The geodatabase compress operation
The geodatabase compress operation removes unnecessary states and rows from the system tables that track versions and versioned edits.
To understand compression, you must first understand how versioning works. If you are unfamiliar with this concept, see Understanding versioning.
What is a compress operation?
The compress operation removes the states that are no longer referenced by a version and can move rows in the delta tables to the business table. A compress operation can only be performed by the ArcSDE administrator and operates against all states in the geodatabase, regardless of the version owner.
Compress operations are necessary because, as a geodatabase is edited over time, the delta tables increase in size and the number of states increases. The larger the tables and the more states, the more data ArcGIS must process every time you display or query a version. Therefore, the greatest impact on performance is not the number of versions but the amount of change contained in the delta tables for each version. As a result, versions can have different query response times.
To maintain database performance, the ArcSDE administrator must periodically run a compress operation to remove unused data.
You can use the the ArcCatalog Compress command or the Compress geoprocessing tool or Python script to compress a geodatabase. See Adding the Compress command to ArcCatalog for information on how to make the Compress command available in ArcCatalog or Compress for information on the geoprocessing tool or script.
What happens during a compress operation?
The compress operation first scans into memory the instance's state tree configuration. Using this information, compress deletes all states that do not participate within a version's lineage. Deleting a state deletes all the rows from the delta tables that are associated with that state.
The next step the compress operation performs is to collapse any candidate lineage of states into one state. A candidate lineage is a collection of states that can be compressed into one state without affecting the logical representation for any table in a given version.
The final step, when applicable, is to move rows from the delta tables into the business tables.
For each step of the operation, database transactions are started and stopped for each table being compressed. The transaction verifies each table is consistent during each step of the process.
The compress operation can be stopped, or killed, while it is executing because the operation is designed to be transactionally consistent. Therefore, if the operation encounters an error, fails, or abruptly stops, the versioned tables being compressed are still logically correct with respect to any version's representation. One reason you might stop the compress operation is if you run it while users are connected to the geodatabase, then discover the compress is consuming a large amount of system resources. In that case, you might want to stop the compress operation and run it again when fewer or no users are connected.
Since the compress transaction can be large, be sure to create enough logical logs and have enough log file space to handle the transaction.
Fully compressing a geodatabase
In a fully compressed geodatabase, there are no rows in the delta tables and the state tree is trimmed back to zero. Performance improvement is greatest if the geodatabase is fully compressed. To achieve this, do the following:
- Reconcile and post all outstanding changes in child versions to the DEFAULT version.
- Delete the versions themselves.
- Make sure no user is connected.*
- Perform the compress operation.
*ArcIMS services (except ArcIMS map services) do not acquire locks on states and, therefore, would not influence the compress operation. ArcGIS clients, including ArcIMS map services, do acquire locks and, therefore, will influence the compress operation.
You can see the results of each compress operation in the COMPRESS_LOG table in the geodatabase (SDE_compress_log in SQL Server and PostgreSQL databases). You can also check the VERSIONS table (SDE_versions in SQL Server and PostgreSQL databases) to see if the state ID for the DEFAULT version has returned to zero. If it has and there are no other outstanding versions, full compression has been achieved.
It may not always be possible to reconcile, post, delete versions, and disconnect all users before a compress operation. For instance, if you are tracking history using versions or need to maintain design versions for a project, the historic and design versions remain pinned to a state within the state tree; therefore, these states will not be removed during compression of the geodatabase. You can successfully compress without doing all these steps, and you will still see some performance improvements.
See Compressing an ArcSDE geodatabase licensed under ArcGIS Server Enterprise to learn how to perform a compress operation.
Frequency of compress operations
The frequency with which you need to perform a compress operation is based on the amount of editing that takes place in your geodatabase. If you have a high volume of edits, you should probably compress the geodatabase once a day. For average or low edit volumes, you should compress at least once a week.
It is important not to wait too long between compress operations; the greater the amount of versioned editing activity that takes place, the longer it will take to compress the geodatabase. If you do not compress the geodatabase at least once a week, compression could take several hours to complete when you do finally run it.
After compressing a geodatabase
You should update the statistics on your geodatabase after you have run a compress operation. For information on updating statistics, see Updating statistics on a geodatabase using Analyze and the topic for your database management system (DBMS).