The archive process
Enabling archiving on a versioned dataset creates and populates the archive class with the current data present in the DEFAULT version. The archive class uses the gdb_from_date and the gdb_to_date to maintain the time the change was archived.
Representing time
It is important to understand how ArcGIS represents time when change is recorded. History can be recorded as either valid time or transaction time. Valid time is the actual moment for which a change occurred in the real world and is typically recorded by the user who is applying the change. Transaction time is the time an event was recorded in the database. Transaction times are generated automatically by the system.
ArcGIS uses transaction time, which is based on the current server time, to record change to the data when changes are saved or posted to the DEFAULT version. Transaction time and the time the event occurred in the real world are rarely the same time. Time will lapse between an event happening in the real world and when the event is recorded in the database. For example, a parcel is sold on May 14, 2006; however, the change is not recorded to the data until June 5, 2006. The transaction time of June 5, 2006, is recorded in the archive class for this change.
When the edit occurs, ArcGIS will archive the transaction to the archive class. The difference between the time of the real-world event and the transaction time may seem insignificant, but it becomes more apparent when queries are performed against the archived information. Backlogs in editing and updating data are not uncommon in production systems, and they result in the time difference and lag between valid and transaction time.
The difference between valid and transaction time is also an issue in situations where history is recorded in a multiuser environment with many different users or departments editing the database. The sequence in which changes are performed and logged in the database may not be the same order in which those changes occurred in the real world.
Enabling archiving
Upon enabling archiving, all rows representing the DEFAULT version for the given class are copied to the archive class with the same timestamp. The gdb_from_date attribute for all rows is time-stamped with the date and time of the enable archiving operation. The gdb_to_date attribute for all rows is time-stamped with 12/31/9999. Anytime an attribute has the gdb_to_date 12/31/9999, it is the current representation of the object in the DEFAULT version. When edits are saved or posted to the DEFAULT version, the geodatabase automatically archives the changes to the archive class. This means
- Features created in the DEFAULT version are represented in the archive class as rows with the attribute value for the gdb_from_date attribute set to the timestamp of the archive operation and the gdb_to_date attribute set to 12/31/9999.
- Features updated in the DEFAULT version update the associated row in the archive by setting the attribute value for the gdb_to_date attribute to the timestamp of the archive operation and insert a new row with the attribute value for the gdb_from_date attribute set to the timestamp of the archive operation and the gdb_to_date attribute set to 12/31/9999.
- Features deleted in the DEFAULT version update the associated row in the archive class by setting the gdb_to_date attribute value equal to the timestamp of the archive operation.
Updating the archive table is performed within a single database transaction. If any errors are encountered during the transaction, the entire archive operation is rolled back, and the save or posting operation is therefore not completed. Once the error has been rectified, perform the save or post operation again.
For each archive operation, the DEFAULT historical marker is updated with the value of the archive operation. This ensures that when choosing the DEFAULT historical marker when working with a historical version, the current representation of the archive class is equivalent to the versioned class's representation in the transactional DEFAULT version.
Accessing the archive class can actually consume fewer database resources than working with the equivalent versioned class.
Application developers interested in the event that captures the moment of the archive operation can refer to the OnarchiveUpdated event on the Iversionevents2 interface of the Software Developers Kit.
Queries on historical versions are on the archive class:
Queries on transactional versions are still on the base and delta tables:
Adding a feature
This feature in a cadastral database shows parcel number 116 and its corresponding row in the archive class. The gdb_from_date shows the time and date of creation, while the gdb_to_date shows 12/31/9999, because the feature has not been modified or deleted since enabling archiving.
When a feature is inserted(parcel 117), and the edits are posted to the DEFAULT version, a row is inserted in the archive class with the gdb_from_date updated with the timestamp of this post operation. The gdb_to_date attribute in the new row shows 12/31/9999 because this feature has yet to be updated or deleted.
Updating a feature
When a feature is updated, the gdb_to_date is set with the timestamp of the archive operation, and a row is inserted to show the current representation of the feature. The gdb_from_date in this new row is set with the time of the archive operation, while the gdb_to_date shows 12/31/9999, since it has yet to be modified or deleted.
The following diagram shows two parcels, 116 and 117, with their corresponding gdb_from_date and gdb_to_date attributes in the archive class prior to performing the update operation.
If the parcel boundary for parcel 117 is extended, and these edits are posted to the DEFAULT version, the gdb_to_date is updated with the timestamp of the archive operation, and a new row is created. The gdb_from_date attribute in this new row is set with the time and date of the archive operation.
For example, queries which investigate moments prior to the update (7/14/2005 5:34:22 PM) show parcel 117 as it existed prior to the update. Querying moments before 7/9/2005 2:33:43 PM will not show parcel 117 because it had not been created. Any moment queries after the update (7/14/2005 3:45:23 AM) will show parcel 117 in its current representation with the extended boundary.
Deleting a feature
When a feature is deleted, the gdb_to_date is updated with the timestamp of the archive operation. The following diagram shows parcels 116 and 117 with their corresponding gdb_from_date and gdb_to_date attributes in the archive class.
If parcel 117 is now deleted, and these edits are posted to the DEFAULT version, the gdb_to_date attribute is updated with the timestamp of the archive operation.
Technical note on archiving
The following scenario can create a time gap in the archive class:
An editor is directly editing the DEFAULT version and deletes an object in an edit session.
The editor then saves the edits, which updates the gdb_to_date attribute of the archive class with the timestamp of the deletion of that object.
If the same object is updated in a child version and reconciled with the DEFAULT version, there will be a conflict.
If, during the conflict resolution process, the editor chooses to replace the conflict with the updated representation of the row, the row will be restored in the DEFAULT version when the version is posted. The archive operation inserts a new row into the archive class and sets the gdb_from_date attribute to the timestamp and gdb_to_date to 12/31/9999.
Therefore, when the editor looks at the object’s lineage through time, the dates will contain a gap between the gdb_to_date and gdb_from_date when the object did not exist in the DEFAULT version.