Data maintenance strategies
Transactions against geographic data can vary widely in duration and complexity. The geodatabase supports two data maintenance strategies—maintenance with and without versions—which balance the needs of users and applications to perform short and long transactions on data that is simple or complex.
Each strategy can be applied on a feature-class-by-feature-class or table-by-table basis, so it's possible to make use of both of them in the same geodatabase.
The way you edit data in each of these strategies is similar—you edit within an edit session and work with many of the same tools. What differs is how the underlying data sources are maintained. There are also some differences in which data you can edit and the type of workflows you can perform. This topic explains these differences.
Data maintenance without versions
This strategy does not involve working with multiple versions—it simply makes use of the underlying DBMS transaction model. Nonversioned edits are equivalent to standard database transactions.
To edit data, you enable nonversioned editing from the Editor Options dialog box, start an edit session, and perform the required operations, such as adding, deleting, or moving features and updating attributes. Your first edit in the edit session begins the transaction. When you save, the individual edit operations you've performed commit to the database as a single transaction. After you save, the next edit you make begins a new transaction. You can save as few or as many operations at a time as required during the edit session, though frequent saving is recommended to avoid locking the data you are editing and blocking other users from accessing or editing the data. Once saved, the changes are available to all other users and applications accessing the data.
If you do not want to commit your edits to the database, stop editing without saving. All the edits in that transaction—all edits since your last save or, if you have not yet saved, all edits since the edit session began—will be rolled back and will not be committed to the database.
As you edit, any unique indexes, constraints, and triggers defined on the data with the DBMS apply. All the same locking behavior applies as if you were performing transactions on the data with the DBMS directly. Therefore, there is the potential for users or applications that access or modify the same data to block one another. When using nonversioned editing in a multiuser editing environment, you should understand how isolation levels and locking work in your DBMS and, if necessary, set the correct isolation level in the DBMS before you start working with ArcGIS.
This strategy is appropriate for simple features for which you don't require the ability to manage history or multiple representations of the data with versions. Since this strategy doesn't require versions, it's also useful if you require both GIS and non-GIS applications to share access to a common database.
Potential applications
- Easily integrate geographic data into existing applications by allowing third-party applications (those not created with ESRI software) to read and modify the same data accessed by ArcGIS applications.
- Manage projects with simple workflows and edits. If transactions are always simple and of short duration, you can modify the data directly without the need to merge changes and periodically manage additional tables required for versions.
Limitations
- You can edit simple data only—points, lines, polygons, annotation, and relationships. You can't edit feature classes that participate in a topology, geometric network, or terrain.
- Since you edit the data source directly, you cannot undo or redo an individual edit if you make a mistake. The only way to undo edits is to abort all edits made since your last save by stopping the edit session without saving.
- Each transaction must be started and completed within a single edit session. An edit session can only last for a limited time; you typically need to end it after a few hours so you can log off at the end of the day.
- There is no conflict detection with nonversioned editing. If one user updates a feature and saves, then another user updates the same feature and saves, the last update made overwrites the first.
For more information, see A quick tour of working with nonversioned data.
Data maintenance with versions
The geodatabase extends the standard DBMS transaction by allowing multiple concurrent states of the databases, known as versions, to exist at the same time. Each version can represent ongoing work, such as a design or a group of work orders, work that can span multiple connections to the database and extend over a period of weeks or months if necessary. Versions allow you to manage past, present, and proposed changes to the data—all in the same geodatabase.
To manage past changes, you save any changes to the data in separate archive tables. You can keep these changes for as long as required, allowing users to see how the database looked at some previous point in time. This capability is referred to as archiving and is built into ArcGIS—no development is required. When you enable this capability, changes to the DEFAULT version, normally used as the published version of the database, archive automatically.
To manage current changes, editors can modify their private version of the geodatabase so that other users cannot see incomplete work. When you edit a version of the data, you don't apply locks. Therefore, you maximize concurrency, because other users can read and edit the same data you're modifying; you don't block one another from accessing the database. Once an editor has finished his or her changes, he or she can integrate them into the published version.
To manage proposed changes, you can develop a scenario or perform a what-if analysis within a version of the database. The scenario can be managed as a single unit of change, spanning multiple edit sessions and days, weeks, or months. You can freely add proposed features, perform geographic analysis, and produce maps—all without affecting the database other users are accessing. Once the changes are complete and have been approved, you can integrate them into the rest of the geodatabase.
There is no limit to the number of versions a geodatabase can have. Versions can be arranged in various configurations and support a wide variety of workflows. However, for the sake of simplicity and geodatabase management considerations, a recommended best practice is to either maintain a flat version tree or have multiple editors concurrently editing the DEFAULT version.
To support versioning capabilities, ArcGIS doesn't duplicate data. Instead, it leaves each feature class and table in its original format but records any changes in tables known as delta tables. Delta tables consist of an adds table for inserts and updates and a deletes table for deletes. Each time you update or delete a record in any version, rows are added to one or both of these tables. When you query or display a feature class or table in a version, ArcGIS assembles the relevant rows from the delta tables and the original table to present a seamless view of the data.
Versioned tables require periodic maintenance by a database administrator. As a geodatabase is edited over time, delta tables increase in size, affecting display and query performance. To maintain performance, the database administrator can periodically compress a versioned database, an operation that removes redundant information from the delta tables. Versioned databases should be compressed whenever a period of high database activity has ended, for example, at the end of a shift or after loading new data. The compression process can be run while other users are connected and using the database.
ArcGIS can manage the underlying delta tables that support versions in one of two ways:
- By saving all changes, regardless of what version, to the delta tables
- By saving all non-DEFAULT version edits to the delta tables but saving all DEFAULT version edits to the base tables
The first way is designed to support ArcGIS applications exclusively. The second way is useful if you need to maintain the data with both ArcGIS and third-party applications.
Maintaining data exclusively with ArcGIS applications
In an environment where you maintain data exclusively with ArcGIS applications, the best way to manage versions is to save all changes in the delta tables. This allows you to take maximum advantage of the capabilities of the geodatabase including archiving, replication, and the ability to edit geometric networks and topologies.
To enable this behavior on a feature class or table, you register the data as versioned without the option to move edits to the base table. Whenever you save changes to a dataset registered this way, changes are saved in the delta tables. With this approach, direct access to the original tables is not possible—users always access a version of the data.
The following example illustrates a configuration in which data is completely managed through the use of versions. Editors require the use of ArcGIS or another ESRI application to make changes to the data. Non-GIS applications can access the published version or any other version, provided they are adapted with multiversioned views.
Because of the delta and other tables required to support versions, applications written to directly access data in the DBMS without the use of software libraries from ESRI do not have the inherent ability to read versions. ArcGIS provides multiversioned views that allow these applications to read a given version with SQL. Multiversioned views can access both tables and feature classes. Access to the geometric attributes of feature classes using a multiversioned view requires the use of SQL geometry types, which are fully supported by ArcGIS.
This approach provides these benefits:
- Undo or redo changes as you're editing.
- The absence of locks permits the introduction of editing conflicts. ArcGIS provides the ability to easily detect, reconcile, and resolve conflicts.
- Archive changes automatically, and query how the database looked at a particular point in time.
- You have the ability to edit features in a geometric network or topology.
- You have the ability for two or more geographically dispersed offices to work on synchronized copies of a geodatabase simultaneously. The offices need to periodically send their changes to one another so that each has an up-to-date view of the geodatabase. ArcGIS refers to this capability as geodatabase replication.
- Mobile users, disconnected from the network, have the ability to edit a portion of the geodatabase on a laptop or handheld device in the field. When the users return to the office, they integrate their changes into the geodatabase. ArcGIS also refers to this capability as geodatabase replication.
Potential applications include
- Projects requiring the storage and query of archived data.
- Projects requiring a what-if analysis: Create a new design in a separate version. If the design is approved, you can merge it in with the rest of the database. If it is not approved, you can discard it.
- Projects with specific quality assurance requirements: Collect changes to data, such as bulk imports, in a version isolated from other database users. Test and approve the changes before merging them with the published version of the database.
- Projects that divide work into functional or geographic units: For example, a project to design and construct a new shopping mall might have distinct construction phases subdivided into east and west sections or subdivided by construction activities, such as building, utility installations, or landscaping. Each unit of work is undertaken in a separate version; as each version is completed, it is posted to the published version of the database.
- Projects that evolve through a prescribed or regulated group of stages, whereby each stage requires engineering, administrative, or legal approval before it can be considered complete: Workflows for these projects can manage each stage as a separate version, such as initial design or proposed version, an approved version, and a version for the construction phase. As a project advances through the various milestones, each stage is reviewed and approved, then superseded by the next version until the last stage is reached and completed.
- Projects with regulatory audit requirements: Keep a persistent archive to support queries on changes.
- Projects with geographically distant offices that need to work on the same geodatabase simultaneously: They require the ability to periodically synchronize their copies of the data.
- Projects that require maintenance crews in the field to update data with mobile computers.
- Projects that require software developers to test SQL statements and application logic on their own private version of the database.
Versions provide many benefits but have some limitations:
- Third-party applications must be adapted with multiversioned views before they can read data.
- There are restrictions on the use of active DBMS behavior such as unique constraints and triggers when working with versioned data. This is because inserts and updates create new rows in the delta tables instead of inserts and updates in the base table.
Maintaining data with ArcGIS and other applications
In a heterogeneous computing environment where you have a number of different departmental applications accessing the same database, you may require the ability to support both ArcGIS and third-party applications. An example of this is if you have one department that maintains the geographic data in the database with ArcGIS and another department that maintains customer records in the same database with a custom application. The custom application needs to apply DBMS constraints and triggers as transactions are made and may not recognize versioned tables. At the same time, the other department needs to edit the geographic data in its own isolated version, not sharing the departmental edits until they are complete and approved.
With these requirements in mind, ArcGIS allows you to perform versioned editing on a feature class or table while retaining the ability to share edits with other applications. To enable this capability on a feature class or table, you register the data as versioned with the option to move edits to the base table. This option is available in the registration dialog box.
When you edit data registered this way, versions work the same way as described in the previous approach; changes are saved in the delta tables. The exception to this is the DEFAULT version. Whenever you save edits to the DEFAULT version, either by editing it directly or merging in changes from another version, the edits are saved in the base tables. The edits do not remain in delta tables as is the case when the Move edits to base option is unchecked.
This allows all applications to work on the same database.
- Applications written without software from ESRI can continue to access and modify data with standard transactions, even if it's being edited in the DEFAULT or another version at the same time.
- Whenever ArcGIS or an application written with ArcObjects saves changes to the DEFAULT version or merges changes into the DEFAULT version, any unique indexes, constraints, and triggers defined on the data with the DBMS apply.
- When one application modifies the data, the changes become immediately available to any other application accessing the data. Since changes to the DEFAULT don't store in delta tables, there is no need to adapt third-party applications with multiversioned views so that they can read these tables.
Potential applications
- Projects requiring data editing by ArcGIS and other applications—Set up a workflow where the other applications access and modify the nonspatial data in the database with standard DBMS transactions, while other users work on specific versions of the same data, performing transactions of relatively long duration that are isolated from all other database users until they post to the DEFAULT version.
- The same potential applications as when you're maintaining data exclusively with ArcGIS applications
- Projects requiring a what-if analysis
- Projects with specific quality assurance requirements
- Projects that divide work into functional or geographic units
- Projects that evolve through a prescribed or regulated group of stages
- Projects that require maintenance crews to update data with mobile computers when they're in the field
- Projects that require software developers to test SQL statements and application logic
Limitations
When you register a dataset as Versioned with the Move edits to base option, you are limited in how you can work with versions.
- You can edit simple data only—points, lines, polygons, annotation, and relationships. You cannot edit a feature class in a topology, geometric network, or terrain.
- You cannot archive changes for the dataset.
- You cannot replicate the dataset.
- When you edit the DEFAULT version or post a version to the DEFAULT, you do not have the ability to resolve conflicts, so it is possible to overwrite another user's edits.
For more information about versions and the capabilities they provide, see Understanding versioning and Version scenarios.