How to build effective data models for field data collection
The ArcGIS Mobile editing framework, combined with the power of the geodatabase data and transaction model, caters to a wide variety of field editing solutions and workflows. It is important that you carefully consider what is and is not supported within the geodatabase information and transactional models when determining how best to support field editing applications with ArcGIS Mobile.
Operational and basemap layers
When designing maps that will be edited in the field using ArcGIS Mobile, it is important to define which map layers will be synchronized with a GIS server (operational layers) and which layers will provide supportive background information (basemap layers). If there are layers in your map document that simply make the map look nicer and do not help field-workers complete their tasks, you should consider removing them before publishing the map. Try to keep your map simple. Avoid using definition queries, selection layers, joined tables, and other more advanced ArcMap behavior. Consider the constraints of the device and conditions where a mobile map will be used. You might already have existing mobile applications that connect directly to a map service or use an existing cache extracted from a mobile service.
Geodatabase support
ArcGIS Mobile supports editing of data within both file-based and multiuser relational geodatabases. Your mobile map can contain layers that reference other data sources, but if you want to capture new features or update existing features, the layer's data source must be a feature class stored within a geodatabase.
These are ArcGIS Mobile feature class requirements:
- Stored inside a personal or multiuser ArcSDE geodatabase
- Must contain a GlobalID column
Each field that you add to a feature class in the geodatabase has certain properties that define its behavior when editing or querying against it. The ArcGIS Mobile applications fully leverage those properties throughout the user experience.
- Subtypes—Using the concept of subtypes, it is possible to subclass the content stored within a single feature class. A subtype has a collection of subtype codes, and for each subtype code, you can assign specific domains and default values for other fields in the feature class. Within the ArcGIS Mobile field applications, each subtype code is displayed in a pick list that will in turn set default values and determine other pick list values for other fields.
- Domains—Both coded value domains and range domains are supported by ArcGIS Mobile. Each domain value appears in a pick list when collecting or updating attribute values, and default values are noted with an asterisk (*). If you do not specify a default value for a field with a domain, ArcGIS Mobile will mark that field as required and prompt you to enter a valid value before your edits will be accepted. Range domains provide helpful tips in the application indicating the range of valid values you can enter and, like coded value domains, will mark fields as required if no default value is specified.
- Raster—Fields of type raster are recognized as a special field type that can be used to store photographs. You can either browse for photos and add them to your raster field or use the embedded camera in your mobile device directly.
- Dates—Date fields do not have a concept of a default value; however, the ArcGIS Mobile applications make it simple for you to set the current date and time when editing.
- Phone numbers—If you use a text field to store phone numbers and properly format your phone number such that it is recognized by a Windows Mobile device that has an embedded cellular chip, you can start the phone and call the number directly from within the mobile application.
Feature classes that are z and/or m enabled can be edited, but z- and/or m-values will not be maintained. For newly collected features, z- and m-values will not be set.
ArcGIS Mobile will not allow you to create a new map layer or alter the schema of attribute fields on a mobile device. The map schema must already exist within your geodatabase, and this implies that you must create an appropriate database design to accommodate any planned data capture or editing. If you anticipate the need to capture features that do not exist inside your current data model, or if you need to capture unstructured information (field notes, for example), it is recommended that you create an additional feature class inside your geodatabase schema to handle the storage of this information.
It is extremely important, after you create a mobile cache, that you do not alter the data and/or map schema. Any and all interaction between the mobile cache and the map/geodatabase requires that the schema be checked for consistency. This is called a checksum. If the map or geodatabase is altered after the cache has been created, you will receive a checksum error when attempting to synchronize changes.
Geodatabase design considerations
An important step in building your field solution is to design the geodatabase information model that will best support your field workflows. The ArcGIS Mobile field applications and SDK take full advantage of both the intelligence you build into your data model and how you display that content using feature layers in ArcMap.
If you are planning to repurpose an existing data model for field use, you can do one of the following:
- Leverage your existing information model in the field using geodatabase replication.
- Transform your existing information model using an extract, transform, and load (ETL) process.
The best way to discover the approach you need to take is to consider how field-workers describe the information that they need to capture/update in the field. The way that you describe information that is collected in the field can be significantly different than the way that you model it in your enterprise geodatabase. Quite often, spatial information that is collected in the field is captured in pieces. For example, when collecting information contained within a city park, field-workers will start at one corner and work their way across the park. They may collect part of a lake shoreline, then stop and collect a park bench, then resume collecting the lake shoreline. In the enterprise data model, lakes are represented as polygon features; however, they are captured as shoreline line features in the field. It may therefore be more beneficial to model lakes as shorelines in a separate geodatabase designed for capture of data in the field.
You can define processes that transform your enterprise data model into a field data model. These processes are called ETL. For this example, you could define a process that would transform and stitch together shoreline features into lake polygons when synchronizing changes between your mobile geodatabase and the enterprise geodatabase.
Some additional geodatabase design questions you need to consider might include these:
- How much of my existing data model is needed in the field (datasets, fields, extents, and so forth)?
- Are there remote offices that need to manage field edits?
- Are new datasets/layers required to support new information that is captured in the field?
- At what frequency do updates between the field and the enterprise need to be made available?
- Do I need to transform datasets and field definitions before I take them to the field?
The following section describes geodatabase replication and ETL as two separate models for managing field editing applications using multiuser geodatabases.
Geodatabase replication model
You can simply publish the contents of your production geodatabase for use in the field and use versioning to isolate the edits that come in from the field from other edits that are made in the office. However, there are a number of challenges with this approach. For example, if you need to be able to synchronize updates from the field, your production geodatabase will need to be accessible outside your company firewall. For most organizations, this is not possible. A better approach would be to use geodatabase replication and isolate the information that is captured in the field from the information that is continually updated in the office.
Using geodatabase replication, you can create a separate enterprise geodatabase instance that will store edits made in the field and periodically synchronize updates with the parent geodatabase you created the replica from. Using this approach, you only need to replicate the information that you need to take to the field (not the entire information model), and you can isolate mobile Web services from your master production geodatabase. Some examples of where replication can be very useful include a distributed system where remote offices have field-workers and do not have connectivity to the main office or when a vehicle-mounted laptop contains a replicated geodatabase and field edits are synchronized when the device is docked in the vehicle. In either example, you might also want to store metadata for each field editing task that does not belong in the enterprise geodatabase (for example, GPS measurement data required for differential correction of collected data).
ETL model
Quite often, the way that you represent spatial information in an enterprise database is different than how you create and update it in the field. Modeling lake polygons as shoreline line features so that they can be captured in pieces is one example. Another is to join normalized data tables or attribute fields stored in the enterprise geodatabase into one table or attribute in the field geodatabase. Another example is the way that street attribute information is stored. Often the proper street name is stored in multiple attributes (number, prefix, name, suffix, and so on). In ArcMap, you would use an expression to label the street name. If you want to display the street name on a mobile device, you need to join those attributes together into one field that you can label against.
You can use geoprocessing models to manage the ETL processes between the mobile and enterprise geodatabases. You can also use the ArcGIS Data Interoperability extension to visually design these transformation processes. It is important to note that the processes you define might not be bidirectional. You will need to define one set of geoprocessing models or custom spatial ETL tools for transforming your data model to a mobile device and a second set of models to reassemble field data to fit the enterprise schema.
Mobile transaction model
Once you have defined the geodatabase data model that best suits your field editing tasks, you then need to define an appropriate transaction model to manage updates coming in from the field. In part, this is defined by the data model decisions you have made, but it is also defined by the number of field editors you need to support and how isolated you want their edits to be.
Some of the transaction model considerations that you need to make include the following:
- Do I want field edits to update the default version? If not, do I want one version for all field edits, or do I want a version for each field editor?
- Can I update the geodatabase without using versioning at all?
- Do I need to archive the edits made in the field?
- How will I manage conflicts between field updates?
You will also need to have an understanding of how a specific transaction model works from the geodatabase perspective before building the transactional geodatabase or publishing mobile map services and how the mobile editing application synchronizes changes.
The following discussion outlines some of the key functionality you should consider when designing your geodatabase for field use.
Editing a nonversioned geodatabase
If you have few field editors that are performing simple editing tasks (updating attribute fields, for example), and there is little to no chance that they will update the same feature in the field, a nonversioned transaction model may best fit your needs. One potential drawback of this approach is that direct update is the only option available to the field editor. If for some reason updates need to be synchronized but are not complete, synchronization becomes problematic. With versioning, you have more flexibility in how your field-workers can synchronize updates.
Using a versioned transaction model, you can isolate field edits and add additional postprocessing and quality assurance checking before reconciling updates. Depending on how you want to isolate field edits, you could create a single version to store all field edits or you could create a version for each field editor. If you create a version for each editor, you will need to publish a mobile map service for each editor. Once the editors have completed their data capture or maintenance and synchronized their updates with the geodatabase, ArcGIS Desktop is needed to reconcile the version edits with the parent version when in the office.
ArcGIS Mobile client editing framework
ArcGIS Mobile applications do not have a concept of start, stop, and save edits as in other ArcGIS applications. Each edit that is made is stored within the mobile service cache on the device until the time that you decide to synchronize updates with the server. You can choose to cancel an edit that you have made within the application, which will roll back all updates made in the field and restore the original state of the feature prior to editing, but you cannot undo one edit to a feature at a time.
Updating the feature layer does not synchronize your changes with the geodatabase directly.
Posting changes
The updates that you perform in the field are stored locally in the mobile service cache on the mobile device. This is important because your field-workers may not be connected in the field at all or they may need to shut down and charge their devices, and it is important that updates are not lost. When connectivity with the server is established, you can then synchronize updates stored in the cache with the server.
When posting changes from the mobile device, only deltas are sent to the server. For example, if you change an attribute on a feature, only the change to that specific field is recorded rather than marking the entire row as edited. This is done so that when synchronizing changes, only the information that has actually changed is sent back to the server. When synchronizing updates in the field, bandwidth and storage must be conserved whenever possible.
Depending on the number of edits you expect and the type of server connection you have (GPRS, for example), you might want to enable posting features only when the application and device are back in the office to ensure a stable high-speed connection.