Migrating to the file geodatabase
Of the various types of geodatabases, file geodatabases are most similar to personal geodatabases in that they are designed to be edited by a single user and do not support geodatabase versioning. You also work with them the same way, whether displaying, querying, editing, or processing data or developing applications. There are a few important differences, though. For example, personal geodatabases have a 2 GB storage limit, while file geodatabases have no limits, and Structured Query Language (SQL) syntax differs slightly between the two. If you are used to working with personal geodatabases and would like to migrate to a file geodatabase, this topic points out these differences and shows you how to get started.
Creating a new file geodatabase
To create a new, empty file geodatabase, right-click a file system folder in the Catalog tree, point to New, then click File geodatabase. See Creating a file geodatabase for more information.
Migrating existing data
- The easiest way to copy data from a personal geodatabase into a file geodatabase is to use the Catalog treeCopy and Paste commands. Copy/Paste is flexible because you can choose exactly what you want to copy. You can select everything in the personal geodatabase or just particular items, such as a set of feature datasets, that you want to migrate. For example, to copy a feature dataset from a personal geodatabase into a file geodatabase, create a new empty file geodatabase in the Catalog tree. Select the items in the personal geodatabase you want, right-click the selection, click Copy, then right-click the file geodatabase and click Paste. For more information, see Copying feature datasets, classes, and tables to another geodatabase.
Copy/Paste can migrate any type of data in the geodatabase except attribute domains that are not referenced by any feature class or table. If you have such domains and want to migrate them, use the Export to XML Workspace Document method discussed next.
- To copy an entire geodatabase, use the Export > XML Workspace Document command to export the entire database to an XML file. You can then create a new, empty file geodatabase and use Import > XML Workspace Document to import the data from the XML file into the file geodatabase. This method is also flexible in that you can choose which datasets to export in the Export wizard. For more information, see the following topics:Exporting feature datasets, classes, and tables to an export file and Importing feature datasets, classes, and tables from an XML workspace document.
- If you're migrating low-precision geodatabase data, the Copy/Paste and Export to XML Workspace Document methods automatically convert the data to high precision, setting the resolution to approximately 0.1 millimeters. This is a good default and works well in almost all cases. However, if you want the data to be stored at a different resolution, use the Upgrade Spatial Reference tool before migrating data with Copy/Paste or Export to XML Workspace Document. Upgrade Spatial Reference converts the data to high precision, allowing you to choose the resolution. Another way to exercise control over the resolution is to migrate with the Import/Export geoprocessing tools. For more information, see Migrating to high precision.
- To move shapefiles, coverages, or data in another format into a file geodatabase, use the same method that you would use to move the data into a personal geodatabase. Select the dataset in the Catalog tree, right-click, then choose the Export > To Geodatabase command; use the To Geodatabase (multiple) command to export multiple datasets at once. You can also find these tools in ArcToolbox under Conversion > To Geodatabase. For more information, see the following topics: Importing features, Importing tables, and Importing raster datasets.
File geodatabases have configuration keywords that customize the storage of an individual dataset. You can specify a keyword when you copy and paste or import data, although the default is usually adequate. For more information, see Configuration keywords for file geodatabases.
Creating new datasets
You create an empty feature dataset, feature class, raster catalog, raster dataset, and table in a file geodatabase the same way you create them in a personal geodatabase: right-click the geodatabase or feature dataset, point to New, then click the item you want to create. See any of the following topics for more information:
Once you've created an empty feature class or table, you load data into it from the Catalog tree. For more information, see About loading data into existing feature classes and tables and Importing raster datasets.
Unlike personal geodatabases, whenever you create a new file geodatabase feature class, raster catalog, raster dataset, or table, either through the Catalog tree or a geoprocessing tool, you can optionally specify a configuration keyword. The configuration keyword customizes how the data is stored and accessed. For a description of the keywords available, see Configuration keywords for file geodatabases.
Editing, displaying, and querying data
Once in a file geodatabase, a dataset looks the same in ArcCatalog and ArcMap as in a personal geodatabase. Also, with the exception of spatial indexes and SQL queries, which are discussed next, you work with datasets the same way. All commands and tools that accept personal geodatabase datasets as input also accept file geodatabase datasets.
The spatial index of a personal geodatabase feature class uses a single grid size that cannot be modified. The spatial index of a file geodatabase feature class uses up to three grid sizes, which you can modify at any time. ArcGIS automatically rebuilds the spatial index at the end of certain update operations to ensure the index and its grid sizes are optimal. However, in some rare cases, you may need to manually recalculate the index. For more information, see Setting spatial indexes.
The SQL WHERE clause syntax you use to query file geodatabases is the same syntax you use on coverages, shapefiles, and other file-based data sources, with some additional capabilities such as support for subqueries. As a result, WHERE clause syntax differs from personal geodatabases. The dialog boxes to create SQL expressions in ArcGIS help you use the correct WHERE clause syntax for the data you're querying, as they list the field names and values with the appropriate delimiters. They also select the relevant keywords and operators for you. However, if you have a WHERE clause defined for a layer in a personal geodatabase, it may not work on the same layer once you've moved its source data into a file geodatabase. Likely reasons for a WHERE clause not working are the following:
- For personal geodatabases, field names are enclosed in square brackets, whereas for file geodatabases, they are enclosed in double quotes.
- The wildcards you use on personal geodatabases are * for any number of characters and ? for one character. File geodatabases use % and _, respectively.
- String searches in personal geodatabases are case insensitive, whereas in file geodatabases they are case sensitive.
- Personal geodatabases use UCASE and LCASE to convert string case, whereas file geodatabases use UPPER and LOWER.
- Dates and times in personal geodatabases are delimited using #, whereas in file geodatabases they are preceded by the word date.
WHERE clause syntax for a personal geodatabase
Equivalent syntax for a file geodatabase
[STATE_NAME] = 'California'
"STATE_NAME" = 'California'
[OWNER_NAME] LIKE '?atherine smith'
"OWNER_NAME" LIKE '_atherine smith'
[STATE_NAME] = 'california' (when a case-insensitive search is required)
LOWER("STATE_NAME") = 'california'
UCASE([LAST_NAME]) = 'JONES'
UPPER("LAST_NAME") = 'JONES'
[DATE_OF_BIRTH] = #06-13-2001 19:30:00#
"DATE_OF_BIRTH" = date '2001-06-13 19:30:00'
Another reason a WHERE clause may not work is that file geodatabases support fewer operators and functions than are supported by personal geodatabases, and file geodatabases provide limited support for subqueries. However, this is unlikely to be a reason the WHERE clause does not work. File geodatabases support the majority of WHERE clause capabilities you would likely need.
There is little that differs in how you use geoprocessing tools on file geodatabases compared to personal geodatabases:
- File geodatabases are supported in all geoprocessing tools.
- To create a file geodatabase, use the Create File GDB tool.
- Whenever you use a tool that creates a new feature class, raster catalog, raster dataset, or table in a file geodatabase, you can optionally specify a configuration keyword to customize how the data is stored. The option is available on some tools. If the option is not available, you can specify it in the Environments dialog box.
Compressing vector data
Unlike personal and ArcSDE geodatabases, file geodatabases allow you to optionally store vector data in a compressed, read-only format to reduce storage requirements. The compressed data is a direct access format. You do not have to decompress data when you access it: ArcGIS and ArcReader read it directly. The data looks the same as if it was decompressed, and you perform all read-only operations the same way, whether through a command in the Catalog tree, geoprocessing, or ArcObjects. For more information, see About compressing file geodatabase data.
If you have an application written in ArcObjects and want to switch the data it accesses from a personal to a file geodatabase, consider the following:
- Update the workspace factory to get the application working on the new data source. Change the workspace factory from AccessWorkspaceFactory to FileGDBWorkspaceFactory, and change the geodatabase extension from .mdb to .gdb.
- If your application uses SQL, the syntax will likely need updating so it works on a file geodatabase:
- As discussed previously in this topic, SQL WHERE clause syntax differs between file and personal geodatabases. If your application uses QueryFilter or QueryDef, refer to the WHERE clause discussion above for the likely changes you'll need to make.
- File geodatabases do not support all the features and functions available for personal geodatabases. At ArcGIS 9.2, the most commonly used functions not supported by file geodatabases include DISTINCT, GROUP BY, and ORDER BY, and the set functions AVG, COUNT, MIN, MAX, and SUM are not supported outside subqueries. Support for some of these is likely to be added in future releases.
- QueryDef join support for file geodatabases is limited in that the subfields can contain simple column names only; aliases, expressions, and functions are not supported. The FROM clause can contain simple table names only.
- ExecuteSQL support for INSERT and UPDATE in file geodatabases is limited to simple statements containing literal values only. Compound expressions such as
SET RENTAL_PRICE = (RENTAL_PRICE - 1.00)
- If your application contains subqueries, they may not work with file geodatabases since support for them is limited. For more information, see SQL reference.
- To maximize data transfer performance, consider using load-only mode whenever a large number of records are being loaded. For more information, see "Performance tips" below.
Apart from these differences, ArcObjects works the same on file geodatabases as on personal geodatabases.
The following can help improve and maintain file geodatabase performance:
- The default resolution when you create a new feature class is 0.1 millimeters or its equivalent in the units of the dataset's coordinate system. The default value for x,y resolution works in almost all cases. If your data is not this accurate, you can optionally set a larger x,y resolution when you create the feature class. Storing coordinates with a larger x,y resolution decreases storage requirements and improves performance. This is not unique to file geodatabases; it also applies to ArcSDE geodatabases. For more information on x,y resolution, see Feature class basics.
- As with any other data source, create only those attribute indexes you really need, since each index you add slightly slows edits to the feature class or table. Each time you edit an indexed attribute, ArcGIS must also update the index. If you need to frequently edit a field, avoid creating an index for it if you can.
- If you plan to add, edit, or delete many features or records in a large dataset, whether through an edit session, a geoprocessing tool, or the Catalog tree, you might be able to save time by deleting the spatial index and any affected attribute indexes before you start, then re-add the indexes after you have completed the changes.
Whenever you add, edit, or delete a feature or record, ArcGIS needs to update the indexes. If you are making the changes on a small dataset or making the changes to just a few records, such as 10 records out of 1 million, the time it takes for ArcGIS to update the indexes after each incremental change will not be an issue. However, if you are making the changes to a large number of records, such as 300,000 out of 1 million, updating the indexes for the many incremental changes may take much longer than if you delete the indexes before you start, then add the indexes after you have completed the changes. Deciding whether to drop indexes in other cases involves trade-offs and may not be obvious.Similarly, developers writing loaders or converters with ArcObjects should consider using load-only mode whenever a large number of records are being loaded. Load-only mode suspends the updating of all attribute and spatial indexes until features have been imported. Once all features have been imported, the indexes for all records, both existing and new, update automatically. You set this mode through the IFeatureClassLoad interface.
- As with personal geodatabases, if you frequently add and delete data, you should compact your file geodatabase on a regular basis. You should also compact a file geodatabase after any large-scale change. Compacting improves performance by rearranging how the data is stored on disk. For more information, see Compacting file and personal geodatabases.
- As when working with any other type of file stored in the file system, keep your computer in a well-maintained and tuned state. If you are running Windows, run the disk defragmenter on an occasional basis to maintain overall file system performance. The disk defragmenter is a tool that comes with the Windows operating system; for more information, see your operating system's online help.
- Spatial index grid sizes that are too small or too large can result in increased load times and poor spatial search performance. ArcGIS automatically rebuilds the spatial index at the end of most operations, ensuring the grid sizes are optimal. You rarely need to manually recalculate the index. However, there are situations when you're adding many features in an edit session that may require a manual recalculation. For more information, see Setting spatial indexes.