How raster data is stored in a geodatabase
Store raster data in the geodatabase when you want to manage rasters, add behavior, and control the schema; when you want to manage a well-defined set of raster datasets as part of your DBMS; and when you require a single architecture for managing all your content. There are three main types of geodatabases: ArcSDE, personal, and file.
The functional behavior of each geodatabase is basically the same; however, there are some exceptions for specific tools or procedures. For information about the differences in behavior by a tool or procedure, refer to the specific tool or procedure with this help system.
One of the few differences that exist with file geodatabases is in regard to SQL queries on raster catalogs. For more information about this, see Migrating to the file geodatabase.
Storing raster data in a file geodatabase
The storage model of the file geodatabases is a hybrid of the ArcSDE geodatabase and the personal geodatabase where managed raster data follows the storage model of the ArcSDE geodatabase and unmanaged raster data follows the storage model of the personal geodatabase. File geodatabases are also similar to personal geodatabases because they are designed to be edited by a single user and do not support versioning. They reside in your file system directory; thus they do not require a password for access. The file geodatabases and ArcSDE geodatabases share the same basic storage schema.
A file geodatabase has several advantages over the use of a personal geodatabase. Like ArcSDE, the file geodatabase stores data in blocks. This provides a more efficient access to data—especially during the mosaic operation. When mosaicking data in a file geodatabase, only overlapping blocks are updated. If an overlapping block does not exist, a new block is inserted. Partial blocks are padded with NoData pixels. In addition, the file geodatabase (and ArcSDE) storage model employs partial pyramid updates, which saves time. Also the data structure of the file geodatabase and ArcSDE are the same—fast copy technology is used to copy and paste data between the file geodatabase and an ArcSDE geodatabase.
The file geodatabase also accepts configuration keywords, but unlike ArcSDE, the configuration keywords have a standard predefined value. For more information about configuration keywords, see Configuration keywords for file geodatabases.
By default, the file geodatabase has a 1 terabyte (TB) limit per dataset. Each dataset in the file geodatabase can be up to 1 TB in size, but you can have many 1 TB datasets inside any given file geodatabase. You can increase this limit for each dataset by using the MAX_FILE_SIZE_256TB configuration keyword.
The basic file geodatabase raster schema has five constituent tables arranged in a hierarchy: the business table is at the top of the hierarchy, while four other subordinate tables store the raster metadata and pixel data. The business table also contains a feature column, which maintains the envelope of the raster. The feature column joins to a feature table that actually stores the feature envelopes. The raster blocks table, the largest, stores the actual pixel information and pyramids. All these tables are stored in a native file format that is hidden and therefore are not directly accessible. The raster blocks table stores the pixel data as a BLOB column, one row per block by band and pyramid level.
The bands are tiled into blocks of pixels according to a user-defined dimension—the default is 128 by 128 pixels. Tiling the raster band data enables efficient storage and retrieval of the raster data. The pyramid information is stored according to a declining resolution. By default, the height of the pyramid is defined by the number of levels either by the application or automatically by the system.
A raster catalog is stored as multiple rows in the business table, whereas a raster dataset is maintained as a single row. The table schema of the raster dataset and the raster catalog are the same. Each row of a raster catalog actually stores a raster dataset. The extent of each raster dataset within a raster catalog is maintained in the feature column of the raster catalog's business table.
A mosaic dataset is stored as a collection of up to nine tables. The catalog, boundary, log, and raster type tables are created when a mosaic dataset is created. The levels, overviews, color correction, seamline, and stereo tables are created on demand. For instance, the levels table is created when the mosaic datasets cells size are calculated.
A raster field added to a table or feature class is called a raster attribute. Raster attributes and raster catalogs have the same schema. Each record in the table or feature class that has a raster field will have an attribute value for the column of type RASTER that joins it to its associated raster schema tables.
Mosaic datasets, unmanaged raster catalogs, and unmanaged raster attributes do not store the raster data in the constituent raster tables. Instead, each value of the business table raster column references the images stored on disk. Deleting a row from an unmanaged raster catalog or a mosaic dataset results in the removal of the reference to the image file; the image file itself remains intact.
Storing raster data in an ArcSDE geodatabase
When raster data is stored within the ArcSDE geodatabase, it offers an enterprise level of functionality, such as security, multiuser access, and data sharing. The following are three main reasons to store your raster data in ArcSDE:
- It will not be updated very regularly (such as every two or three years or longer).
- It will be accessed in read-only use cases (such as using it as basemap data under vector data).
- There are hundreds of users (or more) that will access it as a basemap.
Because of its storage structure, the raster data is said to be managed, or fully controlled, by the geodatabase. ArcSDE geodatabases always store all the raster information (pixels, spatial reference, any associated table, and other metadata) for raster datasets, raster catalogs, and raster attributes within the associated relational database (for example, Oracle, SQL Server, DB2, or Informix). This means that all input raster information is loaded into the database and can be thought of as a format conversion.
When storing a raster dataset in ArcSDE, there can be as many as seven constituent tables. The main table is the business table, which will have, at a minimum, a raster column and a rowid column. In the case of a raster attribute, that raster column could be the only other column besides the required rowid column in a business table.
For all other raster models, including the mosaic dataset, raster dataset, and raster catalog, the business table includes a geometry column that holds the raster footprint. A geometry column will also be present in the business table if a raster attribute is added to a feature class. The geometry column can have two associated tables. One is the feature table that stores the actual geometry data. The feature table, also referred to as the F table, is present if the geometry storage type is ESRI binary. The feature table is absent if an object relational storage type such as an ESRI or IBM ST_GEOMETRY type or an Oracle SDO_GEOMETRY type is employed. The geometry column will also have a spatial index table, also referred to as an S table, associated with it unless an RTREE index is employed. Informix, PostgreSQL, and Oracle Spatial all employ an RTREE index.
Another table that is always present and is associated with the raster column is the raster blocks table. It stores the raster data for all raster models except the mosaic dataset. In the case of the mosaic dataset, this table remains empty, since the raster data for a mosaic dataset is not stored in the DBMS; instead, the raster data is referenced from an image file.
If the raster blocks table stores raster data, it will be the largest of all tables in an ArcSDE geodatabase and, depending on the size of the raster, could require special storage handling such as a devoted DBTUNE configuration. The Oracle SDO_GEORASTER raster storage type has a raster blocks table, but it has no other raster tables associated with it.
The raster auxiliary table stores optional raster band metadata including raster statistics, coordinate transformation, and a color map. For mosaic datasets, the raster auxiliary table also holds the function raster.
The raster column also has an associated raster table and raster bands table if the default ESRI binary raster storage type is used. However, if the optional object relational ST_RASTER storage type is used, these two tables are absent.
The feature table holds the footprint for the raster dataset, which is the same as when you have a feature class—one table stores the geometry and the other stores the spatial index information. For each raster dataset, there is one row in the feature table that stores the envelope.
The raster storage tables include the following:
- Business table—Stores attributes as well as raster and geometry columns
- Raster auxiliary table (AUX)—Stores optional metadata, such as raster statistics, a color map, or coordinate transformation information
- Raster blocks table (BLK for ESRI types and RDT for Oracle SDO_GEORASTER)-—Stores the pixels of each block in a raster band
The blocks table is the largest table and the one that stores the actual pixel information and pyramids.
- Raster band table (BND)—Stores band information
- Raster table (RAS)—Stores a record for each raster dataset
ArcSDE evenly tiles the bands into blocks of pixels according to a user-defined dimension (the default is 128 by 128). Tiling the raster band data enables efficient storage and retrieval of the raster data. The pyramid information is stored according to a declining resolution. The height of the pyramid is determined by the number of levels specified by the application or user.
The raster blocks table stores one row per block (tile) per band in a raster dataset and per pyramid level. For example, a three-band raster divided into 12 blocks with no pyramids built will have 36 rows in the BLK table—12 separate blocks for each of the bands. The column containing the pixel data for the block is a binary large object (BLOB).
The mosaic dataset and raster catalog is stored in ArcSDE as multiple rows in the business table, whereas a raster dataset is only a single row in the business table. The table schema is the same as that of a raster dataset. The only difference is that the feature table will have many rows, whereas each row represents the extent of a raster dataset in the raster catalog. Additionally, a mosaic dataset can contain a pointer to the raster dataset stored outside the ArcSDE geodatabase.
When storing a raster dataset as an attribute, the storage architecture for the raster dataset is the same as for raster catalogs. Each record in the business table will have an attribute value for the column of type RASTER. This attribute will be used to relate the business table to the supporting raster tables. When you load the image, it will be converted to an ArcSDE raster format, and the pixels will be stored in the raster blocks table.
Learn about storing raster datasets and raster catalogs in a geodatabase in DB2
Learn about storing raster datasets and raster catalogs in a geodatabase in Informix
Learn about storing raster datasets and raster catalogs in a geodatabase in Oracle
Learn about storing raster datasets and raster catalogs in a geodatabase in SQL Server
Storing raster data in a personal geodatabase
In a personal geodatabase, the raster dataset is converted to an Imagine (.img) file and stored inside an image database (IDB) folder. The IDB folder is located in the directory next to the personal geodatabase. When you delete a raster dataset, the raster in the IDB folder is permanently deleted.
When storing a mosaic dataset or raster catalog in a personal geodatabase, the mosaic dataset or raster catalog is a table that points to the stored raster datasets it contains. In a mosaic dataset, the raster datasets are stored as unmanaged, whereas in a raster catalog, the raster datasets can be stored as either managed or unmanaged. If they are managed, the entry in the raster catalog table points to the location in the IDB file where the raster dataset is stored. The IDB folder is organized so that it can be referenced to a row in a raster catalog. In an unmanaged case, the mosaic dataset or raster catalog contains the path location where the raster datasets are stored. Each row in the raster catalog business table points to the stored raster dataset. The operations on a mosaic dataset or unmanaged raster catalog do not affect the stored raster files; therefore, if you delete the raster datasets in a mosaic dataset or raster catalog, they will only be deleted from the raster catalog and not from the disk.
When storing a raster dataset as an attribute, the raster is stored as an IMG file in the system-defined location or as it is in the file system; this depends on whether it is managed or not. The storage is similar to a raster catalog.
Comparing raster storage in file, personal, and ArcSDE geodatabases
Raster storage characteristic |
File geodatabase |
Personal geodatabase |
ArcSDE geodatabase |
---|---|---|---|
Size limit |
1 TB for each raster dataset or raster catalog |
2 gigabytes (GB) per geodatabase (This is a table size limit, not a limit on the raster dataset size.) |
Unlimited; limit dependent on DBMS limits |
Raster dataset file format |
File geodatabase raster dataset |
ERDAS IMAGINE, JPEG, or JPEG 2000 |
ArcSDE raster dataset |
Storage |
|
|
|
Stored in the file system |
Stored in Microsoft Access |
Stored in an RDBMS |
|
Compression |
LZ77, JPEG, JPEG 2000, or None |
LZ77, JPEG, JPEG 2000, or None |
LZ77, JPEG, JPEG 2000, or None |
Pyramids |
Supports partial pyramiding |
Rebuilds entire pyramid |
Supports partial pyramiding |
Mosaicking |
Allows you to append to a raster dataset when mosaicking |
Rewrites a new dataset every time you mosaic to a raster dataset |
Allows you to append to a raster dataset when mosaicking |
Updating |
Allows incremental updating |
Allows incremental updating |
|
Number of users |
Single user and small workgroups; some readers and one writer |
Single user and small workgroups; some readers and one writer |
Multiuser; many users and many writers |
Managed versus unmanaged raster data
There are two ways of storing the raster data within a geodatabase, either managed or unmanaged. Geodatabases always store raster datasets as a managed source.
Raster catalogs and rasters as attributes can use the managed or unmanaged sources.
Mosaic datasets are always stored as unmanaged.
In an ArcSDE geodatabase, raster datasets and raster catalogs are always stored as managed.
Managed file geodatabase raster data is stored on disk in a proprietary tiled format that is directly compatible with how ArcSDE stores raster data (including ArcSDE raster compression types). This makes the file geodatabase, managed raster solution, an excellent choice for data transfer between ArcSDE instances (which do not have network connectivity between them). This functionality replaces previous data transfer workflows such as SDE export/import and database export/import (for example, transportable table spaces and detached databases).
Managed personal geodatabase raster data is stored on disk as one of three common file-based raster formats that ArcGIS can write: either ERDAS IMAGINE, JPEG, or JPEG 2000. The raster file format used to store the data is chosen internally by ArcGIS and is based on the type of compression utilized; therefore, if you choose a JPEG compression, the file is stored using the JPEG format, whereas if you choose no compression or LZ77, the raster is stored using the ERDAS IMAGINE format. The input raster datasets are converted from their original format and stored in a special folder (IDB), which resides next to the personal geodatabase .mdb file. The personal geodatabase manages these raster files based on the user's actions. (None of the pixel information is stored within the underlying Microsoft Access database.)
Unmanaged file geodatabase and personal geodatabase raster implementations simply point to the existing raster files on disk that ArcGIS can read. In these scenarios, the geodatabase does not manage the raster files but, rather, only manages the tables that reference the raster files. Unmanaged file and personal geodatabase mosaic datasets, raster catalogs, and rasters as attributes are the quickest to build because no raster format conversion or copying of pixel data occurs.
Compression, pyramids, and tile size
There are other storage structures to consider when storing data in a geodatabase, including compression, pyramids, and tile size.
All three types of geodatabases can store raster data using one of three compression techniques: LZ77 (lossless), JPEG (lossy), or JPEG 2000 (lossy). Lossless compression means the values of pixels in the raster dataset are not changed, whereas lossy compression results in altered pixel values. The amount of compression depends on the type of pixel data; the more homogeneous the image, the higher the compression ratio. You should store data that will be used for analysis, not just display, using a lossless compression. The primary benefit of compressing your data is that it requires less storage space; the amount of savings depends on the method of compression and the redundancy in the data. An added benefit is the overwhelmingly improved performance because you are transferring fewer packets of data. For example, when accessing raster data over a network with low bandwidth, the use of compression can offer improved performance because the amount of information to be transferred is reduced significantly, making it possible to store large, seamless raster datasets and raster catalogs (as large as several terabytes) and serve them quickly to a client for display.
Learn more about raster compression
Pyramids are reduced-resolution representations of your dataset that are stored alongside the data. It is always recommended that you build pyramids. Pyramids can speed up the display of raster data because ArcGIS only needs to process the extent and resolution required for the display instead of resampling the entire dataset. As you zoom in from the full extent, pyramids with finer resolution are used to display the image.
Pyramids are created by resampling the original data into several different layers, each representing an increasingly larger resolution. The resampling methods instruct the server how to resample the data to build the pyramids. Nearest neighbor should be used for discrete (nominal) data or raster datasets with color maps, such as land-use or pseudo color images. Bilinear interpolation or cubic convolution should be used for continuous data, such as satellite imagery or aerial photography. Prototyping the most appropriate resampling technique for your particular data is highly recommended. Remember that pyramid resampling only affects the display, not the original data.
Learn more about raster pyramids
When working with file and ArcSDE geodatabases, ArcGIS allows you to choose the number of levels and the resampling technique, fine-tuning the pyramids in a way that optimizes the display performance of your application. When you update part of a raster dataset in a file geodatabase or ArcSDE geodatabase, you need to only update the part of the pyramid that contains the changes. As a result, you can complete the update in a fraction of the time it takes for other implementations because you do not have to rewrite the entire raster dataset or all its pyramids. Additionally, during an update, other users can continue to access the raster dataset with only a small performance drop.
In the ArcSDE geodatabase, raster data is stored in a structure where the data is tiled, indexed, pyramided, and most often compressed. Because of tiling, indexing, and pyramiding, each time the raster data is queried, only the tiles necessary to satisfy the extent and resolution of the query are returned instead of the whole dataset. The tile size controls the number of pixels you want to store in each database memory block. This is specified as a number of pixels in x and y. The default tile size is 128 by 128 pixels, and most applications do not warrant deviating from these default values. In the ArcSDE geodatabase, the tiles of raster data are compressed before storing them in the geodatabase.
Importing rasters
Raster data is imported into a geodatabase via the user interface in several ways. Raster data can be imported into a geodatabase using the Import shortcut menu by clicking a geodatabase. Data can also be loaded into a raster dataset or raster catalog in a geodatabase with the Load Data command found in ArcCatalog. Several geoprocessing tools can be used to load or import data into the geodatabase; for example, the Copy Raster tool can be used to import raster datasets, the Workspace To Raster Dataset tool loads and mosaics all the raster datasets stored within the specified workspace into one raster dataset, and the Workspace To Raster Catalog tool loads all the raster datasets stored in the same workspace into an existing raster catalog. The Add Rasters To Mosaic Dataset tool only adds pointers to the source data in a mosaic dataset and does not move or load the raster data to the location of the mosaic dataset.