An overview of geodatabase design
An overview of geodatabase design
Geodatabase design is based on a common set of fundamental GIS design steps, so it's important to have a basic understanding of these GIS design goals and methods. This section provides an overview.
GIS design involves organizing geographic information into a series of data themes—layers that can be integrated using geographic location. So it makes sense that geodatabase design begins by identifying the data themes to be used, then specifying the contents and representations of each thematic layer.
This involves defining
- How the geographic features are to be represented for each theme (for example, as points, lines, polygons, or rasters) along with their tabular attributes
- How the data will be organized into datasets, such as feature classes, attributes, raster datasets, and so forth
- What additional spatial and database elements will be needed for integrity rules, for implementing rich GIS behavior (such as topologies, networks, and raster catalogs), and defining spatial and attribute relationships between datasets
Representation
Each GIS database design begins with a decision as to what the geographic representations will be for each dataset. Individual geographic entities can be represented as
- Feature classes (sets of points, lines, and polygons)
- Imagery and rasters
- Continuous surfaces that can be represented using features (such as contours), rasters (digital elevation models [DEM]), or triangulated irregular networks (TINs) using terrain datasets
- Attribute tables for descriptive data
Data themes
Geographic representations are organized in a series of data themes (sometimes referred to as thematic layers). A key concept in a GIS is one of data layers, or themes. A data theme is a collection of common geographic elements such as a road network, a collection of parcel boundaries, soil types, an elevation surface, satellite imagery for a certain date, well locations, and so on.
The concept of a thematic layer was one of the early notions in GIS. Practitioners thought about how the geographic information in maps could be partitioned into logical information layers—as more than a random collection of individual objects (such as a road, a bridge, a hill, a house, a peninsula). These early GIS users organized information in thematic layers that described the distribution of a phenomenon and how it should be portrayed across a geographic extent. These layers also provided a protocol (capture rules) for collecting the representations (as feature sets, raster layers, attribute tables, and so on).
In GIS, thematic layers are one of the main organizing principles for GIS database design.
Each GIS will contain multiple themes for a common geographic area. The collection of themes acts as layers in a stack. Each theme can be managed as an information set independent of other themes. Each has its own representations (points, lines, polygons, surfaces, rasters, and so on). Because the various independent themes are spatially referenced, they overlay one another and can be combined in a common map display. Plus, GIS analysis operations, such as overlay, can fuse information between themes.
GIS datasets are collections of representations for a data theme
Geographic data collections can be represented as feature classes and raster-based datasets in a GIS database.
Many themes are represented by a single collection of homogeneous features such as a feature class of soil type polygons and a point feature class of well locations. Other themes, such as a transportation framework, are represented by multiple datasets (such as a set of spatially related feature classes for streets, intersections, bridges, highway ramps, and so on).
Raster datasets are used to represent continuous surfaces, such as elevation, slope, and aspect, as well as to hold satellite imagery, aerial photography, and other gridded datasets (such as land cover and vegetation types).
Both the intended use and existing data sources influence spatial representations in a GIS. When designing a GIS database, users have a set of applications in mind. They understand what questions will be asked of the GIS. Defining these uses helps to determine the content specification for each theme and how each is to be represented geographically. For example, there are numerous alternatives for representing surface elevation: as contour lines and spot height locations (such as hilltops, peaks), as a continuous terrain surface (a TIN), or as shaded relief. Any or all of these may be relevant for each particular GIS database design. The intended uses of the data will help to determine which of these representations will be required.
Frequently, the geographic representations will be predetermined to some degree by the available data sources for the theme. If a preexisting data source was collected at a particular scale and representation, it will often be necessary to adapt your design to use it.
Individual GIS datasets often are collected in concert with other data layers
While each GIS dataset can be used independently of other GIS data, it is often quite important to collect datasets in concert with other information layers so that the fundamental spatial behavior and spatial relationships are maintained and consistent between the related GIS data layers. Here are a few examples that help to illustrate this concept:
- Hydrologic information about watersheds and drainage basins should be collected in unison with the drainage network. Drainage lines should fit within the basins. All these layers should fit into the surface representation of the terrain.
- Various data layers in a parcel fabric should be collected in concert with other cadastral layers and with underlying survey information so that parcel features fit into the survey control framework. Numerous other feature sets, such as rights-of-way, easements, and zoning classes, are compiled so that they fit onto the parcel fabric.
- The spatial relationships between elevation, landform, soil type, slope, vegetation, surficial geology, and other terrain properties are typically compiled in unison to characterize environmental resource units. Understanding the science behind these spatial relationships helps to build a consistent, logical database where features from each data layer are consistent with each other.
- Topographic basemap information is compiled in an integrated manner. Hydrography, transportation, structures, administrative boundaries, and other topographic map layers are compiled in unison. These cartographic representations in the map display are built in an integrated manner to communicate clearly and accurately and draw attention to key map locations.
In each of these cases, a data model defines a collection of related data themes that fit into an overall information framework. Each framework is essentially a collection of related data themes that are best captured in unison with each other. The data capture guidelines follow sound scientific principles about their spatial behavior and relationships. Each theme plays an important part in the holistic characterization of a particular landscape. For example:
- Terrain landscape. Topographic maps, elevation, drainage network, transportation network, map features, cross-country movement, and so forth
- Urban landscape. Buildings, critical infrastructure, and so forth
- Imagery landscape. Satellite and aerial, local, regional, and national assets, and so forth
- Human landscape. Demographics (population characteristics), cultural centers, citizens, administrative districts and zones and so forth
- Workforce landscape. Mobile workforce tracking, service centers, traffic conditions, warehouses, and so forth
- Sensor landscape. Camera locations, devices, and so forth
- Operations and plans landscape. Zones of control, planned movements, response, and so forth
This concept of collecting integrated data themes in unison is one of the key design principles used in each of the ArcGIS Data Models.