Geodatabase storage is based on relational principles
A database management system (DBMS) can be considered open because the simplicity and flexibility of the generic relational data model enable it to support a broad array of applications.
The geodatabase storage model is based on DBMS principles, leveraging a series of simple yet essential relational database concepts. The DBMS (and the file system for file geodatabases) provides a simple, formal data model for storing and working with information in tables.
Key concepts include the following:
- Data is organized into tables.
- Tables contain rows.
- All rows in a table have the same columns.
- Each column has a type, such as integer, decimal number, character, date, and so on.
- Relationships are used to associate rows from one table with rows in another table. This is based on a common column in each table.
- Relational integrity rules exist for tables. For example, each row always shares the same columns, a domain lists the valid values or value ranges for a column, and so on.
For ArcSDE geodatabases that are stored in relational databases, a number of additional DBMS capabilities also apply:
- Structured query language (SQL), a series of relational functions and operators, is available to operate on the tables and their data elements.
- The SQL operators are designed to work with the generic relational data types, such as integers, decimal numbers, dates, and characters.
For example, a feature class is stored as a DBMS table. Each row represents a feature. The columns in each row represent various characteristics or properties of the feature, and one of the columns holds the feature geometry (for example, point, line, or polygon coordinates). In the following example, the shape field holds a polygon shape for each parcel row in the feature class table.
Various column types in the DBMS are used to hold the shape field in the table. These are typically either a binary large object (BLOB) type or an extended spatial type that is supported in some DBMSs. For example, at least one spatial column type for storing features is available for all DBMSs that ArcSDE supports—Oracle, IBM DB2, Informix, PostgreSQL, and SQL Server. This adds support for accessing features using SQL, which adheres to the International Organization for Standardization (ISO) and Open Geospatial Consortium, Inc. (OGC) standards for spatial types.
SQL operates on the rows, columns, and types in tables. The column types (the numbers, characters, dates, BLOBs, spatial types, and so on) are objects in the SQL algebra. The DBMS manages these simple data types and tables, while additional application logic implements more complex object behavior and integrity constraints.
Adding spatial types and SQL support for spatial attributes to a DBMS, however, is not enough on its own to support GIS.
Implementing higher-level objects and behavior in relational DBMSs
Developers wanting to implement higher-level objects with behavior and logic write application code to do so. For example, an organization may implement an employees table as follows:
Last Name |
First Name |
Hire Date |
Salary |
---|---|---|---|
Brown |
Ben |
10-10-2001 |
$10,000.50 |
Jones |
Betty |
06-14-1998 |
$22,000.00 |
Smith |
Jason |
08-23-1999 |
$44,000.75 |
The Employees table is a simple relational data table containing rows and columns. The data in each column adheres to a particular data type, such as character, date, and number. DBMSs work with information at this level.
However, simply adding this information to a DBMS table does not turn the DBMS into a payroll or employee management system. Adding a Dollars column that holds numbers with two decimal places does not turn a DBMS into an accounting system. Higher-level application logic is needed.
Examples of logic that could be implemented to support employment activities are hiring, implementing a pay raise, employee resignations, promotions, and managing benefits. The business objects being modeled for the employees and their names, salaries, and hire dates are not implemented as relational objects. More sophisticated and focused application logic is required to implement behavior and integrity on these business objects.
Similar business objects are universally applied in GIS. For example, topologies, networks, linear referencing systems, raster catalogs, annotations, terrains, map layers, and so forth, are all examples of advanced objects used to implement GIS behavior on top of the simple spatial representations stored in the DBMS.
As with other DBMS applications, tables with spatial column types are not enough on their own for GIS applications. Both sets of objects (the simple DBMS relational column types and the geodatabase application objects such as topologies) are necessary for building geographic information systems.
Where does the application logic belong?
Countless database management system (DBMS) implementations have demonstrated overwhelmingly that the use of a separate application tier that works on the rows and column types in tables is appropriate for advanced applications. For example, all the widely adopted customer information systems (CIS), enterprise resource planning (ERP) systems, and accounting packages implement advanced application logic in the application tier, which enables more openness and extensibility, higher performance, richer toolsets, and increased flexibility.
Users interact with and perform transactions within these systems through the application logic for the vast majority of operations and only use structured query language (SQL) for focused (and appropriate) activities.
Separating the application logic above the data tier also allows the same logic to be applied to DBMSs, files, extensible markup language (XML), and other data storage alternatives. This enables this architecture to be more open. For example, the geodatabase application logic in ArcGIS is also used to read and work with all geographic data sources such as computer-aided drafting (CAD) data, shapefiles, MapInfo data, Intergraph GeoMedia files, and Geography Markup Language (GML) files.
Other ways to persist this higher-level logic include as stored procedures and database triggers in the DBMS or as extended types in the DBMS.