Working with geodatabases in Python
Validating table names
Geodatabases use various relational database management systems (RDBMS) to maintain the many tables that comprise a geodatabase. All tables in a geodatabase must have a unique name, so a mechanism for checking whether a table name is unique is essential when creating data in a geodatabase. Ensuring that a new table name is unique is key to removing the potential for accidentally overwriting data if the script doesn't ensure that the new output name is unique.
The Exists() function can check to see whether the table name is unique for a given workspace.
Using the ValidateTableName() function, a script can determine whether a specific name is valid and unique to a specific workspace. The ValidateTableName() function does not determine that the specified name is unique to the specified workspace. It only returns a valid table name for that workspace.
Function |
Explanation |
---|---|
ValidateTableName(name, {workspace}) |
Takes a table name and a workspace path and returns a valid table name for the workspace |
Specifying the workspace as a parameter allows ArcPy to check all the existing table names and determine whether there are naming restrictions imposed by the output workspace. If the output workspace is an RDBMS, it may have reserved words that may not be used in a table name. It may also have invalid characters that cannot be used in a table or field name. All invalid characters are replaced with an underscore (_). ValidateTableName() returns a string representing a valid table name that may be the same as the input name if the input name is valid. The example below guarantees that the new output feature class created by the Copy Features tool has a unique name that is valid in any geodatabase:
# Move all shapefiles from a folder into a geodatabase # import arcpy from arcpy import env # Set the workspace. List all of the shapefiles # env.workspace = "D:/St_Johns" fcs = arcpy.ListFeatureClasses("*") # Set the workspace to SDE for ValidateTableName # env.workspace = "Database Connections/Bluestar.sde" # For each feature class name # for fc in fcs: # Validate the output name so it is valid # outfc = arcpy.ValidateTableName(fc) # Copy the features from the workspace to a geodatabase # arcpy.CopyFeatures_management(fc, outfc)
Validating field names
Each database can have naming restrictions for field names in a table. Objects such as feature classes or relationship classes are stored as tables in an RDBMS, so these restrictions affect more than just stand-alone tables. These restrictions may or may not be common among various database systems, so scripts should check all new field names to ensure that a tool does not fail during execution.
Function |
Explanation |
---|---|
ValidateFieldName(name, {workspace}) |
Takes a string (field name) and a workspace path and returns a valid field name based on name restrictions in the output geodatabase |
The example below ensures that a field is added, regardless of the input name, using the ValidateFieldName function:
# Create a new numeric field containing the ratio of polygon area to # polygon perimeter. Two arguments, a feature class and field name, # are expected. # Define a pair of simple exceptions for error handling # class ShapeError(Exception): pass class FieldError(Exception): pass import arcpy import os try: # Get the input feature class and make sure it contains polygons # input = arcpy.GetParameterAsText(0) desc = arcpy.Describe(input) if desc.shapeType.lower() != "polygon": raise ShapeError # Get the new field name and validate it # fieldname = arcpy.GetParameterAsText(1) fieldname = arcpy.ValidateFieldName(fieldname, os.path.dirname(input)) # Make sure shape_length and shape_area fields exist # if len(arcpy.ListFields(input,"Shape_area")) > 0 and \ len(arcpy.ListFields(input,"Shape_length")) > 0: # Add the new field and calculate the value # arcpy.AddField_management(input, fieldname, "double") arcpy.CalculateField_management(input,fieldname, "[Shape_area] / [Shape_length]") else: raise FieldError except ShapeError: print "Input does not contain polygons" except FieldError: print "Input does not contain shape area and length fields" except: print arcpy.GetMessages(2)
Whenever a script is updating a dataset, such as a feature class or table, be careful to avoid situations in which the dataset is locked. If you have opened a personal or file geodatabase in ArcCatalog, a script will not be able to update any of the geodatabase's contents until it is deselected and the folder is refreshed or ArcCatalog is closed. This includes script tools.
Parsing table and field names
Scripts should use the ParseTableName() and ParseFieldName() functions to split the fully qualified names for a dataset or for a column in a table into its components (database, owner, table, and column). Scripts that need to be RDBMS independent should not assume that the period (.) is the delimiter used to separate the components of a fully qualified dataset name. ArcGIS applications always use fully qualified names, so if your script is being used as the source of a script tool, any feature class name, for example, must be parsed if the script needs to determine the name of the feature class without the user and database names. In the example below, the script is checking the input feature class for specific user names. The script could use the parsing functions of Python to split the qualified name, but it would then assume a specific syntax, which may change if another database type is used. ParseTableName() returns a single string with the database name, owner name, and table name separated by commas. ParseFieldName() returns the table name and the field name, also separated by commas. A script can then reliably parse the returned string, since these functions always return the same formatted string. A string in Python has a number of native methods for manipulation. In this example, the split method is used to create a Python list, using the comma as the delimiter.
Function |
Explanation |
---|---|
ParseFieldName(name, {workspace}) |
Parses a fully qualified field name into its components (database, owner name, table name, field name) depending on the workspace |
ParseTableName(name, {workspace}) |
Parses a table name into its components (database, owner, table) depending on the workspace |
# Append input feature class to another feature class and update field # import arcpy from arcpy import env import sys # Get the name of the input feature class and parse it. # env.workspace = os.path.dirname(arcpy.GetParameterAsText(0)) # Split ParseTableName's returned value to determine database, owner, and # featureclass # fullname = arcpy.ParseTableName(os.path.basename(arcpy.GetParameterAsText(0))) database, owner, featureclass = fullname.split(",") # Validate the name of the feature class that will be appended to and set # the workspace using the administrator's connection # env.workspace = "Database Connections/Trans_admin.sde" appendFC = arcpy.ValidateTableName("common", "roads") try: if owner == "TJohnson": arcpy.CalculateField_management(fullname, "AppendedBy", owner) arcpy.Append_management(fullname, appendFC) elif owner == "RSmith": arcpy.CalculateField_management(fullname, "AppendedBy", owner) arcpy.Append_management(fullname, appendFC) else: arcpy.AddError("Unknown user of input feature class") except: arcpy.AddError(arcpy.GetMessages(2))
A Python list is an ordered collection of any type of object, such as strings or numbers. Lists are zero based and can be used to effectively handle arrays of values when the order of the list's contents is known.
Scripts that are used as the source of script tools can make some assumptions about their input argument values. Paths to data are always fully qualified by the geoprocessing framework. Scripts that may be run outside an ArcGIS application should not make the same assumption.