The geoprocessing framework
The geoprocessing framework is the set of windows and dialog boxes you use to manage and execute tools. This document focuses on the high-level concepts and ideas behind this framework rather than the mechanics of using it.
The core idea behind geoprocessing is to allow you to quickly and easily turn your ideas into new software that can be executed, managed, modified, documented, and shared with the ArcGIS user community. Software, in this case, means something that instructs ArcGIS to do what you want. A geoprocessing model, for example, is new software built by you with an easy-to-use visual programming language called ModelBuilder.
The main theme of this section is the idea that geoprocessing is a way for you to create new useful software. By doing so, it is hoped you'll have a broader and deeper understanding of how and why to use geoprocessing.
To create new software of any kind, two essential elements are needed:
- A formal language that operates on the data captured within the system.
- A framework for creating, managing, and executing software based on this language. This includes things such as editors, browsers, and documentation tools.
Geoprocessing's language is its collection of tools. The geoprocessing framework is a small collection of built-in user interfaces for organizing and managing existing tools and creating new tools. The basic components of the framework are shown in A quick tour of geoprocessing and consist of the following:
- The Search window to find and execute tools and the Catalog window to browse to toolboxes to manage or execute their tools
- The tool dialog box for interactively filling out tool parameters and executing the tool
- The Python window for executing a tool by typing in its parameters
- The ModelBuilder window for chaining together sequences of tools
- Methods for creating scripts and adding them to toolboxes.
Geoprocessing models and ModelBuilder
The tool dialog box lets you execute a single tool. You can think of this as executing a single instruction in a programming language. While single tool execution is certainly practical, the system would not be very useful unless you could string together multiple tools, feeding the output of one into another, just like a programming language.
In the geoprocessing framework, the ModelBuilder window is how you quickly and easily turn your ideas into software by chaining together elements of the geoprocessing language (the tools) into a sequence. It's important to realize that models are software, since they instruct the computer to do something. The programming language is visual—what you see in ModelBuilder—rather than text-based like a traditional programming language.
The most important thing to note here is that models are tools. They behave exactly like all other tools in the system. You can execute them in the dialog box window or in the Python window. Since models are tools, you can embed models within models. In fact, several of the system tools provided with ArcGIS are models.
Models can be as complex as you dare. You can use any system or custom tools in a model, including other models you've written (since models are just tools). You can also use loops and conditions to control the logical flow of a model.
Models can be extremely simple and still be productive. You can create a model that contains a single tool but embeds some of its parameters. For example, the Buffer tool takes six parameters, but for your current set of tasks, you know that three of these parameters will always be the same. Rather than filling out these parameters each time you execute the Buffer tool, you can quickly create a model and set these three parameters, save it as the MyBuffer tool, and use its dialog box rather than the Buffer dialog box. You might only use MyBuffer a few times before deleting it, but it's no loss because it was quick and easy to create and productive for you to do so.
Scripting
You can also use a scripting language to create new, useful software. A program that uses a scripting language is a script. In the world of software programming, languages can be divided into two basic categories: system languages and scripting languages. System languages are things such as C++ and .NET that are used to create applications from scratch, using low-level primitives and the raw resources of the computer. Scripting languages, such as Python and Perl, are used to glue applications together, using built-in higher-level functions of the computer and masking the nuts and bolts a system language programmer must deal with. Compared to system languages, scripting languages are easier to learn and use—a basic understanding of programming is all that's needed to be productive.
In the geoprocessing framework, scripts are analogous to models in that they can be used to create new tools. Models are created with a visual programming language (ModelBuilder), and scripts are created with a text-based language and text editors.
Just like models, scripts are tools. You can introduce a script to a custom toolbox using a step-by-step wizard, and it becomes just another tool that you can use in a model or in another script. Several of the system tools are scripts. Technically, you can write a script and not introduce it to a toolbox, in which case it's not a tool but only a stand-alone script on disk.
There are several reasons why you'd want to use scripting:
- At some point, you may have the need for more advanced programming logic, such as conditional execution and advanced error handling; more advanced data structures, such as dictionaries and lists; or more functionality, such as string, math, and file manipulation functions. Many scripting languages have been extended with third-party libraries for things such as advanced math and statistics, Web automation, database queries, and advanced system utilities.
- There are some low-level geoprocessing functions available only in scripts. Cursors, for example, let you loop through records on a table, reading or writing rows and inserting new rows. There are functions to access the properties of ArcGIS data, such as the extent of a feature class or the sundry properties of individual fields on a table.
- Scripts are great for wrapping other software—the gluing together of applications. For example, you might have a model that outputs a simple text file of parcel owners and addresses affected by a zoning ordinance change, and you want to launch another program that reads this text file and generates official notification letters for the owners of the affected parcels. You could use a script to wrap this letter generation program, introduce this script to a toolbox, and use it directly in the model.
- Scripts can be executed outside ArcGIS. That is, you can execute the script directly from the operating system prompt. (You still need to have the ArcPy site-package installed on the machine since you need access to the geoprocessing tools.)
A framework for creating and managing software
The geoprocessing framework was built to let you quickly and easily turn your ideas into new software that can be managed by the system and shared among users.
Geoprocessing is a language consisting of operators, or tools, that operate on the data within ArcGIS (tables, feature classes, rasters, TINs, and so on), and perform tasks that are necessary for manipulating and analyzing geographic information across a wide range of disciplines.
You can quickly and easily create new software in the form of models and scripts. These new tools perform tasks that are not part of the standard ArcGIS package. For example, there is no menu, button, or programming object anywhere in ArcGIS that performs the simple Project and Clip model shown in What is geoprocessing.
Tools are managed by the geoprocessing framework, which means you don't have to. This is a subtle but important point that isn't immediately obvious.
- All tools, whether they are system tools or custom (user-written) tools, can be accessed from their toolbox. Imagine a different situation where models, scripts, and system and custom tools were each accessed by different interfaces and methods—it would be a nightmare to use and manage. In geoprocessing, all things are created and managed equally, whether they are component tools, model tools, or script tools.
- Tools are all documented the same way. Once you create a tool, you can document your tool in the Catalog window so it can be cataloged and searched by the system. Compare this to the alternative of leaving documentation standards and management as an exercise for the user.
- Tools have the same user interface: the dialog box. These dialog boxes are automatically created based on the tool parameters. You don't have to do any user interface programming. Consider the alternative where user interface design and programming are left to the tool author.
Tools can be easily shared. A toolbox with all its tools and toolsets is either contained in a file on disk with a .tbx extension or within a geodatabase. Anyone with access to the file or geodatabase can run its tools.
The salient point is that your tools become full-fledged members of the geoprocessing framework where they have consistent documentation, user interface, methods of access, and methods of sharing.
Geoprocessing and ArcObjects
ArcObjects is the extensive library of low-level programming objects delivered as part of the ArcGIS Software Development Kit (SDK). Developers use ArcObjects to build new applications or extend the existing functionality of ArcGIS applications. (For the record, most system tools and the entire geoprocessing framework were all built using ArcObjects.) Like geoprocessing, the ArcObjects SDK can be used to create new software.
The ArcObjects SDK and geoprocessing are complementary; neither obsoletes the other. As a general statement, ArcObjects is used to extend ArcGIS with new behavior, while geoprocessing is designed to automate tasks. You use ArcObjects to do things like add new user interfaces, add custom behavior to feature classes, or create a special cartographic renderer. Geoprocessing is used to create software (models and scripts) that automates tasks within the confines of a well-behaved framework.
ArcObjects is meant to be used with a system programming language, where the programmer needs to access low-level primitives to implement complex logic and algorithms. This is why ArcObjects contains thousands of different objects and requests—to allow the programmer the fine degree of control they require. Because ArcObjects is used in concert with a system programming language, it requires a good deal of programming knowledge—much more than geoprocessing, with its models and scripts.
Conversely, geoprocessing is a universal capability that can be used and deployed by all GIS users to automate their work, build repeatable and well-defined methods and procedures, and model important geographic processes.