How different input data formats are handled
Vector data
Geostatistical Wizard, the ESDA tools, and the Geostatistical Analyst geoprocessing tools can accept points, lines, and polygons as input data.
- Point data is read as x,y coordinates and an attribute value at each location. In the case where two or more points have the same x,y coordinates, the attribute values can be treated in different ways. Geostatistical Wizard will prompt you to choose an option. For geoprocessing tools, the option must be set using the coincident points environment variable. See How coincident data is handled for more information.
- Line data is used by computing x,y centroid coordinates for each line and reading the data in as points. Lines are also used for interpolation methods that account for barriers (diffusion interpolation and kernel interpolation with barriers).
- Polygon data is used by computing x,y coordinates of each polygon centroid and reading in the data as points. Polygons are not altered when they are used for polygon declustering (see Adjusting for preferential sampling by declustering the data), in conjunction with some of the kriging methods, or when polygons are used to represent barriers.
Raster data
Raster data can also be used as input to Geostatistical Wizard and the Geostatistical Analyst geoprocessing tools. In these cases, the cell centers are assigned x,y coordinates and the data is read in the same way as a point file. The ESDA tools do not allow raster data as input.
Handling missing values
Feature datasets with missing values should be treated with care. If missing values are represented by a code (for example, -99), all the statistical analysis based on that data will be wrong unless one of the two options described below is followed:
- Import the data into a geodatabase and recode all the missing values as <Null> values. Null values will be ignored in all the computations done within Geostatistical Analyst (the ESDA tools, wizard, and geoprocessing tools).
- Write a definition query to exclude the missing values from the dataset. Note that definition queries will exclude entire rows from the attribute table, so you may need copies of the layer if you want to exclude different rows based on different attributes. For example, in the case where two attributes were supposed to be measured at each point, but for a few points, one measurement is missing, one layer would have a definition query to exclude missing values of attribute one, and a copy of the layer would be created using a definition query to exclude missing values for attribute two. For more information on using definition queries, refer to Building a query expression.
Raster cells that have NoData values are ignored when read in Geostatistical Wizard and the Geostatistical Analyst geoprocessing tools.