Search neighborhoods
You can assume that as locations get farther from the prediction location, the measured values have less spatial autocorrelation with the prediction location. As these points have little or no effect on the predicted value, they can be eliminated from the calculation of that particular prediction point by defining a search neighborhood. It is also possible that distant locations may have a detrimental influence on the predicted value if they are located in an area that has different characteristics than those of the prediction location. A third reason to use search neighborhoods is for computational speed. If you have 2,000 data locations, the matrix would be too large to invert, and it would not be possible to generate a predicted value. The smaller the search neighborhood, the faster the predicted values can be generated. As a result, it is common practice to limit the number of points used in a prediction by specifying a search neighborhood.
The specified shape of the neighborhood restricts how far and where to look for the measured values be used in the prediction. Additional parameters restrict the locations that are used within the search neighborhood. The search neighborhood can be altered by changing its size and shape or by changing the number of neighbors it includes.
The shape of the neighborhood is influenced by the input data and the surface that you are trying to create. If there are no directional influences in the spatial autocorrelation of your data (see Accounting for directional influences for more information), you will want to use points equally in all directions, and the shape of the search neighborhood is a circle. However, if there is directional autocorrelation or a trend in the data, you may want the shape of your neighborhood to be an ellipse oriented with the major axis parallel to the direction of long-range autocorrelation (the direction in which the data values are most similar).
The search neighborhood can be specified in the Geostatistical Wizard, as shown in the following example:
- Neighborhood type: Standard
- Maximum neighbors = 4
- Minimum neighbors = 2
- Sector type (search strategy): Circle with four quadrants with 45° offset; radius = 182955.6
- Coordinates of test point (x = -2084032, y = 89604.57)
- Predicted value= 0.08593987
The Weights section lists the weights that are used to estimate the value at the location marked by the crosshair on the preview surface. The data points with the largest weights are highlighted in red.
Once a neighborhood shape is specified, you can restrict which locations within the shape should be used. You can define the maximum and minimum number of neighbors to include and divide the neighborhood into sectors to ensure that you include values from all directions. If you divide the neighborhood into sectors, the specified maximum and minimum number of neighbors is applied to each sector.
There are several different sector types that can be used:
- One sector
- Ellipse with four sectors
- Ellipse with four sectors and a 45-degree offset (selected)
- Eight sectors
Kriging uses the data configuration specified by the search neighborhood in conjunction with the fitted semivariogram model; weights for the measured locations can be determined. Using the weights and the measured values, a prediction can be made for the prediction location. This process is performed for each location within the study area to create a continuous surface. Other interpolation methods follow the same process, but the weights are determined using techniques that do not involve a semivariogram model.
The Smooth Interpolation option creates three ellipses. The central ellipse uses the Major semiaxis and Minor semiaxis values. The inner ellipse uses these semiaxis values multiplied by 1 minus the value for Smoothing factor, whereas the outer ellipse uses the semiaxis values multiplied by 1 plus the smoothing factor. All the points within these three ellipses are used in the interpolation. Points inside the smallest ellipse have weights assigned to them in the same ways as for standard interpolation (for example, if the method being used is inverse distance weighted interpolation, the points within the smallest ellipse are weighted based on their distance from the prediction location). The points that fall between the smallest ellipse and the largest ellipse get weights as described for the points falling inside the smallest ellipse, but then the weights are multiplied by a sigmoidal value that decreases from 1 (for points located just outside the smallest ellipse) to 0 (for points located just outside the largest ellipse). Data points outside the largest ellipse have zero weight in the interpolation. An example of this is shown below:
In Geostatistical Analyst, the weights for all nonkriging models are defined by a priori analytic functions based on the distance from the prediction location. Most kriging models predict a value using the weighted sum of the values of the nearby locations. Kriging uses the semivariogram to define the weights that determine the contribution of each data point to the prediction of new values at unsampled locations. Because of this, the default search neighborhood used in kriging is constructed using the major and minor ranges of the semivariogram model.
It is expected that a continuous surface is made from continuous data, such as temperature observations, for example. However, all interpolators with a local searching neighborhood generate predictions (and prediction standard errors) that can be substantially different for nearby locations if the local neighborhoods are different. To see a graphical representation of why this occurs, see Smooth interpolation.
A model using the smooth interpolation option cannot predict values when the search neighborhood does not contain any data points, so there may be areas of the map that are left blank.