How Create Spatially Balanced Points works
There are many considerations to take into account when designing a sampling network. Some designs can be found here: An introduction to sampling monitoring networks. Spatially balanced designs, in particular, are constructed to improve the efficiency of estimated values by maximizing spatial independence among sample locations (Theobald et al. 2007). They also lead to more efficient sampling by providing more information per sample unit as every sample is distributed across the population. Note that these comments refer to statistical efficiency, which is one of several criteria that could be applied to a sampling design. A different measure of efficiency may be that of optimal semivariogram estimation, which usually requires samples to be taken at varying distances from one another, and often clustered samples are used to determine the nugget value more accurately (see Warrick and Myers 1987 for an optimization algorithm with semivariogram fitting criteria in mind).
The Create Spatially Balanced Points tool was developed based on the algorithm proposed by Theobald et al. (2007), which is based in part on the method developed by Stevens and Olsen (2004). The method is based on the following:
- The Reverse Randomized Quadrant-Recursive Raster (RRQRR) algorithm is used to map 2D space into a 1D space in which successive samples constitute a spatially balanced sampling design.
- Unequal inclusion probabilities are used to handle variations in sampling intensity. Inclusion probabilities are relative values (between 0 and 1, inclusive) which specify the probability that a location (raster cell) will be selected relative to other locations.
The input to the tool is a raster that simultaneously defines the following:
- The maximum enclosing rectangle for the analysis
- The inclusion probabilities (locations in the study area have nonnull, greater than 0 inclusion probabilities)
- The sample frame (study area)
- The finest resolution at which the sample locations will be generated
The resulting spatially balanced design has the following properties:
- Low variance in the area of the Voronoi polygons generated from the sample sites (in other words, each sample point represents roughly the same proportion of the total study area).
- Flexibility, so that changes in time, accessibility to sample sites, budget, and so forth, can be used to update the sample locations. This requires that the randomization process mentioned above be controlled and repeatable—which is achieved by setting the seed value for the random number generator. A seed value of 0 will produce unrepeatable (new) output each time the tool is run. Use of a fixed seed value greater than 0 will produce repeatable results and can be used to increase or decrease the number of sample points without compromising the spatial balance of the design.
For best results, Theobald et al. (2007) recommend that the number of samples be less than 1 percent of all the possible sample locations in the study area.
References:
- Stevens, D.L., and A.R. Olsen. 2004. "Spatially balanced sampling of natural resources."Journal of the American Statistical Association 99 (465): 262–278.
- Theobald, D.M., D.L. Stevens, Jr., D. White, N.S. Urquhart, A.R. Olsen, and J.B. Norman. 2007. "Using GIS to Generate Spatially Balanced Random Survey Designs for Natural Resource Applications."Environmental Management 40: 134–146.
- Warrick, A.W., and D.E. Myers. 1987. "Optimization of sampling locations for variogram calculations."Water Resources Research 23 (3): 496–500.