# How Generate Spatial Weights Matrix works

Spatial Statistics does not mean applying traditional (nonspatial) statistical methods to data that just happens to be spatial (having x-and y-coordinates). Spatial statistics integrate space and spatial relationships directly into their mathematics (area, distance, length, and so on). For many spatial statistics, these spatial relationships are specified formally through a spatial weights matrix file or table.

A spatial weights matrix is a representation of the spatial structure of your data. It is a quantification of the spatial relationships that exist among the features in your dataset (or, at least, a quantification of the way you conceptualize those relationships). Because the spatial weights matrix imposes a structure on your data, you should select a conceptualization that best reflects how features actually interact with each other (giving thought, of course, to what it is you are trying to measure). If you are measuring clustering of a particular species of seed-propagating tree in a forest, for example, some form of inverse distance is probably most appropriate. However, if you are assessing the geographic distribution of a region's commuters, travel time or travel cost might be a better choice.

While physically implemented in a variety of ways, conceptually, the spatial weights matrix is an NxN table (N is the number of features in the dataset). There is one row for every feature and one column for every feature. The cell value for any given row/column combination is the weight that quantifies the spatial relationship between those row and column features.

At the most basic level, there are two strategies for creating weights to quantify the relationships among data features: binary or variable weighting. For binary strategies (fixed distance, K nearest neighbors, or contiguity) a feature is either a neighbor (1) or it is not (0). For weighted strategies (inverse distance or zone of indifference), neighboring features have a varying amount of impact (or influence), and weights are computed to reflect that variation.

Based on your parameter specifications, the Generate Spatial Weights Matrix tool creates a Spatial Weights Matrix (.swm) file in *little endian* binary format. The spatial relationship values in that file are stored using sparse matrix techniques to minimize disk space, computer memory, and the number of required calculations. These relationship values are utilized in the mathematics of several spatial statistics tools including Spatial Autocorrelation (Global Moran's I), Hot Spot Analysis (Getis-Ord Gi*), and Cluster and Outlier Analysis (Anselin Local Moran's I). While the Spatial Weights Matrix file can conceivably store NxN spatial relationships, in most cases, each feature should only be related to a handful of others. The sparse methodology takes advantage of this by only storing nonzero relationships.

It is possible to run out of memory when you are using a .swm file. This generally occurs when you select a Conceptualization of Spatial Relationships or Distance Band/Threshold Distance resulting in features having many, many neighbors, negating the sparse nature of the .swm file. You generally do not want to create a spatial weights matrix where every feature has thousands of neighbors. You want all features to have at least one neighbor and almost all features to have at least eight neighbors. You can ensure that each feature has a specified minimum number of neighbors by specifying that minimum value for the Number of Neighbors parameters.

## Additional resources

Mitchell, Andy. *The ESRI Guide to GIS Analysis, *Volume 2.* *ESRI Press, 2005.* *

Getis, Arthur, and Jared Aldstadt. Constructing the spatial weights matrix using a local statistic. *Geographical Analysis,* 36(2): 90–104, 2004.