# How GWR works

Geographically Weighted Regression (GWR) is one of several spatial regression techniques increasingly used in geography and other disciplines. GWR provides a local model of the variable or process you are trying to understand/predict by fitting a regression equation to every feature in the dataset. GWR constructs these separate equations by incorporating the dependent and explanatory variables of features falling within the bandwidth of each target feature. The shape and size of the bandwidth is dependent on user input for the Kernel type, Bandwidth method, Distance, and Number of neighbors parameters.

## Implementation notes and tips

In global regression models, such as OLS, results are unreliable when two or more variables exhibit multicollinearity (when two or more variables are redundant or together tell the same "story"). GWR builds a local regression equation for each feature in the dataset. When the values for a particular explanatory variable cluster spatially, you will very likely have problems with local multicollinearity. The condition number in the Output feature class indicates when results are unstable due to local multicollinearity. As a rule of thumb, do not trust results for features with a condition number larger than 30; equal to Null; or, for shapefiles, equal to -1.7976931348623158e+308.

Severe model design errors often indicate a problem with global or local multicollinearity. To determine where the problem is, run the model using OLS and examine the VIF value for each explanatory variable. If some of the VIF values are large (above 7.5, for example), global multicollinearity is preventing GWR from solving. More likely, however, local multicollinearity is the problem. Try creating a thematic map for each explanatory variable. If the map reveals spatial clustering of identical values, consider removing those variables from the model or combining those variables with other explanatory variables to increase value variation. If, for example, you are modeling home values and have variables for both bedrooms and bathrooms, you may want to combine these to increase value variation or represent them as bathroom/bedroom square footage. Avoid using spatial regime dummy/binary variables, spatially clustering categorical/nominal variables, or variables with very few possible values when constructing GWR models.

Problems with local multicollinearity can also prevent the AIC and CV Bandwidth method from resolving an optimal distance/number of neighbors. Try specifying a particular distance or a specific neighbor count, then examine the condition numbers in the Output feature class to see which features are associated with local multicollinearity problems (condition numbers larger than 30). You may want to remove these problem features temporarily while you find an optimal distance/number of neighbors. Keep in mind that results associated with Condition Numbers larger than 30 are not reliable.

Condition numbers indicate how sensitive a linear equation solution is to small changes in matrix coefficients. Individual feature results when the condition number is greater than 30 are not included in the variance of the parameter estimates; this impacts standard error diagnostics, global sigma, and standardized residuals.

The user may change this condition number threshold by resetting the registry:

[HKEY_CURRENT_USER\Software\ESRI\GeoStatisticalExtension\DefaultParams\GWR]

"ConditionNumberThreshold"="40"

Parameter estimates and predicted values for GWR are computed using the following spatial weighting function: exp(-d^2/b^2). There may be differences in this weighting function among various GWR software implementations. Consequently, results from the ESRI GWR tool may not match results of other GWR software packages exactly.