Using Geographically Weighted Regression to Analyze Spatial Data
Methods: Geographically weighted regression (GWR) is an approach to analyzing spatial data that does not assume the relationship between variables is constant across a study area. GWR allows local parameters to be estimated, rather than global ones. As opposed to a global score that is obtained in OLS regression, GWR performs tests and obtains local coefficients for each location and each variable in a study. In other words, GWR produces separate regression parameters for each observation, in this case, census tracts.
The problem under study was one that had concerned health professionals for some time, the siting of tobacco outlets in low-income and minority (African American and Hispanic) neighborhoods. We replicated previous studies that had used non-spatial statistics to investigate this issue and which had found a significant relationship among minority status, median household income, and density of tobacco outlets.
Results: The geographically weighted regression produced an R2 of .68 with an Akaike Information Criterion (AIC) of 201.419. The R-square values range from .29 to .70. The geographic variation in these values demonstrates how the explanatory power of the model differs across the census tracts. For example, while the global OLS R-squared was .584, the R-square values in some of the census tracts with the highest percentages of African-Americans was as low as .294.
Both percent African American and median household income were negatively related to the dependent variable in all census tracts; however, this negative relationship was only significant for household income in 42.5% of the tracts while for percent African American it was negatively significant in 93.75% of them.
We performed an analysis of the residuals to see if there existed spatial autocorrelation and found none. We next applied the Benjamini-Hochberg correction to the data, except for the Hispanic variable, as this procedure is unnecessary when there is no significance. After the correction, percent African American was negatively related in 76% of the census tracts and median household income was negatively related in 16% of the tracts. Percent commercial was positively related in 59% of the census tracts. The most powerful predictors of tobacco outlet density were percent African American and population density (significant in all but one tract, 99%).
Thus, contrary to previous studies, this research, using geographically weighted regression, found no relationship between tobacco outlet density and percent Hispanic, and found a negative relationship with regard to two variables – that of being African American and median household income. Positive significant relationships were found with population density and land use.
Implications for practice or policy: Researchers in the previous studies had made policy recommendations based on the finding they obtained using non-spatial statistics. Researchers should use care in making public health recommendations regarding interventions to prevent tobacco use or any other public health issue without examining more closely the local heterogeneity of the spatial features involved.