* This problem is described in file named "Moran's I in Stata.pdf"
* Need to install user-written STATA command called spatgsa if it is not
* already installed on your computer.
use http://www.ats.ucla.edu/stat/stata/faq/ozone.dta, clear
summarize lat lon
* Based on the minumum and maximum values of these variables, we can calculate
* the greatest Euclidean distance we might measure between two points in this
* dataset.
display sqrt((34.69012 - 33.6275)^2 + (-116.2339 - -118.5347)^2)
* Knowing this maximum distance between two points in our data, we can
* generate a matrix based on the distances between points. In the spatwmat
* command, we name the weights matrix to be generated, indicate which of our
* variables are the x- and y-coordinate variables, and provide a range of
* distance values that are of interest in the band option. All of the
* distances are of interest in this example, so we create a band with an
* upper bound greater than our largest possible distance. If we did not
* care about distances greater than 2, we could indicate this in the band
* option.
spatwmat, name(ozoneweights) xcoord(lon) ycoord(lat) band (0 3)
* As described in the ouput, the command above generated a matrix with 32
* rows and 32 columns because our data includes 32 locations. Each
* off-diagonal entry (i,j) in the matrix is equal to 1/(distance between
* point i and point j). Thus, the matrix entries for pairs of points that are
* close together are higher than for pairs of points that are far apart. If
* you wixh to look at the matrix, you can display it with the "matrix list"
* command. With our matrix of weights, we can now calculate Moran's I and do
* hypothesis testing. The null hypothesis is that there is no spatial
* autocorrelation correlation between the ozone measurements versus the
* alternative hypothesis that is some spatial correlation between the
* ozone measurements.
spatgsa av8top, weights(ozoneweights) moran
* Based on these results, we can reject the null hypothesis that there is
* zero global spatial autocorrelation present in the variable av8top.
* Variations
* BINARY MATRIX: If there exists some threshold distance d such that pairs
* with distnaces less than d are neighbors and pairs with distances greater
* than d are not, you can create a binary neighbors matrix with the
* spatwmat command (indicating "bin" and setting band to have an upper bound
* of d) and use this weights matrix for calculating Moran's I. We could do this
* for d = 1:
spatwmat, name(ozoneweights) xcoord(lon) ycoord(lat) band(0 1) bin
* Using this binary weight matrix we can again calculate Moran's I to test
* for the presence of spatial autocorrelation among the defined "neighbors."
* In this example, the binary formulation of distance yields a similar result.
* We can reject the null hypothesis that there is zero spatial autocorrelation
* present in the variable av8top at alpha = 0.05.
spatgsa av8top, weights(ozoneweights) moran
* USING AN EXISTING MATRIX: If you have calculated a weights matrix according
* to some other metric than those available in spatwmat and wish to use it in
* calculating Moran's I, spatwmat allows you to read in a STATA dataset of
* the required dimensions and format it as a distance matrix that can be
* used by spatgsa. If altweights.dta is a dataset with 32 columns and 32
* rows, it could be converted to a weighted matrix "aweights" to be used in
* analyzing av8top:
* spatwmat using "C:\altweights.dta", name(aweights)