Chapter 14 - Query, measurement, and transformation

14.1 Introduction: what is spatial analysis?

This chapter begins a section of the book that reviews the scientific core of GIS, the analysis of data beyond its visualization. The first paragraph of this section is worth thinking about: the skills and habits of thought you learn in GIS can be applied to photographic, astronomic, and image analysis as well as "mesoscale" human oriented investigations.

The John Snow example is a perennial favorite of anyone trying to establish the historical lineage of GIS as well as of medical geographers pointing to the first explicit use of a spatially referenced database in epidemiology.

The class assignments have already taken you fairly deeply into GIS analysis, although GTKAGIS postpones analysis to Chapter 9. In any case it is important that you relate the sections of this chapter to operations we are doing in the assignments and exercises; this is one of the advantages of working through these two texts in parallel. 

For your reference, here is an outline of the upcoming analytical topics of this chapter:

14.3 MEASUREMENTS
  Distance & length
  Shape
  Slope and aspect

14.4 TRANSFORMATIONS
  Buffering
  Point in polygon
  Overlay
  Interpolation
    Thiessen
    Inverse Distance Weighting (IDW)
    Kriging
    Density estimation

15.2 DESCRIPTIVE SUMMARIES
  Centers
  Dispersion
  Nearest-neighbor
  Autocorrelation
  Fragmentation

14.2 Queries

You should be familiar with each of the catalog, map, and table views of data, but consider how they give different views of and access to the constituent databases, layers, and rows and columns. We have also looked at histograms, and you should be aware that ArcMap > Tools > Graphs > Create gives you access to a variety of statistical visualizations, with some of the "dynamic link" features shown in Figure 14.8. A query is usually a "Where is...?" question.

14.3 Measurements

It is always helpful to think of the DATA to KNOWLEDGE hierarchy that GIS analysis affords. This logic can be applied even to so simple a "toy" example as Figure 14.9 if you think of transforming 4 data points into a informational mapped polygon whose area we then can know.

Study the equation in Section 14.3.1 for a moment and appreciate not only its power but that it can be applied to data in 3D and indeed any integral dimension. In fact it is used as a metric to explain "distance" between observations in multivariate space. I raise this point not only as a mathematical insight but also to reinforce that, although we are preoccupied with the data of geography, we are also learning about the geography of data (think about that!).

The section also introduced the term "polyline" which is also used by ESRI. To inspect such an object look at the flight_path data from GTKAGIS Chapter 3, which is ___ lines linking ___ cities. If the number of lines equals the number of points we have a "cycle" and Earhart would have made it!

The word "metric" is becoming quite popular and is often used to refer to anything measured, but its more rigorous definition is some kind of constructed variable that represents a phenomenon. In my environmental work there are frequent references to "landscape metrics" such as patch size and shape: fractal dimension, diversity, heterogeneity, etc. I'm sure your own area of work/study has similar concepts. Vice President Al Gore liked to invent them for measuring government efficiency (and no, he didn't invent the al-gor-ithm, but look up the etymology).

The subject of shape is a rich area of fractal research. I had a George Mason University graduate student write an MS thesis on the fractal analysis of political districts (see Box 14.3). The simplest and most compact shape is a circle (which is why bubbles are spheres), but for political districts hexagons would be the next most desirable shape - if people were spread uniformly on the land!

In GIS modeling slope and aspect are commonly used as criteria for location (see GTKAGIS Chapter 20) and we've also seen a related concept in the shaded relief data for the Horn of Africa in Chapter 5.

14.4 Transformations

In the material world we and other animals are always making judgments about spatial relationships (is that apple attached to this tree?), but it has taken a lot of difficult theoretical, algorithmic, and programming work to implement these operations in GIS. Much of this work has resulted in hundreds of various kinds of transformations, some of which I've illustrated in a table referenced earlier.

Among the simplest transformations (but not always easy to implement for complex data) is buffering, which is used extensively in GTKAGIS Chapter 12. Note that almost any operation on vector data can also be done in the raster domain. In fact, many of the operations in ArcGIS > SpatialAnalyst take advantage of the speed and flexibility of raster transformations, as simply illustrated in Figure 14.18 (which you should compare to Figure 6.9).

The first example in Section 14.4.2, counting disease events among the population at risk in a region, is the foundation of epidemiology and results in measures of prevalence and incidence. If the pumps in John Snow's London had exclusive service areas he could have used a GIS to make a choropleth of incidence and easily focus on the source of the contaminated water.

To test your understanding of the polygon overlay problem (and to refresh your memory of set theory) characterize the 10 regions created in Figure 14.20. into 4 kinds. These operations cannot be done on the attribute data alone, but only when the features are topologically defined and related in space. And keeping Shrek from walking through Donkey requires gigabytes of RAM.

All of these operations are highly scale-dependent. If a database has many small features, it will have exponentially many overlay polygons. If it is generalized, the operation will be much faster, although less precise. This is illustrated in Table 14.1 which is not easy to understand unless you draw a vertical line between columns 6 and 7, separating the 5 individual layers from the 3 examples of overlay. To see the kind of data being referred to, see Figures 12.8 and 15.13.

Spatial interpolation transforms points into surfaces and objects into fields. Our Assignment #2 uses density estimation (Section 14.4.4.4). Each of the other sub-sub-sub sections is worth examining, at least for the graphics, but Kriging (Section 14.4.4.3) is only for the adventurous. To browse some excellent examples of this, look at 
C:\Program Files\ArcGIS\Documentation\Geostatistical_Analyst_Tutorial.pdf
from which Figures 14.23 and 30 are taken. This is a very technical field that nevertheless can result in quite elegant - and sometimes misleading - transformations.