Chapter 8 - Geographic data modeling

This chapter presents the technical core of GIS. You may already be familiar with spreadsheets and databases - even of the relational type - but GIS takes data into the spatial domain, and, because "everything is spatial," the potential for deeper understanding has been enriched.

NOTE: This is a deep chapter, so I've tried to highlight those sections and exhibits that I think give you the essential ideas we'll need to know for our explanations. Skim over sections I've not mentioned so that you can refer back to them later if necessary.

This chapter is an overview of:

8.1 Introduction

At its simplest level science - including GIScience - is a process that circularly links DATA and MODELS: we collect data to test (deduce the truth of) models and we formulate models in order to determine what data to collect.

I like to think of the REAL WORLD as the infinitely complex subject of our analysis about which we collect DATA based on MODELS of how we think the world works. Although I can no longer find the source, in my PhD thesis is the following quote: 
"Models are to be used but not believed."
--Henri Theil
So if you can think of the following system that builds up from "reality":

RESULTS
COMPUTER MODEL
CONCEPTUAL MODEL
 T H E   R E A L   W O R L D

with GIS in the middle. Figure 8.2 may help with this. We are continually checking our results with what's going on in the world, hopefully rebuilding - or merely tweaking - the model to be more reliable.

Anyway, you've already had quite a bit of experience with data models; consider how we modeled the pattern of African national density with about half a megabyte of data (countries.shp and associated files) representing lines and polygons. It's a heroic simplification, yet adequate enough for our purposes, and adding more data to the African map gave the problem an even greater degree of realism...but it's still not "reality."

8.2 GIS data models

Figure 8.3 is a picture of modeling. Look how the relatively simple street network in B is abstracted from the hundreds of structures visible in the same area in A. Even clearer is the distinction between the map/model in Figure 8.16 and a real neighborhood of houses, pipes, etc. The book is full of such models - flip through the pages and find a picture or graphic that represents and abstraction of the real world.

Rasters

This section begins with a review of early CAD and computer cartography, but now that we all have digital cameras, perhaps the simplest data model to understand is rasters (Figure 8.3). I once asked Mike Goodchild (the book's author on the right in Figure 1.15!) whether GIS had yet shown us new things in the world, as had the microscope (cells) or the telescope (galaxies) and he cited the USGS digital elevation model (DEM) of the US, which revealed in stark clarity the "skin" of the nation. Box 8.1 discusses methods of compressing these data, but an even more profound modeling achievement is the transformation process:

REAL
WORLD
  →   SURVEY
DATA
  →   CONTOUR
LINES
  →  
DIGITIZED
POLYLINES
  →   RASTER
ELEVATIONS

that ultimately corresponds much more closely to what we would see if the continent were stripped of its clouds, vegetation,  etc. What you see in the figure is a 'shaded relief" model used to give you the feeling that the Sun is shining - impossibly - from the northwest. You also saw this in GTKAGIS Chapter 5.

Task In Figure 8.6 Estimate the increase in scale of the zoomed window versus the original raster.

Task If you have the bandwidth, download one of the higher-resolution images from the US DEM website (link above) and zoom in on a region you know. First you'll see the area in greater detail, but eventually you can see the individual gray-scale pixels that make up the data.

Features

More sophisticated is the so-called vector data model that abstracts the world into geometric objects (the book as well as ArcGIS calls them features) explicitly located in space. Figure 8.7 is a toy example that clearly shows how three kinds of objects are created in 2-dimensional (xy) space. Then the points are connected to make lines and polylines (multiple lines) and the lines (if they form a cycle) can enclose polygons. Though it's unlikely you will ever be confused about this, make a mental bookmark of this figure.
 
Task: Add gridlines to the first frame to actually see that the points are located at e.g. (x, y)1 = (2, 4) and so forth. Just like Algebra I!

NOTE: You might wonder why GIS is stuck with 0-, 1-, 2- or 3-D objects (as in CAD and movie animations). That's in part what fractals are about...

Once the features are modeled they have to be able to relate:
  • to one another (which highways in Figure 8.11 intersect the Grande Raccordo Anulare?)
  • or to other kinds of features (what street is the hydrant feature on in Figure 8.4?)
  • This problem goes beyond geometry to use another branch of mathematics: topology. And once the features relate to one another in a data model we can inquire (query) how they interact with one another by asking the software the above questions  - precisely, reliably and in great numbers - rather than just looking at the map. Other examples of these queries are listed in Section 13.2.1.

    Figures 8.8-10neatly summarize the core GIS vector data model, but may be confusing at first read. Don't worry about the details because fully understanding the model isn't essential to your work. But try to get a feel for the deconstruction of reality that is necessary, and if you have an analytical bent, by all means try to figure them out! Suffice it to say that GIS is the elaboration of geometric entities, topologically related in space and linked to databases that keep track of their attributes.

    Next,  the TIN model can take us into the third (vertical) dimension, as illustrated in Figure 8.12 which shows a "wire frame," then symbolized (colored), and finally draped with an image of Death Valley. But this is actually only 2½D; computer graphics and more sophisticated GIS data models allow us to model volumes in 3D. Imagine another frame in Figure 8.7 in which polygons become the faces of volumes - assemble enough of them and you get an alien spaceship! Task: Look at Death Valley in GoogleEarth and see if you can find out where the figure is.

    Underlying all GIS data structures are simple geometric, algebraic, and topological models that have been well-developed for hundreds of years. For example, the network of Figure 8.13 is a graph composed of vertices (points), edges (lines), and faces (polygons) whose numbers must conform to Euler's formula:
        vertices edges + faces = 2
    that is not a natural law or statistically close but simply cannot be violated. Check it for the TIN, and don't forget to count the outside as a face! The fact that the GIS data must conform to simple rules helps in determining that a data structure is correctly specified.

    The object model of Section 8.2.4 is somewhat beyond our needs. Although I've used it in my R programming, you won't need to understand much about it. Suffice it to say that to the geeks "everything is an object." But study Figure 8.18 to get a feel for how the paradigm, once mastered and used, might tame a very complicated problem.

    8.3 Example of a water-facility object data model

    Personally I'd like you to recognize the sophisticated "network" (more precisely graph) data model, which relates to my own research area. Although it's shown in a static form in e.g. Figure 8.17 you see it every time you use a web GIS to get directions. How do you think Google gets you from the hotel to the beach except by navigating a network (graph!) of links and nodes?

    8.4 Geographic data modeling in practice

    Environmental modeling presents a tremendous challenge to GIScience because it asks us to represent the various Earth spheres (actually "shells" of litho, hydro, bio, anthro, etc) that are constantly changing and interacting at multiple scales. Think for a moment of an environmental topic that interests you and ponder how you might model it. This exercise might lead to an interesting final project - or even a career!