Geometry vs Geography; or How We Measure Area and Distance in a Spatial Database

This is part of a series of posts I’m doing on Census Bureau Demographic Data for Developers based on my experience creating a database of Census Data as part of my summer internship at Patch.com.  

Here’s the bottom line: if you’re measuring distance over relatively small distances and doing area comparisons instead of returning an actual area measure, use a geometry. If you’re measuring distances over a relatively large distance, especially across the International Date Line, use a geography. Read on for a discussion of why.

I think everyone has a basic grasp of the problems when representing a 3-dimensional sphere (such as the Earth) on a 2-dimensional map. Geographical features are distorted, especially as you move further from the Equator.

The same is true for data shown (or projected) in a GIS program such as Esri’s ArcGIS or QGIS. Different projections attempt to correct for this error by focusing on being correct for different areas of the globe. In the case of the Census, the standard projection is North American Datum – 1983.

I don’t pretend to be an expert in map projections or the process of transforming one map projection into another, but it’s important to point out that there are differences whether you’re projecting your data in 2 dimensions as a geometry or in 3 dimensions as a geography.

In 2 dimensions, calculations of area and distance are extremely straightforward. The points exist on a Cartesian graph and the calculations are simple. This means computationally these functions are cheap and can be run over a large number of features very quickly.

From http://mathsfirst.massey.ac.nz/Algebra/PythagorasTheorem/images/papp1.gif

Distance between two points on a Cartesian plane
(http://mathsfirst.massey.ac.nz/Algebra/PythagorasTheorem/images/papp1.gif)

But with geographies, the calculations are much more complex. What was simple geometry is now more complex trigonometry as you’re calculating arc length rather than a simple line length.

From http://mathforum.org/mathimages/imgUpload/thumb/Spherical.png/200px-Spherical.png

Distance calculation between two points on the surface of a sphere
(http://mathforum.org/mathimages/imgUpload/thumb/Spherical.png/200px-Spherical.png

So why use a geography at all? For measuring long distances, calculations using a geography will be much more accurate, particularly if you’re crossing the International Date Line.

For example, using a geometry to represent the locations of Heathrow Airport in London and SFO in San Francisco, the straight line distance is much different from the actual route flown by airplanes. Thankfully when planning air routes, they take into account the curvature of the Earth to plot the shortest route.

Something similar happens if you were to plot the distance between San Francisco and Beijing. Remember a 2 dimensional projection has no concept of how the Earth curves, so it will plot the distance from SFO across the US, over Europe and Asia, the exact opposite way you’d want to calculate it in real life. That’s because the Date Line (or at least the 180 degree of longitude it notionally traces) represents the east and west edges of the projection.

This is also important when you’re calculating areas. If you’re calculating the area of a geometry, you will get the result in the units of measure for that projection. In the case of census data, that will generally be decimal degrees, which aren’t really useful. If you cast the geometry into a geography (::geography), you’ll get the result in square meters, a far more understandable measure.

For comparing areas of objects in the same projection, I prefer to do the comparison in the standard units for that projection (called SRID Units) and only do the conversion when I want to return an intelligible area. This preserves precision in the numbers I’m using.

To repeat the statement I made at the beginning, if you’re measuring distance over relatively small distances and doing area comparisons instead of returning an actual area measure, use a geometry. If you’re measuring distances over a relatively large distance, especially across the International Date Line, use a geography. If you need any more clarity on this, please check out the great explanation from OpenGeo.org or this great GIS Stack Exchange post.

Advertisements

3 thoughts on “Geometry vs Geography; or How We Measure Area and Distance in a Spatial Database

  1. Pingback: Spatial Demographic Data – Weighting Demographics with PostGIS | Datapolitan

    • It’s probably optimized for the scale of the operation, though they could probably eat the cost of complex calculations on a small scale without too much trouble. They’re also having to do some computation along streets and other access ways between points. I can ask the next time I see someone from Google Maps (which actually shouldn’t be that long, one of the perks of living in New York)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s