Cartograms

This page is for another statistics class project, this time using cartograms.
Using density-equalized map distortion to depict statistical data.

Intro

Cartograms are maps that use distortions to emphasize trends or characteristics. They can produce some very interesting and informative results when used to display statistical data.

In 2010, I was taking a graduate-level course in statistics (Biostat/CSSS/Stat 529 - Sample Survey Techniques) at the University of Washington. For my final project in the class, I decided to investigate cartograms. I did a study using cartograms to map the distribution of Walmarts, McDonalds, and Starbucks in the US. The left side of this page describes this paper.

After writing this paper, I got sort of carried away making some posters of interesting statistics. These are shown on the right side of the page.
A paper I wrote that has background info on cartograms:


Supercenters, Hamburgers, and Coffee: Using density-equalizing cartograms to display the distribution of Walmarts, McDonalds, and Starbucks in the US

A final project for my statistics class,
by Steph Abegg, May 2010



Intro (from paper)

Which state has the most McDonalds? Is Starbucks a west-coast phenomenon? Is Walmart taking over?

Inspired by these questions, the following paper not only presents a unique and playful way of mapping American commercialism, but also uses the analysis as an opportunity to conduct a comprehensive investigation of density-equalizing maps, or "cartograms." First, I discuss the development and methodology of cartograms, including a relatively recent method proposed by Gastner and Newman in 2004. Then, I use Gastner and Newman's method to create elegant cartograms depicting the distribution of Walmarts, McDonalds, and Starbucks in the fifty United States. I examine some of the interesting trends evidenced by these cartograms, and finally I conclude with a discussion of the advantages and disadvantages of cartograms in comparison with other common methods of graphical data presentation.



Background (from paper)

For statistical data involving estimates for geographic areas, it is natural to want to display the data on a map. Maps are valuable means of graphical display, allowing visualization of trends and patterns that other forms of presentation cannot.

However, there are challenges associated with using cartography to analyze or present statistical data. Apart from needing additional geographic information and the increased complexity of the data presentation, a significant hurdle is that population density is extremely variable. Smaller populations tend to be found in larger geographic regions (such as rural areas), which can lead to misleading visual impressions. Perhaps the best way to visualize data that is affected by spatial characteristics is to actually use spatial characteristics to distinguish the trends on display. This is the idea that inspired cartograms.

Cartograms are maps in which the sizes of geographic regions appear in proportion to their population or some other analogous property. These density-equalizing maps are useful for the representation of census results, election returns, disease incidence, and vital statistics such as the distribution of Walmarts, McDonalds, and Starbucks across the country. Several methods for making cartograms have been proposed over the years, but most of these ideas are inordinately complex or suffer from a lack of readability due to the distortion needed to scale regions and have them still fit together.

In 2004, Gastner and Newman at the University of Michigan proposed a new method of making cartograms. Their method is not only faster and more conceptually simple than previous methods, but also produces useful and easily readable cartograms. Using the relatively straightforward principles of linear diffusion, a cartogram is created from a given population density (or some other analogous property) by allowing the population to "flow away" from high-density areas into low-density ones, until the density is equalized everywhere. Areas shrink and grow and distort to stay connected, producing a cartogram that is in fact unique for a given dataset. (For the formulas and Fourier transforms, Gastner and Newman's 2004 paper is a good reference.) The degree to which the data is binned is important to achieving the desired balance of distortion and recognizably: a very fine level of data binning will cause substantial local distortions, while a coarser level of binning will result in a cartogram with features that are easier to recognize, but gives a less accurate impression of the true population distribution.

The rest of this paper focuses on my use of Gastner and Newman's method to develop cartograms displaying the distribution of Walmarts, McDonalds, and Starbucks in the fifty United States. The Appendix provides annotated references to the datasets and software I used in this study, as well as provides some examples of code I wrote to analyze and display the density-equalized cartograms in R.



Interpretations, Data Analysis, Advantages/Disadvantages, Conclusions, References

Download paper!

My paper.
Link to pdf of report.


Some posters using cartograms:

After writing the paper on the left, I got sort of carried away making some posters of interesting statistics. I've provided links to these posters below, which use cartograms to show:
  1. The per area and per capita distribution of STARBUCKS in the US, which builds upon the data discussed in the paper.
  2. The statewide frequencies of six major types of NATURAL HAZARDSearthquakes, floods, tornadoes, wild fires, lightning, and hurricanes.
  3. The various LAND COVERS  of the US as categorized by the National Land Cover Dataset—water, wetland, perennial ice, forest, shrubland, planted land, grassland, barren land, and developed land.
  4. The distribution of ETHNIC GROUPS in the US—Whites, Blacks, Hispanics, Asians, Refugees, Natives, and European-born.
  5. The growth of the US POPULATION in 10-year increments from 1900 to 2010.
  6. I also later did a study that involved plotting MOUNTAINEERING ACCIDENTS AND FATALITIES on cartograms.
  7. And still later I did a study on MT. RAINIER CLIMBING STATISTICS and used a cartogram to plot the home states of climbing parties.
  8. And then I created a poster (requested as a teaching tool) on the distribution of ELECTRIC POWER ENERGY SOURCES in the United States.

1.
A poster using cartograms to map the per area and per capita distribution of Starbucks in the US. This poster goes beyond the discussions in my paper. It shows some interesting trends, such as how Starbucks truly is a west coast phenomenon!
 
Click to enlarge.


Also a graph of Starbucks growth. Should have bought stock in 1992....


2.
Below is a poster using cartograms to map the frequencies of various natural hazards (earthquakes, floods, tornadoes, wild fires, lightning, and hurricanes) in the Lower 48. A new wave of hazard risk mapping!

Click to enlarge.


3.
A poster using cartograms to map land cover in the Lower 48. I will always choose to live where the glaciers (i.e. perennial ice) are.....

Click to enlarge.


4.
A poster using cartograms to map the distribution of the major ethnic groups in the US. The distributions of most ethnic groups deviate distinctly from the overall population of the US.

Click to enlarge.


5.
A poster using cartograms to map the growth of the US population over the last century. The poster shows some interesting trends of how the populations of individual states have grown at different rates, particularly how the US population has gradually pushed westward since 1900.

Click to enlarge.


6.
I later did a study that used cartograms to show how mountaineering accidents are concentrated in the mountainous states such as Washington, California, Montana, Wyoming, and Colorado.
 
Click to enlarge.


7.
Still later I did a study on Mt. Rainier climbing statistics, and used a cartogram to plot the home states of climbing parties on Rainier.
 
Click to enlarge.


8.
I created a poster (requested as a teaching tool) that uses cartograms to illustrate the distribution of energy sources—such as coal, petroleum, hydroelectricity, etc.—across the United States.
 
Click to enlarge.