Bill Mill web site logo

Making a US State Choropleth

Following on from yesterday's post, I wanted to discuss how to make a simple choropleth map.

Download and extract data

To find some data to use, I googled "US state population" and found this census page with some data for each state's population and the change in population over several years.

To get the CSV data into shape, I downloaded it, converted it to json with csvkit (use pip install csvkit to install it) and filtered it with jq.

$ wget https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/national/totals/nst-est2019-popchg2010_2019.csv
# convert to json, filter out a few data points, and shape the data
$ cat nst-est2019-popchg2010_2019.csv | csvjson | \
    jq 'map(select(.NAME | test("United|Region") | not)) |
    map({(.NAME): .POPESTIMATE2019}) | add' > population.json
$ head population.json
{
  "Alabama": 4903185,
  "Alaska": 731545,
  "Arizona": 7278717,
  "Arkansas": 3017804,
  "California": 39512223,
  "Colorado": 5758736,
  "Connecticut": 3565287,
  "Delaware": 973764,
  "District of Columbia": 705749,

I enjoy crafting those arcane-looking jq commands, but you could just as easily, and more readably, filter and shape your data within your javascript or with a short python script.

Anyway, that gave me a file with a list of objects with a state name and that state's population, which we can go ahead and use to make a choropleth.

Using the data

Starting from the map code in the previous post, I'll highlight the differences.

// mapData is a topojson object
// populationData is an object of {state name: population}
function map(mapData, populationData) {
  const width = 975,
    height = 610,

    // d3.extent returns an array of the largest and smallest values in our
    // data:
    //
    // > d3.extent(Object.values(populationData))
    // Array [ 578759, 39512223 ]
    extent = d3.extent(Object.values(populationData)),

    // scale is a logarithmic scale that maps values in our extent into the
    // range [0,1], which is what `interpolateGreys` expects
    //
    // > scale = d3.scaleLog().domain(extent)
    // > scale(700_000)
    // 0.04503258343433408
    // > scale(20_000_000)
    // 0.8387874633947101
    scale = d3.scaleLog().domain(extent),

    // colorScale is a function that takes a value and maps it to a color,
    // using the scale we just defined
    //
    // > colorScale = d => d3.interpolateGreys(scale(d))
    // > colorScale(700_000)
    // "rgb(250, 250, 250)"
    // > colorScale(20_000_000)
    // "rgb(50, 50, 50)"
    colorScale = d => d3.interpolateGreys(scale(d));

    // d3 has many color scales available:
    // https://github.com/d3/d3-scale-chromatic

  // Snip: create map and nation boundaries as before

  // Instead of filling each state path with a constant color, this time
  // we vary the color based on how many people live in the state
  const state = svg
    .append("g")
    .attr("stroke", "#444")
    .selectAll("path")
    // map each state object to a path in the SVG
    .data(topojson.feature(mapData, mapData.objects.states).features)
    .join("path")
    // fill the path with a color based on the color scale above
    .attr("fill", (d) => colorScale(populationData[d.properties.name]))
    .attr("vector-effect", "non-scaling-stroke")
    .attr("d", d3.geoPath());
}

window.addEventListener("DOMContentLoaded", async (event) => {
  map(
    await d3.json(`https://cdn.jsdelivr.net/npm/us-atlas@3/states-albers-10m.json`),
    await d3.json(`https://cdn.billmill.org/static/blog/us_choro/population.json`)
  )
});

A bit on colors

Our data is continuous, so it makes sense to use a single color scheme. d3 provides plenty of single color schemes, so for example we could change it to blue by using interpolateBlues instead of interpolateGreys:

Or reds with interpolateReds:

And it's inadvisable, but we could make a rainbow map with d3.interpolateSinebow:

In general, your map should have as many colors as there are classes in the data, so try not to use more colors than necessary even if the result looks a little more fun; it will hinder people's ability to understand the map you've made.