DOI: 10.14714/CP94.1538
© by the author(s). This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0.
Georgianna Strode, Florida State University | gstrode@fsu.edu
John Derek Morgan, University of West Florida | jmorgan3@uwf.edu
Benjamin Thornton, Florida State University | bwt13b@my.fsu.edu
Victor Mesev, Florida State University | vmesev@fsu.edu
Evan Rau, Florida State University | evanrau@gmail.com
Sean Shortes, Florida State University | seanshortes@gmail.com
Nathan Johnson, Florida State University | mrchocoborider@gmail.com
Trumbo’s (1981) ideas on bivariate choropleth design have been underexplored and underutilized. He noted that effective map design (including color selection) is directly informed by the intended goal or use of the map (i.e., what questions might the map answer), and he identified three common spatial relationships that can be displayed by a bivariate choropleth map: inverse relationships, a range of one variable within another, and direct relationships. Each is best suited to answering different map readers’ questions. Trumbo also suggested sample color palettes to focus the map reader’s attention on pertinent data. In consultation with Trumbo, we extended his ideas, first by creating focal models that illustrate his three spatial relationships. We then constructed sample maps to examine each of the focal models, and finally compared each model by mapping the same two data sets (of obesity and inactivity). We investigated the visual differences in each of the resulting maps, and asked spatial questions regarding the relationships between obesity and inactivity. Our work validates Trumbo’s ideas on bivariate choropleth map design, and we hope our focal models guide cartographers towards making color choices by linking their map purpose to the appropriate focal model.
KEYWORDS: bivariate choropleth map; sequential color scheme; diverging color scheme; color selection
“Confusion and clutter are failures of design, not attributes of information. And so the point is to find design strategies that reveal detail and complexity — rather than to fault the data for an excess of complication. Or, worse, to fault viewers for a lack of understanding.”
Edward Tufte, Envisioning Information, 1990
While univariate maps, which represent one thematic variable at a time, are both standard and widespread (Jin and Guo 2009; Brewer 1994), bivariate maps have the potential to reveal spatial relationships and patterns between two variables on a single map more effectively than by using two side-by-side univariate maps (Carstensen 1986). Bivariate maps can make use of a variety of symbol strategies, such as color combinations (as we explore in this paper), shaded proportional symbols, shaded cartograms, split symbols, shaded isolines, or star plots (Friendly 2008; Kimball and Kostelnick 2017). Though they can represent any pairing of thematic variables, they are typically employed to examine the relationships between socioeconomic variables, such as elderly populations and ethnic minorities, or levels of educational attainment and household income. A well-constructed bivariate map displays both the distributions of its individual datasets and their degree of interaction, and—at least for some visual variable combinations— variation in one thematic variable does not impair the ability to read the other (Ware 2009).
The alternative to bivariate mapping is to compare two univariate maps side-by-side; often inconvenient and inefficient (Leonowicz 2006; Wainer and Francolini 1980; Bernard et al. 2015). In order to keep information cost to a minimum, the type of data classification, symbol choices, and scale must be comparable across all maps in the series (Wickens, Gordon, and Liu 2004). Yet even when mapmakers use consistent visual communication techniques, human judgment can be shaped by the user’s experience in map reading, familiarity with the subject matter, and the limitations of human visual perception. We adhere to the hypothesis that given the difficulties of creating comparable univariate maps, pursuing methods to visualize multiple phenomena on a single map is worthwhile (Leonowicz 2006).
Frequently, bivariate maps produce ambiguous representations that convey information poorly (Dunn 1989). A common method for creating bivariate maps is to superimpose two choropleth maps. This technique, called the overlay, crisscross, merge, or crossing, combines two maps along with their color schemes to produce a new map with a new set of colors (Monmonier 1989). An early example of overlays is the US Census Bureau’s 1970s Urban Atlas map series, which contained sixteen-color, mass-produced maps showing various pairings of data variables. Although the maps were aesthetically pleasing, they were strongly criticized as difficult to interpret (Fienberg 1979; Wainer and Francolini 1980; Dunn 1989), as cartographers had no literature to guide their interpretations and relied on their own judgment and that of colleagues (Halliday 1987). More recently, techniques for drawing bivariate choropleth overlays have improved, and studies have demonstrated that effective color schemes can greatly enhance their readability (Eyton 1984; Robertson and O’Callaghan 1986; MacEachren et al. 1999; Rheingans 1997; Leonowicz 2006). Nevertheless, they still lack sophistication for the interrogation of data relationships.
Bivariate choropleth mapping is a relatively recent cartographic method, dating to 1974, when the US Census Bureau saw value in combining two data sets into one map. A suggestion was made for “two choropleth maps of different variables to be ‘crossed’” or overlaid (Meyer, Broome, and Schweitzer 1975, 102). This melding became colloquially known as the overlay. The overlay is constructed by combining two univariate choropleth maps and their color schemes through transparency, by overprinting, or by manual color adjustments (Stevens 2015). The overlay scheme is considered a method of convenience and is not necessarily the best choice (Trumbo 1981). Although the overlay scheme works well in some instances, critics of bivariate choropleth mapping note that the overlapping color schemes can become muddied and indistinct, especially the farther they are from the legend’s x- and y-axes. The information cost for a map can be unnecessarily high when a less-than-ideal bivariate choropleth mapping method, such as the overlay, is utilized. This can force map readers to refer to the legend more often than necessary.
In the mid 1970s, the Census Bureau developed an automated mapping system and began creating a series of bivariate maps for mass distribution. The overlay scheme was used to merge yellow-to-red on the x-axis and yellow-to-blue on the y-axis. The resulting color scheme was a confusing mixture of colors. “Considerable practice” was required to discriminate among the colors in the legend and to organize data relationships (Fienberg 1979, 176). A study by Wainer and Franconi (1980) found difficulties in map comprehension, and they concluded that the Census maps did not meet Bertin’s (1973) elementary level of inquiry as it was difficult to answer a question such as, “what is the median family income at this spot?”. Tufte (2001) termed the maps “puzzle graphics,” because users had to run phrases through their minds to aid comprehension. Olson’s (1981) study on users of the bivariate maps concluded that participants could recognize order, but judging correlation between two variables was more difficult than using two side-by-side univariate maps.
Figure 1 offers an example of the difficulties of the overlay method. It shows the percentages of education and income. Unfortunately, this scheme produces ambiguous mixtures across the blue-to-violet range that are difficult to distinguish visually and could require users to view the legend too often.
Both univariate and bivariate choropleth maps can answer a variety of questions. For example, assume a thematic variable pair of income and education level. Univariate maps are limited to proximity queries about a single data variable, and restrict interrogation to questions such as: Do low income tracts occur mainly in the center of the city? Are tracts with highly educated populations mainly in the suburban areas with an especially heavy concentration in the northern suburbs? (Trumbo 1981, 223). In contrast, bivariate choropleth maps answer more complex questions involving thematic variable pairs: Are all of the districts with both low income and low education near the center of the city? Are there tracts with high education and low incomes? What is the range of educational attainment among high earners? Is there a positive association between the two variables? Or what spatial patterns are formed by exceptional cases? (Trumbo 1981, 223).
For Bruce Trumbo, effective design (including color selection) is directly related to the map’s purpose, as expressed in these sorts of questions (Trumbo 1981). “My personal motivation . . . was to improve on the bivariate color scheme adopted by the U.S. Bureau of Census. In my view, there are many ways to do that, including ones I suggested” (personal communication, July 2, 2019). However, a review of the literature suggests that the creation of bivariate maps does not necessarily begin with a conscious articulation of each map’s purpose (Nyerges 1991; Caquard 2015). Trumbo’s 1981 paper identified principles to facilitate the comprehension of statistical maps and identified several color schemes appropriate for different map purposes. Our paper focuses on map purposes, or types of questions that a bivariate choropleth map can answer, as identified by Trumbo, rather than upon his specific color scheme suggestions.
Trumbo identified three common data relationships that together cover many of the questions that a reader might ask of a map. These are seen in Table 1: inverse relationships, a range of one variable within another, and direct relationships. Each corresponds to a different map purpose and facilitates its own range of questions. Trumbo also suggested sample color palettes to focus the attention of the user on data pertinent to the map’s purpose (Trumbo 1981). Each question uses a statistical emphasis to place focus on specific types of interactions between variables. One method, the inverse relationship, explores the highs and lows of the two variables. A sample question is “where are high foreclosures within low-income areas?” The corresponding bivariate choropleth map design should highlight the areas with high and low values for both inputs while de-emphasizing the areas where values are normal or average. A second method focuses on a range of values within a specific category. A sample question is “what is the range of foreclosures within areas of high income?” The corresponding bivariate choropleth map would first identify the various categories, and second, show the progression of values within each category. A third method shows the relationship between the two variables. A sample question is “what is the relationship between income and foreclosures?” The corresponding bivariate choropleth map displays the correlation between the variables.
Table 1. Three types of questions a bivariate choropleth map can answer based on Trumbo’s data relationships.
Each of these three types of questions places emphasis on different interactions between data variables, and each can be supported by the thoughtful use of color. To that end, Trumbo articulated four principles of color usage to distinguish similarity and dissimilarity. Principle I is order: for data ordered quantitatively, the colors should preserve the order, i.e., they should show a progression of hue, saturation, and/or lightness (whether individually or in combination) that can be detected by the human eye. Principle II is separation, in which important differences in values are represented by colors that are equally different. Colors are spaced comfortably apart, or separated, from others. Principle III refers to rows and columns: in a bivariate legend, each row or column represents a univariate dataset and should be in a visual sequence, with colors that are distinct enough that the corners of the bivariate legend stand out. Lastly, Principle IV relates to the legend diagonal. If the interaction between variables is critical, then the principal diagonal is the focal point and it should be visually separated from colors on either side. Both univariate maps and bivariate choropleth maps use Principles I and II, but Principles III and IV are formulated specifically for the latter.
Trumbo’s four principles provide theoretical insights into how color can communicate data changes and relationships using bivariate choropleth maps. The first two principles involving color sequences have been successfully applied within GIS largely due to efforts of Brewer (1994; 2005) and Brewer et al. (2015). The gradient formation of Principle I is represented by Brewer’s sequential color scheme, and the mutually distinguishable colors of Principle II are represented by the diverging color scheme (Brewer 2005). Principles III and IV combine multiple color schemes in various ways to direct focus to specific portions of the data to address the map’s original question. Principles III and IV have not been widely transferred to GIS color schemes. Univariate map makers often start with a clear purpose for a map and correspondingly design their maps to communicate a specific idea or theme. In this sense univariate maps may be seen as more often confirmatory in purpose and for public consumption (MacEachren 1995; Tukey 1980). However, bivariate map makers construct more complex maps (multiple variables), which are often exploratory in purpose and not necessarily for the public consumption (Tyner 2010).
Focal models are illustrations that show how to translate Trumbo’s three spatial relationships into specific color choices that adhere to his four principles of color usage. Figure 2, for example, illustrates three color schemes and how they highlight specific data. The sequential and diverging color schemes from Brewer (1994) align with Trumbo’s Principles I (order) and II (separation). Sequential schemes order sequences from low to high to illustrate linear progressions of values (Brewer 2005; Slocum et al. 2005). Sequential schemes are characterized by words such as “range,” “progression,” and “gradient.” They minimize low values while emphasizing high values. Diverging color schemes show separation by emphasizing the extremes with complementary colors and de-emphasizing the central areas. They minimize statistical descriptions such as “average,” “common,” “normal,” “regular”, “expected,” and “unchanged” while emphasizing high and low values, changes, distinct values, differences, and standard deviations. In effect, diverging schemes are two complementary sequential color schemes joined at the center, designed to highlight the extremes and minimize the expected values. Qualitative schemes represent differences in data categories and do not use ordered colors because mathematical operations are meaningless on nominal data (Brewer 1994).
Trumbo’s original color scheme examples were designed for the Ostwald and Hickethier solid cube color models that are no longer used today. Both color models are “uniform color spaces” as they are known within the field of color science (Wyszecki and Stiles 1967). Uniform color spaces are based upon a geometric shape such as a square or cone and use a consistent, measurable metric, making perceptive color differences more intuitive. Validation of color similarity and difference can be achieved through examination of the color metrics (Robertson and O’Callaghan 1986; Bujack et al. 2017). Robertson and O’Callaghan demonstrated that Trumbo’s principles are achieved through uniform color models. Principles I (order) and II (separation), used for both univariate and bivariate maps, are easily achieved through the consistent measurable metrics of the model. For bivariate maps, Principles III (rows and columns) and IV (diagonal) can also be satisfied through differing schemes. The diagonal is achieved through a square model that highlights the diagonal, and rows and columns are successful using a conical model that preserves linearity within a group (Robertson and O’Callaghan 1986).
Trumbo (1981) proposed his principles for bivariate choropleth mapping prior to mapmaking software being generally available outside of governmental agencies and labs (Waters 2018). He had limited opportunities to test his theories before publication (Trumbo, personal communication, July 2, 2019). Uniform color models are independent of any particular display device, and for reproducibility on different display devices, it may be necessary to adjust the hue, saturation, and lightness—a process known as “device modeling” (Robertson & O’Callaghan 1986). For example, Ware (2009) notes differences between types of monitors where some colors have more luminescence than others, which can alter the visual output. As Trumbo writes,
there are more difficulties rendering these schemes on screen and in color printing than I had supposed. Color settings on a monitor have to be just right in order to show strong association to best effect (with my schemes), and the usual precision of color printing in popular magazines is not sufficient to represent some color schemes to good effect. (personal communication, July 2, 2019)
Color settings on monitors have advanced significantly, allowing for a greater and broader use of color in cartography (Monmonier 2006). Modern computer graphics use one of several RGB color spaces defined by the primary colors of red, green, and blue in the X, Y, and Z axes. Color scheme parameters such as color separation, the logical arrangement of colors, and a zero-saturation diagonal can be automated by symbology tools available today in most GIS software.
Bertin’s (1973) Semiologie Graphique introduced an ambitious and comprehensive theory to formalize graphic language. His concept of visual variables (e.g., size, shape, value, orientation, color, texture) serves as a guide for map symbology in cartography textbooks. In addition to his contribution of visual variables, Bertin described three levels of reading possible in map comprehension:
Robertson and O’Callaghan (1986) concluded through their color evaluations that Trumbo’s principles and suggested color schemes supported Bertin’s vision of the levels of comprehension that maps should support. The definitions of the intermediate and superior levels require multiple thematic variables, thus suggesting that Bertin may have envisioned bivariate and multivariate maps as readily achievable. The focal model that supports seeing the range of one variable within a category of another variable successfully addresses the elementary level because a perceived variable can be translated into a quantitative value. The model focusing on direct relationships can support the intermediate (local distribution) and superior levels (global distribution) of map comprehension by revealing the relationships between two variables (Robertson and O’Callaghan 1986).
Bertin also introduced the concept of “selectivity” of graphic variables, which is related to the idea of selective attention (MacEachren 1995; 36): the ability of a map reader to focus on one visual variable while simultaneously ignoring others. The concept of selective attention was first developed in the field of psychology, and has since been applied by cartographers to further the understanding of thematic maps (Carswell and Wickens 1990; Nelson 2000; Elmer 2013). Selective attention concepts of configurality, asymmetry, integrality, and separability describe the perceptual aspects of the combination of two visual variables.
Table 2 shows the relationships between selective attention concepts, using combinations of visual variables identified by Elmer (2013) and Trumbo’s equivalent concepts using color. The focal models are based on Trumbo’s color concepts, and will be addressed in detail in the next section. This chart serves to situate Trumbo’s ideas on color within the cartographic literature regarding selective attention concepts. Configural relationships show highs and lows of data values as does Trumbo’s corners method. The asymmetrical concept and the range method reveal ranges of data values within categories with unique interaction effects. The integral and diagonal methods both show the emerging dimension where two variables interact while inhibiting reading individual variables.
Table 2. Relationships between Elmer’s (2013) discussion of selective attention for bivariate maps and Trumbo’s color concepts. Charts adapted from Elmer (2013).
Trumbo’s original color scheme does not address the selective attention concept of separability, in which a reader can attend to one visual variable with minimal interference from others. However, this can be achieved by combining color with other visual variables, such as the example in Figure 3, which combines color with symbol size. The corners method uses color to show how the population in each area meets their wastewater and drinking water needs using private infrastructure, public infrastructure, or a combination of the two. By aggregating water infrastructure and population data at the property parcel scale to a 1km grid, we can use a graduated symbology, where population data are clearly separable from water infrastructure information. Applications of this map include planning new sewer lines or identifying areas without electricity to operate private wells for planning for disaster management.
Figure 3. Bivariate map showing population using public and private water infrastructures for drinking and wastewater in Orlando, USA. Water infrastructure data from the Florida Department of Health, 2016. Data were gridded to 1km cells, with larger squares representing higher population and smaller grids showing lower population.
The reprisal of Trumbo’s work that we present here began by consulting with Trumbo, and both extends and demonstrates his 1981 ideas. We created focal models that illustrate how Trumbo’s three spatial relationships above can be translated into bivariate color schemes that adhere to his four principles of color usage. We then constructed three sample maps to illustrate each of the focal models. Most sample maps are standard choropleth maps, but two used 1km grids. Finally, we created three maps using the same socioeconomic data variables (obesity and inactivity) for each of the models to determine if the final maps were visibly different and if each can be used to successfully answer questions about relationships between obesity and inactivity. In doing so we explored the following questions: Do the focal models produce different maps using the same data? Is there a noticeable relationship between the stated map purpose and the resulting emphasized data? Can the focal models guide color choices by associating their map purpose with the appropriate color scheme? Can the focal models be converted into reproducible symbolization guidelines suitable for a GIS?
To clarify the relationships between the four principles suggested by Trumbo and the three types of questions, Figure 4 presents an overview of our ideas. The diagram shows Trumbo’s four principles of order, separation, rows and columns, and diagonal as foundational knowledge. The principles are shown in relation to our three focal models, each of which derives from one of the three types of questions that the cartographer is trying to answer. This diagram can also be used as a decision-making flow chart for cartographers. Each focal model draws attention to the data that appropriately address the purpose of the map (Table 3).
Figure 4. Relationship between Trumbo’s four principles and the three focal models for bivariate choropleth map.
Below, we’ll examine each focal model by constructing a diagram of a bivariate choropleth map legend, and we’ll illustrate how its design conforms to Trumbo’s four principles of effective color choice for bivariate choropleth maps. These diagrams illustrate the areas where the reader’s focus is directed, and the combination of sequential, diverging, and qualitative schemes that produced the data emphasis. For each model, we’ll also look at an example map and legend.
The corners model, shown in Figure 5, deals with the exploratory questions of low/high of x and low/high of y. It uses multiple complementary diverging color schemes designed to highlight the distinct corners while minimizing the interior. The corners model succeeds because the complementary diverging color schemes draw attention to the extreme areas. It is meant to address questions such as: Where are the areas of high income and low education? Where are the areas of low population density and high crime? Where are the areas of high public transportation and high food deserts? Perdue (2013) successfully used the corners model to highlight differences between population density and crowding in urban environments.
The map and legend in Figure 6 offers an example of the corners model, highlighting areas where the rates of home ownership or of education are either high or low. The legend uses multiple diverging color schemes around the exterior walls to emphasize the contrast between the four corners and minimize the interior areas. The effect is to draw attention to geographic areas where extreme percentages of those with advanced education and home ownership are very similar or very dissimilar. The geographic areas with less extreme values of the two population categories are minimized through representation with lighter colors. The goal of this map and legend is to allow quick and easy identification of areas with very low or very high values of either variable without distraction from areas with intermediate values.
Figure 6. Corners model highlighting areas of either high or low rates of education and home ownership for Leon County, Florida. Data from 2010 US Census.
Trumbo (1981) called for a 4×4 grid, but we demonstrate a 3×3 grid in Figure 6. In critiquing this palette, Trumbo asked about the 4×4, as it offers more gradation (personal communication, August 2016). Trumbo’s sample color palettes used the Ostwald and Hickethier uniform color models that could achieve his goals (Robertson and O’Callaghan 1986). In constructing a sample corner palette for our demonstrations, we were unable to blend 16 colors with sufficient distinctness and order. We are unsure if the problem is a change in color model or our limited color blending skills. Thus, we have chosen to use colors similar to Brewer’s (1994) “diverging/diverging” 3×3 scheme to overcome issues with color distinctiveness. Further, an anonymous reviewer of this article aptly noted that the 3×3 grid colors can be adjusted to achieve different purposes. For example, if it is important to emphasize high and low values, the interior five colors could be lightened so that the corner colors are more prominent. If it is important to emphasize certain highs and lows (e.g. emphasis on low/high, low/low, high/high, or high/low), the two corners representing these values can be left saturated and the other two corners can be muted slightly to call less attention.
There are limitations to the corners model. It may require map readers to consult the legend more often than other models due to the lack of color gradation that reminds readers of data values. Additionally, it is important to select colors that are appropriate to the data type to avoid misrepresentation (e.g., qualitative vs. quantitative) interpretation. Finally, it is worth noting Eyton’s (1984) and Dunn’s (1989) designs for highlighting high and low data values as shown by Figure 7. Both of these models are alternatives to Trumbo’s corner and diagonal methods. These models perform the tasks of a corner model by highlighting high and low data values by the use of pure colors in the upper-left and lower-right cells with easily distinguishable colors in the other two corner cells. These models perform the task of Trumbo’s diagonal model (discussed later) with the lower-left and upper-right cells using a progression of light-to-dark shades of the same color hue to show the data relationships along the diagonal. Eyton (1984) demonstrates eight classes of data and Dunn (1989) uses five. Data along the diagonal can be represented using a single color or multiple colors to show gradation. Both models can represent correlated data as well as outliers.
Figure 7. (a) Adapted from Eyton (1984) and (b) Dunn (1989). Both diagrams are alternatives to the corners and the diagonal focal models.
The range model, shown in Figure 8, illustrates the range of y within the low/high of x. The primary focal axis consists of a qualitative color scheme that provides an organizing structure to separate data into visually distinct categories, resembling ribbons. The secondary axes are sequential interior schemes that show the progression of values from low to high within each category. Users can select a category of interest along the x-axis, and then see the range of values distributed throughout. Our diverging color schemes vary somewhat from Brewer’s (1994). While she describes hers as having steps of lightness, our involve colors that fall along gradients that diverge from a quantitative midpoint, but do not necessarily require a change in lightness.
Figure 8. Two versions of the range model, with either diverging or qualitative colors along the x-axis.
The range model addresses questions such as: What are the ranges of education among those with high incomes? About how many votes were cast in areas with strong Obama support? What are the income levels in areas of high foreclosures? The map and legend in Figure 9 show both the degree of support for political candidates as well as a general idea of the number of votes cast. For example, populated areas with strong Obama support are shaded dark blue and rural areas with strong Romney support are shaded light pink. Rural areas with relatively equal voting for each candidate are shaded light purple. Instead of choropleth enumeration units, the map uses a 1km grid system. The colors along the x-axis create categories showing the percentage of support for each candidate. Within each category, a sequential scheme along the y-axis indicates the number of votes within each region.
Figure 9. An example of the range model, showing the number of votes cast in areas with a given level of support for a candidate. Data are 2012 precinct-level election results reported to the Florida Division of Elections, disaggregated to census blocks (spatial interpolation weighted for population 18 years and older), then disaggregated to a 1km grid.
Trumbo (1981) used specific color codes to achieve color gradations. However, we were unable to successfully select colors that achieved both order and separation in a visually appealing manner. Instead, we found that the value-by-alpha symbolization of Roth, Woodruff, and Johnson (2010) worked very well.
It is worth noting that the range model is the only focal model that can map qualitative data. Categorical colors can represent the different types of data in the x-direction while quantitative values are mapped in the y-direction. An example of a qualitative map could be one of predominant agricultural production by county. For each county, the main agricultural product (e.g., corn, wheat, soy) could be represented along the x-axis through columns, while the total amount harvested (quantitative) could be represented along the y-axis using a transparency gradient. Another example map could show income and education, where income distributions are along the y-axis and categories of education (e.g., no high school, high school, associate degree, bachelor’s, etc.) comprise the columns along the x-axis. Brewer (1994) provides variations on the range model. One, a bivariate example, uses a qualitative scheme for presentation of nominal data. The second, a binary model, presents the range (y-axis) of values within a categorical variable (x-axis) while maintaining a binary status (e.g., members vs. non-members).
The diagonal model answers exploratory questions about the relationship between x and y, as shown in Figure 10. It consists of a sequential color scheme at a 45-degree angle, to show progression of correlated data, and a diverging color scheme on the opposite diagonal, which highlights the differences between areas on either side of the main diagonal. These two color schemes, at right angles to each other, succeed in dividing the data into three categories (Carstensen 1984). Correlated data are shown on the diagonal sequence while non-correlated data are shown in complementary colors at their respective corners. This is similar to the overlay method used by the 1970s Census maps; both attempt to show data correlation along the diagonal.
The diagonal model can answer questions such as: What is the relationship between income and education? Are tobacco sales and food deserts correlated? Is there a relationship between population density and public transportation? The example map and legend in Figure 11 show the relationship between elderly and minority populations using the diagonal model. The complementary colors in the diverging color scheme produce a white-gray-black sequence along the diagonal. The resulting effect divides the data into three categories: correlated data appear in a grayscale sequence, while non-correlated data are shown using gradients of complementary colors. In the example map, there are few greyscale areas, as the data themes feature little correlation. However, the complementary blue and orange colors show clearly whether an area’s population is predominantly elderly or minority. The internal progression of color preserves the degree of population differences (e.g., high elderly populations are a darker orange than areas with low elderly populations).
Figure 11 clearly shows the locations where one of the two variables has a much higher percentage than the other. The data do not need to be correlated to use this model, as the diagonal model shows data to be above, on, or below the diagonal. Contrast Figure 11 with Figure 1, which used the overlay method. Recall that Figure 7 presents an alternative version of the corners and diagonal models.
In order to compare the focal models in detail, we created three maps, one per model, of the same two datasets: the percentages of obese and of physically inactive persons per county, from the US County Health Rankings & Roadmaps Program (University of Wisconsin Population Health Institute 2016). Figures 12–14 illustrate the results.
In the corners model (Figure 12), the diverging complementary colors in each corner separate high values from low values. Since this model focuses on high and low values, a 3×3 grid is sufficient, with intermediary shades working only to separate the corners. This map can answer the following questions: Where are areas of high obesity and high inactivity? Where are areas of high obesity and low inactivity? Where are areas of low obesity and low inactivity? Where are areas of low obesity and high inactivity?
The range model (Figure 13) shows a diverging color sequence along the obesity axis, with sequential colors representing the range of inactivity in each obesity category. This map could be used to answer the following questions,: Where are the areas of highest inactivity and highest obesity? What is the range of inactivity in the counties with the highest obesity? Note that in this map, counties with the highest inactivity, regardless of obesity category, have a high saturation. This shifts the readers’ focus strictly to the activity levels rather than how activity relates to obesity, which should be considered when categorizing data and designing a color scheme.
The diagonal model (Figure 14) shows both obesity and inactivity with sequential colors. The diagonal, representing a correlation between the two, follows a grayscale sequence, while the more saturated blue and orange corners represent non-correlation. The map could answer the following questions: Where are obesity and inactivity positively correlated? Where do obesity and inactivity differ? The most attention is drawn to the black areas of the map, which represent areas with high, correlated values of obesity and inactivity.
To illustrate how changing the color scheme can affect the outcome of the data interpretation, we inverted the “inactivity” dataset to an “activity” dataset for Figures 15–16, and switched the axes while keeping the color scheme the same. Now, the range model in Figure 15 illustrates activity levels on the x-axis with diverging colors, while a sequential scheme shows obesity along the y-axis. In contrast to Figure 13, this reclassification of inactivity to activity provides a more intuitive understanding of how activity levels may relate to obesity levels. Rising activity would seem to indicate lower obesity, rather than lower inactivity indicating lower obesity. This allows readers to answer questions such as: Where are the areas of lowest activity and highest obesity? What is the range of activity in the counties with higher percent obesity?
Figure 16 shows that a reclassification from “inactivity” to “activity,” though beneficial for the range model map, is detrimental for the diagonal model map, whose color scheme now represents an entirely different purpose. The new diagonal emphasizes any correlation between increased activity and increased obesity, which is both a very different perspective on the datasets than before, and a correlation that mostly doesn’t appear on the map. However, the map can answer questions such as: Where are high activity levels and high obesity levels? Where are low activity levels and high obesity levels? Is there positive correlation between increased activity and increasing obesity? Is there lack of correlation between increased activity and increasing obesity?
To be effective, maps must be designed for their intended use. Bivariate and univariate maps share many of the same design challenges, but the multidimensional aspect of the former adds extra complexity. To address this, Trumbo (1981) proposed three types of bivariate choropleth maps, each with unique purposes and goals. Our work, under the guidance of Trumbo, extends his ideas by producing focal models, sample color palettes, and sample maps (Table 4). The three resulting focal models offer mapmakers bivariate choropleth options other than the single-purpose overlay scheme, and they lead to design choices that support a map’s intended purpose by highlighting the appropriate data. The focal model diagrams, and our sample materials, provide guidelines for the production of bivariate choropleth maps in a GIS production environment. We also presented a methodology to relate typical map user questions to the appropriate focal model, which can be used for cartographic decision making.
There is more work to be done to improve the design of bivariate choropleths, and future work could explore color palette selection, the role of color saturation, classification choices, the value of statistics in revealing data relationships, and mapping data uncertainty. We also urge researchers to consider external factors that could contribute to the clarity of these maps, including the size of geographic areas, display devices, and the user’s knowledge and experience with the data.
In this paper, we have provided a framework for linking types of bivariate maps with focal models, and thereby operationalized Trumbo’s (1981) principles of bivariate map design. Using this framework, GIS practitioners have a practical outline for how to create bivariate choropleth maps.
The authors would like to thank Bruce Trumbo for revisiting his original ideas and providing insight on our interpretations of his work. Amy Griffin, Daniel P. Huffman, and three anonymous reviewers provided valuable comments and suggestions. Our thanks to Brittany Gress for her support with graphics.
Bernard, Jürgen, Martin Steiger, Sebastian Mittelstädt, Simon Thum, Daniel Keim, and Jörn Kohlhammer. 2015. “A Survey and Task-Based Quality Assessment of Static 2D Colormaps.” Visualization and Data Analysis 2015, edited by David L. Kao, Ming C. Hao, Mark A. Livingston, and Thomas Wischgoll. Bellingham, WA: SPIE. https://doi.org/10.1117/12.2079841.
Bertin, Jacques. 1973. Semiologie Graphique, Second Edition. The Hague: Mouton-Gautier.
Brewer, Cynthia A. 1994. “Color Use Guidelines for Mapping and Visualization.” In Modern Cartography. Vol. 2: Visualization in Modern Cartography, edited by Alan M. MacEachren and D. R. Fraser Taylor, 123–147. New York: Elsevier Science, Inc.
https://doi.org/10.1016/B978-0-08-042415-6.50014-4.———. 2005. Designing Better Maps: A Guide for GIS Users. Redlands, CA: Esri Press.
Brewer, Cynthia A., Mark Harrower, Ben Sheesley, Andy Woodruff, and David Heyman. 2015. “ColorBrewer: Color Advice for Cartography.” Accessed December 22, 2015. http://www.ColorBrewer.org.
Bujack, Roxana, Terece L. Turton, Francesca Samsel, Colin Ware, David H. Rogers, and James Ahrens. 2017. “The Good, the Bad, and the Ugly: A Theoretical Framework for the Assessment of Continuous Colormaps.” IEEE Transactions on Visualization and Computer Graphics 24 (1): 923–933. https://doi.org/10.1109/TVCG.2017.2743978.
Caquard, Sébastien. 2015. “Cartography III: A Post-representational Perspective on Cognitive Cartography.” Progress in Human Geography, 39 (2): 225–235. https://doi.org/10.1177/0309132514527039.
Carstensen, Lawrence W. 1984. “Perceptions of Variable Similarity on Bivariate Choropleth Maps.” The Cartographic Journal 21 (1): 23–29. https://doi.org/10.1179/caj.1984.21.1.23.
———. 1986. “Bivariate choropleth mapping: The effects of Axis Scaling.” The American Cartographer 13 (1): 27–42. https://doi.org/10.1559/152304086783900158.
Carswell, C. Melody, and Christopher D. Wickens. 1990. “The Perceptual Interaction of Graphical Attributes: Configurality, Stimulus Homogeneity, and Object Integration.” Perception & Psychophysics 47: 157–168. https://doi.org/10.3758/BF03205980.
Dunn, Richard. 1989. “A Dynamic Approach to Two-variable Color Mapping.” The American Statistician 43 (4): 245–252. https://doi.org/10.1080/00031305.1989.10475669.
Elmer, Martin E. 2013. “Symbol Considerations for Bivariate Thematic Maps.” Proceedings of 26th International Cartographic Conference, Dresden, Germany. https://icaci.org/files/documents/ICC_proceedings/ICC2013/_extendedAbstract/278_proceeding.pdf.
Eyton, J. Ronald. 1984. “Map Supplement: Complementary-color, Two-variable Maps.” Annals of the Association of American Geographers 74 (3): 477–490. https://doi.org/10.1111/j.1467-8306.1984.tb01469.x.
Fienberg, Stephen E. 1979. “Graphical Methods in Statistics.” The American Statistician 33 (4): 165–178. https://doi.org/10.1080/00031305.1979.10482688.
Friendly, Michael. 2008. “The Golden Age of Statistical Graphics.” Statistical Science 23 (4): 502–535. https://doi.org/10.1214/08-sts268.
Halliday, Sandra M. 1987. Two-variable Choropleth Maps: An Investigation of Four Alternate Designs. Master’s Thesis, Memorial University of Newfoundland. http://research.library.mun.ca/id/eprint/890.
Jin, Hai, and Diansheng Guo. 2009. “Understanding Climate Change Patterns with Multivariate Geovisualization.” In 2009 IEEE International Conference on Data Mining Workshops, edited by Yucel Saygin, Jeffrey Xu Yu, Hillol Kargupta, Wei Wang, Sanjay Ranka, Philip S. Yu, and Xindong Wu, 217–222. Washington: IEEE. https://doi.org/10.1109/ICDMW.2009.91.
Kimball, Miles A., and Charles Kostelnick, eds. 2017. Visible Numbers: Essays on the History of Statistical Graphics. London: Routledge.
Leonowicz, Anna. 2006. “Two-variable Choropleth Maps as a Useful Tool for Visualization of Geographical Relationship.” Geografija 42 (1): 33–37. https://publications.lsmuni.lt/object/elaba:6210410.
MacEachren, Alan M. 1995. How Maps Work: Representation, Visualization, and Design. New York: Guilford Press.
MacEachren, Alan M., Monica Wachowicz, Robert Edsall, Daniel Haug, and Raymon Masters. 1999. “Constructing Knowledge from Multivariate Spatiotemporal Data: Integrating Geographical Visualization with Knowledge Discovery in Database Methods.” International Journal of Geographical Information Science 13 (4): 311–334. https://doi.org/10.1080/136588199241229.
Meyer, Morton A., Frederick R. Broome, and Richard H. Schweitzer Jr. 1975. “Color Statistical Mapping by the U.S. Bureau of the Census.” The American Cartographer 2 (2): 101–117. https://doi.org/10.1559/152304075784313250.
Monmonier, Mark. 1989. “Interpolated Generalization: Cartographic Theory for Expert Guided Feature Displacement.” Cartographica 26 (1): 43–64. https://doi.org/10.3138/V700-H680-077Q-6503.
———. 2006. “Cartography: Uncertainty, Interventions, and Dynamic Display.” Progress in Human Geography 30 (3): 373–381. https://doi.org/10.1191/0309132506ph612pr.
Nelson, Elisabeth S. 2000. “Designing Effective Bivariate Symbols: The Influence of Perceptual Grouping Processes.” Cartography and Geographic Information Science 27 (4): 261–278. https://doi.org/10.1559/152304000783547786.
Nyerges, Timothy L. 1991. “Analytical Map Use.” Cartography and Geographic Information Systems 18 (1): 11–22. https://doi.org/10.1559/152304091783805635.
Olson, Judy M. 1981. “Spectrally Encoded Two-variable Maps.” Annals of the Association of American Geographers 71 (2): 259–276. https://doi.org/10.1111/j.1467-8306.1981.tb01352.x.
Perdue, Nicholas A. 2013. “The Vertical Space Problem: Rethinking Population Visualizations in Contemporary Cities.” Cartographic Perspectives 74: 9–28. https://doi.org/10.14714/cp74.83.
Rheingans, Penny. 1997. “Dynamic Color Mapping of Bivariate Qualitative Data.” Proceedings: Visualization ‘97: 159–166. New York: ACM. https://doi.org/10.1109/VISUAL.1997.663874.
Robertson, Philip K., and John F. O’Callaghan. 1986. “The Generation of Color Sequences for Univariate and Bivariate Mapping.” IEEE Computer Graphics and Applications 6 (2): 24–32. https://doi.org/10.1109/MCG.1986.276688.
Roth, Robert E., Andrew W. Woodruff, and Zachary F. Johnson. 2010. “Value-by-alpha Maps: An Alternative Technique to the Cartogram.” The Cartographic Journal 47 (2): 130–140. https://doi.org/10.1179/000870409X12488753453372.
Slocum, Terry A., Robert B. McMaster, Fritz C. Kessler, and Hugh H. Howard. 2005. Thematic Cartography and Geographic Visualization, Second Edition. Upper Saddle River, NJ: Pearson Prentice Hall.
Stevens, Joshua. 2015. Bivariate Choropleth Maps: A How-to Guide. Accessed June 24, 2015. http://www.joshuastevens.net/cartography/make-a-bivariate-choropleth-map.
Trumbo, Bruce E. 1981. “A Theory for Coloring Bivariate Statistical Maps.” The American Statistician 35 (4): 220–226. https://doi.org/10.1080/00031305.1981.10479360.
Tufte, Edward R. 1990. Envisioning Information. Cheshire, CT: Graphics Press.
———. 2001. The Visual Display of Quantitative Information. Cheshire, CT: Graphics Press.
Tukey, John W. 1980. “We Need Both Exploratory and Confirmatory.” The American Statistician 34 (1): 23–25. https://doi.org/10.1080/00031305.1980.10482706.
Tyner, Judith A. 2010. Principles of Map Design. New York: Guilford Press.
University of Wisconsin Population Health Institute. 2016. “Rankings Data and Documentation.” County Health Rankings and Roadmaps. Accessed December 18, 2019. https://www.countyhealthrankings.org/explore-health-rankings/rankings-data-documentation.
Wainer, Howard, and Carl M. Francolini. 1980. “An Empirical Inquiry Concerning Human Understanding of Two-variable Color Maps.” The American Statistician 34 (2): 81–93. https://doi.org/10.1080/00031305.1980.10483006.
Waters, Nigel. 2018. “GIS: History.” International Encyclopedia of Geography, edited by Douglas Richardson, Noel Castree, Michael F. Goodchild, Audrey Kobayashi, Weidong Liu, and Richard A. Marston. https://doi.org/10.1002/9781118786352.wbieg0841.pub2.
Ware, Colin. 2009. “Quantitative Texton Sequences for Legible Bivariate Maps.” IEEE Transactions on Visualization and Computer Graphics 15 (6): 1523–1529. https://doi.org/10.1109/TVCG.2009.175.
Wickens, Christopher D., John Lee, Yili Liu, and Sally Gordon Becker. 2004. An Introduction to Human Factors Engineering, Second Edition. Upper Saddle River, NJ: Prentice Hall.
Wyszecki, Günter, and W. S. Stiles. 1967. Color Science, Second Edition. New York: John Wiley & Sons.