The choice of the appropriate type of map relies on the aim of the representation and the nature of the indicator to map.

Dot-density maps

In a dot-density map, each case which occurred in an area is represented as a dot on the map. Dot-density maps are best at representing the burden of disease by geographical areas, expressed as count of cases. For rare diseases, dot maps are effective at detecting clustered cases. Dot-density maps are not indicated for representing rates or other composite indicators. They do not require classifying values in ranges and therefore, no information is lost in mapping.

Very few surveillance systems accurately record the exact address of residence of cases, allowing for a precise positioning of case dots. Usually, the smallest geographical area available in the surveillance data is used for mapping, and dots are randomly located within these administrative boundaries. Using randomly located dots on large geographical areas can be particularly misleading, as cases seem to have occurred homogeneously throughout the area, even in locations known to be sparsely populated. An indication should be placed on the map to indicate the random location of the dots within area boundaries. The size of the dots should be such that dots do not overlap because of their number. When the disease is frequent, a dot can be assigned to a number of cases, e.g. 1 dot = 10 cases. An example of a dot-density map is presented on figure 1 in chapter Analysis by place characteristics.

During outbreak investigations, dot maps can be used to represent cases of the disease which occurred during the outbreak, plotted by place of exposure. If the outbreak occurs in a closed environment, such as a cruise-ship, a prison or a nursing home, cases can be plotted on a floor plan of the area (figure 1).

Figure 1: Distribution of resident by clinical status, Nursing home X, Delaware, USA, 1992.

Choropleth maps

In choropleth maps, geographical areas, usually defined by administrative boundaries, are filled with color or pattern to reflect the magnitude of the indicator to be mapped. Choropleth maps are popular because of their visual and intuitive appeal. However, careful thoughts should be given to designing the map as the level of aggregation, the mode of classification and number of categories, as well as the choice of colors may hide meaningful spatial patterns or create artifactual ones.

Choice of the level of geographical aggregation

Whenever crude rates, age and/or sex specific rates or standardized rates are plotted, numerator and denominator data needs to be aggregated by geographical areas. Administrative boundaries are generally used since denominators are usually available with sufficient details. However, the choice of the administrative level should avoid resulting in too few cases reported in each area that would result in wide variations of rates. It is not advisable to calculate rates when less than 20 cases are contributing to the numerator. Therefore, a balance needs to be struck between the size of the geographical area and the period during which cases are accumulated, in order to avoid such small figures.

Classification

Affecting a color or a pattern to a geographical area requires that data are organized in categories. Four to eight categories are used in general. Too few categories results in loss of information while too many may result in difficulties in perception. In addition, there should not be too many categories when the map includes relatively few geographical units.

Most mapping software offer several approaches in classifying data. Selecting the appropriate classification method requires inspecting the distribution of values across geographical areas (figure 2). The most commonly used classification methods include:

Equal count classification

This type of classification attempts to assign an equal number of observations in each category. Quartiles of the distribution are used for four categories, quintiles for five. While this method is popular and commonly used, it may classify areas with close values in different groups and areas with dissimilar values in same categories based solely on ranks. It is best used for evenly- or normally- distributed data.

Equal interval classification

This classification applies equal amplitude ranges to data. Therefore, it is indicated when values of the geographical areas are evenly distributed, resulting in a straight line on the distribution plot (figure 2) and an equal number of areas in each category. It should not be used when the distribution is skewed or includes outliers since it may result in categories not being represented on the map (as in figure 3, map 2).

Equal area classification

Equal area classification attempts to classify data so that each class is represented by an equal area on the map. Caution should be used with this method when population density varies greatly across areas: large under populated areas may appear in two categories in relation to their large size although their values may be very similar.

Mean and standard deviation classification

This classification method assigns categories on both sides of the mean using standard deviations (e.g. -1,-2, -3 standard deviations under the mean, +1, +2, +3 standard deviations above the mean). This is a statistical classification method best indicated if the distribution of values follows a normal distribution. It shows how area values differ from the average value. It is not indicated when the distribution of values is skewed and may result in categories not being represented on the map (as in figure 3, map 4).

Natural breaks classification

This method attempts to regroup values by minimizing the variance within categories and maximizing the variance between categories. It tends to regroup similar values together to best represent the distribution.

In conclusion, whenever values are evenly distributed, equal count or equal interval classifications can be used. When values are normally distributed, mean and standard deviation classification should be used. In other instances, natural breaks should be preferred, as this method makes no assumptions about the shape of the distribution. Figure 3 shows example of the effect of the classification method selected.

Figure 2: Distribution of values and ranges of the classification methods, brucellosis notification in Lebanon, by district, 52 weeks rate, as of week 15 of 2003

1: quartiles, 2: equal interval, 3: equal area, 4: mean and standard deviation, 5: natural breaks

Figure 3: Distribution of brucellosis notification in Lebanon, by district, 52 weeks rate, as of week 15 of 2003

Grey shades and colors

Choropleth maps use analogy to represent the data. Caution should be used in assigning shades and colors to classification areas. Risks of transmission, expressed as crude, specific or standardized rates are quantitative in nature. Therefore, a light-dark color contrast should be preferred to represent it. A gradient of grey (black and white display) or of color (color display) such as yellow or red is indicated.

The cold-warm contrast ranging from cyan to orange can be used effectively to represent values below and over a mean as exemplified in figure 4. While the cold-warm contrast is very effective in representing the opposition between high and low values, it does not appropriately represent the range of values when printed or duplicated in black ad white. In addition, it may pose problems to person with color-blindness.

Figure 4: Distribution of brucellosis in Lebanon, by district, 52 weeks rate, as of week 15 of 2003

Isopleth maps

Figure 5: Incidence of flu-like illness in France, 2002

Source: INSERM Unit U 444, Epidemiology and information sciences

Isopleth maps (figure 5) do not require aggregating values by geographical areas but instead use the data itself to define geographical areas with similar values. It is indicated to represent continuous data that varies progressively over geographical locations, e.g. temperature, elevation. In surveillance, isopleth maps are indicated to represent incidence when surveillance is based on a sample of sentinel sites. Shaded or colored areas boundaries are derived from the data using statistical methods such as kriging [1].

Mapping place and time

Displaying place and time characteristics of the distribution of a disease is a very effective way to grasp the dynamic of the disease transmission. This is best achieved by plotting subsequent maps next to each other as exemplified on figure 6.

Figure 6: Distribution of polio cases by district, Albania, April to September 1996

Additional information can be found in the "Handbook on geographic information system and digital mapping" from UNSTAT.

References


1. CARRAT F, VALLERON A-J Epidemiologic MappingUsing the "Kriging" Method. Application to an Influenza-Like Illness epidemic in France. Am. J. Epidemiol., 1992; 135:1293-1300.
2. http://unstats.un.org/unsd/publication/SeriesF/SeriesF_79E.pdf
3. See http://www.who.int/docstore/wer/pdf/2000/wer7538.pdf for detailed information about meningitis thresholds in Africa.