Frequency distribution table

As a first step in exploring the data, frequency distributions of variables are commonly summarised in tables. Frequency distribution tables can be used for both categorical and numerical variables. Continuous variables should only be used with class intervals. Frequency distributions can show either the actual number of observations which fall within a given class (absolute frequency), the ratio of the absolute frequency to the total number of observations (relative frequency), or the percentage of observations (percentage frequency). A table is a data set organised in rows and columns. The simplest table includes two columns. The first column lists the categories in which data are grouped. The second column shows the number of events or individuals falling in each category (absolute frequency). A third column may show the percentage of the total that each category represents (percentage frequency).

Table 1. Number of cases of disease X by age groups, among residents of sample-city, 2009

Age groups (years)  Number of cases Percentage (%) 
0-9 422 8.3
10-19 783 15.5
20-29 565 11.2
30-39 904 17.9
40-49 237 4.7
50-59 676 13.4
60-69 898 17.7
70-79 239 4.7
80 or more 120 2.4
Unknown 220 4.2
Total 5064 100

Some guiding text on how to create a good table: 

  • The title should be concise and include the "what (or who), when and where" content of the table (i.e. description by person, place and time). The title should be preceded by a table number (e.g. Table II).
  • Each row and column should be clearly and concisely labelled. Units of measurements should be indicated (e.g. years, meters, cases/1000, etc.). Categories should be mutually exclusive.
  • The total of lines and columns should always be mentioned as well as missing or unknown information or any exclusion (as a special line or in footnotes).
  • All of the abbreviations or codes should be explained in footnote (e.g. OR = odds ratio).
  • The source of data should be mentioned in a footnote unless these are original data.
  • Lines to separate columns are not needed. They are easily replaced by proper alignment and justification of columns. Horizontal lines should be reduced to the strict minimum.
  • Any table and its attached foot note should be self-explanatory. No additional text should be needed to understand the table.

The above table shows case count with percentage of total according to one variable (age groups). Data could be segregated across a second or several other variables. This can be illustrated as follows.

Table 2. Number of cases of disease X by age groups, sex and X/Y characteristic, among residents of sample-city, 2009

Age groups (years) Gender Number of cases

X characteristic

Y  characteristic Total
0-9 Male
Female
Total
10-19 Male
Female
Total
20-29 Male
Female
Total
(...)

Cumulative frequency distribution table

The cumulative frequency (CF) is the running total of the frequencies; the CF that corresponds to a given value is the sum of all the frequencies up to and including that given value. A cumulative frequency distribution table has added columns that give the cumulative frequency and the cumulative percentage of the results as well. For more information, see also frequency polygons.

Table 3. Number and cumulative frequency of cases of disease X by age groups, among residents of sample-city, 2009

Age groups (years)  Number of cases Percentage (%)  Cumulative frequency Cumulative percentage (%) 
0-9 422 8.3 422 8.3
10-19 783 15.5 1205 23.8
20-29 565 11.2 1770 35
30-39 904 17.9 2674 52.9
40-49 237 4.7 2911 57.6
50-59 676 13.4 3587 71
60-69 898 17.7 4485 88.7
70-79 239 4.7 4724 93.4
80 or more 120 2.4 4844 95.8
Unknown 220 4.2 5064 100
Total 5064 100 5064 100

Contingency table

Contingency tables (also known as cross tabulation or cross tab) are used to present and analyse the relation between two or more categorical variables in a matrix format. Cohort studies and case control studies are classical methods used by epidemiologists to identify association between an exposure and a disease. The crude results of such studies are frequently presented as 2-by-2 contingency tables. They can be illustrated as follows.

 Table 4. General structure of a contingency table

Disease present 

Disease absent

Total
Exposed  a b a+b
Unexposed c d c+d
Total a+c b+d

Cohort studies

Table 5. Cases of disease X according to consumption of food X, among customers of restaurant Y, 29 February 2009

Consumption of food X

Total Cases Risk % Risk Ratio (RR)
Yes 100 40 40,0 2
No 50 10 29,0 Reference group
Total 150 50 33,3

Case-control study

Table 6. Cases of disease X and controls according to consumption of food X, among customers of restaurant Y, 29 February 2009

Consumption of food X

Cases Controls Odds Ratio (OR)
Yes 80 30 9,3
No 20 70 Reference group
Total 100 100

Dummy table

Although epidemiologists cannot analyse data before they are collected, they usually prepare their analysis by designing dummy tables (empty shells) which will later figure the results. This is an important part of any plan of analysis. It allows making sure that the responses to be obtained will fit with the study design, the hypothesis tested and the way questions are asked.

Table 7. Dummy table for food specific attack rates in a cohort study - cases of gastroenteritis according to consumption of specific food items and beverages, among customers of restaurant X, date

 Food item

Have eaten (Exposed)

 

Did not eat (Non-exposed)

 

 

 

Case

 Total

Risk % 

 

Case

Total

Risk % 

 

Risk ratio

 95 % CI

Potato salad

 

 

 

 

 

 

 

 

 

 

Fruit salad

 

 

 

 

 

 

 

 

 

 

Tiramisu

 

 

 

 

 

 

 

 

 

 

Roasted chicken

 

 

 

 

 

 

 

 

 

 

 Milk

 

 

 

 

 

 

 

 

 

 

 Beer

 

 

 

 

 

 

 

 

 

 

 (...)

 

 

 

 

 

 

 

 

 

 

 

Table 8. Dummy table for food specific attack rates in a case-control study - cases of gastroenteritis according to consumption of specific food items and beverages, among customers of restaurant X, date

 Food item

Cases exposed

 

Controls exposed

 

 

 

n

 N

(%) 

 

 n

 N

(%) 

 

Odds ratio

 (95 % CI)

Potato salad

 

 

 

 

 

 

 

 

 

 

Fruit salad

 

 

 

 

 

 

 

 

 

 

Tiramisu

 

 

 

 

 

 

 

 

 

 

Roasted chicken

 

 

 

 

 

 

 

 

 

 

 Milk

 

 

 

 

 

 

 

 

 

 

 Beer

 

 

 

 

 

 

 

 

 

 

 (...)