As a first step in exploring the data, frequency distributions of variables are commonly summarised in tables. Frequency distribution tables can be used for both categorical and numerical variables. Continuous variables should only be used with class intervals. Frequency distributions can show either the actual number of observations which fall within a given class (absolute frequency), the ratio of the absolute frequency to the total number of observations (relative frequency), or the percentage of observations (percentage frequency). A table is a data set organised in rows and columns. The simplest table includes two columns. The first column lists the categories in which data are grouped. The second column shows the number of events or individuals falling in each category (absolute frequency). A third column may show the percentage of the total that each category represents (percentage frequency).
Table 1. Number of cases of disease X by age groups, among residents of sample-city, 2009
Some guiding text on how to create a good table:
The above table shows case count with percentage of total according to one variable (age groups). Data could be segregated across a second or several other variables. This can be illustrated as follows:
Table 2. Number of cases of disease X by age groups, sex and X/Y characteristic, among residents of sample-city, 2009
The cumulative frequency (CF) is the running total of the frequencies; the CF that corresponds to a given value is the sum of all the frequencies up to and including that given value. A cumulative frequency distribution table has added columns that give the cumulative frequency and the cumulative percentage of the results as well. For more information, see also frequency polygons.
Table 3. Number and cumulative frequency of cases of disease X by age groups, among residents of sample-city, 2009
Contingency tables (also known as cross tabulation or cross tab) are used to present and analyse the relation between two or more categorical variables in a matrix format. Cohort studies and case control studies are classical methods used by epidemiologists to identify association between an exposure and a disease. The crude results of such studies are frequently presented as 2-by-2 contingency tables. They can be illustrated as follows.
Table 5. Cases of disease X according to consumption of food X, among customers of restaurant Y, 29 February 2009
Consumption of food X
Table 6. Cases of disease X and controls according to consumption of food X, among customers of restaurant Y, 29 February 2009
Although epidemiologists cannot analyse data before they are collected, they usually prepare their analysis by designing dummy tables (empty shells) which will later figure in the results. This is an important part of any plan of analysis. It allows making sure that the responses to be obtained will fit with the study design, the hypothesis tested and the way questions are asked.
Table 7. Dummy table for food specific attack rates (AR) in a cohort study - cases of gastroenteritis according to consumption of specific food items and beverages, among customers of restaurant X, date
Have eaten (Exposed)
Did not eat (Non-exposed)
Table 8. Dummy table for food specific attack rates in a case-control study - cases of gastroenteritis according to consumption of specific food items and beverages, among customers of restaurant X, date