Skip Ribbon Commands
Skip to main content

Effect Modification and Confounding

Last modified at 9/15/2011 10:25 PM by Arnold Bosman

An epidemiological study can be conducted to investigate the cause of a disease in a certain population, attempting to quantify an observed association between exposures and disease outcomes [1].  To determine the effect of an exposure within a population on disease occurrence, requires ideally a comparison between disease occurrence in a certain 'exposed group' with the disease occurrence within that same group in absence of that same exposure.  As this is impossible (once people are exposed, we can no longer study how disease would have occurred among those same individuals without the exposure), we usually compare the incidence amongst an exposed group to that amongst a similar, yet unexposed unexposed group. If the incidence amongst the unexposed is the same as that amongst the exposed had they not been exposed, then the straightforward comparison is justified. If not, then the comparison is confounded; bias is introduced.


If life was truly simple, then to measure the effect between exposure and outcome (expressed as relative risk, odds ratio, vaccine effectiveness etc) ideally it would be enough to measure the distribution of the exposure and outcome of interest in a population and present these variables in a single two-by-two table.

However, life is always more complex; there are 'third variables' that can distort (confound) or modify the effect in our study. In some studies there may be many of these third variables, which we call confounders.


Serious problems can arise if confounding and effect modification are not considered at all stages: designing a study, analysing the data, interpreting the findings [2].


If a factor is known to be associated with both the exposure as well as with the outcome in a study, such a 'third variable' is considered to be a confounder. Unless we correct for this confounding variable, our  measurement of association (e.g. RR or OR) will be distorted, leading to over- or underestimation of the true effect. In some instances, it might reverse the direction of the effect.

There are two ways to account for confounding variables:

Stratification of data 

An association may be seen between age at first birth and carcinoma of the breast. There is also a percieved association between the number of children a woman bears and carcinoma of the breast - however those who have their first child earlier will be more likely to have larger families. Therefore, the data must be separated to compare those who only have one child; and risks calculated according to age at first parity.  Note that not all the data collected will be used if this is the plan of analysis.

Unless the association between the exposure and the disease outcome varies markedly between the strata, the evidence from the different strata can be combined to present a summary of the association, to create one RR value or one OR value. Some strata may include more individuals than others, and therefore will have a more accurate measurement of the association. Therefore the average of the associations observed across all strata is weighted towards the most accurate: the most widely used weighting scheme used is that proposed by Mantel and Haenszel.

Multivariable analysis 

Logistic regression can be used to simultaneously adjust for the effects of more than one confounding variables. Similar methods can be used for data from cohort studies.

Effect Modification      

Using an adjusted/weighted odds ratio implies that the observed association between exposure and disease is really the same in each of the strata - once the strata are defined by the levels of the confounding variables. However, this is not always the case, and where it is not, it makes no sense to present a summary of the association. If the exposure causes the disease according to different levels of the confounding variable, then we say that the confounding variable is actually an effect modifier. Interaction and "heterogeneity between strata" are frequently used as though synonymous with effect modification, though they do differ. In this event, it will be appropriate to present different measures of associations (RR or OR) as according to the different levels.


1. Richard Farmer, Ross Lawrenson, David Miller. Epidemiology and Public Health Medicine, Fifth edn. John Wiley & Sons Ltd, 2008,

2. Raj Bhopal. Concepts of Epidemiology: an integrated introduction to the ideas, theories, principles and methods of epidemiology, Oxford University Press, 2007,