Selection bias may occur in cohort studies if the exposed and unexposed groups are not truly comparable , e.g. comparing an occupational cohort with the general population.
Selection biases in cohort studies include: healthy worker effect, diagnostic bias, non-response bias and loss to follow-up.
The healthy worker effect (HWE) bias is an example of a selection bias that underestimates the mortality/ morbidity related to occupational exposures . This bias reflects the healthier status of the workforce compared to the general population (which includes people who are too sick to work), so that a direct comparison of the workforce with the general population will be biased. It is a problem for those who study occupational cohorts.
The healthy worker effect phenomenon often leads, paradoxically, to lower mortality/ morbidity rates observed in subjects exposed to workplace toxins compared to the general population. Any excess risk associated with an occupation will tend to be underestimated by a comparison with the general population , leading to an underestimation of relative risk (RR) for occupational exposure and disease.
The following table illustrates the incidence rate of disease X in an exposed group of workers compared with the incidence rate in the general population (see the 'Total' row in the table).
In this hypothetical example, the incidence rate observed among exposed workers is 1 case/100 years compared to 1.4 cases/ 100 years in the general population, suggesting that exposed workers have a lower rate of illness than the general population. The general population, however, is composed of two groups: people that are healthy enough to work (workers), and many people who cannot work because of ill-health (non-workers). The group that is too sick to work is included among the non-workers in the table, and results in non-workers having a higher incidence than the remainder of the general population that comprises current workers .
In the above example, we observe that the incidence rate among workers in the general population is the same as that of exposed workers at our study site. But, because the non-workers in the general population have a rate that is five times as great as workers, this results in the overall rate in the general population being greater than that of exposed workers.
As a consequence, any study comparing rates of disease X between exposed workers and the general population would give a biased estimate (with the exposed workers having a substantially lower rate of disease X than the general population), due to the 'healthy worker effect' selection bias.
Two components of HWE bias have been suggested :
Factors that determine the size of the HWE bias  have been identified for mortality studies (some of which may also affect this bias in morbidity studies), and include:
Efforts should be made to avoid bias from the HWE.
Diagnostic bias can also occur in cohort studies if the diagnosis depends on the knowledge of the exposure status.
Example: in a cohort study of risk factors for mesothelioma, understanding that identification of mesothelioma is based on a difficult histological diagnosis, histopathologists may be more likely to diagnose a biopsy as mesothelioma if a history of asbestos exposure is reported.
In a cohort study, non-response matters only if it is associated with both the exposure and the outcome/ disease (see also non-response bias in case-control studies). Efforts should be made to prevent non-response bias.
Example: the table below illustrates the results of a hypothetical cohort study where the following scenarios occur:
This bias reflects differences in completeness of follow-up between comparison (exposure) groups i.e. exposed and unexposed. It is a problem for cohort studies as the length of time a cohort needs to be followed up can make if difficult to follow all subjects until the end of the study e.g. due to people moving, losing contact etc. If subjects are lost randomly (in both exposure groups), this does not create loss to follow-up bias  (we will just have a smaller sample size/ study population on which to base our RR calculation, and wider confidence intervals ).
Loss to follow-up bias occurs if the loss of follow-up is associated with both exposure and outcome e.g. associated with exposed cases. It behaves similarly to non-response bias in cohort studies. Differences in loss to follow-up between exposure groups can lead to bias as the people who are lost to follow-up may be more (or less) likely to have developed the outcome of interest .
Example: in a cohort study looking at smoking as a risk factor for development of lung cancer, loss to follow-up bias occurs if smokers who have lung cancer are more likely to be lost to follow-up (e.g. if they are more likely to die from lung cancer) than non-smokers with lung cancer.
1. Bailey L, Vardulaki K, Langham J, Chandramohan D. Introduction to Epidemiology. Black N, Raine R, editors. London: Open University Press in collaboration with LSHTM; 2006.
2. Rothman KJ. Epidemiology - An Introduction. New York: Oxford University Press; 2002.
3. Le Moual N, Kauffmann F, Eisen EA, Kennedy SM. The healthy worker effect in asthma: work may cause asthma, but asthma may also influence work. Am J Respir Crit Care Med. 2008 Jan 1; 177(1):4-10. Epub 2007 Sep 13.
4. Baillargeon J. Characteristics of the healthy worker effect. Occup Med. 2001 Apr-Jun;16(2):359-66.
5. Giesecke J. Modern Infectious Disease Epidemiology. 2nd ed. London: Arnold; 2002.