Field Epidemiology Manual Wiki

Molecular epidemiology

Last modified at 4/21/2016 7:41 PM by Vladimir Prikazsky

1. Molecular epidemiology of infectious diseases.

Different definitions of molecular epidemiology can be found in the literature. In the field of infectious diseases, molecular epidemiology has been defined as the use of molecular typing methods for infectious agents in order to study the distribution, dynamics, and determinants of health and disease in human populations [1,2]. In particular, molecular epidemiology of infectious diseases combines traditional epidemiological methods with analysis of genome polymorphisms of pathogens over time, place and person across human populations and relevant reservoirs, to study host–pathogen interactions and infer hypotheses about host-to-host or source-to-host transmission [3,4,5].

Molecular typing methods are used to study the genomic organisation and evolution of pathogens, to identify patterns of infection and sources of transmission, as standard component of epidemiological surveillance of infectious diseases and to support outbreak investigations. Of particular interest is the application of these typing methods to study markers associated with pathogenicity and antibiotic resistance (see Roles for microbial typing in HCAI prevention and control and Antimicrobial Stewardship).Notably, the results of studies using molecular typing methods (such as laboratory-based surveillance) do not substitute for a comprehensive epidemiological investigation (such as patient-based surveillance), but laboratory studies and epidemiological studies should be analyzed in parallel and results should be integrated as complementary components of the overall investigation [6].  

Traditional phenotypic typing systems, based on microbiological, biochemical, serological, and physiological characters, have been widely used (see Phenotypic typing), but genotyping methods that examine the relatedness of isolates at a molecular level have changed the ability to differentiate among bacterial types or subtypes and have been used all over the world (see Overview of molecular typing methods).

2. Comparative and library typing systems.

Usually, typing methods can be described as comparative or library typing systems:

  1. in the first approach (comparative), mainly used for outbreak investigation, a set of outbreak-related and unrelated isolates are tested to identify outbreak-related strains and to distinguish epidemic from endemic or sporadic isolates. In the long term, comparison between outbreak-related isolates and other isolates collected at different times, from the past or future, is not relevant. Generally, comparative systems produce significant results only in a local context for delineation of isolates closely related from those significantly different in genomic backgrounds;

  2. on the contrary, library typing methods (or definitive typing), that use more stable genotypic markers, are useful to compare strains from a current outbreak with previous circulating strains in order to monitor clonal spread and distribution in different populations over extended periods of time. Interestingly, library typing methods can be used in different laboratories at various time intervals, in order to generate data to be aggregated in a single database for comparative assessment in great detail, at any time, in long-term retrospective and prospective multicenter studies, as well as epidemiological surveillance studies. Notably, library typing methods should be robust and sufficiently standardized, thus various international networks developed databases on the basis of molecular typing data in order to standardize library typing methods (see Table 1).

Comparative or library epidemiological typing systems are not to be considered as intrinsic characters of each method but an alternative way of use of it. As such, for example, PFGE may be used as comparative typing in outbreak investigations and as library typing in surveillance of infectious diseases [3,4,7]

3. Overview of molecular typing methods.

Over the last years different molecular typing methods have been developed and used all over the world. The selection of an appropriate molecular typing method depends essentially on the problem to explain - particularly for the epidemiological surveillance of infectious diseases, including healthcare associated infections, and for outbreak investigation - and on the level at which typing is being used, the epidemiological context, and the time and geographical scale of its use. Typing methods need to be evaluated and validated with respect to a number of criteria (see Criteria for assessing microbial typing systems). Guidelines and general criteria have been proposed to interpret the obtained results [3,4,8].

Based on recent published reviews [8,9] the Table below reports the characteristics, advantages, and limits of the main molecular typing methods, currently used in outbreak investigations and in epidemiological surveillance studies. Furthermore, some useful links to web site and/or to online databases for bacterial typing are reported. There has also been the modification of the STROBE tools to improve the reporting of molecular epidemiology for infectious diseases [2].

Table 1.Main molecular typing methods.

Molecular typing method




Useful links/database

Pulsed-field gel electrophoresis (PFGE)


Whole genome restriction polymorphism

Excellent discriminatory power

High intra- and inter-laboratory reproducibility

High epidemiological concordance

Moderate cost

Limited ease of use

Not rapid

Limited portability

Moderate interpretation

Low resolution for similar fragments size


Amplified fragment length polymorphism (AFLP)

Selective PCR amplification of a subset of restriction fragments

Excellent discriminatory power

High reproducibility

Limited ease of use

Not rapid

High cost


Random Amplification of Polymorphic DNA (RAPD)

PCR amplification of random segments of genomic DNA with single primer of arbitrary nucleotide sequence

High rapidity

Ease of use

Low cost




Low discriminatory power

Low intra-laboratory reproducibility


Repetitive-element polymerase chain reaction (rep-PCR)


PCR amplification of non coding intergenic

repetitive sequences

High rapidity

High discriminatory power

Ease of use

Low cost

Low inter- laboratory reproducibility (improved by semi-automated   commercial systems)

Variable-Number Tandem Repeat (VNTR) typing and Multilocus VNTR analysis


PCR amplification of polymorphisms of genomic variable number tandem repeat elements



Excellent reproducibility

High discriminatory power

Ease of use


High Rapidity

Moderate cost

Moderate inter-laboratory reproducibility

Single Locus Sequence Typing (SLST)

Sequencing of single

target gene

High discriminatory power for some species (e.g. spa-typing for S. aureus)

Ease of use

High rapidity

Moderate cost

Potential misclassification of particular types, due to recombination and/or homoplasy


Multilocus sequence typing (MLST)


Sequencing of allelic variants of 7 housekeeping genes.

Excellent reproducibility


Standard nomenclature

High discriminatory power (not for all species)

Limited ease of use

Not rapid

Limited accessibility

High cost

Comparative genomic hybridisation (CGH): microarrays


Labelled cDNA/RNA, hybridized with specific probes

High throughput technique

Simultaneous genotyping and profiling


Poor accessibility

The intra- and inter-laboratory reproducibility of microarray data needs to be established prior to the application

High cost

Whole Genome - Next generation Sequencing (WG-NGS)

Sequencing of multiple, overlapped regions

High throughput technique

Limited ease of use

Limited accessibility


4. Next Generation Sequencing (NGS).

In the last years, Next Generation Sequencing (NGS) technologies (also called second generation sequencing o high-throughput sequencing) have revolutionized molecular typing methods providing the possibility to obtain complete or nearly complete genome sequences (often approximately 90% of the entire genome) of thousands of strains (Whole Genome Sequencing, WGS).


WGS has already been used for the accurate identification of bacterial isolates, for the characterization of strains in large outbreaks at national/international levels and to reveal the global genetic diversity of pathogens. It is expected that in the near future, WGS will replace currently used typing methods. In particular, WGS has the potential to compare different genomes with a single-nucleotide resolution and this allows a precise characterisation of cross-transmission episodes and outbreaks. In addition WGS can also be useful for well defining phenotypic characteristics, such as the virulence or antibiotic resistance of a particular pathogen.


The strong advantage of NGS versus traditional Sanger sequencing is the ability to generate millions of reads in single runs at comparatively low costs. However, WGS is still too laborious and time-consuming to obtain useful data in routine surveillance and in small research and clinical laboratories.

Notably, the development of NGS technologies was accompanied by the generation of huge amounts of data leading to the need to develop web-based bioinformatics platforms for rapid data processing and analysis. The NGS revolution will not be extensively available to health professionals and the results cannot be applied in everyday clinical practice, until several bioinformatics challenges have been solved. The integration of genomic and epidemiological databases and NGS data will be the next frontier in bacterial epidemiology in order to empower stakeholders in public health decisions [7,8].



  1. Hall A. What is molecular epidemiology? Trop Med Int Health 1996; 1: 407–08.

  2. Field N, Cohen T, Struelens MJ, et al. Strengthening the Reporting of Molecular Epidemiology for Infectious Diseases (STROME-ID): an extension of the STROBE statement. Lancet Infect Dis2014; 14: 341–52.

  3. Struelens MJ, De Gheldre Y, Deplano A. Comparative and library epidemiological typing systems: outbreak investigations versus surveillance systems. Infect Control Hosp Epidemiol 1998;19(8):565-9.

  4. Van Belkum A, Tassios PT, Dijkshoorn L, Haeggman S, Cookson B, Fry NK, et al. Guidelines for the validation and application of typing methods for use in bacterial epidemiology. Clin Microbiol Infect. 2007;13 Suppl 3:1-46

  5. Struelens MJ, Brisse S. From molecular to genomic epidemiology: transforming surveillance and control of infectious diseases. Euro Surveill 2013;18(4):20386.

  6. Agodi A, Voulgari E, Barchitta M, Quattrocchi A, Bellocchi P, Poulou A, et al. Spread of a carbapenem- and colistin-resistant Acinetobacter baumannii ST2 clonal strain causing outbreaks in two Sicilian hospitals. J Hosp Infect. 2014; 86(4): 260-6. doi: 10.1016/j.jhin.2014.02.001.

  7. Boccia S, Pasquarella C, Colotto M, Barchitta M, Quattrocchi A, Agodi A, and the Public Health Genomics and GISIO Working Groups of the Italian Society of Hygiene, Preventive Medicine and Public Health (SItI). Molecular epidemiology tools in the management of healthcare-associated infections: towards the definition of recommendations. Epidemiol Prev. 2015; 39 (5): 21-26. Carriço JA, Sabat AJ, Friedrich AW, Ramirez M, on behalf of the ESCMID Study Group for Epidemiological Markers (ESGEM). Bioinformatics in bacterial molecular epidemiology and public health: databases, tools and the next-generation sequencing revolution. Euro Surveill. 2013;18(4):pii=20382.

  8. Sabat AJ, Budimir A, Nashev D, Sá-Leão R, van Dijl Jm, Laurent F, et al, on behalf of the ESCMID Study Group of Epidemiological Markers (ESGEM). Overview of molecular typing methods for outbreak detection and epidemiological surveillance. Euro Surveill. 2013;18(4):20380.

  9. Ranjbar R, Karami A, Farshad S, Giammanco GM, Mammina C. Typing methods used in the molecular epidemiology of microbial pathogens: a how-to guide. New Microbiol 2014; 37:1-15.

Original contribution from:

Antonella Agodi, Dept of Medical and Surgical Sciences and Advanced Technologies “GF Ingrassia”, University of Catania, Catania, Italy.