Having determined the sources from which data are to be collected, there are a range of issues that need to be decided upon regarding the data that is to be collected and the principles of operation of the system. Theses issues may be classified under the following headings:

  • Data quality
  • Data governance
  • Operating procedures

The concept of setting quality standards for data, and measuring data against those standards, is well established for readily measurable aspects of data quality, particularly the validity, timeliness and completeness of data items. Simple quality standards can be set, and measured against, such as maximum acceptable levels of missing items in particular data fields and mean, median or maximum acceptable time between event detection and report to the surveillance system.

Quality criteria for data and information, and the systems that are used to process and deliver those data and information, are frequently expressed in terms of the following dimensions:


Type of measure

Quality target



Information should be sufficiently complete to be fit for purpose



Information should be available when it is needed



Information should be sufficiently free from error to be fit for purpose



Information should be contextually appropriate



Provenance, objectivity, believability



Information should be formatted to satisfy users' needs

Harder to measure but perhaps more important aspects in the context of overall information quality, such as accuracy and relevance, have sometimes been neglected in the setting of standards.

One approach to addressing the question of how to ensure that case reports are relevant and reliable is to develop case definitions for reporting of surveillance data. This is more likely to be feasible if the case definitions are developed as part of the overall process of developing a new surveillance system, when there is the opportunity to assess the acceptability and resource requirements for the reporting mechanisms, training of staff, and supporting investigations (e.g. to ensure that all suspected cases have appropriate laboratory investigations undertaken) required to ensure compliance with the adoption of the case definitions. As with any case definitions, those used for surveillance need to be defined so that they achieve the desired level of sensitivity and specificity in terms of case ascertainment. Case definitions used for surveillance will often need to be more sensitive (and less specific) for case ascertainment than those used in analytical epidemiological studies, since the purpose of surveillance is frequently to provide an early warning of possible emergence of disease outbreaks or rising trends, which can then be assessed through further epidemiological study.

Retrospective implementation of case definitions, particularly where those definitions are based on specific laboratory investigations or the collection of specific exposure or risk factor data, can pose significant problems in terms of the cost and acceptability to reporters and surveillance system operators. It may, however, be possible to categorise data collected through pre-existing surveillance systems against case definitions that have been developed at a later stage, even if it is not possible in the short to medium term to adapt the systems to report according to a particular level of case definition.

Data collected should be relevant and sufficient to meeting surveillance objectives, and should be restricted to only items that are required to meet the objectives of surveillance. Additional, non-essential data items, which are often collected on a 'nice to know' basis rather than because they are justified in terms of meeting explicit objectives, place additional burdens on data providers and on the supporting information systems and, if the data set being reported is person identifying, may breach data protection restrictions.

Anonymised data should be used where there is no need to be able to identify individuals or there is no other reliable method of achieving record linkage between different data sets. In the case of infectious disease surveillance, it is often necessary to collect person identifying information in order to be able to contact cases rapidly in order to undertake follow up and contact tracing and/or outbreak investigations. When person identifying data are used they should be kept secure and disclosed only on a strict 'need to know' basis, and in accordance with Data Protection laws.

Surveillance systems used to capture, analyse, and disseminate information should be operated to agreed standards. The development and adoption of standard operating protocols and case definitions provides a mechanism for ensuring that surveillance systems operate in a consistent and explicit manner over time and place. The scope of operating protocols should include a statement of purpose, case definitions or definitions of hazards and exposures, laboratory investigation protocols (where appropriate), sources of data (e.g. the type of clinical service from which the data are to be captured, and the 'sampling' approach - universal, random or convenience sample, sentinel), the data items to be collected (including level of person identifier required), the outputs, and the roles and responsibilities of those involved in the surveillance process and the custodian or owner of the system. Publication of these protocols helps to make the purpose and governance arrangements for surveillance systems explicit to data providers, data subjects and the recipients of surveillance outputs.

Many of these suggested components of an operating protocol have been covered earlier in this chapter. One component that requires mention is that of guidance on what and when to report. To some extent this guidance can be provided through the publication of case definitions for reporting, but such definitions do not exist for many surveillance systems, particularly those that cover a wide range of infections (such as laboratory reporting schemes that capture data on all organisms identified by reporting laboratories). Even when case definitions do exist, guidance may be required as to whether cases should only be reported when they meet certain criteria (e.g. those for a confirmed case) or when all exposure and risk factor data are available, or whether preliminary reporting should be made on the basis of suspected case identification and/or when only partial exposure or risk factor data are available (in which case, clear mechanisms need to be defined for how more detailed information should be reported at a later date). Protocols should also cover the issue of how frequently data should be reported, and through what mechanism, and, where electronic reporting systems are used, what form and structure the data should be reported in.

Surveillance systems should also be subject to regular audit against their objectives and periodic evaluation.