Skip to main content
Log in

Using scan statistics for congenital anomalies surveillance: the EUROCAT methodology

  • PERINATAL EPIDEMIOLOGY
  • Published:
European Journal of Epidemiology Aims and scope Submit manuscript

Abstract

Scan statistics have been used extensively to identify temporal clusters of health events. We describe the temporal cluster detection methodology adopted by the EUROCAT (European Surveillance of Congenital Anomalies) monitoring system. Since 2001, EUROCAT has implemented variable window width scan statistic for detecting unusual temporal aggregations of congenital anomaly cases. The scan windows are based on numbers of cases rather than being defined by time. The methodology is imbedded in the EUROCAT Central Database for annual application to centrally held registry data. The methodology was incrementally adapted to improve the utility and to address statistical issues. Simulation exercises were used to determine the power of the methodology to identify periods of raised risk (of 1–18 months). In order to operationalize the scan methodology, a number of adaptations were needed, including: estimating date of conception as unit of time; deciding the maximum length (in time) and recency of clusters of interest; reporting of multiple and overlapping significant clusters; replacing the Monte Carlo simulation with a lookup table to reduce computation time; and placing a threshold on underlying population change and estimating the false positive rate by simulation. Exploration of power found that raised risk periods lasting 1 month are unlikely to be detected except when the relative risk and case counts are high. The variable window width scan statistic is a useful tool for the surveillance of congenital anomalies. Numerous adaptations have improved the utility of the original methodology in the context of temporal cluster detection in congenital anomalies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Naus J, Wallenstein S. Temporal surveillance using scan statistics. Stat Med. 2006;25(2):311–24.

    Article  PubMed  Google Scholar 

  2. Dolk H, Loane M, Garne E. The prevalence of congenital anomalies in Europe. Adv Exp Med Biol. 2010;686:349–64.

    Article  PubMed  Google Scholar 

  3. Khoshnood B, Greenlees R, Loane M, Dolk H. Paper 2: EUROCAT public health indicators for congenital anomalies in Europe. Birth Defects Res A Clin Mol Teratol. 2011;91(Suppl 1):S16–22.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  4. Boyd PA, Haeusler M, Barisic I, Loane M, Garne E, Dolk H. Paper 1: the EUROCAT network–organization and processes. Birth Defects Res A Clin Mol Teratol. 2011;91(Suppl 1):S2–15.

    Article  PubMed  CAS  Google Scholar 

  5. Loane M, Dolk H, Kelly A, Teljeur C, Greenlees R, Densem J. Paper 4: EUROCAT statistical monitoring: identification and investigation of 10 year trends of congenital anomalies in Europe. Birth Defects Res A Clin Mol Teratol. 2011;91(Suppl 1):S31–43.

    Article  PubMed  CAS  Google Scholar 

  6. Dolk H, Loane M, Teljeur C, Densem J, Greenlees R, McCullough N et al. Detection and investigation of temporal clusters of congenital anomaly in Europe: seven years of experience of the EUROCAT surveillance system. Epidemiology 2015 [Epub ahead of print, PMID: 25840712).

  7. Centres for Disease Control and Prevention. Guidelines for investigating clusters of health events. Morbidity and mortality weekly report 1990; 39(RR-11):1–16.

  8. Sego LH, Woodall WH, Reynolds MR. A comparison of surveillance methods for small incidence rates. Stat Med. 2008;27(8):1225–47.

    Article  PubMed  Google Scholar 

  9. Quataert PK, Armstrong B, Berghold A, Bianchi F, Kelly A, Marchi M, Martuzzi M, Rosano A. Methodological problems and the role of statistics in cluster response studies: a framework. Eur J Epidemiol. 1999;15(9):821–31.

    Article  PubMed  CAS  Google Scholar 

  10. Tango T. Tests for temporal clustering. In: Gail M, Krickeberg K, Samet J, Tsiatis A, Wong W, editors. Statistical methods for disease clustering. New York: Springer; 2010. p. 49–70.

    Chapter  Google Scholar 

  11. Nagarwalla N. A scan statistic with a variable window. Stat Med. 1996;15(7–9):845–50.

    Article  PubMed  CAS  Google Scholar 

  12. Naus JI. The distribution of the size of the maximum cluster of points on a line. J Am Stat Assoc. 1965;60(310):532–8.

    Article  Google Scholar 

  13. Glaz J, Zhang Z. Multiple window discrete scan statistics. J Appl Stat. 2004;31(8):967–80.

    Article  Google Scholar 

  14. McCullough N, Loane M, Greenlees R, Dolk H. EUROCAT statistical monitoring report—2009. 2012. Newtown abbey, Northern Ireland, EUROCAT Central Registry.

  15. Kulldorff M. SaTScan user guide for version 9.3. 2014. Boston, USA.

  16. Weinstock MA. A generalised scan statistic test for the detection of clusters. Int J Epidemiol. 1981;10(3):289–93.

    Article  PubMed  CAS  Google Scholar 

  17. Unkel S, Farrington CP, Garthwaite PH, Robertson C, Andrews N. Statistical methods for the prospective detection of infectious disease outbreaks: a review. J R Stat Soc Ser A Stat Soc. 2012;175(1):49–82.

    Article  Google Scholar 

  18. Fricker RD. Some methodological issues in biosurveillance. Stat Med. 2011;30(5):403–15.

    Article  PubMed  Google Scholar 

  19. Tango T. A test for spatial disease clustering adjusted for multiple testing. Stat Med. 2000;19(2):191–204.

    Article  PubMed  CAS  Google Scholar 

  20. Cucala L. A hypothesis-free multiple scan statistic with variable window. Biom J. 2008;50(2):299–310.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We wish to thank Joan Morris and Fabrizio Bianchi for their helpful comments in an earlier draft of this manuscript. EU Commission Public Health Programme EUROCAT Joint Action 2011–2013 (Grant No. 2010 22 04).

Conflict of interest

None.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Helen Dolk.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 28 kb)

Supplementary material 2 (DOCX 59 kb)

Appendix: The scan statistic defined by Nagarwalla [11]

Appendix: The scan statistic defined by Nagarwalla [11]

An analysis requires the dates of a set of cases, and the specification of the start and end dates for the analysis period. For a given subset of n cases occurring between date start and date end , it is possible to calculate the test statistic lambda as follows:

$$\lambda = \left( \frac{n}{r} \right)^{n} \left( {\frac{r - n}{n}} \right)^{r - n} \left( \frac{1}{d} \right)^{n} \left( {\frac{1}{1 - d}} \right)^{r - n}$$

Where r is the total number of cases, and

$$d = \frac{{date_{last} - date_{first} }}{{date_{end} - date_{start} }}$$

The dates date first and date last are the dates of the first and last cases in the scanning window, respectively. Typically a minimum scanning window of 5 is used. Lambda is calculated for every possible subset of n consecutive cases for n = 5,…, r

The lambda for each subset of cases is calculated and the largest found is recorded along with the details of the start and end dates. Next, simulated datasets are generated each with r uniform random numbers in the interval [0,1]. For each simulation, the lambda for each subset is calculated and the largest lambda is recorded. After, for example, 999 iterations the lambdas are ordered by size from smallest to largest and the 95th percentile is noted as λsig. If the lambda from the real dataset is greater than to λsig then the cases are designated as a significant cluster.

The method assumes a uniform distribution of cases and hence no seasonal variation.

Multiple cases in the observed data can occur on the same day. Where the number of cases in a single day is equal to or greater than the minimum window size, the d value for that window will equal zero and hence lambda will equal infinity. A nominal value equivalent to an hour is added to each case that occurs on the same day, thereby distributing multiple cases. An upper limit of 24 cases within a day is a reasonable assumption in our context. The alternative of distributing same-day cases uniformly across 24 h would, in theory, be preferred.

It should be noted that different statistical terminology was adopted in EUROCAT to avoid confusion with variable names in computer coding. Hence our notation uses ‘r’ in place of ‘N’. To maintain consistency between the code and this article, we have used same terminology in the code and the paper.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Teljeur, C., Kelly, A., Loane, M. et al. Using scan statistics for congenital anomalies surveillance: the EUROCAT methodology. Eur J Epidemiol 30, 1165–1173 (2015). https://doi.org/10.1007/s10654-015-0044-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10654-015-0044-3

Keywords

Navigation