Skip to main content
Log in

Misclassification errors from postal code-based geocoding to assign census geography in Nova Scotia, Canada

  • Quantitative Research
  • Published:
Canadian Journal of Public Health Aims and scope Submit manuscript

Abstract

OBJECTIVES: Postal codes are often the only available geographic identifiers in many sources of health data in Canada. In order to conduct geographic analyses, postal codes are routinely geocoded to census geography to link to ecological data. Despite common use of this method, the extent of geographic misclassification errors is poorly understood. We estimated misclassification errors in the geocoding of postal codes to assign census geography in Nova Scotia, Canada.

METHODS: We examined differences between counts and match rates for postal-code geocoded and actual locations of buildings in Nova Scotia at two census administrative area levels: dissemination areas (DAs) and census subdivisions (CSDs). Actual locations were based on the data collected by the provincial government containing actual latitude/longitude of buildings. Variation in misclassification by rurality, using Statistics Canada’s classification, was also assessed.

RESULTS: Outside two urban areas (Halifax Metro and Sydney) which had <10% differences in counts, many DAs had >30% differences. Match rates showed similar patterns, with the vast majority of non-urban DAs having <40% match rates. Even in major urban areas, 10% of DAs had large misclassification errors. Misclassification errors at the CSD level were still too great to estimate counts or rates without further area aggregation.

CONCLUSION: Routine use of postal code geocoding should be replaced with geocoding of location information using additional identifiers such as civic addresses or latitude and longitude. If data holders did this in-house before providing data to researchers, the accuracy and capacity of geographic analysis would be enhanced while protecting confidentiality.

Résumé

OBJECTIFS: Les codes postaux sont souvent les seuls identifiants géographiques disponibles dans de nombreuses sources de données sanitaires au Canada. Afin de procéder à des analyses géographiques, les codes postaux sont habituellement géocodés à la géographie du recensement pour être reliés aux données écologiques. Bien que ce soit une méthode couramment utilisée, on connaît mal l’étendue des erreurs de classification géographique. Nous avons estimé les erreurs de classification dans le géocodage des codes postaux pour fins d’association à la géographie du recensement en Nouvelle-Écosse, au Canada.

MÉTHODE: Nous avons examiné les écarts entre les numérations et les taux d’appariement d’emplacements géocodés selon le code postal et d’emplacements réels de bâtiments en Nouvelle-Écosse à deux niveaux de régions administratives du recensement: les aires de diffusion (AD) et les subdivisions de recensement (SDR). Les emplacements réels ont été déterminés selon les données recueillies par le gouvernement provincial indiquant la latitude et la longitude réelles des bâtiments. Nous avons aussi évalué la variation des erreurs de classification par ruralité à l’aide de la classification de Statistique Canada.

RÉSULTATS: Sauf dans deux agglomérations urbaines (Sydney et la région métropolitaine de Halifax) où il y avait <10 % d’écarts dans les numérations, beaucoup d’AD affichaient des écarts >30 %. Les tendances étaient semblables pour les taux d’appariement: la très grande majorité des AD non urbaines affichaient des taux d’appariement <40 %. Même dans les grandes agglomérations urbaines, 10 % des AD comportaient d’importantes erreurs de classification. Les erreurs de classification à l’échelle des SDR étaient encore trop importantes pour estimer les numérations ou les taux sans un regroupement plus poussé des zones.

CONCLUSION: L’utilisation habituelle du géocodage par code postal devrait être remplacée par le géocodage de l’information de localisation à l’aide d’identifiants supplémentaires, comme les adresses de voirie ou la latitude et la longitude. Si les détenteurs de données faisaient cela à l’interne avant de fournir leurs données aux chercheurs, l’exactitude et la capacité des analyses géographiques seraient rehaussées, et la confidentialité des données serait protégée.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Krieger N, Waterman P, Lemieux K, Zieler S, Hogan JW. Evaluating the accuracy of geocoding in public health research. Am J Public Health 2001; 90:1114–16.

    Google Scholar 

  2. Rushton G, Armstrong MP, Gittler J, Greene BR, Pavlik CE, West MM, et al. Geocoding in cancer research: A review. Am J Prev Med 2006;30(2S):S16–24. doi: 10.1016/j.amepre.2005.09.011.

    Article  Google Scholar 

  3. Auger N, Daniel M, Platt RW, Wu Y, Luo ZC, Choiniere R. Association between perceived security of the neighbourhood and small-for-gestational-age birth. Paediatr Perinat Epidemiol 2008;22(5):467–77. doi: 10.1111/j.1365-3016.2008.00959.x.

    Article  Google Scholar 

  4. Wilkins R, Peters PA. PCCF+ Version 5K User’s Guide. Automated Geographic Coding Based on the Statistics Canada Postal Code Conversion Files, Including Postal Codes Through May 2011. Catalogue no. 82F0086-XDB. Ottawa, ON: Health Analysis Division, Statistics Canada, 2012.

    Google Scholar 

  5. Peller P. An Analysis of the Postal Code Conversion File’s Use in Research. Calgary, AB: University of Calgary, 2011; 1–24.

    Google Scholar 

  6. Jacquez GM. A research agenda: Does geocoding positional error matter in health GIS studies? Spat Spatio-temporal Epidemiol 2012;3:7–16. doi: 10.1016/j. sste.2012.02.002.

    Article  Google Scholar 

  7. Bell NJ, Schuurman N, Morad Hameed S. A small-area population analysis of socioeconomic status and incidence of severe burn/fire-related injury in British Columbia, Canada. Burns 2009;35(8):1133–41. PMID: 19553025. doi: 10.1016/j.burns.2009.04.028.

    Article  Google Scholar 

  8. Wang C, Guttmann A, To T, Dick PT. Neighborhood income and health outcomes in infants: How do those with complex chronic conditions fare? Arch Pediatr Adolesc Med 2009;163(7):608–15. PMID: 19581543. doi: 10.1001/ archpediatrics.2009.36.

    Article  Google Scholar 

  9. Zhang X, Onufrak S, Holt JB, Croft JB. A multilevel approach to estimating small area childhood obesity prevalence at the census block-group level. Prev Chronic Dis 2013;10:E68. doi: 10.5888/pcd10.120252.

    Google Scholar 

  10. Terashima M, Guernsey JR, Andreou P. What type of rural? Assessing the variations in life expectancy at birth at small area-level for a small population province using classes of locally defined settlement types. BMC Public Health 2014;14:162. PMID: 24524307. doi: 10.1186/1471-2458-14-162.

    Article  Google Scholar 

  11. Pampalon R, Hamel D, Gamache P. Recent changes in the geography of social disparities in premature mortality in Québec. Soc Sci Med 2008;67(8):1269–81. PMID: 18639966. doi: 10.1016/j.socscimed.2008.06.010.

    Article  Google Scholar 

  12. Matheson FI, Moineddin R, Glazier RH. The weight of place: A multilevel analysis of gender, neighborhood material deprivation, and body mass index among Canadian adults. Soc Sci Med 2008;66(3):675–90. PMID: 18036712. doi: 10.1016/j.socscimed.2007.10.008.

    Article  Google Scholar 

  13. Terashima M, Rainham DGC, Levy AR. A small-area analysis of inequalities in chronic disease prevalence across urban and non-urban communities in the Province of Nova Scotia, Canada, 2007–2012. BMJ Open 2014; 4(e004459):1–10.

    Google Scholar 

  14. Armstrong B. Effect of measurement error on epidemiological studies of environmental and occupational exposures. Occup Environ Med 1998; 55(10):651–56. PMID: 9930084. doi: 10.1136/oem.55.10.651.

    Article  CAS  Google Scholar 

  15. Rhomberg L, Chandalia J, Long J, Goodman J. Measurement error in environmental epidemiology and the shape of exposure-response curves. Crit Rev Toxicol 2011;41(8):651–71. PMID: 21823979. doi: 10.3109/10408444. 2011.563420.

    Article  Google Scholar 

  16. Government of Nova Scotia. Nova Scotia Civic Address Users Guide. Halifax, NS: GeoNOVA, 2015.

    Google Scholar 

  17. Statistics Canada. Postal Code Conversion File Plus (PCCF+) Reference Guide. Catalogue no. 82-E0086-XDB 6A. Ottawa, ON: Statistics Canada, 2014.

    Google Scholar 

  18. Statistics Canada. 2011 Census Dictionary. Catalogue no. 98-301-X, 2012. Available at: http://www12.statcan.gc.ca/census-recensement/2011/ref/dict/index-eng.cfm (Accessed December 10, 2015).

    Google Scholar 

  19. Ross NA, Tremblay S, Graham K. Neighbourhood influences on health in Montreal, Canada. Soc Sci Med 2004;28:443–78.

    Google Scholar 

  20. Goldberg DW, Jacquez GM. Advances in geocoding for the health sciences. Spat Spatio-temporal Epidemiol 2012;3:1–5. doi: 10.1016/j.sste.2012.02.001.

    Article  Google Scholar 

  21. Census of Population. Catalogue no. 12-581-X. Available at: http://www.statcan.gc.ca/pub/12-581-x/2012000/pop-eng.htm (Accessed November 30, 2015).

  22. Iburi S, Fujita J, Yajima H, Kakuda H, Sakamoto M, Matsumura A. The intervention against an outbreak of pulmonary tuberculosis in the dormitory of construction laborers - Connection with approaches from public health, medical treatment, social welfare, and labor management. Kekkaku 2001; 76(11):691–98. PMID: 11766360.

    CAS  PubMed  Google Scholar 

  23. Ratcliffe JH. Geocoding crime and a first estimate of a minimum acceptable hit rate. Int J Geogr Inform Sci 2004;18(1):61–72. doi: 10.1080/ 13658810310001596076.

    Article  Google Scholar 

  24. DMTI Spatial. Platinum Postal Code Suite v2011.3. Markham, ON: Multiple Enhanced Postalcodes (MEP), 2011.

    Google Scholar 

  25. Kephart G, Asada Y, Atherton F, Burge F, Campbell L-A, Dowling L, et al. Small Area Variation in Rates of High-Cost Healthcare Use Across Nova Scotia. Halifax, NS: Maritime SPOR Support Unit, 2016.

    Google Scholar 

  26. Fuller D, Shareck M. Canada Post community mailboxes: Implications for health research. Can J Public Health 2014;105(6):e453-55.

    Google Scholar 

  27. Shah TI, Bell S, Wilson K. Geocoding for public health research: Empirical comparison of two geocoding services applied to Canadian cities. Can Geogr 2014;58(4):400–17. doi: 10.1111/cag.12091.

    Article  Google Scholar 

  28. Office for National Statistics UK. Guidance and Methodology, Super Output Areas. ONS, London, UK.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mikiko Terashima PhD.

Additional information

Conflict of Interest: None to declare

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Terashima, M., Kephart, G. Misclassification errors from postal code-based geocoding to assign census geography in Nova Scotia, Canada. Can J Public Health 107, e424–e430 (2016). https://doi.org/10.17269/CJPH.107.5459

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.17269/CJPH.107.5459

Key words

Mots clés

Navigation