Skip to main content
Log in

Assessing spatial and attribute errors in large national datasets for population distribution models: a case study of Philadelphia county schools

  • Published:
GeoJournal Aims and scope Submit manuscript

Abstract

Geospatial technologies and digital data have developed and disseminated rapidly in conjunction with increasing computing efficiency and Internet availability. The ability to store and transmit large datasets has encouraged the development of national infrastructure datasets in geospatial formats. National datasets are used by numerous agencies for analysis and modeling purposes because these datasets are standardized and considered to be of acceptable accuracy for national scale applications. At Oak Ridge National Laboratory a population model has been developed that incorporates national schools data as one of the model inputs. This paper evaluates spatial and attribute inaccuracies present within two national school datasets, Tele Atlas North America and National Center of Education Statistics (NCES).

Schools are an important component of the population model, because they are spatially dense clusters of vulnerable populations. It is therefore essential to validate the quality of school input data. Schools were also chosen since a validated schools dataset was produced in geospatial format for Philadelphia County; thereby enabling a comparison between a local dataset and the national datasets.

Analyses found the national datasets are not standardized and incomplete, containing 76 to 90 percent of existing schools. The temporal accuracy of updating annual enrollment values resulted in 89 percent inaccuracy for 2003. Spatial rectification was required for 87 percent of NCES points, of which 58 percent of the errors were attributed to the geocoding process. Lastly, it was found that by combining the two national datasets, the resultant dataset provided a more useful and accurate solution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Anderson, J. R., Hardy, E. E., Roach, J. T., & Witmer, R. E. (1976). A land use and land cover classification system for use with remote sensor data. U.S. Geological Survey, Professional Paper 964, Washington, DC.

  • Bhaduri, B., Bright, E., Coleman, P., & Urban, M. L. (2007). LandScan USA: A modeling approach for population distribution and dynamics. GeoJournal (in press).

  • Cayo, M. R., & Talbot, T. O. (2003). Positional error in automated geocoding of residential addresses. International Journal of Health Geographics, 2(10), 1–12.

    Google Scholar 

  • Ganguly, A. R., Protopopescu, V., & Sorokine, A. (2005). A bottom-up strategy for uncertainty quantification in complex geo-computational models. GeoComputatioin 2005: 8th International Conference on GeoComputation, Ann Arbor, MI, August 2005, 1–5.

  • Goodchild, M. F. (1994). Integrating GIS and remote sensing for vegetation analysis and modeling: Methodological issues. Journal of Vegetation Science, 5, 615–626.

    Article  Google Scholar 

  • Holmes, K. W., Chadwick, O. A., & Kyriakidis, P. C. (2000). Error in USGS 30 meter digital elevation model and its impact on terrain modeling. Journal of Hydrology, 233, 154–173.

    Article  Google Scholar 

  • Khorram, S, Knight, J, Dai, X, Yuan, H, & Cakir, H. I. (2000) Issues Involved in the Accuracy Assessment of Large Scale Land Use/Land Cover Mapping and Monitoring from Remotely Sensed Data. Geoscience and Remote Sensing Symposium, IGARSS’2000 Proceedings. IEEE 2000 International.

  • National Center for Education Statistics (2005) http://nces.ed.gov/index.asp

  • Patterson, L., Urban, M., Myers, A., Bhaduri, B., Bright, E., Coleman, P. (Forthcoming) “The effects of quality control on decreasing error propagation in the landscan USA population distribution model”.

  • Sui, D. Z., Goodchild, M. F. (2001). GIS as media? International Journal of Geographical Information Science, 15(5), 387–390.

    Article  Google Scholar 

  • The Office of Americas/North America & Homeland Security Division (PMH) (2005), Homeland Security Infrastructure Protection (HSIP) Gold.

  • Tele Atlas North America (2005), Dynamap ®/2000, Release 05.2.1.

  • Tveite, H., Langass, S. (1999). An accuracy assessment method for geographical line data sets based on buffering. International Journal of Information Science, 13(1), 27–47.

    Article  Google Scholar 

  • US Census Bureau, Population Division (2006). Table 2: Annual Estimates of the Population by Selected Age Groups and Sex for the United States: April 1, 2000 to July 1, 2005 (NC-EST2005-02).

  • U.S. Geological Survey (2006) Accuracy assessment of 1992 landcover data, http://landcover.usgs.gov/accuracy/index.php

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lauren Patterson.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patterson, L., Urban, M., Myers, A. et al. Assessing spatial and attribute errors in large national datasets for population distribution models: a case study of Philadelphia county schools. GeoJournal 69, 93–102 (2007). https://doi.org/10.1007/s10708-007-9099-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10708-007-9099-3

Keywords

Navigation