Spatio-temporal change of support modeling with R

Raim, Andrew M.; Holan, Scott H.; Bradley, Jonathan R.; Wikle, Christopher K.

doi:10.1007/s00180-020-01029-4

Spatio-temporal change of support modeling with R

Original paper
Published: 23 September 2020

Volume 36, pages 749–780, (2021)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Andrew M. Raim ORCID: orcid.org/0000-0002-4440-2330¹,
Scott H. Holan^2,3,
Jonathan R. Bradley⁴ &
…
Christopher K. Wikle²

618 Accesses
3 Citations
4 Altmetric
Explore all metrics

Abstract

Spatio-temporal change of support methods are designed for statistical analysis on spatial and temporal domains which can differ from those of the observed data. Previous work introduced a parsimonious class of Bayesian hierarchical spatio-temporal models, which we refer to as STCOS, for the case of Gaussian outcomes. Application of STCOS methodology from this literature requires a level of proficiency with spatio-temporal methods and statistical computing which may be a hurdle for potential users. The present work seeks to bridge this gap by guiding readers through STCOS computations. We focus on the R computing environment because of its popularity, free availability, and high quality contributed packages. The stcos package is introduced to facilitate computations for the STCOS model. A motivating application is the American Community Survey (ACS), an ongoing survey administered by the U.S. Census Bureau that measures key socioeconomic and demographic variables for various populations in the United States. The STCOS methodology offers a principled approach to compute model-based estimates and associated measures of uncertainty for ACS variables on customized geographies and/or time periods. We present a detailed case study with ACS data as a guide for change of support analysis in R, and as a foundation which can be customized to other applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Geographically Weighted Regression

Spatial Data and Spatial Statistics

Quantifying Spatio-Temporal Characteristics via Moran’s Statistics

Notes

https://census.gov/data/developers/data-sets/acs-1year/notes-on-acs-estimate-and-annotation-values.html.

References

Banerjee S, Carlin BP, Gelfand AE (2014) Hierarchical modeling and analysis for spatial data, 2nd edn. Chapman and Hall, London
Book Google Scholar
Bates D, Maechler M (2019) Matrix: sparse and dense matrix classes and methods. https://CRAN.R-project.org/package=Matrix. R package version 1.2-18
Battersby SE, Finn MP, Usery EL, Yamamoto KH (2014) Implications of Web Mercator and its use in online mapping. Cartogr Int J Geogr Inf Geovis 49(2):85–101. https://doi.org/10.3138/carto.49.2.2313
Article Google Scholar
Bivand RS, Pebesma E, Gómez-Rubio V (2013) Applied spatial data analysis with R, 2nd edn. Springer, Berlin
Book Google Scholar
Bradley JR, Holan SH, Wikle CK (2015a) Multivariate spatio-temporal models for high-dimensional areal data with application to longitudinal employer-household dynamics. Ann Appl Stat 9(4):1761–1791. https://doi.org/10.1214/15-AOAS862
Article MathSciNet MATH Google Scholar
Bradley JR, Wikle CK, Holan SH (2015b) Spatio-temporal change of support with application to American Community Survey multi-year period estimates. Stat 4(1):255–270. https://doi.org/10.1002/sta4.94
Article MathSciNet Google Scholar
Breakstone CD, Anderson TS (2019) Census data API user guide. https://www.census.gov/data/developers/guidance/api-user-guide.html. Version 1.6
Brunsdon C (2014) pycno: Pycnophylactic Interpolation. https://CRAN.R-project.org/package=pycno. R package version 1.2
Carpenter B, Gelman A, Hoffman M, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A (2017) Stan: a probabilistic programming language. J Stat Softw 76(1):1–32. https://doi.org/10.18637/jss.v076.i01
Article Google Scholar
Cortes RX, Rey S, Knaap E (2019) pysal/tobler: Tobler initial release. https://doi.org/10.5281/zenodo.3386577
Cressie N, Wikle CK (2011) Statistics for spatio-temporal data. Wiley, New York
MATH Google Scholar
de Valpine P, Turek D, Paciorek CJ, Anderson-Bergman C, Lang DT, Bodik R (2017) Programming with models: writing statistical algorithms for general model structures with NIMBLE. J Comput Graph Stat 26(2):403–413. https://doi.org/10.1080/10618600.2016.1172487
Article MathSciNet Google Scholar
Depaoli S, Clifton JP, Cobb PR (2016) Just another Gibbs sampler (JAGS): flexible software for MCMC implementation. J Educ Behav Stat 41(6):628–649. https://doi.org/10.3102/1076998616664876
Article Google Scholar
Eddelbuettel D (2013) Seamless R and C++ integration with Rcpp. Springer, Berlin
Book Google Scholar
Eddelbuettel D, Sanderson C (2014) RcppArmadillo: accelerating R with high-performance C++ linear algebra. Comput Stat Data Anal 71:1054–1063. https://doi.org/10.1016/j.csda.2013.02.005
Article MathSciNet MATH Google Scholar
Eicher CL, Brewer CA (2001) Dasymetric mapping and areal interpolation: implementation and evaluation. Cartogr Geogr Inf Sci 28(2):125–138. https://doi.org/10.1559/152304001782173727
Article Google Scholar
ESRI (1998) ESRI shapefile technical description. https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf. Accessed 27 May 2020
Fuentes M, Song H-R, Ghosh SK, Holland DM, Davis JM (2006) Spatial association between speciated fine particles and mortality. Biometrics 62(3):855–863. https://doi.org/10.1111/j.1541-0420.2006.00526.x
Article MathSciNet MATH Google Scholar
Gotway CA, Young LJ (2002) Combining incompatible spatial data. J Am Stat Assoc 97(458):632–648. https://doi.org/10.1198/016214502760047140
Article MathSciNet MATH Google Scholar
Higham NJ (1988) Computing a nearest symmetric positive semidefinite matrix. Linear Algebra Appl 103:103–118. https://doi.org/10.1016/0024-3795(88)90223-6
Article MathSciNet MATH Google Scholar
Lam NS-N (1983) Spatial interpolation methods: a review. Am Cartogr 10(2):129–150. https://doi.org/10.1559/152304083783914958
Article Google Scholar
Lunn D, Spiegelhalter D, Thomas A, Best N (2009) The BUGS project: evolution, critique and future directions. Stat Med 28(25):3049–3067. https://doi.org/10.1002/sim.3680
Article MathSciNet Google Scholar
Mileu N, Queirós M (2018) Development of a QGIS plugin to dasymetric mapping. In: Free and open source software for geospatial (FOSS4G) conference proceedings, vol 18, no 9. https://doi.org/10.7275/3628-0a51
National Academy of Sciences (2015) Realizing the potential of the American Community Survey: challenges, tradeoffs, and opportunities. National Academies Press, Washington, DC. https://doi.org/10.17226/21653
Nguyen H, Cressie N, Braverman A (2012) Spatial statistical data fusion for remote sensing applications. J Am Stat Assoc 107(499):1004–1018. https://doi.org/10.1080/01621459.2012.694717
Article MathSciNet MATH Google Scholar
Nychka D, Saltzman N (1998) Design of air quality monitoring networks. Lecture notes in statistics. Springer, pp 51–76. https://doi.org/10.1007/978-1-4612-2226-2_4
Nychka D, Furrer R, Paige J, Sain S (2017) Fields: tools for spatial data. University Corporation for Atmospheric Research, Boulder, CO, USA. https://github.com/NCAR/Fields. R package version 10.3. Accessed 27 May 2020
Ooms J (2014) The jsonlite package: a practical and consistent mapping between JSON data and R objects. arXiv:1403.2805
Pebesma E (2018) Simple features for R: standardized support for spatial vector data. R J 10(1):439–446. https://doi.org/10.32614/RJ-2018-009
Article Google Scholar
Plummer M, Best N, Cowles K, Vines K (2006) CODA: convergence diagnosis and output analysis for MCMC. R News 6(1):7–11
Google Scholar
Prener CG, Revord CK (2019) Areal: an R package for areal weighted interpolation. J Open Source Softw 4(37):1221. https://doi.org/10.21105/joss.01221
Article Google Scholar
Qiu F, Zhang C, Zhou Y (2012) The development of an areal interpolation ArcGIS extension and a comparative study. GISci Remote Sens 49(5):644–663. https://doi.org/10.2747/1548-1603.49.5.644
Article Google Scholar
R Core Team (2020) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Google Scholar
Raim AM, Holan SH, Bradley JR, Wikle CK (2017) A model selection study for spatio-temporal change of support. In: JSM Proceedings, government statistics section. American Statistical Association, Alexandria, pp 1524–1540
Rode M, Arhonditsis G, Balin D, Kebede T, Krysanova V, Van Griensven A, Van der Zee SE (2010) New challenges in integrated water quality modelling. Hydrol Process 24(24):3447–3461. https://doi.org/10.1002/hyp.7766
Article Google Scholar
Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc Ser B (Stat Methodol) 64(4):583–639. https://doi.org/10.1111/1467-9868.00353
Article MathSciNet MATH Google Scholar
Stan Development Team (2020) RStan: the R interface to Stan. http://mc-stan.org/. R package version 2.19.3. Accessed 27 May 2020
Tobler WR (1979) Smooth pycnophylactic interpolation for geographical regions. J Am Stat Assoc 74(367):519–530. https://doi.org/10.1080/01621459.1979.10481647
Article MathSciNet Google Scholar
U.S. Census Bureau (2016) American Community Survey data suppression. https://www.census.gov/programs-surveys/acs/technical-documentation/data-suppression.html. Accessed 2 Sept 2019
U.S. Census Bureau (2018) Understanding and using American Community Survey data: What all data users need to know. https://www.census.gov/programs-surveys/acs/guidance/handbooks/general.html. Accessed 2 Sept 2019
Walker K (2018) tigris: Load Census TIGER/Line Shapefiles. https://CRAN.R-project.org/package=tigris. R package version 0.7. Accessed 27 May 2020
Waller LA, Gotway CA (2004) Applied spatial statistics for public health data. Wiley-Interscience, New York
Book Google Scholar
Weinberg DH, Abowd JM, Belli RF, Cressie N, Folch DC, Holan SH, Levenstein MC, Olson KM, Reiter JP, Shapiro MD, Smyth JD, Soh L-K, Spencer BD, Spielman SE, Vilhuber L, Wikle CK (2018) Effects of a government-academic partnership: Has the NSF-Census Bureau Research Network helped improve the US statistical system? J Surv Stat Methodol. https://doi.org/10.1093/jssam/smy023
Article Google Scholar
Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer, New York
Book Google Scholar
Wickham H, François R, Henry L, Müller K (2020) dplyr: a grammar of data manipulation. https://CRAN.R-project.org/package=dplyr. R package version 0.8.5. Accessed 27 May 2020
Wikle CK, Berliner LM (2005) Combining information across spatial scales. Technometrics 47(1):80–91. https://doi.org/10.1198/004017004000000572
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research was partially supported by the U.S. National Science Foundation (NSF) and the U.S. Census Bureau under NSF grant SES-1132031, funded through the NSF-Census Research Network (NCRN) program, and NSF Awards SES-1853096 and SES-1853099. This article is released to inform interested parties of ongoing research and to encourage discussion. The views expressed on statistical issues are those of the authors and not the NSF or U.S. Census Bureau. The authors thank Taylor Bowen and Toni Messina from the Office of Information Technology/GIS, City of Columbia, Missouri for supplying the shapefile used in the case study and for useful discussion.

Author information

Authors and Affiliations

Center for Statistical Research and Methodology, U.S. Census Bureau, 4600 Silver Hill Road, Washington, DC, 20233, USA
Andrew M. Raim
Department of Statistics, University of Missouri, Columbia, MO, USA
Scott H. Holan & Christopher K. Wikle
Office of the Associate Director for Research and Methodology, Washington, DC, USA
Scott H. Holan
Department of Statistics, Florida State University, Tallahassee, FL, USA
Jonathan R. Bradley

Authors

Andrew M. Raim
View author publications
You can also search for this author in PubMed Google Scholar
Scott H. Holan
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan R. Bradley
View author publications
You can also search for this author in PubMed Google Scholar
Christopher K. Wikle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew M. Raim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 177 KB)

Supplementary material 2 (R 1 KB)

Supplementary material 3 (R 8 KB)

Supplementary material 4 (R 2 KB)

Supplementary material 5 (R 4 KB)

Supplementary material 6 (R 2 KB)

Supplementary material 7 (stan 0 KB)

Supplementary material 8 (R 2 KB)

Supplementary material 9 (R 4 KB)

Supplementary material 10 (R 8 KB)

Supplementary material 11 (R 3 KB)

Supplementary material 12 (R 4 KB)

Supplementary material 13 (stan 1 KB)

Supplementary material 14 (R 2 KB)

Supplementary material 15 (R 0 KB)

Supplementary material 16 (txt 5 KB)

Computational details and proofs

We will make use of the following well-known property in several places.

Property 1

If ${\varvec{A}} \in \mathbb {R}^{m \times k}$, ${\varvec{B}} \in \mathbb {R}^{k \times l}$, ${\varvec{C}} \in \mathbb {R}^{l \times n}$, then $\text {vec}({\varvec{A}} {\varvec{B}} {\varvec{C}}) = ({\varvec{C}}^\top \otimes {\varvec{A}}) \text {vec}({\varvec{B}})$.

The following proposition gives the explicit solution to the minimization problem stated in (3.10). Bradley et al. (2015a) considers a similar problem featuring a more general objective function but assuming that the columns of ${\varvec{S}}$ are orthonormal. Higham (1988) gives a general discussion of problems involving Frobenius and 2-norm distance minimization.

Proposition 1

(Frobenius Norm Minimization) Suppose ${\varvec{S}} \in \mathbb {R}^{n \times r}$ has rank r and ${\varvec{\varSigma }} \in \mathbb {R}^{n \times n}$ is positive definite. The minimizer ${\varvec{X}} \in \mathbb {R}^{r \times r}$ of $\Vert {\varvec{\varSigma }} - {\varvec{S}} {\varvec{X}} {\varvec{S}}^\top \Vert _\text {F}$ is ${\varvec{X}} = ({\varvec{S}}^\top {\varvec{S}})^{-1} {\varvec{S}}^\top {\varvec{\varSigma }} {\varvec{S}} ({\varvec{S}}^\top {\varvec{S}})^{-1}$.

Proof

Using Property 1, we have

$$\begin{aligned} \Vert {\varvec{\varSigma }} - {\varvec{S}} {\varvec{X}} {\varvec{S}}^\top \Vert _{\text {F}}^2&= \text {vec}\left[ {\varvec{\varSigma }} - {\varvec{S}} {\varvec{X}} {\varvec{S}}^\top \right] ^\top \text {vec}\left[ {\varvec{\varSigma }} - {\varvec{S}} {\varvec{X}} {\varvec{S}}^\top \right] \nonumber \\&= \left[ \text {vec}({\varvec{\varSigma }}) - \text {vec}({\varvec{S}} {\varvec{X}} {\varvec{S}}^\top ) \right] ^\top \left[ \text {vec}({\varvec{\varSigma }}) - \text {vec}({\varvec{S}} {\varvec{X}} {\varvec{S}}^\top ) \right] \nonumber \\&= \left[ \text {vec}({\varvec{\varSigma }}) - ({\varvec{S}} \otimes {\varvec{S}}) \text {vec}({\varvec{X}}) \right] ^\top \left[ \text {vec}({\varvec{\varSigma }}) - ({\varvec{S}} \otimes {\varvec{S}}) \text {vec}({\varvec{X}}) \right] \nonumber \\&= \Vert \text {vec}({\varvec{\varSigma }}) - ({\varvec{S}} \otimes {\varvec{S}}) \text {vec}({\varvec{X}}) \Vert _2^2, \end{aligned}$$

(A.1)

where the norm on the last line is the usual 2-norm on $\mathbb {R}^{n^2}$. We recognize the expression in (A.1) as a standard least squares minimization whose solution is

$$\begin{aligned} \text {vec}({\varvec{X}})&= [({\varvec{S}} \otimes {\varvec{S}})^\top ({\varvec{S}} \otimes {\varvec{S}})]^{-1} ({\varvec{S}} \otimes {\varvec{S}})^\top \text {vec}({\varvec{\varSigma }}) \\&= [({\varvec{S}}^\top \otimes {\varvec{S}}^\top ) ({\varvec{S}} \otimes {\varvec{S}})]^{-1} ({\varvec{S}}^\top \otimes {\varvec{S}}^\top ) \text {vec}({\varvec{\varSigma }}) \\&= [ {\varvec{S}}^\top {\varvec{S}} \otimes {\varvec{S}}^\top {\varvec{S}} ]^{-1} \text {vec}({\varvec{S}}^\top {\varvec{\varSigma }} {\varvec{S}}) \\&= [({\varvec{S}}^\top {\varvec{S}})^{-1} \otimes ({\varvec{S}}^\top {\varvec{S}})^{-1}] \text {vec}({\varvec{S}}^\top {\varvec{\varSigma }} {\varvec{S}}) \\&= \text {vec}\left[ ({\varvec{S}}^\top {\varvec{S}})^{-1} {\varvec{S}}^\top {\varvec{\varSigma }} {\varvec{S}} ({\varvec{S}}^\top {\varvec{S}})^{-1} \right] . \end{aligned}$$

Therefore, the minimizer is ${\varvec{X}} = ({\varvec{S}}^\top {\varvec{S}})^{-1} {\varvec{S}}^\top {\varvec{\varSigma }} {\varvec{S}} ({\varvec{S}}^\top {\varvec{S}})^{-1}$, as desired. $\square $

Remark 1

(MLE Computation) To compute the MLE for the STCOS model, we first note that the likelihood, excluding the parameter model, is

$$\begin{aligned} f({\varvec{Z}} \mid {\varvec{\mu }}_B, \sigma _K^2, \sigma _\xi ^2)&= \int \phi ({\varvec{Z}} \mid {\varvec{H}} {\varvec{\mu }}_B + {\varvec{S}} {\varvec{\eta }}, \sigma _\xi ^2 {\varvec{I}} + {\varvec{V}}) \cdot \phi ({\varvec{\eta }} \mid {\varvec{0}}, \sigma _K^2 {\varvec{K}}) d {\varvec{\eta }} \\&= \phi ({\varvec{Z}} \mid {\varvec{H}} {\varvec{\mu }}_B, {\varvec{\varDelta }}) \\&= (2 \pi )^{-N/2} |{\varvec{\varDelta }}|^{-1/2} \exp \left\{ -\frac{1}{2} ({\varvec{Z}} - {\varvec{H}} {\varvec{\mu }}_B)^\top {\varvec{\varDelta }}^{-1} ({\varvec{Z}} - {\varvec{H}} {\varvec{\mu }}_B) \right\} , \end{aligned}$$

where ${\varvec{\varDelta }} = \sigma _\xi ^2 {\varvec{I}} + {\varvec{V}} + \sigma _K^2 {\varvec{S}} {\varvec{K}} {\varvec{S}}^\top $. Given $\sigma _K^2$ and $\sigma _\xi ^2$, the likelihood is maximized by the weighted least squares estimator $\hat{{\varvec{\mu }}}_B = ({\varvec{H}}^\top {\varvec{\varDelta }}^{-1} {\varvec{H}})^{-1} {\varvec{H}}^\top {\varvec{\varDelta }}^{-1} {\varvec{Z}}$. To estimate the unknown $\sigma _K^2$ and $\sigma _\xi ^2$, we carry out numerical maximization on the partially maximized log-likelihood

$$\begin{aligned} \ell (\sigma _K^2, \sigma _\xi ^2) = -\frac{N}{2} \log (2 \pi ) -\frac{1}{2} \log |{\varvec{\varDelta }}| -\frac{1}{2} ({\varvec{Z}} - {\varvec{H}} \hat{{\varvec{\mu }}}_B)^\top {\varvec{\varDelta }}^{-1} ({\varvec{Z}} - {\varvec{H}} \hat{{\varvec{\mu }}}_B). \end{aligned}$$

To enforce the constraints that $\sigma _K^2 > 0$ and $\sigma _\xi ^2 > 0$, we optimize over $(\vartheta _1, \vartheta _2) \in \mathbb {R}^2$ and take $\sigma _K^2 = \exp (\vartheta _1)$, $\sigma _\xi ^2 = \exp (\vartheta _2)$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Raim, A.M., Holan, S.H., Bradley, J.R. et al. Spatio-temporal change of support modeling with R. Comput Stat 36, 749–780 (2021). https://doi.org/10.1007/s00180-020-01029-4

Download citation

Received: 02 January 2020
Accepted: 26 August 2020
Published: 23 September 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s00180-020-01029-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatio-temporal change of support modeling with R

Abstract

Access this article

Similar content being viewed by others

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Computational details and proofs

Computational details and proofs

Property 1

Proposition 1

Proof

Remark 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation