1932

Abstract

Spatial statistics is concerned with the analysis of data that have spatial locations associated with them, and those locations are used to model statistical dependence between the data. The spatial data are treated as a single realization from a probability model that encodes the dependence through both fixed effects and random effects, where randomness is manifest in the underlying spatial process and in the noisy, incomplete measurement process. The focus of this review article is on the use of basis functions to provide an extremely flexible and computationally efficient way to model spatial processes that are possibly highly nonstationary. Several examples of basis-function models are provided to illustrate how they are used in Gaussian, non-Gaussian, multivariate, and spatio-temporal settings, with applications in geophysics. Our aim is to emphasize the versatility of these spatial-statistical models and to demonstrate that they are now center-stage in a number of application domains. The review concludes with a discussion and illustration of software currently available to fit spatial-basis-function models and implement spatial-statistical prediction.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-040120-020733
2022-03-07
2024-04-20
Loading full text...

Full text loading...

/deliver/fulltext/statistics/9/1/annurev-statistics-040120-020733.html?itemId=/content/journals/10.1146/annurev-statistics-040120-020733&mimeType=html&fmt=ahah

Literature Cited

  1. Allcroft DJ, Glasbey CA. 2003. A latent Gaussian Markov random-field model for spatiotemporal rainfall disaggregation. J. R. Stat. Soc. Ser. C 52:487–98
    [Google Scholar]
  2. Bachl FE, Lindgren F, Borchers DL, Illian JB. 2019. inlabru: an R package for Bayesian spatial modelling from ecological survey data. Methods Ecol. Evol. 10:760–66
    [Google Scholar]
  3. Banerjee S, Carlin BP, Gelfand AE 2004. Hierarchical Modeling and Analysis for Spatial Data London: Chapman and Hall. , 2nd ed..
  4. Banerjee S, Gelfand AE, Finley AO, Sang H 2008. Gaussian predictive process models for large spatial data sets. J. R. Stat. Soc. Ser. B 70:825–48
    [Google Scholar]
  5. Berliner LM 1996. Hierarchical Bayesian time series models. Proceedings of the XVth Workshop on Maximum Entropy and Bayesian Methods KM Hanson, RN Silver 15–22 Dordrecht, Neth: Kluwer Academic
  6. Bradley JR, Cressie N, Shi T. 2016a. A comparison of spatial predictors when datasets could be very large. Stat. Surv. 10:100–31
    [Google Scholar]
  7. Bradley JR, Holan SH, Wikle CK. 2015. Multivariate spatio-temporal models for high-dimensional areal data with application to longitudinal employer-household dynamics. Ann. Appl. Stat. 9:1761–91
    [Google Scholar]
  8. Bradley JR, Holan SH, Wikle CK. 2018. Computationally efficient multivariate spatio-temporal models for high-dimensional count-valued data (with discussion). Bayesian Anal. 13:253–310
    [Google Scholar]
  9. Bradley JR, Holan SH, Wikle CK. 2020. Bayesian hierarchical models with conjugate full-conditional distributions for dependent data from the natural exponential family. J. Am. Stat. Assoc. 115:2037–52
    [Google Scholar]
  10. Bradley JR, Wikle CK, Holan SH. 2016b. Bayesian spatial change of support for count-valued survey data with application to the American Community Survey. J. Am. Stat. Assoc. 111:472–87
    [Google Scholar]
  11. Bradley JR, Wikle CK, Holan SH. 2019. Spatio-temporal models for big multinomial data using the conditional multivariate logit beta distribution. J. Time Ser. Anal. 50:363–82
    [Google Scholar]
  12. Cao C, Xiong J, Blonski S, Liu Q, Uprety S et al. 2013. Suomi NPP VIIRS sensor data record verification, validation, and long-term performance monitoring. J. Geophys. Res. Atmos. 118:11664–78
    [Google Scholar]
  13. Christensen WF, Amemiya Y. 2002. Latent variable analysis of multivariate spatial data. J. Am. Stat. Assoc. 97:302–17
    [Google Scholar]
  14. Christensen WF, Amemiya Y. 2003. Modeling and prediction for multivariate spatial factor analysis. J. Stat. Plann. Inference 115:543–64
    [Google Scholar]
  15. Cliff A, Ord J. 1981. Spatial Processes—Models and Applications London: Pion
  16. Cressie N. 1993. Statistics for Spatial Data Hoboken, NJ: Wiley. Rev. ed.
  17. Cressie N, Johannesson G. 2008. Fixed rank kriging for very large spatial data sets. J. R. Stat. Soc. Ser. B 70:209–26
    [Google Scholar]
  18. Cressie N, Kang EL 2010. High-resolution digital soil mapping: kriging for very large datasets. Proximal Soil Sensing RA Viscarra Rossel, AB McBratney, B Minasny 49–63 New York: Springer
    [Google Scholar]
  19. Cressie N, Shi T, Kang EL. 2010. Fixed rank filtering for spatio-temporal data. J. Comput. Graph. Stat. 19:724–45
    [Google Scholar]
  20. Cressie N, Wikle CK. 2011. Statistics for Spatio-Temporal Data Hoboken, NJ: Wiley
  21. Cressie N, Zammit-Mangion A. 2016. Multivariate spatial covariance models: a conditional approach. Biometrika 103:915–35
    [Google Scholar]
  22. Cseke B, Zammit-Mangion A, Heskes T, Sanguinetti G 2016. Sparse approximate inference for spatio-temporal point process models. J. Am. Stat. Assoc. 111:1746–63
    [Google Scholar]
  23. Damianou A, Lawrence N. 2013. Deep Gaussian processes. Proc. Mach. Learn. Res. 31:207–15
    [Google Scholar]
  24. De Oliveira V, Kedem B, Short DA 1997. Bayesian prediction of transformed Gaussian random fields. J. Am. Stat. Assoc. 92:1422–33
    [Google Scholar]
  25. Dempster AP, Laird NM, Rubin DB 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. Ser. B 39:1–38
    [Google Scholar]
  26. Dewar M, Scerri K, Kadirkamanathan V. 2009. Data-driven spatio-temporal modeling using the integro-difference equation. IEEE Trans. Signal Proc. 57:83–91
    [Google Scholar]
  27. Diggle PJ, Tawn JA, Moyeed RA. 1998. Model-based geostatistics (with discussion). J. R. Stat. Soc. Ser. C 47:299–350
    [Google Scholar]
  28. Dunlop MM, Girolami M, Stuart AM, Teckentrup AL. 2018. How deep are deep Gaussian processes?. J. Mach. Learn. Res. 19:1–46
    [Google Scholar]
  29. Eidsvik J, Finley AO, Banerjee S, Rue H. 2012. Approximate Bayesian inference for large spatial datasets using predictive process models. Comput. Stat. Data Anal. 56:1362–80
    [Google Scholar]
  30. Enting IG. 2002. Inverse Problems in Atmospheric Constituent Transport Cambridge, UK: Cambridge Univ. Press
  31. Finley AO, Sang H, Banerjee S, Gelfand AE. 2009. Improving the performance of predictive process modeling for large datasets. Comput. Stat. Data Anal. 53:2873–84
    [Google Scholar]
  32. Freestone DR, Aram P, Dewar M, Scerri K, Grayden DB, Kadirkamanathan V 2011. A data-driven framework for neural field modeling. NeuroImage 56:1043–58
    [Google Scholar]
  33. Genton MG, Kleiber W. 2015. Cross-covariance functions for multivariate geostatistics. Stat. Sci. 30:147–63
    [Google Scholar]
  34. Gneiting T, Kleiber W, Schlather M. 2010. Matérn cross-covariance functions for multivariate random fields. J. Am. Stat. Assoc. 105:1167–77
    [Google Scholar]
  35. Griffith D. 2000. A linear regression solution to the spatial autocorrelation problem. J. Geogr. Syst. 2:141–56
    [Google Scholar]
  36. Hanks E, Schliep E, Hooten M, Hoeting J. 2015. Restricted spatial regression in practice: geostatistical models, confounding, and robustness under model misspecification. Environmetrics 26:243–54
    [Google Scholar]
  37. Heaton MJ, Datta A, Finley AO, Furrer R, Guinness J et al. 2019. A case study competition among methods for analyzing large spatial data. J. Agric. Biol. Environ. Stat. 24:398–425
    [Google Scholar]
  38. Hensman J, Lawrence ND. 2014. Nested variational compression in deep Gaussian processes. arXiv:1412.1370 [ stat.ML]
  39. Hodges J, Reich B. 2010. Adding spatially-correlated errors can mess up the fixed effect you love. Ann. Appl. Stat. 64:325–34
    [Google Scholar]
  40. Hooten MB, Johnson DS, McClintock BT, Morales JM. 2017. Animal Movement: Statistical Models for Telemetry Data Boca Raton, FL: Chapman and Hall/CRC Press
  41. Huang H, Blake LR, Hammerling DM. 2019. Pushing the limit: a hybrid parallel implementation of the multi-resolution approximation for massive data. arXiv:1905.00141 [ stat.CO]
  42. Huang HC, Cressie N, Gabrosek J 2002. Fast, resolution-consistent spatial prediction of global processes from satellite data. J. Comput. Graph. Stat. 11:63–88
    [Google Scholar]
  43. Hughes J, Haran M. 2013. Dimension reduction and alleviation of confounding for spatial generalized linear mixed models. J. R. Stat. Soc. Ser. B 75:139–59
    [Google Scholar]
  44. Johns CJ, Nychka D, Kittel TGF, Daly C. 2003. Infilling sparse records of spatial fields. J. Am. Stat. Assoc. 98:796–806
    [Google Scholar]
  45. Kang EL, Cressie N. 2011. Bayesian inference for the spatial random effects model. J. Am. Stat. Assoc. 106:972–83
    [Google Scholar]
  46. Katzfuss M. 2013. Bayesian nonstationary spatial modeling for very large datasets. Environmetrics 24:189–200
    [Google Scholar]
  47. Katzfuss M. 2017. A multi-resolution approximation for massive spatial datasets. J. Am. Stat. Assoc. 112:201–14
    [Google Scholar]
  48. Katzfuss M, Cressie N. 2011. Spatio-temporal smoothing and EM estimation for massive remote-sensing data sets. J. Time Ser. Anal. 32:430–46
    [Google Scholar]
  49. Lee BS, Park J. 2020. A scalable partitioned approach to model massive nonstationary non-Gaussian spatial datasets. arXiv:2011.13083 [stat.CO]
  50. Lindgren F, Rue H. 2015. Bayesian spatial modelling with R-INLA. J. Stat. Softw. 63:191–25
    [Google Scholar]
  51. Lindgren F, Rue H, Lindström J. 2011. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach (with discussion). J. R. Stat. Soc. Ser. B 73:423–98
    [Google Scholar]
  52. Lopes HF, Gamerman D, Salazar E. 2011. Generalized spatial dynamic factor models. Comput. Stat. Data Anal. 55:1319–30
    [Google Scholar]
  53. Lopes HF, Salazar E, Gamerman D. 2008. Spatial dynamic factor analysis. Bayesian Anal. 3:759–92
    [Google Scholar]
  54. Ma P, Kang EL 2020. A fused Gaussian process model for very large spatial data. J. Comput. Graph. Stat. 29:479–89
    [Google Scholar]
  55. McCullagh P, Nelder JA. 1989. Generalized Linear Models London: Chapman and Hall. , 2nd ed..
  56. McLachlan GJ, Krishnan T. 2007. The EM Algorithm and Extensions Hoboken, NJ: Wiley. , 2nd ed..
  57. Michalak AM, Bruhwiler L, Tans PP. 2004. A geostatistical approach to surface flux estimation of atmospheric trace gases. J. Geophys. Res. Atmos. 109:D14109
    [Google Scholar]
  58. Nelder JA, Wedderburn RWM. 1972. Generalized linear models. J. R. Stat. Soc. Ser. A 135:370–84
    [Google Scholar]
  59. Nguyen H, Cressie N, Braverman A 2017. Multivariate spatial data fusion for very large remote sensing datasets. Remote Sensing 9:142–61
    [Google Scholar]
  60. Nychka D, Bandyopadhyay S, Hammerling D, Lindgren F, Sain S. 2015. A multiresolution Gaussian process model for the analysis of large spatial datasets. J. Comput. Graph. Stat. 24:579–599
    [Google Scholar]
  61. Nychka D, Wikle C, Royle JA. 2002. Multiresolution models for nonstationary spatial covariance functions. Stat. Model. 2:315–31
    [Google Scholar]
  62. Paciorek CJ. 2007. Bayesian smoothing with Gaussian processes using Fourier basis functions in the spectralGP package. J. Stat. Softw. 19:21–38
    [Google Scholar]
  63. Paciorek CJ. 2010. The importance of scale for spatial-confounding bias and precision of spatial regression estimators. Stat. Sci. 25:107–25
    [Google Scholar]
  64. Pebesma EJ. 2004. Multivariable geostatistics in S: the gstat package. Comput. Geosci. 30:683–91
    [Google Scholar]
  65. Perrin O, Monestiez P 1999. Modelling of non-stationary spatial structure using parametric radial basis deformations. GeoENV II–Geostatistics for Environmental Applications J Gómez-Hernández, A Soares, R Froidevaux 175–86 New York: Springer
    [Google Scholar]
  66. R Core Team 2021. R: a language and environment for statistical computing R Found. Stat. Comput. Vienna, Austria:
  67. Rue H, Martino S, Chopin N 2009. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion). J. R. Stat. Soc. Ser. B 71:319–92
    [Google Scholar]
  68. Sahu SK, Mardia KV. 2005. A Bayesian kriged Kalman model for short-term forecasting of air pollution levels. J. R. Stat. Soc. Ser. C 54:223–44
    [Google Scholar]
  69. Sainsbury-Dale M, Zammit-Mangion A, Cressie N. 2021. Modelling, fitting, and prediction with non-Gaussian spatial and spatio-temporal data using FRK. arXiv:2110.02507 [ stat.CO]
  70. Sampson PD, Guttorp P. 1992. Nonparametric estimation of nonstationary spatial covariance structure. J. Am. Stat. Assoc. 87:108–19
    [Google Scholar]
  71. Sang H, Huang JZ 2012. A full scale approximation of covariance functions for large spatial data sets. J. R. Stat. Soc. Ser. B 74:111–32
    [Google Scholar]
  72. Sansó B, Schmidt AM, Nobre AA. 2008. Bayesian spatio-temporal models based on discrete convolutions. Can. J. Stat. 36:239–58
    [Google Scholar]
  73. Schmidt AM, O'Hagan A. 2003. Bayesian inference for non-stationary spatial covariance structure via spatial deformations. J. R. Stat. Soc. Ser. B 65:743–58
    [Google Scholar]
  74. Searle SR, Henderson HV. 1981. On deriving the inverse of a sum of matrices. Soc. Ind. Appl. Math. Rev. 23:53–60
    [Google Scholar]
  75. Sengupta A, Cressie N. 2013. Hierarchical statistical modelling of big spatial datasets using the exponential family of distributions. Spatial Stat. 4:14–44
    [Google Scholar]
  76. Sengupta A, Cressie N, Kahn BH, Frey R. 2016. Predictive inference for big, spatial, non-Gaussian data: MODIS cloud data and its change-of-support. Aust. N. Z. J. Stat. 58:15–45
    [Google Scholar]
  77. Simpson D, Illian JB, Lindgren F, Sørbye SH, Rue H. 2016. Going off grid: computationally efficient inference for log-Gaussian Cox processes. Biometrika 103:49–70
    [Google Scholar]
  78. Smith RL. 1996. Estimating nonstationary spatial correlations. Tech. Rep. Cambridge Univ. Cambridge, UK. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.5988&rep=rep1&type=pdf
    [Google Scholar]
  79. Solo V. 2002. Identification of a noisy stochastic heat equation with the EM algorithm. Proceedings of the 41st IEEE Conference on Decision and Control4505–8 Washington, DC: IEEE
  80. Stein ML. 1999. Interpolation of Spatial Data: Some Theory for Kriging New York: Springer
  81. Stein ML. 2014. Limitations on low rank approximations for covariance matrices of spatial data. Spatial Stat. 8:1–19
    [Google Scholar]
  82. Stroud JR, Müller P, Sansó B. 2001. Dynamic models for spatiotemporal data. J. R. Stat. Soc. Ser. B 63:673–89
    [Google Scholar]
  83. Tagle F, Genton MG, Yip A, Mostamandi S, Stenchikov G, Castruccio S. 2020. A high-resolution bilevel skew-t stochastic generator for assessing Saudi Arabia's wind energy resources. Environmetrics 31:1–17
    [Google Scholar]
  84. Tao T. 2011. An Introduction to Measure Theory Providence, RI: Am. Math. Soc.
  85. Tibshirani R. 1996. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58:267–88
    [Google Scholar]
  86. Tobler WR. 1970. A computer movie simulating urban growth in the Detroit region. Econ. Geogr. 46:234–40
    [Google Scholar]
  87. Tzeng S, Huang HC. 2018. Resolution adaptive fixed rank kriging. Technometrics 60:198–208
    [Google Scholar]
  88. Tzeng S, Huang HC, Cressie N. 2005. A fast, optimal spatial-prediction method for massive datasets. J. Am. Stat. Assoc. 100:1343–57
    [Google Scholar]
  89. Vidakovic B, Müller P 1999. An introduction to wavelets. Bayesian Inference in Wavelet-Based Models B Vidakovic, P Müller 1–18 New York: Springer
    [Google Scholar]
  90. Wahba G. 1990. Spline Models for Observational Data Philadelphia, PA: Soc. Ind. Appl. Math.
  91. Wikle CK 2010. Low-rank representations for spatial processes. Handbook of Spatial Statistics AE Gelfand, PJ Diggle, M Fuentes, P Guttorp 1–18 Boca Raton, FL: Chapman and Hall/CRC
    [Google Scholar]
  92. Wikle CK, Berliner LM. 2007. A Bayesian tutorial for data assimilation. Phys. D Nonlinear Phenom. 230:1–16
    [Google Scholar]
  93. Wikle CK, Cressie N. 1999. A dimension-reduced approach to space-time Kalman filtering. Biometrika 86:815–29
    [Google Scholar]
  94. Wikle CK, Milliff RF, Nychka D, Berliner LM. 2001. Spatiotemporal hierarchical Bayesian modeling tropical ocean surface winds. J. Am. Stat. Assoc. 96:382–97
    [Google Scholar]
  95. Wikle CK, Zammit-Mangion A, Cressie N. 2019. Spatio-Temporal Statistics with R Boca Raton, FL: Chapman and Hall/CRC
  96. Wood SN. 2017. Generalized Additive Models: An Introduction with R Boca Raton, FL: Chapman and Hall/CRC. , 2nd ed..
  97. Xu G, Genton MG. 2017. Tukey g-and-h random fields. J. Am. Stat. Assoc. 112:1236–49
    [Google Scholar]
  98. Zammit-Mangion A, Bertolacci M, Fisher J, Stavert A, Rigby ML et al. 2021a. WOMBAT: a fully Bayesian global flux-inversion framework. Geosci. Model Dev. Discuss. In press. https://doi.org/10.5194/gmd-2021-181
    [Crossref] [Google Scholar]
  99. Zammit-Mangion A, Cressie N. 2021. FRK: an R package for spatial and spatio-temporal prediction with large datasets. J. Stat. Softw. 98:41–48
    [Google Scholar]
  100. Zammit-Mangion A, Cressie N, Ganesan AL 2016. Non-Gaussian bivariate modelling with application to atmospheric trace-gas inversion. Spatial Stat. 18:194–220
    [Google Scholar]
  101. Zammit-Mangion A, Ng TLJ, Vu Q, Filippone M. 2021b. Deep compositional spatial models. J. Am. Stat. Assoc. https://doi.org/10.1080/01621459.2021.1887741
    [Crossref] [Google Scholar]
  102. Zammit-Mangion A, Rougier J. 2020. Multi-scale process modelling and distributed computation for spatial data. Stat. Comput. 30:1609–27
    [Google Scholar]
  103. Zammit-Mangion A, Rougier J, Schön N, Lindgren F, Bamber J. 2015. Multivariate spatio-temporal modelling for assessing Antarctica's present-day contribution to sea-level rise. Environmetrics 26:159–77
    [Google Scholar]
  104. Zammit-Mangion A, Sanguinetti G, Kadirkamanathan V. 2012. Variational estimation in spatiotemporal systems from continuous and point-process observations. IEEE Trans. Signal Proc. 60:3449–59
    [Google Scholar]
/content/journals/10.1146/annurev-statistics-040120-020733
Loading
/content/journals/10.1146/annurev-statistics-040120-020733
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error