Supporting SLEUTH – Enhancing a cellular automaton with support vector machines for urban growth modeling
Introduction
In recent years, urbanization has become one of the most striking change processes in the coupled human-environment system of Central Europe (Antrop, 2004, Siedentop, 2006). The quantitative and qualitative measurement, prediction, and evaluation of land-use dynamics – especially urban sprawl – have come to play a central role in land-system science (Brown et al., 2004, Lambin and Geist, 2006, Verburg, 2006). The creation of appropriate and adjusted models is challenged by the complexity of urban systems with their manifold self-organization processes, non-linear relationships, and emergent properties as well as feedback loops between different spatio-temporal scales and compartments. In this context, an increasing number of studies testing the use of artificial intelligence techniques for urban simulation have emerged recently (Batty, 2005, Benenson and Torrens, 2004, Schwarz and Haase, 2007, Steven et al., 2002, Wu and Silva, 2010).
Among these popular techniques are spatially explicit cellular automata (CA) like SLEUTH (Clarke, Hoppen, & Gaydos, 1997). SLEUTH is an acronym for its suite of input data (Slope, Land use, Exclusion, Urban, Transportation, and Hillshade) and is a purely growth-oriented model. As a bottom-up approach it is not dependent on intensive preliminary studies regarding the general causes of urban growth in a study area or the location-specific driving forces (Clarke et al., 1997). Based on the principles of neighborhood effects and spatial autocorrelation, the simulation rules are relatively simple. However, due to its ability to capture the complex emergence of urban patterns, SLEUTH has been applied in several urban growth studies throughout the world (Chaudhuri and Clarke, 2013, Clarke et al., 1997, Rafiee et al., 2009, Silva and Clarke, 2005, Wu et al., 2008). Although SLEUTH can simulate urban growth quite accurately, the modeling process is strongly influenced by stochastic decisions, resulting in variable growth patterns (Chaudhuri & Clarke, 2013). Nor does SLEUTH provide information regarding important human and ecological forces which determine the local suitability of urban growth.
The machine learning concept called support vector machines (SVM) (Cortes & Vapnik, 1995) is able to overcome these disadvantages. SVM are used in a variety of applications for solving classification problems (Drucker et al., 1999, Guo et al., 2005, Mountrakis et al., 2011, Waske et al., 2010) and for regression challenges (Gestel et al., 2001, Verplancke et al., 2008). In SVM explanatory variables are used to calculate probabilities for specified conditions (e.g. a raster cell belonging to a specific class). Conceptually in SVM, input vectors are projected in a higher-dimensional feature space in which an optimal hyperplane can be constructed for separating the input data into two or more classes. By using a specific feature selection, significant features can be identified and separated from those which are not of interest. In this manner, insights may be gained into those characteristic features which determine the separation process.
Xie (2006) implemented SVM for general land-use change objectives and tested different modification possibilities. Huang, Xie, and Tay (2010) extended the results of SVM for rural–urban simulation applications. A spatially explicit application with the purpose of dynamic modeling of urban growth is not performed and neither is a feature selection conducted to analyze the different impacts of the possible driving factors. Yang, Li, and Shi (2008) and Okwuashi, McConchie, Nwilo, and Eyo (2009) combined SVM with CA-based approaches for modeling urban growth in a spatially explicit way. Yang et al. (2008) applied their model to Shenzen city; Okwuashi et al. (2009) modeled the Lekki area of Lagos. Both of these studies derived nonlinear transition rules for CA simulation of urban land-use dynamics, but did not exploit the complementary advantages of SVM and CA. In both of these cases, urban growth was erroneously directed. As a result, the predicted growth of beaches and aquacultures was artificially limited (Yang et al., 2008) or urban edge growth was overestimated (Okwuashi et al., 2009). In addition, these two studies examined only common proximity and market access distance variables, while other important growth stimuli such as demographic changes or fluctuations in regional job markets were not considered.
This previous work has led us to the principal objective of this research: to combine the simulation strengths of SLEUTH and SVM. We focus on this objective by using an SVM-based probability map where geophysical and socio-economic forces drive the local suitability of urban growth (Mountrakis et al., 2011, Waske et al., 2010). We then compare the results with a combination of SLEUTH and binomial logistic regression (BLR – Lesschen et al., 2005, Verburg et al., 1999). Two primary research questions of this study are:
- 1.
Can the performance of SLEUTH be enhanced by using an SVM-based probability map?
- 2.
How well do SVM perform in comparison to the widely used BLR in an urban growth application?
The paper is structured as follows: Following this Introduction, Section 2 describes our study area as well as the compilation of the various input data sets. In Section 3 we detail the applied models and the chosen methods for assessing their accuracy. Section 4 presents an analysis of the selected driving forces, along with the derived probability maps and discusses the validation results. Finally, Section 5 provides a short conclusion and offers a perspective for future research.
Section snippets
Study area
The study area covers the mid-western part of the German federal state of North-Rhine Westphalia (NRW). The population density is 523 people per km2 (2012). Approximately 50% of the state’s area is in agricultural use, and 25% is forested. Approximately 20% of the total area is built-up land – residential and commercial areas as well as transport infrastructure (IT NRW., 2013). In addition to historical roots in agriculture and forestry, NRW has a considerable industrial heritage. For more than
Methods
We have described the land-use data and forces driving urban growth that constitute the principal inputs into the process of training the BLR and the SVM models. Urban growth probability maps generated by these models were then used to enhance the performance of the CA SLEUTH. Fig. 2 presents the modeling workflow, data, and methods used in this study.
A short overview of the theoretical and mathematical background of BLR and SVM will now be presented. Following this background material, the
Probability maps of urban growth
The driving forces discussed in Section 2.2.2 (Table 1) form the feature space (Eq. (2)) and build the base for training the 1984–2001 BLR and SVM urban growth models. The principal focus of our study is on the resulting probability maps of urban growth where every cell exhibits a continuous value indicating its probability of being urbanized. The ROC was used for assessing the performance of the models. Fig. 5 presents the ROC curve of both models. The curve of the SVM model clearly reaches a
Conclusions and outlook
It was the aim of this study to enhance SLEUTH by using SVM and to assess its performance in comparison with a BLR-based model. This is the first study linking the SLEUTH urban CA with the machine learning approach of SVM. We have assessed several aspects of the accuracy of the spatially explicit urban growth models: their certainty (cut-off value), their probability performance (ROC) in comparison with random operators (Cohen’s Kappa), the quantity (κhisto) and the allocation ability of urban
Acknowledgments
This study was carried out in the Remote Sensing Research Group (RSRG, Department of Geography, University of Bonn) with the support of Prof. Dr. Gunter Menz (RSRG) and Dr. Kerstin Voß (University of Education Heidelberg, Department of Geography). For the provision of the land-use and land-cover data we would like to thank the project “Visualisierung der Landnutzung und des Flächenverbrauchs in Nordrhein-Westfalen auf der Basis von Satellitenbildern” funded by the Ministry for Climate
Glossary
- AUC
- Area Under Curve
- ATKIS
- Amtliches Topographisch-Kartographisches Information system
- BLR
- Binomial Logistic Regression
- CA
- Cellular Automaton
- MC
- Monte Carlo iterations
- MRV
- Multiple Resolution Validation
- NRW
- North-Rhine Westphalia
- RBF
- Gaussian Radial Basis Function kernel
- ROC
- Receiver Operating Characteristic
- SLEUTH
- Slope, Land use, Exclusion, Urban, Transport, Hillshade
- SVM
- Support Vector Machines
- UGM
- Urban Growth Model
- UGMr
- Urban Growth Model with reduced input data sets
- UGMr-AR
- Urban Growth Model with reduced input
References (78)
Landscape change and the urbanization process in Europe
Landscape and Urban Planning
(2004)A review of assessing the accuracy of classifications of remotely sensed data
Remote Sensing of Environment
(1991)- et al.
Spatial ecosystem modelling using parallel processors
Ecological Modelling
(1991) - et al.
Support vector machines for predicting distribution of Sudden Oak Death in California
Ecological Modelling
(2005) - et al.
Support vector machines in remote sensing: A review
ISPRS Journal of Photogrammetry and Remote Sensing
(2011) - et al.
Optimal feature selection for support vector machines
Pattern Recognition
(2010) - et al.
Useful techniques of validation for spatially explicit land-change models
Ecological Modelling
(2004) - et al.
Land-cover change model validation by an ROC method for the Ipswich watershed, Massachusetts, USA
Agriculture, Ecosystems & Environment
(2001) - et al.
Simulating urban growth in Mashad City, Iran through the SLEUTH model (UGM)
Cities
(2009) - et al.
Predicting land-use change
Agriculture, Ecosystems and Environment
(2001)
A spatial explicit allocation procedure for modelling the pattern of land use change based upon actual land use
Ecological Modelling
Cellular automata for simulating land use changes based on support vector machines
Computers & Geosciences
Testing popular visualization techniques for representing model uncertainty
Cartography and Geographic Information Science
Practical Statistics for Medical Research
Cities and Complexity: Understanding Cities with Cellular Automata, Agent-Based Models, and Fractals
Geosimulation: Automata-Based Modeling of Urban Phenomena
A tutorial on support vector machines for pattern recognition
Data Mining and Knowledge Discovery
The SLEUTH land use change model: A review
International Journal of Environmental Resources Research
A self-modifying cellular automaton model of historical urbanization in the San Francisco Bay area
Environment and Planning B
Support-vector networks
Machine Learning
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
The Measurement of Interrater Agreement
Financial time series prediction using least squares support vector machines within the evidence framework
IEEE Transactions on NeuralL Networks
Entwicklung eines fernerkundungsgestützten Modellverbundes zur Simulation des urban-ruralen Landnutzungswandels in Nordrhein-Westfalen
The computer and the geographer
Transactions of the Institute of British Geographers
Support vector machines for urban growth modeling
GeoInformatica
On the mean accuracy of statistical pattern recognizers
IEEE Transactions on Information Theory
Mobilität und Verkehrsverhalten der Ausbildungs- und Berufspendlerinnen und -pendler
The measurement of observer agreement for categorical data
Biometrics
Cited by (143)
Incorporating spatial heterogeneity to model spontaneous and self-organized urban growth
2024, Applied GeographyAnalysing urban growth using machine learning and open data: An artificial neural network modelled case study of five Greek cities
2023, Sustainable Cities and SocietyEmbedding sustainable land-use optimization within system dynamics: bidirectional feedback between spatial and non-spatial drivers
2022, Environmental Modelling and Software