Supporting SLEUTH – Enhancing a cellular automaton with support vector machines for urban growth modeling

https://doi.org/10.1016/j.compenvurbsys.2014.05.001Get rights and content

Highlights

  • Creation of probability maps for urban growth derived by support vector machines and binary logistic regression.

  • Coupling of a modified version of SLEUTH and binary logistic regression resp. support vector machines.

  • Simulation of urban growth between 1975 and 2005 in the southern part of North-Rhine Westphalia.

  • Examination of a probable enhancement of SLEUTH by using various validation techniques with different foci of analysis.

Abstract

In recent years, urbanization has been one of the most striking change processes in the socioecological system of Central Europe. Cellular automata (CA) are a popular and robust approach for the spatially explicit simulation of land-use and land-cover changes. The CA SLEUTH simulates urban growth using four simple but effective growth rules. Although the performance of SLEUTH is very high, the modeling process still is strongly influenced by stochastic decisions resulting in a variable pattern. Besides, it gives no information about the human and ecological forces driving the local suitability of urban growth. Hence, the objective of this research is to combine the simulation skills of CA with the machine learning approach called support vector machines (SVM). SVM has the basic idea to project input vectors on a higher-dimensional feature space, in which an optimal hyperplane can be constructed for separating the data into two or more classes. By using a forward feature selection, important features can be identified and separated from unimportant ones. The anchor point of coupling both methods is the exclusion layer of SLEUTH. It will be replaced by a SVM-based probability map of urban growth. As a kind of litmus test, we compare the approach with the combination of CA and binomial logistic regression (BLR), a frequently used technique in urban growth studies. The integrated models are applied to an area in the federal state of North Rhine-Westphalia involving a highly urbanized region along the Rhine valley (Cologne, Düsseldorf) and a rural, hilly region (Bergisches Land) with a dispersed settlement pattern. Various geophysical and socio-economic driving forces are included, and comparatively evaluated. The validation shows that the quantity and the allocation performance of SLEUTH are augmented clearly when coupling SLEUTH with a BLR- or SVM-based probability map. The combination enables the dynamical simulation of different growth types on the one hand as well as the analyses of various geophysical and socio-economic driving forces on the other hand. The SVM approach needs less variables than the BLR model and SVM-based probabilities exhibit a higher certainty compared to those derived by BLR.

Introduction

In recent years, urbanization has become one of the most striking change processes in the coupled human-environment system of Central Europe (Antrop, 2004, Siedentop, 2006). The quantitative and qualitative measurement, prediction, and evaluation of land-use dynamics – especially urban sprawl – have come to play a central role in land-system science (Brown et al., 2004, Lambin and Geist, 2006, Verburg, 2006). The creation of appropriate and adjusted models is challenged by the complexity of urban systems with their manifold self-organization processes, non-linear relationships, and emergent properties as well as feedback loops between different spatio-temporal scales and compartments. In this context, an increasing number of studies testing the use of artificial intelligence techniques for urban simulation have emerged recently (Batty, 2005, Benenson and Torrens, 2004, Schwarz and Haase, 2007, Steven et al., 2002, Wu and Silva, 2010).

Among these popular techniques are spatially explicit cellular automata (CA) like SLEUTH (Clarke, Hoppen, & Gaydos, 1997). SLEUTH is an acronym for its suite of input data (Slope, Land use, Exclusion, Urban, Transportation, and Hillshade) and is a purely growth-oriented model. As a bottom-up approach it is not dependent on intensive preliminary studies regarding the general causes of urban growth in a study area or the location-specific driving forces (Clarke et al., 1997). Based on the principles of neighborhood effects and spatial autocorrelation, the simulation rules are relatively simple. However, due to its ability to capture the complex emergence of urban patterns, SLEUTH has been applied in several urban growth studies throughout the world (Chaudhuri and Clarke, 2013, Clarke et al., 1997, Rafiee et al., 2009, Silva and Clarke, 2005, Wu et al., 2008). Although SLEUTH can simulate urban growth quite accurately, the modeling process is strongly influenced by stochastic decisions, resulting in variable growth patterns (Chaudhuri & Clarke, 2013). Nor does SLEUTH provide information regarding important human and ecological forces which determine the local suitability of urban growth.

The machine learning concept called support vector machines (SVM) (Cortes & Vapnik, 1995) is able to overcome these disadvantages. SVM are used in a variety of applications for solving classification problems (Drucker et al., 1999, Guo et al., 2005, Mountrakis et al., 2011, Waske et al., 2010) and for regression challenges (Gestel et al., 2001, Verplancke et al., 2008). In SVM explanatory variables are used to calculate probabilities for specified conditions (e.g. a raster cell belonging to a specific class). Conceptually in SVM, input vectors are projected in a higher-dimensional feature space in which an optimal hyperplane can be constructed for separating the input data into two or more classes. By using a specific feature selection, significant features can be identified and separated from those which are not of interest. In this manner, insights may be gained into those characteristic features which determine the separation process.

Xie (2006) implemented SVM for general land-use change objectives and tested different modification possibilities. Huang, Xie, and Tay (2010) extended the results of SVM for rural–urban simulation applications. A spatially explicit application with the purpose of dynamic modeling of urban growth is not performed and neither is a feature selection conducted to analyze the different impacts of the possible driving factors. Yang, Li, and Shi (2008) and Okwuashi, McConchie, Nwilo, and Eyo (2009) combined SVM with CA-based approaches for modeling urban growth in a spatially explicit way. Yang et al. (2008) applied their model to Shenzen city; Okwuashi et al. (2009) modeled the Lekki area of Lagos. Both of these studies derived nonlinear transition rules for CA simulation of urban land-use dynamics, but did not exploit the complementary advantages of SVM and CA. In both of these cases, urban growth was erroneously directed. As a result, the predicted growth of beaches and aquacultures was artificially limited (Yang et al., 2008) or urban edge growth was overestimated (Okwuashi et al., 2009). In addition, these two studies examined only common proximity and market access distance variables, while other important growth stimuli such as demographic changes or fluctuations in regional job markets were not considered.

This previous work has led us to the principal objective of this research: to combine the simulation strengths of SLEUTH and SVM. We focus on this objective by using an SVM-based probability map where geophysical and socio-economic forces drive the local suitability of urban growth (Mountrakis et al., 2011, Waske et al., 2010). We then compare the results with a combination of SLEUTH and binomial logistic regression (BLR – Lesschen et al., 2005, Verburg et al., 1999). Two primary research questions of this study are:

  • 1.

    Can the performance of SLEUTH be enhanced by using an SVM-based probability map?

  • 2.

    How well do SVM perform in comparison to the widely used BLR in an urban growth application?

The paper is structured as follows: Following this Introduction, Section 2 describes our study area as well as the compilation of the various input data sets. In Section 3 we detail the applied models and the chosen methods for assessing their accuracy. Section 4 presents an analysis of the selected driving forces, along with the derived probability maps and discusses the validation results. Finally, Section 5 provides a short conclusion and offers a perspective for future research.

Section snippets

Study area

The study area covers the mid-western part of the German federal state of North-Rhine Westphalia (NRW). The population density is 523 people per km2 (2012). Approximately 50% of the state’s area is in agricultural use, and 25% is forested. Approximately 20% of the total area is built-up land – residential and commercial areas as well as transport infrastructure (IT NRW., 2013). In addition to historical roots in agriculture and forestry, NRW has a considerable industrial heritage. For more than

Methods

We have described the land-use data and forces driving urban growth that constitute the principal inputs into the process of training the BLR and the SVM models. Urban growth probability maps generated by these models were then used to enhance the performance of the CA SLEUTH. Fig. 2 presents the modeling workflow, data, and methods used in this study.

A short overview of the theoretical and mathematical background of BLR and SVM will now be presented. Following this background material, the

Probability maps of urban growth

The driving forces discussed in Section 2.2.2 (Table 1) form the feature space (Eq. (2)) and build the base for training the 1984–2001 BLR and SVM urban growth models. The principal focus of our study is on the resulting probability maps of urban growth where every cell exhibits a continuous value indicating its probability of being urbanized. The ROC was used for assessing the performance of the models. Fig. 5 presents the ROC curve of both models. The curve of the SVM model clearly reaches a

Conclusions and outlook

It was the aim of this study to enhance SLEUTH by using SVM and to assess its performance in comparison with a BLR-based model. This is the first study linking the SLEUTH urban CA with the machine learning approach of SVM. We have assessed several aspects of the accuracy of the spatially explicit urban growth models: their certainty (cut-off value), their probability performance (ROC) in comparison with random operators (Cohen’s Kappa), the quantity (κhisto) and the allocation ability of urban

Acknowledgments

This study was carried out in the Remote Sensing Research Group (RSRG, Department of Geography, University of Bonn) with the support of Prof. Dr. Gunter Menz (RSRG) and Dr. Kerstin Voß (University of Education Heidelberg, Department of Geography). For the provision of the land-use and land-cover data we would like to thank the project “Visualisierung der Landnutzung und des Flächenverbrauchs in Nordrhein-Westfalen auf der Basis von Satellitenbildern” funded by the Ministry for Climate

Glossary

AUC
Area Under Curve
ATKIS
Amtliches Topographisch-Kartographisches Information system
BLR
Binomial Logistic Regression
CA
Cellular Automaton
MC
Monte Carlo iterations
MRV
Multiple Resolution Validation
NRW
North-Rhine Westphalia
RBF
Gaussian Radial Basis Function kernel
ROC
Receiver Operating Characteristic
SLEUTH
Slope, Land use, Exclusion, Urban, Transport, Hillshade
SVM
Support Vector Machines
UGM
Urban Growth Model
UGMr
Urban Growth Model with reduced input data sets
UGMr-AR
Urban Growth Model with reduced input

References (78)

  • P.H. Verburg et al.

    A spatial explicit allocation procedure for modelling the pattern of land use change based upon actual land use

    Ecological Modelling

    (1999)
  • Q. Yang et al.

    Cellular automata for simulating land use changes based on support vector machines

    Computers & Geosciences

    (2008)
  • J.C.J.H. Aerts et al.

    Testing popular visualization techniques for representing model uncertainty

    Cartography and Geographic Information Science

    (2003)
  • D.G. Altman

    Practical Statistics for Medical Research

    (1990)
  • M. Batty

    Cities and Complexity: Understanding Cities with Cellular Automata, Agent-Based Models, and Fractals

    (2005)
  • BBSR (2012). Trends der Siedlungsflächenentwicklung. Status quo und Projektion 2030. BBSR-Analysen KOMPAKT. Bonn:...
  • I. Benenson et al.

    Geosimulation: Automata-Based Modeling of Urban Phenomena

    (2004)
  • BMVBS/BBSR (2009). Einflussfaktoren der Neuinanspruchnahme von Flächen. Forschungen 139. Bonn: Bundesministerium für...
  • Brown, D.G., Walker, R., Manson, S., & Seto, K. (2004). Modeling Land Use and Land Cover Change. In: Gutman, G.,...
  • C.J.C. Burges

    A tutorial on support vector machines for pattern recognition

    Data Mining and Knowledge Discovery

    (1998)
  • G. Chaudhuri et al.

    The SLEUTH land use change model: A review

    International Journal of Environmental Resources Research

    (2013)
  • K.C. Clarke et al.

    A self-modifying cellular automaton model of historical urbanization in the San Francisco Bay area

    Environment and Planning B

    (1997)
  • C. Cortes et al.

    Support-vector networks

    Machine Learning

    (1995)
  • H. Drucker et al.

    Support vector machines for spam categorization

    IEEE Transactions on Neural Networks

    (1999)
  • J.L. Fleiss et al.

    The Measurement of Interrater Agreement

    (2004)
  • T.V. Gestel et al.

    Financial time series prediction using least squares support vector machines within the evidence framework

    IEEE Transactions on NeuralL Networks

    (2001)
  • R. Goetzke

    Entwicklung eines fernerkundungsgestützten Modellverbundes zur Simulation des urban-ruralen Landnutzungswandels in Nordrhein-Westfalen

    (2012)
  • Goetzke, R., Over, M., & Braun, M. (2006). A method to map land-use change and urban growth in North Rhine-Westphalia...
  • T. Hägerstrand

    The computer and the geographer

    Transactions of the Institute of British Geographers

    (1967)
  • Herold, M., Menz, G., Clarke, K.C. (2001): Remote Sensing and Urban Growth Models. Demands and Perspectives. In:...
  • Hsu, C.-W., Chang, C.-C., & Lin, C.-J. (2010). A Practical Guide to Support Vector Classification. Taipei: Department...
  • B. Huang et al.

    Support vector machines for urban growth modeling

    GeoInformatica

    (2010)
  • G.F. Hughes

    On the mean accuracy of statistical pattern recognizers

    IEEE Transactions on Information Theory

    (1968)
  • A. Hullmann et al.

    Mobilität und Verkehrsverhalten der Ausbildungs- und Berufspendlerinnen und -pendler

    (2002)
  • IT NRW. (2013). Landesdatenbank NRW. Landesdatenbank...
  • Judex, M. (2008). Modellierung der Landnutzungsdynamik in Zentralbenin mit dem XULU-Framework....
  • LAG21 (Ed.). (2008). Flächenmanagement als partizipativer Prozess einer nachhaltigen Stadtentwicklung – Dokumentation....
  • Lambin, E.F., & Geist, H.J. (2006). Introduction: Local processes with global impacts. In: Lambin, E.F. & Geist, H.J....
  • J.R. Landis et al.

    The measurement of observer agreement for categorical data

    Biometrics

    (1977)
  • Cited by (143)

    View all citing articles on Scopus
    View full text