Ozone Concentration Forecasting Based on Artificial Intelligence Techniques: A Systematic Review

Yafouz, Ayman; Ahmed, Ali Najah; Zaini, Nur’atiah; El-Shafie, Ahmed

doi:10.1007/s11270-021-04989-5

Ozone Concentration Forecasting Based on Artificial Intelligence Techniques: A Systematic Review

Published: 13 February 2021

Volume 232, article number 79, (2021)
Cite this article

Download PDF

Water, Air, & Soil Pollution Aims and scope Submit manuscript

Ozone Concentration Forecasting Based on Artificial Intelligence Techniques: A Systematic Review

Download PDF

Ayman Yafouz ORCID: orcid.org/0000-0002-0932-1295¹,
Ali Najah Ahmed²,
Nur’atiah Zaini¹ &
…
Ahmed El-Shafie^3,4

1553 Accesses
26 Citations
Explore all metrics

Abstract

The prediction of tropospheric ozone concentrations is vital due to ozone’s passive impacts on atmosphere, people’s health, flora and fauna. However, ozone prediction is a complex process and the wide range of traditional models is incapable to obtain an accurate prediction. “Artificial intelligence”, “machine learning” and “ozone prediction model” search terms in the title, abstract or keywords are involved. Inclusion criteria include subject area (engineering, computer science), English language and being published from 2015. This criterion obtained 156 articles, which were categorized into 4 areas of interest based on the machine learning technique applied. Recently as a result of the rapid development in the technology and the increase in the number of measured data, artificial intelligence techniques have been intensively used in predicting ozone concentration as an alternative to the traditional models. Therefore, the main objective of this study is to investigate the most developed techniques that have been used in predicting ozone concentrations as well as theoretic approaches such as information set approaches, fuzzy set approach and probabilistic set approaches. It is clearly stated that the standalone algorithms such as decision tree (DT) and support vector machine (SVM) outperformed multilayer perceptron (MLP); however, the latter is massively implemented by many researchers in the prediction of ozone concentrations. This review paper investigated artificial intelligence techniques integrated with optimization approaches. It can be concluded that hybrid algorithms have significantly improved the prediction accuracy. However, the majority of the proposed hybrid models have limitations; thus, there is a need to develop better hybrid algorithm that is able to tackle all the drawbacks of the improved algorithms and capable to capture the ozone concentration changes with a high level of accuracy.

Prediction of tropospheric ozone using artificial neural network (ANN) and feature selection techniques

Article 23 June 2021

Drashti Kapadia & Namrata Jariwala

Air Quality Modeling Using the PSO-SVM-Based Approach, MLP Neural Network, and M5 Model Tree in the Metropolitan Area of Oviedo (Northern Spain)

Article 26 August 2017

P. J. García Nieto, E. García-Gonzalo, … A. A. Rodríguez Miranda

Ground-level Ozone Prediction Using Machine Learning Techniques: A Case Study in Amman, Jordan

Article 30 May 2020

Maryam Aljanabi, Mohammad Shkoukani & Mohammad Hijjawi

1 Introduction

The low costs of sensors to scale airborne agents, besides the high existence of environmental data, lead forward to an extreme increase in the pollution dataset available amount for analysis (Bellinger et al. 2017). Air quality prediction is vital due to its impacts in keeping human health secured. Thus, ground-level O₃ prediction is also crucial (Samsuri Abdullah et al. 2020; Geetha and Prasika 2018). For human health and luxury, clean air is crucial. Air pollutant is a real hazard to both people and the planet (Mohd Napi et al. 2020). Indeed, ground-level O₃ is a global air pollution and dangerous element issue. In non-rural zones, ozone’s presence records a wide range in the meteorological and gas emission elements (Abdul Aziz et al. 2019). Ozone in the tropospheric layer stays as the almost inescapable air harming the universe that shall impact people’s prosperity, sustainability and the planet as well. Indeed, prospect danger to people’s health, crops, flora and fauna will occur due to a high tendency of O₃ in the tropospheric layer (Arsić et al. 2020).

Ozone, as illustrated in Flowchart 1, is crucial as it plays an important role in the thermal elements of the atmosphere. In urban areas, precocious death rate and chronic diseases of vulnerable community members are due to high-risk air quality (Abdullah et al. 2019; World Health Organization 2018). Owing to this, the artificial neural network (ANN) becomes a field of interest to be utilized in prediction purposes, especially metrological data and volatile organic compounds (Cabaneros et al. 2019). Recently, ANN performs desirably in various short- and long-term prediction implementations (Lightstone et al. 2017; Rahimi 2017).

Owing to this, to assist in minimizing the related consequences, short-term prediction is highly recommended, since it is frequently repeated over interval time, to raise population’s awareness by public authorities. Recently, support vector machine (SVM) besides artificial neural network—which is soft computing model derivation—becomes wide range and is used for air quality prediction depending on the meteorological and the gas emission data obtained from networks of air monitoring (Al-Abri and S. 2016).

Research papers which focused on the use of “machine learning” or “artificial intelligence” in the ozone prediction are checked and evaluated throughout the protocol of modelling recommended by Wu et al. (2014). In Addition, the performance of the models regarding ozone prediction by using machine learning is examined. Furthermore, papers concerned on tropospheric and stratosphere ozone concentration layers in industrial and urban areas (Wang et al. 2018c). Although multilayer perceptron (MLP) network is widely used in various prediction and forecasting applications (Cabaneros et al. 2017; Muslim et al. 2020), there is another formulation of models examined. It is expected that readers, through this paper, clearly understand a variety of ozone prediction approach terminologies and concept. For discussion in details of different subjects that have been mentioned in papers and textbook, see Bishop (2014), Kotzias et al. (2009) and Swamy (2018). All abbreviations used are mentioned in Table 1.

Table 1 List of abbreviation

Full size table

In part of the ANN model, development and types are outlined briefly. A detailed discussion for every single stage is stated somewhere (see Galelli et al. 2014; Humphrey et al. 2017; Jiang et al. 2004). Furthermore, the taxonomies of current selections at different stages at the progress of prediction were based on previous studies. The outcome of the paper reviewed is that the concentration of O₃ reflects the condition of the metrological data. Pollution due to high ozone concentration is affected by a variety of factors and the typical result of the conditions of metrological data such as temperature and relative humidity.

2 Literature Review

In this paper, the investigated articles covered are selected by using the below-mentioned international journals: Springer Nature, Atmospheric Environment, Environment International, Journal of Cleaner Production, American Geophysical Union, Atmosphere (MDPI), Environmental Research, Environmental Science and Technology, Science of the Total Environment, Atmospheric Chemistry and Physics, Environmental Pollution, Frontiers in Earth Science, The International Ozone Association, International Journal of Environmental Research and Public Health and International Journal of Automation and Computing. Also, international journals such as International Journal of Modelling on Simulation, Journal of Intelligent and Fuzzy Systems, Neural Networks, Air Quality, Atmosphere, and Health, Journal of Hazardous Materials, Water Resources, Journal of the Air and Waste Management Association, ScienceDirect, Chemosphere, Ecological Modelling, Atmospheric Measurement Techniques, Geoscientific Model Development, Journal of Scientific and Industrial Research, Environmental Modelling and Software, World Health Organization, Atmospheric Pollution Research, Neural Networks in a Softcomputing Framework, Fresenius Environmental Bulletin, Integrative Biology, Journal of Advanced Science and Technology, Advances in Neural Information Processing Systems, Advances in Space Research, Aerosol and Air Quality Research, Aerosol Science and Technology, Agricultural Water Management, Algorithms, American Chemical Society, American Journal of Epidemiology, American Statistical Association, Applied Acoustics, Applied Sciences, Applied Soft Computing, Applied Water Science, Arabian Journal of Geosciences, Artificial Intelligence Review, Atmosphere, Atmospheric and Oceanic Optics, Atmospheric Research, Chemical Engineering Research and Design, Chemometrics and Intelligent Laboratory Systems, Chinese Journal of Environmental Engineering, Computational Intelligence and Neuroscience, Computer Aided Chemical Engineering, Decision Support Systems, Ecological Processes and Environmental Modelling and Assessment.

The research papers are accessed through EZproxy library catalogues by Universiti Tenaga Nasional, ScienceDirect , IEEE Xplore and Scopus. Search terms involved “Ozone Prediction”, “Ozone Concentration Forecasting”, “Artificial Neural Networks” and “Machine Learning and ozone concentration”. For each database utilized in the paper review progress, the procedure of keyword search was frequently repeated till the citation that followed up stopped. Nevertheless, the reference list of the papers reviewed was tracked to obtain further references. The articles selected, in this review papers, were published from 2015 to June 2020. Articles that related to stratosphere and troposphere ozone concentration are included. Furthermore, articles with undesirable performance or obtained similar results to other approaches were not covered. Table 2 clearly stated the characteristics of the covered papers, such as the authors, study area and data examined.

The main contribution of this paper review is to present the recent machine learning techniques, including SVM, ANN, decision tree and hybrid models, for predicting ozone concentration. A variety of papers has been reviewed from different journals and scientific platforms.

Details regarding the publication year of the papers reviewed are shown in Fig. 1a. Meanwhile, the distribution of the number of articles with related air pollutant parameters is presented in Fig. 1b. The findings showed that oxides of nitrogen, e.g. NO₂, NO, PM₁₀ and PM_2.5 and O₃, are the highest parameters’ number of tested variables out of the papers reviewed. O₃ concentration prediction was tested in 63 papers out of 156 papers, whereas, by approximately 50% lower than O₃, PM concentration prediction was studied in 28 of the total papers reviewed. NO and CO have been studied in 20 and 10 papers, respectively. Furthermore, other variables have been discussed with low research intensity.

The distribution of research papers based on study area is shown in Fig. 2. It is noticed that there are an increasing number of papers since 2015 that cite the prediction of ozone concentration based on machine learning (ML) algorithms. This increase occurred due to the difficulty of accessing ML algorithms in the past (Cabaneros et al. 2019). Recently, the availability of ML algorithms as a faster and more helpful computing tool has earned research attention.

The aim of this paper is to investigate all aspects affecting the prediction of O₃ and the accuracy of models. This has been done by using a clear discussion on the integration of previous studies in the same or similar field. The research studies involved in this review paper are organized as shown in Table 2.

Table 2 Details of paper reviewed

Full size table

3 Discussion

3.1 Theoretic Approaches

Prediction process could be progressed with the help of past and present data, of the required variable, and then it is carried out to forecast the future developed trend utilizing logical reasoning and scientific methods (Yang et al. 2020). Historical data consist of full knowledge regarding different traits and historical behaviours of the data system. Time series prediction analysis plays an important role from catastrophe prevention and rational decisions in various areas. So, various approaches are required to efficiently predict various data accordingly with different traits (Wang et al. 2018a).

Air pollutant concentration prediction is essential to create alarm warning for air quality which possesses practical significance for community (Bai et al. 2018). Owing to this, there are several types of approaches established for ozone layer time series prediction analysis.

3.1.1 Informative Theory Approach

Based on Kingman and Kullback (1970), informative theories encompass the subject fields which create an interdisciplinary framework to explore different dynamic systems as well as their nonlinear relationships and behaviours (Mayer et al. 2014). Understanding these characteristics is necessary for atmospheric and environmental systems characterized by a high degree of complexity and nonlinearity (Chattopadhyay et al. 2020). Thus, a study carried out by Chattopadhyay et al. (2020) on the tropospheric ozone’s dependence on several ozone’s precursors, in the period of summer monsoon to, explored the intrinsic multicollinearity via Bartlett’s sphericity test. Additionally, PCA has been performed to obtain the most influential precursors of ozone by identifying the principal components along with maximizing factor loadings. This study has resulted in the investigation of the intrinsic uncertainty as well as identification of the normal distribution. However, there is a need for further study to investigate the post-monsoon as well as winter season which are the periods when pollution become in alarming stages.

The detrended fluctuation analysis of power law correlations in column ozone over Antarctica has been carried out by Varotsos (2005a). This study has utilized dataset from 1979 to 2003. Furthermore, the study illustrated the planetary waves’ role in scaling attribute of spatiotemporal of the Antarctica O₃ hole. This research has outcome that since 1996, the intrinsic dynamics in column ozone in the Antarctica’s edge have changed which means that does not hold for the ozone hole of Antarctica (Varotsos 2005a). Similarly, in the line with detrended fluctuation analysis for the global ozone layer, Varotsos (2005b) has utilized modern computational techniques. This analysis has been performed for globally and zonally averaged column ozone data implemented by satellite-borne (1979–2003) and ground-based (1964–2004) to identify long-term correlations in the time series of column ozone. The results showed that column ozone fluctuations illustrate persistence in long-range power law correlations at all lags of time.

3.1.2 Fuzzy set theoretic approaches

In 1993, Song and Chissom suggested the definition of fuzzy time series (FTS) rely on fuzzy set (Song and Chissom 1993). Recently, for the prediction of air pollutants, fuzzy time series has been utilized. According to a study carried out by Wang et al. (2017), fuzzy set theory has been utilized to build the air quality index. His research paper deploys the trapezoidal function to define and identify the negative effects of individual pollutants that lead to membership degrees in the individual pollutants. However, the limitation of the research is the low number of air pollutants examined; meanwhile, ignoring other environmental variables might influence the level of AQI. Furthermore, there is consideration of pollutant concentrations with respect of period, whereas a variety of correlated factors with efficiency of prediction has not been considered.

Another fuzzy approach has been developed by Wojtylak (2012), and this prediction approach has utilized fuzzy time series models in order to predict the pollutant such as O₃, CO, NO, SO₂, PM_2.5 and PM. The positive aspect in the study that encompasses all the data, including the chaotic, uncertain and imprecise, could not been utilized. Although the results of the experiment were promising, uncertainty and stability analysis is still questioned which is very vital for the model proposed. The consequences of not performing the uncertainty and stability analysis lead to insufficient results.

In line with demanding of the stability and uncertainty analysis, Yang et al. (2020) have implemented both analyses in order to evaluate the robustness of the model. This novel combined predicting system for air pollution is relying on fuzzy theory and optimization of aggregation weight. This proposed predicting system is capable of taking into account more information and maintaining the models’ diversity. For the prediction of PM_2.5 and PM₁₀, the proposed system has outweighed other models, such as backpropagation artificial neural network (BPNN), extreme learning machine (ELM) and double exponential, in terms of the generalization, stability and accuracy abilities that are the principle of a robust air quality early-warning system in practice.

3.1.3 Probabilistic Set Theoretic Approaches

Probabilistic approaches for air quality prediction have been highly preferred in order to enhance the capability for ozone prediction (Vautard et al. 2009). To augment forecasters of air quality from the USA, a study of probabilistic forecasting of surface ozone with novel approach has been proposed (Balashov et al. 2017). This study investigates the surface ozone predicting approach which is relied on standard meteorological variables and statistical approaches. It aims to generate probabilistic MDA8 ozone predictions and REGiS weighs and gathers all of the developed regression models on the principle of the weather patterns forecasted by an NWP model. The outcomes proposed which model proceeds highest once trained and adjusted individually for a single air quality monitoring station and its corresponding meteorological site. However, this approach is not developed to deal with sudden local emission modifications or occasions like biomass fire.

The National Centre for Atmospheric Research has carried out a research to enhance the air quality prediction with an analog ensemble (Delle Monache et al. 2020). The analog ensemble speculates the probability of the true state of a prediction which relies on a recent deterministic numerical forecasting and an archive of previous analogous predictions associated with previous observations. The results’ probabilistic predictions from analog ensemble are numerically sharp, reliable and consistent which quantify the underlying forecasting’s uncertainty. However, a wide range of datasets is required since analog-based approaches perform poorly when dealing with smoke of wild fire incidence.

A probabilistic prediction study for extreme NO₂ pollution episodes has been conducted by Aznarte (2017). The datasets used were for meteorological measures and NO₂ concentrations in order to construct models for extreme NO₂ concentration prediction. The experiments of results proved the reliability, accuracy and sharpness of predictions which outweigh point-forecasting alternatives; furthermore, investigation of the relative independent variables’ importance was included. This study has showed an approach to calculate the overlapping of thresholds’ probability that is not a complex and comprehensible manner to show probabilistic forecasts maximizing its advantages. Nevertheless, it is lacking of longer forecasting horizons and in need of inclusion of spatiotemporal consideration as well as other numerical prediction covariates.

3.2 Overview on prediction modelling

Based on the procedure described in Flowchart 2, the automatic generation of models from data is progressed and created in ML. Data is the origin of the ML approaches (Wang et al. 2019). The relationship of ML mode inputs and outputs can be described by the data which might be involved in unsupervised techniques (Khan & Kumar, 2019). Similarly, datasets encoded the specification that needed to be incorporated in the ML model. Moreover, in order to choose an effective ML model, the model’s selection should be implemented. Data processing, data partitioning, model selection, features, training and evaluation are progressed in ML, while are also empowering governance, repeatability and collaboration (Wang et al. 2019). In this study, we have concerned on different ML models which are the extensions of the well-known ML model, i.e. SVMs (Kumar et al. 2019; Tanaskuli et al. 2020).

3.3 Support Vector Machine

SVM is known as a supervised learning approach, where no assumption is applied to the implicit distribution of data (Mountrakis et al. 2011). SVM is a reputable machine learning technique that is widely implemented on regression cases and classification problems (Faris et al. 2018). The SVM is a ML approach, which performance depends on its determination of parameters, for pattern recognition purpose (Beltrami and da Silva 2020). In the case of regression forecasting, support vector machine created by Vapnik (1979) is massively utilized as a comprehensive mathematical model. To achieve differential subgroups in a short time, it provides reliable classification of high-dimensional big data to a narrow range of numbers of data points (support vectors) (Ozer et al. 2020). It is generated for classification case purpose; however, support vector regression (SVR) approaches capable to be successfully implemented in regression issues. SVR has utilized to forecast cloud cover, visibility, solar radiation and recently is used tremendously in predict air pollutant concentrations prediction like O₃ (Quej et al. 2017). According to Vapnik (2000), the SVR provides a high boundary on the generalization error which is structural risk minimization principle. SVR showed more superiority and accurately compared to a number of statistical forecasting methods (Su et al. 2020). The SVR algorithm provides the approximated regression functions which is developed by using a high-dimensional group of linear equations (Eqs. (1), (2) and (3)) which are stated below

$$ y= w\phi (x)+b $$

(1)

where w is the weight transmitter, ϕ(x) is the high-dimensional feature space, x is the input, y is the output and b is the coefficient representing bias.

By reducing below-mentioned regulated risk equation, above-mentioned parameters can be estimated.

$$ R=\frac{{\left\Vert w\right\Vert}^2}{2}+C{\sum}_{i=1}^N{L}_{\epsilon}\left({x}_i,{y}_i,f\ \right) $$

(2)

Where

$$ {L}_{\varepsilon}\left({x}_i,{y}_i,f\right)=\left\{\begin{array}{c}\left|{y}_i-f\left({x}_i\right)\right|-\varepsilon, \mid {y}_i-f\left({x}_i\right)\mid \ge \varepsilon \\ {}0,\mathrm{otherwise}\end{array}\right. $$

(3)

Currently, during big data era, the ML users are exposed to a variety of challenges related to involving a support vector machine in an environmental scheme that leads to velocity data, variety and volume. The datasets possessing a variety which is constructed in daily basis are raised widely in the scientific majority and engineering fields, involving text categorization, medical imaging, computational biology, genomics and banking. It is known that many data imply many extracting probabilities and uncovering useful knowledge; however, it could show pretty benefit at the first glance, due to the exceeding time limit and complicated storage capacity of the SVM training (Liu et al. 2017b; Qiu et al. 2016).

In the training procedure, the training dataset was divided into two classes in the purposes of determining a hyperplane as shown in Fig. 3. Its location is identified with a—normally small—sub-dataset of vectors extracted from the training set (T) named as support vectors (SVs). Identification of the vectors selected to be the SVs boosts the capability of interpretation of the SVM decisions (Mountrakis et al. 2011). Although the data are linearly separated by the hyperplane, SVMs are capable to deal with nonlinear problems in which they are linearly separable by using kernel functions.

An important obstacle of SVMs exists in the increase of O(t²) time and O(t³) complexity of memory, where t is called for cardinality of T. This drawback possesses the researchers’ attention; owing to this, the developed approaches are targeted either in empowering training set or in achieving minimized training sets of SVM from which support vectors are likely identified (Nalepa and Kawulok 2019). This paper concludes the accomplishments in this area.

3.3.1 Model selection for SVMs

Model selection of support vector machines is intrinsic due to an issue of identifying the hyperparameters of SVM where a kernel tool and its parameter are involved; thus, it is an expensive task computationally (Ding et al. 2015). SVMs not only possess a high prediction rate in a large number of real applications, the SVM efficiency and the high accuracy of classification based on the subset feature selection as well as the parameter setting (Faris et al. 2018). Thus, automated model selection is an essential point, because unsuitably tuned parameters could impact the performance of SVM. A crucial obstacle in the SVM modelling is its hyperparameters’ determination of optimal values, due to the importance of these parameters on the SVMs’ efficiency and effectiveness. Determination of optimal value task for the SVM hyperparameter is known as a problem of the SVM model selection (Kalita and Singh 2020). As shown in Flowchart 3, the modelling stages of SVM are represented by starting inserting the input variables till the final decision of model selection.

For the parameters’ determination of SVM, the grid search (GS) is the easiest, well-known and most suitable algorithm. On the other hand, the time consumption limits the GS for big-sized case (Beltrami and da Silva 2020). Swamy (2018) suggested a sophisticated adaptation strategy based on covariance matrix to deal with parameterized kernel space for kernel selection. This strategy research outweighs a standard grid search technique for determining the hypermeter (that is clearly not applicable for a wide range of parameters). Overall, support vector machine is influenced by the determination of regularization parameters from kernel function (Liu et al. 2017b). Referred to all preferences, the Gaussian function is applied as an approach for selecting SVM; this kernel approach is named as the radial basis function (RBF) parameters (Beltrami and da Silva 2020). Based on the study of Aladeemy et al. (2017), RBF kernel can achieve most of decision boundary shapes. By Eq. (4) or (5), the RBF kernel is expressed

$$ K\left({x}_i,{x}_j\right)={e}^{-\frac{{\left|\left|{x}_i-{x}_j\right|\right|}^2}{2{\sigma}^2}} $$

(4)

$$ K\left({x}_i,{x}_j\right)={e}^{-\gamma {\left|\left|{x}_i-{x}_j\right|\right|}^2} $$

(5)

to minimize the feature redundancy and enhance the prediction accuracy (Zhang et al. 2020), minimum redundancy and maximum relevance technique MRMR. This approach has performed well in the feature reduction that is an efficient technique to enhance the model selection framework performance.

Genetic algorithm (GA) performs well in solving problems related to the maximum value of fitness (Roy et al. 2020). This explains the process involving genetic algorithm into the criterion of model selection as incorporated by Lessmann et al. (2006). In order to empower the GA’s computational efficiency, Iram et al. (2018) proposed a genetic technique based on an advanced selection and mutation schemes. The efficiency of GA optimizer is due to the suggested operators, known as enhanced selection and log-scaled mutation GA (ESALOGA), which enhanced diversity, precision and consistency. The sophisticated optimizer, in the hybrid GA, is integrated with gradient descent approach (Roy et al. 2020). Basis model selection of genetic algorithms was utilized for the smooth twin parametric-margin SVMs by Wang et al. (2013). GA-based numerical moment matching approach proposed by Jahani et al. (2020) is adopted as a main uncertainty estimation approach of a wide range of dataset by using assessors’ data and key features.

In the latest algorithm proposed by Faris et al. (2018), multiverse optimizer (MVO) is implemented for selecting optimal features and optimizing the parameters of SVM simultaneously. For better accuracy and dimensionality reduction, a new variation of cohort intelligence (CI) algorithm, which is suggested by Aladeemy et al. (2017), is applied. Twin support vector machine (TSVM) is developed from standard support vector machine, and it is always better than support vector machine. However, wavelet TSVM enlarges the kernel function (KF) selection framework and improves generalization but incapable to perform well with a problem of parameter selection(Wang, et al., 2020c).

Another way is the usage of the grid research which is popular and considered as the easiest approach to choose the SVMs’ parameters. Novelty of Beltrami and da Silva (2020) is applying the quadtree technique to the GS, which is time consuming, in order to speed up the progress and deal with high. However, methods analyzing the class separability are overweighing the grid search algorithm. Thus, a developed index named as expected square distance ratio (ESDR) was proposed by Yin and Yin (2016) that is capable to perform as best as class separability criterion. Also, adaptive fusion of multiple kernel functions proposed by Wang et al. (2020b) achieves a robust ability of SVM generalization and increases forecast accuracy. Zhang and Song (2015) observed that different types of kernels might be similarly efficient for specific data and suggested a multilabel kernel recommendation approach based on the characteristics of data. A motivating model-transferring method, as model adaptation, is named heterogeneous max-margin classifier adaptation approach, abbreviated as HMCA, which was proposed by Mozafari and Jamzad (2016).

In the aim of speeding up the model selection progress of SVMs, parallel algorithms are suggested (Devos et al. 2014). Owing to this, by using parallel algorithms based on grid search and by removing the ones that have an extremely small chance of becoming support vectors, the computational time has been reduced. For boosting up the progress, Fayed and Atiya (2019) have proposed a sophisticated technique, which is faster, by using a SVM-based model with particle swarm optimization algorithm.

Researchers pay great attention in constructing new kernels rather than in developing the current kernel function (Zhang and Song 2015). For example, Gruca et al. (2014) proposed SVM techniques with kernel constructed by neuro-fuzzy systems. It is in the mind that identifying the targeted SVM model is ought to be combined with methods for training SVMs from a wide range of data (specifically for decreasing the cardinality SVM training sets), since the kernel’s best performance might be dependent on the result of training set selection technique.

3.4 Decision Tree

Decision tree (DT) is one of the machine learning techniques that is massively utilized in recognition, survival analysis, regression and classification (Li et al. 2019). With respect to the uniqueness of its benefits, DT has turned into one of the highest utilized and most reliable techniques for prediction purposes. Throughout training multiple trees and gathering their predictions, the forecasting performance of an individual DT is enhanced (Rokach 2016). The decision tree methodology is applying data mining approach in order to construct classification systems for enhancing forecast techniques for a target variable or based on multiple covariates (Song and Lu 2015). Generally, the feature number is reduced while the selection of features is applied with maintaining the equal or sometimes better learning performance (Rao et al. 2019). Using feature values of instances, decision trees classify those instances. Each node in a decision tree represents a feature in an instance to be classified (Pandya 2015).

It is capable to deal efficiently with wide-range, complex datasets without a complex parametric structure imposed since it is non-parametric. Also, it performs very well in prediction by using historical data (Song and Lu 2015). The decision tree training process is highly needed to be parallelized as in the big data requirement (Meng et al. 2016). Due its high communication cost, Meng et al. (2016) suggested parallel voting decision tree.

3.4.1 Different Algorithms of Decision Tree for Regression Model

Decision Tree Learning

The usage of a decision tree on the basis of predictive model that forms a map of observations for an item to result in obtaining the item’s target value is called decision tree learning (Somvanshi et al. 2017). Decision tree is preferred due its reputation of being easier to apply and explain than other quantitative data-driven approaches. As a regression model, a robust and simple decision tree learning algorithm was suggested by Liu et al. (2017a) to forecast the prices of copper as well as the prices of other metals. To identify the abnormal damage and shortcomings, Abdallah et al. (2018) implemented a decision tree learning algorithm which is used in the study to generate wind turbine telemetry.

A regression decision tree is capable to be implemented for regression cases that interact with a continuous target attribute. This type of tree might possess a minimum of four various representations of node such as internal nodes that are capable to be incorporated with oblique or univariate tests; meanwhile, the leaves could be joined with multivariate regression models or uncomplicated constant forecasts (Czajkowski and Kretowski 2016). The fundamental conception is to gather linear regression and decision trees to predict an attribute of numerical target.

Nowadays, great attention from research has been focused on the construction of decision tree ensembles, which are motivated by the massively used technique being Bayesian additive regression trees (BART) framework as a generative probabilistic model (Linero 2018). These techniques are executed by a tool of an efficient repeated partitioning model. To achieve the most efficient split at every tree, node is regularly influenced by using a least squares error criterion (Rokach 2016). The results show that without performing transformation for variables, the best performance of the regression tree models is achieved which are considered in trying to impute missing values that are restrained by non-random missingness (Taylor et al. 2018). Based on a leave-one-location-out cross-validation (LOLO CV) procedure, gradient boosting achieved the highest accuracy compared to other ten machine learning models with the lowest root mean square error (RMSE) (Watson et al. 2019). Due to the high accuracy of %R², the workability of regression tree modelling for egg weight prediction might be preferred in practical term compared with the multiple linear regression analysis and ridge regression techniques (Orhan et al. 2016).

Random forest

Machine learning algorithms are capable to deal with interactions and nonlinearities which do not seek a specific regression model to be identified (Pavlov 2019). The most preferred and widely utilized ML technique in the non-streaming (batch) setting, nowadays, is random forests. This priority is imputable due to its low demands and high performance which taken into account the hyperparameter tuning and input arranging (Zimmerman et al. 2018). It is progressed by creating plenty of DT at training time and a result of the average forecasting (regression) of the single trees (Xi et al. 2015).

For regression relationships between air pollutant concentrations, random forests are widely applied as an advanced data mining approach (Kamińska 2018). In the challenge case of sophisticating data streams, the random forest (RF) approach is incapable of performing better than any of the boosting or bagging techniques (Gomes et al. 2017). To minimize the overfitting risk, bootstrap accumulation of multiple regression trees is utilized by the random forest; also, to achieve high-accuracy forecast, random forest is gathering the forecast extracted from a lot of trees (Amaral et al. 2013). By using random forest, Shah et al. (2014) suggested multiple attribution that might be considered helpful in multiple epidemiology datasets. An evolved technique known as random forest spatiotemporal kriging (RF-STK) was generated to predict the daily NO₂ concentration which is reflected in good prediction (Zhan et al. 2018b). Kakkar and Jain (2016) have proposed a framework by using attribute selection to overcome defect prediction. This resulted in minimizing the total number of attributes utilized by an overage of 6-fold for every dataset.

Gradient Boosting

Gradient boosting decision tree (GBDT) is a machine learning technique for regression and classification cases that generates a prediction model in the shape of ensemble prediction models (Aartsen et al. 2015). The gradient boosting decision tree principle is gathering a weak set of classifiers to form a strong one. The primary obstacle is that it is required to peruse every single data instance to evaluate the information obtained for all probable split points that are, for each feature, consequently, time consumed. To overcome this limitation, Ke et al. (2017) suggest two approaches: exclusive feature bundling (EFB) and gradient-based one-side sampling (GOSS). In the context of ozone concentration prediction, Jumin et al. (2020) applied boosted decision tree which outperformed neural network techniques and linear regression for all stations examined. GBDT is a well-known approach and has a pretty number of efficient applications such as Extreme Gradient Boosting (XGBoost) and Pgbrt (Rao et al. 2019)

The XGBoost package is an effective and sophisticated enforcement of gradient boosting framework (Melville 2014). XGBoost is a very operative and excessively utilized machine learning approach which is massively applied by analysts to obtain desirable results on challenges of various machine learning techniques. The algorithm progresses faster than existing popular solutions on a single machine by ten times. Also, billions of examples are scale by it in memory-limited or distributed settings (Agarwal et al. 1994). The system achieves forecast accuracy higher compared with other studies (Liu et al. 2020c). The package involves tree learning algorithm and effective solver of linear model. XGBoost provides different objective functions and classification, including regression and ranking. The users are capable to define a new objective easily based on their interest since the system is created to be modifiable.

3.4.2 Attribute Selection for Regression

Dimensionality reduction and feature selection are crucial study cases in forming effective regression and classification models for enhancing the process of decision-making by utilizing data-based learning techniques. A suitable feature subset co-operates forward supporting the regression model performance, particularly while integrating with a high-dimensional feature space (Zhang et al. 2018b). However, the trees’ comprehensible rationality could be extremely influenced by the bias in the split attribute selection, and the conventional heuristic approaches are shown as multivalue. In the same context, to avoid the traditional heuristic attribute bias measurement, a feature selection approach for nodes based on the decision tree model conception is needed to be taken into account (Sun and Hu 2017). Among different types of machine learning, such as linear regressions (LRs), SVR, ridge regression, lasso regression, RF, ANNs and k-nearest neighbor (KNN), gradient boosted trees (GBTs) have the best performance as per correlation coefficient (R²), cross-validation (CV) and mean square error (MSE) values (Watson et al. 2019). A combination of particle swarm organization (PSO)-based feature selection was suggested by Niyonsaba and Jang (2015), and the result showed RF with 50 particles of PSO has the best performance of accuracy compared to other models by 99.8%.

For the typical regression tree (RT), the tree possesses leaves where every leaf possesses a fixed value, normally the mean targeted attribute value. A tree model might be shown like an ordinary RT extension. In general, feature selection reduces the number of features while keeping the same or even better learning performance (Rao et al. 2019).

3.5 Artificial Neural Network

ANN is a black-box computational approach which consists of interacting network to be shaped similarly as structures (Akrami et al. 2013). ANN possesses the capability and efficiency for dealing with nonlinear cases: eventually, it became the massively implemented approach recently (Farzad and El-Shafie 2017). Its structure consists of single input layer, hidden layers as needed and an output layer. Generally, it is widely recognized for its forecasting ability for the nonlinear variables (Kumar et al. 2020). ANN idea is similar to the human brain neurons to transfer data to following connected nodes. To form nonlinear relationships between the precursors of concentration ozone, Gavrila (2017) suggested a model containing a multilayer of neural network with error propagation which shows its ability to forecast ozone on short-term basis. To integrate a robust algorithm by utilizing ANN approaches, there are a lot of shortcomings that researchers are exposed to which are best input combinations, proper transfer function and continuous-time series data without missing (AlOmar et al. 2020).

By using ANN and BNN models, Solaiman et al. (2008) have investigated 3 emergent data-driven approaches in order to address the complexity of nonlinear relationships among metrological variables and ozone. These 3 dynamic neural network methods with different structures: Bayesian neural network models, recurrent neural network neural network and time-lagged feed-forward network. The outcomes illustrate that the three models are suitable in predicting equipment which outweigh the regularly utilized MLP which could be applied for a short period of O₃ concentration prediction; nevertheless, its incapability to identify the underlying physical processes is the primary limitation when applied in pollutant prediction.

In order predict pollutant concentration of PM₁₀, PM_2.5, O₃ and NO₂, Agarwal et al. (2020) have constructed an ANN model that achieved the coefficient of determination (R²) of 0.88, 0.86, 0.87 and 0.79, respectively. However, in terms of 4 subsequent days, prediction accuracy has resulted to a low value as R² for O₃ is 0.48. Hooyberghs et al. (2005) describe the development of a neural network tool to forecast the daily average PM₁₀ concentrations with a good accuracy up to 0.8 for R². A wavelet adaptive neuro-fuzzy inference system (ANFIS) model proposed by Bhardwaj and Pruthi (2020), which is a gathered intelligent scheme integrating the learning potency of ANN, is less resource-intensive and more effective compared to the existing models in prediction nitrogen dioxide (Esfandani and Nematzadeh 2016)

In another wide-range-area study implemented by Goulier et al. (2020), ten of air pollutant concentration were predicted based on three predictors (sound, time, traffic) by using the approach of artificial neural network named as street canyon in Münster. The results of this study have shown high accuracy for the NO₂ and O₃ which reflects the reliability of models’ forecasting ability integrated with the variables of all three inputs. With the aid of random grid, search for hyperparameters was adopted, as an optimizer, in the ANN model to predict O₃. This results in high accuracy of prediction achieved by Liu et al. (2019) where R² is 0.9896. GA-integrated artificial neural network performed very well compared to other implemented models in predicting air overpressure (Jahed Armaghani et al. 2018).

3.5.1 Feed-Forward Neural Networks

The most parsimonious model suggested by Offenberg et al. (2017), a feed-forward neural networks, as specified by AICC, uses 11 input data, a single hidden layer of 4 tanh activation function nodes and a single linear output function. Artificial neural network algorithms based on feed-forward backpropagation network was developed by Hafeez et al. (2020) to predict the ozone production and prediction. Interestingly, this study has indicated that the proposed ANN model achieved 0.9965 for R².

3.5.2 Recurrent or Feedback Neural Networks

The recurrent neural network (RNN) performs on the principle of keeping the output layer and insert them into the input to aid in forecasting layer’s outcome (Chung et al. 2015). The 1st layer is shaped like feed-forward neural network with the sum product of the features and the weights. The RNN progress commences once the transformation from one-step duration to the following neuron shall keep part of the knowledge obtained from the previous step duration, and is computed (Jin et al. 2017). The RNN progress might be noticed like a MLP network with the upgrading of adding feedback loops to its structure (Mo et al. 2020), which implies that every neuron performs as a memory cell in implementing computations.

ANN with backpropagation (BP) with a sigmoid activation function and middle layer and its hybrid with genetic algorithm (BP-GA) are utilized to enhance the proposed method performance (Esfandani and Nematzadeh 2016). To forecast ozone by utilizing, as input, the meteorological parameters only, Biancofiore et al. (2015) implemented a feed-forward neural structure technique which showed much more efficiency than multiple linear regression algorithm. For a long period of ozone prediction, the complementary ensemble empirical mode decomposition with cycle reservoir jumps, multiple linear regression (CEEMD + CRJ + MLR) has an outstanding performance where R² is 0.9763 (Mo et al. 2020).

3.5.3 Multilayer Perceptron

MLP networks are a combination of interacted connected nodes or neurons, named as the input, output and hidden layers. The number of nodes in the input layer was based on the number of input variables, when the output layer contains a single node resembling the target variable (Cabaneros et al. 2017). In the context of the ozone concentration prediction attempts, Capilla (2016) studied the two efficient methods which are multiple linear regression and neural network models. The comparison between the performance of multiple linear regression and multilayer perceptron was based on the result of RMSE, mean absolute percentage error (MAPE) and R² in short-term ozone prediction. The results showed that MLP achieved higher accuracy compared to MLR. Also, the MLP has better performance in predicting sediment yield when compared with SWAT based on the determination coefficient R² (Singh et al. 2012). For ozone prediction, MLP shows better evaluation results with 9% improvement in the correlation coefficient compared to RBF (Kovač-Andrić et al. 2016). After comparison has been made between types of feature selection, for instance PCA, stepwise regression and classification and regression trees (CART) in MLP, the MLP-PCA has slightly better performance in the prediction of NO₂ (Cabaneros et al. 2017). Messikh et al. (2017) employed MLP in the modelling of the phenol removal, where the result, interestingly, of this model has reach 0.99 for R² correlation coefficient.

In a research study carried out by Chattopadhyay and Bandyopadhyay (2007) for ozone prediction in Switzerland throughout artificial neural network with backpropagation, the models are built in single one-hidden layer as well as two-hidden layer perceptron with sigmoid activation function with a learning rate of 0.9 at peak ozone concentration. The results have showed that both of the two techniques are very promising; meanwhile, the two-hidden layer perceptron is performed better in predicting the mean monthly total ozone concentrations. However, models are varying which implicates the increasing nonlinearity in the model which provides very much to the ozone prediction. It implies that the dataset has various complexity degrees.

In line with this attempt, a comparative research between two ML models in the nonlinear regression and MLP for prediction of tropospheric O₃ was performed (Chattopadhyay et al. 2019). These models utilized PM₁₀, SO₂, NO_x and temperature as independent variables to predict the dependent variable O₃. This attempt has showed that MLP with tanhyperbolic activation function performed increasingly higher in correlation between the predicted and the actual datasets. The results highlighted that the MLP with gradient descent has similar efficiency in the performance to predict the ozone concentration the tropospheric layer. However, the coefficient of determination (R²) of 0.435, 0.343 and 0.272 is low for MLP1, MLP2 and regression models, respectively.

By using artificial neural network based on principal component analysis, Chattopadhyay and Chattopadhyay (2012) have built a model to predict monthly total O₃ concentration. The model was constructed by using the predictor variables extracted from PCA to be treated in ANN basis. It is found that rainfall and cloud cover are good predictors for the total monthly ozone. Furthermore, ANN possesses a better potential to predict monthly total ozone during pre-monsoon and winter seasons compared to after and during monsoon. Nevertheless, it is required to increase the number of data points as well as the need of more specific research such as the use of daily and hourly dataset prediction. Similarly, Chattopadhyay et al. (2012) implemented PCA on the data, where the multicollinearity is removed, which has been trained in ANN model. This attempt showed the proposed ANN model generates very good predictions. However, the predictions of two zones are not up to the significant accuracy’s degree which is 0.301 to 0.254 in terms of WI (Chattopadhyay et al., 2012) .

3.6 Hybrid Model

Hybrid models generated various individual forecast algorithms, an approach to tackle many shortcomings of predictive algorithms in the cost of more complicated final solution. The primary target is to select and fairly integrate a series of predictive models in a path of enhancing final forecast accuracy (Rozinajová et al. 2018). The accuracy development is obtained by gathering individual forecast model’s positivity along with reducing their defects. The essential target for constructing hybrid models is to enhance their robustness and to increase accuracy and generalization ability (Wu and Shahidehpour 2014). Recently, hybrid models that are highly used utilized air pollution prediction. Ensembles from decision tree are most well known for achieving high-quality forecasts in non-parametric regression cases (Linero 2018). A combination of RF algorithms with accurately controlled advanced multipollutant sensor packages, such as real-time affordable multipollutant (RAMP) monitors, represents an auspicious technique to tackle the low execution which suffers from low-cost air quality sensors (Zimmerman et al. 2018).

Single ANN model and single econometric presented lower prediction—based on prediction—compared to hybrid model proposed by Kim and Won (2018). The overall prediction ability of the hybrid model (ARIMA-SVMs) is improved, which is higher compared to that of the SVM model (Nie et al. 2012). A highly efficient hybrid model was suggested by Kai et al. (2017), which combined various individual models to tackle their limitations. The outputs present that this hybrid model outweighs baseline individual models. This study shows that SOM-LSSVM performs better than single LSSVM (Ismail et al. 2011). Owing to this, the integration of various models is capable to overbalance the single model (Zhang et al. 2018a). Yuan et al. (2016) gathered autoregressive integrated moving average (ARIMA) and GM models by using specific weight to predict energy consumption and figure out that the performance of prediction of this model is higher than that of the single and ARIMA and GM models.

Rubal and Kumar (2018) proposed to gather advanced differential evolution techniques with RF algorithms rather than concerning on currently existing stand-alone approach. This study showed that the proposed approach outweighs the single model with an independent classifier of multilabel classifier and Bayesian network algorithm. Similarly, throughout forecasting of carbon monoxide, the study of Masih (2018) showed that SVM and ANN possess low performance when compared to an ensemble classifier like RF and bagging which implies that hybrid models are more robust.

Rahmati et al. (2020) implemented the first attempt to distinguish the source areas of dust by the aid of hybrid ML models, which are considered as an intelligent system, consist of ANFIS and are generated with combined meta-heuristic optimization models: cultural algorithm (CA), differential evolution (DE) and bat algorithm (BA). The outcome is that the hybrid ANFIS-DE model ought to be investigated in terms of its cost-effective, auspicious approach to effectively allocate the areas of dust source.

Recently, a novel clustering-based ensemble model (CEeSNN) for air pollution prediction based on evolving spiking neural networks (eSNN) suggested by Maciąg et al. (2019) performed better than other models such as singleton NeuCube model, an MLP network and the ARIMA model. However, based on the evaluation indexes which are RMSE, MAE and MAPE, the Box-Jenkins ARIMA approaches outweigh the neural networks (Elman recurrent neural network, backpropagation neural network and radial basis function neural network). Based on the combined experimental methods (Xi et al. 2015), the outcomes have shown that hybrid models are better than single models.

4 Conclusion

Air pollution is very critical and important due to its impact on environmental and social cases. The majority of papers have been carried out in China, USA, India, Malaysia and Australia. This means ML algorithms are getting to be a growingly common tool in environmental health. In this paper, a review on the concentration prediction of different pollutants of air pollution, especially ozone, has been implemented. The essential target is to investigate recent approaches to predict ozone concentration by using ML algorithms. The review paper has been carried out in four areas of interest based on ML which are ANN, SVM, decision tree and hybrid models. It concluded that:

A variety of theoretic approaches has been mentioned and discussed in terms of their methodology and effectiveness. These approaches are information set approach, fuzzy set approach and fuzzy set approach. The evaluation of the performance of these approaches varies due to its own procedure and theory applied as well as the complexity of the datasets utilized and its own duration profile.
The SVM is a ML approach, which performance depends on its determination of parameters, in the purpose of pattern recognition (Beltrami and da Silva 2020). Generally, based on the observation of previous studies, it is clear that SVM is influenced highly by the sort of the kernel function and the regularization parameter selection (αs) (Liu et al. 2017b). SVM based on Tanaskuli et al. (2020) was preferred in predicting ozone concentration due to its high performance and accuracy achieved.
Decision tree algorithm is one of the ML models that is hard to justify which type of it is recommended to be applied. Identifying the best decision tree in advance is often impossible. Generally, heterogeneous node representation is needed throughout the used tree for various problems. Owing to this, (Czajkowski & Kretowski, 2016) proposed an extension for the GMT and GRT systems called the mixed global model tree (mGMT), which is specialized evolutionary algorithm (EA), for more understanding of the underlying progress beyond the representation selection. This state-of-the-art technique is capable of testing models in the leaves or internal nodes and seeking for an optimal tree structure.
ANN algorithm is progressed in accumulative process in which it transfers from neuron to the following neuron to pass the three categories of layers which are input layer, hidden layers and output layer. To integrate a robust algorithm by utilizing ANN approaches, there are a lot of shortcomings that researchers expose to which are best input combinations, proper transfer function and continuous-time series data without missing. Based on the reviewed papers, the MLP has been widely used in various fields and in the concentration prediction of air pollutants due to high accuracy achieved.
Hybrid models generated various individual forecast algorithms, an approach to tackle many shortcomings of predictive algorithms in the cost of more complicated final solution. The essential target for constructing hybrid models is to enhance their robustness and to increase accuracy and generalization ability. Based on the paper reviewed, it is clear to say that hybrid models outweigh single model based on the accuracy and time of computation and other factors as well. That is the reason of being auspicious algorithms in data mining. This indicates that hybrid models outweigh single models based on various previous studies.

The ML techniques such as DT, SVM, ANN and hybrid models possess a crucial role in all the artificial intelligence applications. This paper concludes the accomplishments in this area which is continuing to increase with advancements in ML algorithms related to prediction of ozone concentration. With this development, it suggests that several fruitful algorithms could be anticipated in the future.

References

Aartsen, M. G., Ackermann, M., Adams, J., Aguilar, J. A., Ahlers, M., Ahrens, M., et al. (2015). Determining neutrino oscillation parameters from atmospheric muon neutrino disappearance with three years of IceCube DeepCore data. Physical Review D: Particles, Fields, Gravitation, and Cosmology, 91(7). https://doi.org/10.1103/PhysRevD.91.072004.
Abayomi-Alli, A., Odusami, M. O., Abayomi-Alli, O. O., Misra, S., & Ibeh, G. F. (2019). Long short-term memory model for time series prediction and forecast of solar radiation and other weather parameters. In Proceedings - 2019 19th International Conference on Computational Science and Its Applications, ICCSA 2019 (pp. 82–92). https://doi.org/10.1109/ICCSA.2019.00004.
Chapter Google Scholar
Abdallah, I., Dertimanis, V., Mylonas, H., Tatsis, K., Chatzi, E., Dervilis, N., et al. (2018). Fault diagnosis of wind turbine structures using decision tree learning algorithms with big data. Safety and Reliability - Safe Societies in a Changing World - Proceedings of the 28th International European Safety and Reliability Conference, ESREL, 2018(December), 3053–3062. https://doi.org/10.1201/9781351174664-382.
Article Google Scholar
Abdi-Oskouei, M., Carmichael, G., Christiansen, M., Ferrada, G., Roozitalab, B., Sobhani, N., et al. (2020). Sensitivity of meteorological skill to selection of WRF-Chem physical parameterizations and impact on ozone prediction during the Lake Michigan Ozone Study (LMOS). Journal of Geophysical Research-Atmospheres, 125. https://doi.org/10.1029/2019JD031971.
Abdul Aziz, F. A. B., Rahman, N., & Mohd Ali, J. (2019). Tropospheric ozone formation estimation in Urban City, Bangi, using artificial neural network (ANN). Computational Intelligence and Neuroscience, 2019. https://doi.org/10.1155/2019/6252983.
Abdullah, S., Nasir, N. H. A., Ismail, M., Ahmed, A. N., & Jarkoni, M. N. K. (2019). Development of ozone prediction model in urban area. International Journal of Innovative Technology and Exploring Engineering, 8(10). https://doi.org/10.35940/ijitee.J1127.0881019.
Abdullah, S., Napi, N. N. L. M., Ahmed, A. N., Mansor, W. N. W., Mansor, A. A., Ismail, M., et al. (2020). Development of multiple linear regression for particulate matter (PM10) forecasting during episodic transboundary haze event in Malaysia. Atmosphere, 11(3), 1–14. https://doi.org/10.3390/atmos11030289.
Article CAS Google Scholar
Agarwal, A. K., Wadhwa, S., & Chandra, S. (1994). Diagnosis of tuberculosis--newer tests. The Journal of the Association of Physicians of India, 42(8), 665.
CAS Google Scholar
Agarwal, S., Sharma, S. R. S., Rahman, M. H., Vranckx, S., Maiheu, B., et al. (2020). Air quality forecasting using artificial neural networks with real time dynamic error correction in highly polluted regions. Science of the Total Environment, 735, 139454. https://doi.org/10.1016/j.scitotenv.2020.139454.
Article CAS Google Scholar
Akdemir, A., Filiz, B., & Özel, A. (2018). Investigation of performance of tropospheric ozone estimations in the industrial region using differential artificial neural networks methods. Global NEST Journal, 20(1), 103–108. https://doi.org/10.30955/gnj.002328.
Article CAS Google Scholar
Akrami, S. A., El-Shafie, A., & Jaafar, O. (2013). Improving rainfall forecasting efficiency using modified adaptive neuro-fuzzy inference system (MANFIS). Water Resources Management, 27(9), 3507–3523. https://doi.org/10.1007/s11269-013-0361-9.
Article Google Scholar
Al-Abri, E. S., & S., E. (2016). Modelling atmospheric ozone concentration using machine learning algorithms. PLoS One, 13(3), e0194889. https://doi.org/10.1371/journal.pone.0194889.
Article CAS Google Scholar
Al-Janabi, S., Mohammad, M., & Al-Sultan, A. (2020). A new method for prediction of air pollution based on intelligent computation. Soft Computing, 24(1), 661–680. https://doi.org/10.1007/s00500-019-04495-1.
Article Google Scholar
Al, R., Frutiger, J., Zubov, A., & Sin, G. (2018). Prediction of environmental properties using a hybrid group contribution approach. In Computer aided chemical engineering (Vol. 44). Elsevier Masson SAS. https://doi.org/10.1016/B978-0-444-64241-7.50282-2.
Aladeemy, M., Tutun, S., & Khasawneh, M. T. (2017). A new hybrid approach for feature selection and support vector machine model selection based on self-adaptive cohort intelligence. Expert Systems with Applications, 88, 118–131. https://doi.org/10.1016/j.eswa.2017.06.030.
Article Google Scholar
Alaiz Moreton, H., Fernández-Robles, L., Alfonso-Cendón, J., Castejón-Limas, M., Sánchez-González, L., & Pérez-Garcia, H. (2019). Ground-level ozone predictions using outlier identification leveraged sample weighted regressors. Journal of Experimental & Theoretical Artificial Intelligence, 31(6), 829–840. https://doi.org/10.1080/0952813X.2018.1509898.
Article Google Scholar
Allu, S. K., Srinivasan, S., Maddala, R. K., Reddy, A., & Anupoju, G. R. (2020). Seasonal ground level ozone prediction using multiple linear regression (MLR) model. Modeling Earth Systems and Environment, 3(0123456789). https://doi.org/10.1007/s40808-020-00810-0.
AlOmar, M. K., Hameed, M. M., & AlSaadi, M. A. (2020). Multi hours ahead prediction of surface ozone gas concentration: Robust artificial intelligence approach. Atmospheric Pollution Research. https://doi.org/10.1016/j.apr.2020.06.024.
Amaral, G., Bushee, J., Cordani, U. G., Kawashita, K., Reynolds, J. H., Almeida, F. F. M. D. E., et al. (2013). No 主観的健康感を中心とした在宅高齢者における健康関連指標に関する共分散構造分析Title. Journal of Petrology, 369(1), 1689–1699. https://doi.org/10.1017/CBO9781107415324.004.
Article Google Scholar
Arsić, M., Mihajlović, I., Nikolić, D., Živković, Ž., & Panić, M. (2020). Prediction of ozone concentration in ambient air using multilinear regression and the artificial neural networks methods. Ozone Science and Engineering, 42(1), 79–88. https://doi.org/10.1080/01919512.2019.1598844.
Article CAS Google Scholar
Awang, N., Kar Yong, N., & Yin Hoeng, S. (2017). Forecasting ozone concentration levels using Box-Jenkins ARIMA modelling and artificial neural networks: A comparative study. Matematika, 33(2), 119. https://doi.org/10.11113/matematika.v33.n2.900.
Article Google Scholar
Aznarte, J. L. (2017). Probabilistic forecasting for extreme NO2 pollution episodes. Environmental Pollution, 229(2), 321–328. https://doi.org/10.1016/j.envpol.2017.05.079.
Article CAS Google Scholar
Bae, S., Lim, Y. H., & Hong, Y. C. (2020). Causal association between ambient ozone concentration and mortality in Seoul, Korea. Environmental Research, 182(December 2019), 109098. https://doi.org/10.1016/j.envres.2019.109098.
Article CAS Google Scholar
Bai, L., Wang, J., Ma, X., & Lu, H. (2018). Air pollution forecasts: An overview. International Journal of Environmental Research and Public Health, 15(4), 1–44. https://doi.org/10.3390/ijerph15040780.
Article CAS Google Scholar
Baker, K. R., Nguyen, T. K. V., Sareen, N., & Henderson, B. H. (2020). Meteorological and air quality modeling for Hawaii, Puerto Rico, and Virgin Islands. Atmospheric Environment, 234. https://doi.org/10.1016/j.atmosenv.2020.117543.
Baker, K. R., Woody, M. C., Valin, L., Szykman, J., Yates, E. L., Iraci, L. T., et al. (2018). Photochemical model evaluation of 2013 California wild fire air quality impacts using surface, aircraft, and satellite data. Science of the Total Environment, 637-638, 1137–1149. https://doi.org/10.1016/j.scitotenv.2018.05.048.
Article CAS Google Scholar
Baker, K. R., & Woody, M. C. (2017). Assessing model characterization of single source secondary pollutant impacts using 2013 SENEX Field Study Measurements. Environmental Science and Technology, 51(7), 3833–3842. https://doi.org/10.1021/acs.est.6b05069.
Article CAS Google Scholar
Balashov, N. V., Thompson, A. M., & Young, G. S. (2017). Probabilistic forecasting of surface ozone with a novel statistical approach. Journal of Applied Meteorology and Climatology, 56(2), 297–316. https://doi.org/10.1175/JAMC-D-16-0110.1.
Article Google Scholar
Behm, S., & Haupt, H. (2020). Predictability of hourly nitrogen dioxide concentration. Ecological Modelling, 428(April), 109076. https://doi.org/10.1016/j.ecolmodel.2020.109076.
Article CAS Google Scholar
Bellinger, C., Mohomed Jabbar, M. S., Zaïane, O., & Osornio-Vargas, A. (2017). A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health, 17(1), 1–19. https://doi.org/10.1186/s12889-017-4914-3.
Article Google Scholar
Beloconi, A., & Vounatsou, P. (2020). Bayesian geostatistical modelling of high-resolution NO2 exposure in Europe combining data from monitors, satellites and chemical transport models. Environment International, 138(2), 105578. https://doi.org/10.1016/j.envint.2020.105578.
Article CAS Google Scholar
Beltrami, M., & da Silva, A. C. L. (2020). A grid-quadtree model selection method for support vector machines. Expert Systems with Applications, 146, 113172. https://doi.org/10.1016/j.eswa.2019.113172.
Article Google Scholar
Bhardwaj, R., & Pruthi, D. (2020). Development of model for sustainable nitrogen dioxide prediction using neuronal networks. International journal of Environmental Science and Technology, 17(5), 2783–2792. https://doi.org/10.1007/s13762-019-02620-z.
Article CAS Google Scholar
Biancofiore, F., Verdecchia, M., Di Carlo, P., Tomassetti, B., Aruffo, E., Busilacchio, M., et al. (2015). Analysis of surface ozone using a recurrent neural network. Science of the Total Environment, 514, 379–387. https://doi.org/10.1016/j.scitotenv.2015.01.106.
Article CAS Google Scholar
Broomandi, P., Dabir, B., Bonakdarpour, B., Rashidi, Y., & Akherati, A. (2018). Simulation of mineral dust aerosols in southwestern iran through numerical prediction models. Environmental Progress & Sustainable Energy, 37(4), 1380–1393. https://doi.org/10.1002/ep.12805.
Article CAS Google Scholar
Bishop, C. M. (2014). Neural networks for pattern recognition. by C. M. Bishop; Pattern Recognition and Neural Networks. by B. D. Ripley Review by Nicholas Lange Published by American Statistical Association Stable. URL: http://www.jstor.org/stable/2965437. Your us, 92(440), 1642–1645.
Cabaneros, S. M., Calautit, J. K., & Hughes, B. R. (2019). A review of artificial neural network models for ambient air pollution prediction. Environmental Modelling and Software, 119(June), 285–304. https://doi.org/10.1016/j.envsoft.2019.06.014.
Article Google Scholar
Cabaneros, S. M. L. S., Calautit, J. K. S., & Hughes, B. R. (2017). Hybrid artificial neural network models for effective prediction and mitigation of urban roadside NO2 pollution. Energy Procedia, 142, 3524–3530. https://doi.org/10.1016/j.egypro.2017.12.240.
Article CAS Google Scholar
Capilla, C. (2016). Prediction of hourly ozone concentrations with multiple regression and multilayer perceptron models. International Journal of Sustainable Development and Planning, 11(4), 558–565. https://doi.org/10.2495/SDP-V11-N4-558-565.
Article Google Scholar
Chattopadhyay, G., & Chattopadhyay, S. (2008). A probe into the chaotic nature of total ozone time series by correlation dimension method. Soft Computing, 12(10), 1007–1012. https://doi.org/10.1007/s00500-007-0267-7.
Article Google Scholar
Chattopadhyay, G., Chattopadhyay, S., & Chakraborthy, P. (2012). Principal component analysis and neurocomputing-based models for total ozone concentration over different urban regions of India. Theoretical and Applied Climatology, 109(1–2), 221–231. https://doi.org/10.1007/s00704-011-0569-7.
Article Google Scholar
Chattopadhyay, G., Midya, S. K., & Chattopadhyay, S. (2019). MLP based predictive model for surface ozone concentration over an urban area in the Gangetic West Bengal during pre-monsoon season. Journal of Atmospheric and Solar-Terrestrial Physics, 184, 57–62. https://doi.org/10.1016/j.jastp.2019.01.008.
Article CAS Google Scholar
Chattopadhyay, G., Midya, S. K., & Chattopadhyay, S. (2020). Information theoretic study of the ground-level ozone and its precursors over Kolkata, India, during the summer monsoon. Iranian Journal of Science and Technology, Transaction A: Science, 0123456789. https://doi.org/10.1007/s40995-020-01007-x.
Chattopadhyay, S., & Bandyopadhyay, G. (2007). Artificial neural network with backpropagation learning to predict mean monthly total ozone in Arosa, Switzerland. International Journal of Remote Sensing, 28(20), 4471–4482. https://doi.org/10.1080/01431160701250440.
Article Google Scholar
Chattopadhyay, S., & Chattopadhyay, G. (2012). Modeling and prediction of monthly total ozone concentrations by use of an artificial neural network based on principal component analysis. Pure and Applied Geophysics, 169(10), 1891–1908. https://doi.org/10.1007/s00024-011-0437-5.
Article Google Scholar
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2015). Chung15.Pdf. Proceedings of the 32nd International Conference on Machine Learning, 37. http://proceedings.mlr.press/v37/chung15.pdf
Collet, S., Kidokoro, T., Karamchandani, P., Shah, T., & Jung, J. (2017). Future-year ozone prediction for the United States using updated models and inputs. Journal of the Air and Waste Management Association, 67(8), 938–948. https://doi.org/10.1080/10962247.2017.1310149.
Article CAS Google Scholar
Czajkowski, M., & Kretowski, M. (2016). The role of decision tree representation in regression problems – An evolutionary perspective. Applied Soft Computing Journal, 48, 458–475. https://doi.org/10.1016/j.asoc.2016.07.007.
Article Google Scholar
de Hoogh, K., Gulliver, J., van Donkelaar, A., Martin, R. V., Marshall, J. D., Bechle, M. J., et al. (2016). Development of West-European PM2.5 and NO2 land use regression models incorporating satellite-derived and chemical transport modelling data. Environmental Research, 151(2), 1–10. https://doi.org/10.1016/j.envres.2016.07.005.
Article CAS Google Scholar
Delle Monache, L., Alessandrini, S., Djalalova, I., Wilczak, J., Knievel, J. C., & Kumar, R. (2020). Improving air quality predictions over the United States with an analog ensemble. Weather and Forecasting, 35(5), 2145–2162. https://doi.org/10.1175/WAF-D-19-0148.1.
Article Google Scholar
Derwent, R. G. (2020a). Monte Carlo analyses of the uncertainties in the predictions from global tropospheric ozone models: Tropospheric burdens and seasonal cycles. Atmospheric Environment, 231. https://doi.org/10.1016/j.atmosenv.2020.117545.
Derwent, R. G. (2020b). Global warming potential (GWP) for methane: Monte carlo analysis of the uncertainties in global tropospheric model predictions. Atmosphere, 11(5), 1–15. https://doi.org/10.3390/ATMOS11050486.
Article Google Scholar
Derwent, R. G., Parrish, D. D., Galbally, I. E., Stevenson, D. S., Doherty, R. M., Naik, V., & Young, P. J. (2018). Uncertainties in models of tropospheric ozone based on Monte Carlo analysis: Tropospheric ozone burdens, atmospheric lifetimes and surface distributions. Atmospheric Environment, 180(February), 93–102. https://doi.org/10.1016/j.atmosenv.2018.02.047.
Article CAS Google Scholar
Devos, O., Downey, G., & Duponchel, L. (2014). Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils. Food Chemistry, 148, 124–130. https://doi.org/10.1016/j.foodchem.2013.10.020.
Article CAS Google Scholar
Ding, Y., Cheng, L., Pedrycz, W., & Hao, K. (2015). Global nonlinear kernel prediction for large data set with a particle swarm-optimized interval support vector regression. IEEE Transactions on Neural Networks and Learning Systems, 26(10), 2521–2534. https://doi.org/10.1109/TNNLS.2015.2426182.
Article Google Scholar
Dunker, A. M., Wilson, G., Bates, J. T., & Yarwood, G. (2020). Chemical sensitivity analysis and uncertainty analysis of ozone production in the comprehensive air quality model with extensions applied to Eastern Texas. Environmental Science and Technology, 54(9), 5391–5399. https://doi.org/10.1021/acs.est.9b07543.
Article CAS Google Scholar
Esfandani, M. A., & Nematzadeh, H. (2016). Predicting air pollution in Tehran: Genetic algorithm and back propagation neural network. Journal of Artificial Intelligence and Data Mining, 4(1), 49–54. https://doi.org/10.5829/idosi.jaidm.2016.04.01.06.
Article Google Scholar
Eslami, E., Choi, Y., Lops, Y., & Sayeed, A. (2019). A real-time hourly ozone prediction system using deep convolutional neural network. Neural Computing and Applications, 0123456789, 8–11. https://doi.org/10.1007/s00521-019-04282-x.
Article Google Scholar
Faleh, R., Bedoui, S., & Kachouri, A. (2017). Ozone monitoring using support vector machine and K-nearest neighbors methods. Journal of Electrical and Electronics Engineering, 10(1), 49–52.
Google Scholar
Fan, J., Wu, L., Zhang, F., Cai, H., Wang, X., Lu, X., & Xiang, Y. (2018). Evaluating the effect of air pollution on global and diffuse solar radiation prediction using support vector machine modeling based on sunshine duration and air temperature. Renewable and Sustainable Energy Reviews, 94(February), 732–747. https://doi.org/10.1016/j.rser.2018.06.029.
Article CAS Google Scholar
Fang, C., Wang, L., & Wang, J. (2019). Analysis of the spatial–temporal variation of the surface ozone concentration and its associated meteorological factors in Changchun. Environments - MDPI, 6(4), 1–15. https://doi.org/10.3390/environments6040046.
Article CAS Google Scholar
Fang, H. (2018). Evaluation and prediction of ozone depletion. IOP Conference Series: Earth and Environmental Science, 170(5). https://doi.org/10.1088/1755-1315/170/5/052027.
Fares, S., Alivernini, A., Conte, A., & Maggi, F. (2019). Ozone and particle fluxes in a Mediterranean forest predicted by the AIRTREE model. Science of the Total Environment, 682, 494–504. https://doi.org/10.1016/j.scitotenv.2019.05.109.
Article CAS Google Scholar
Faris, H., Hassonah, M. A., Al-Zoubi, A. M., Mirjalili, S., & Aljarah, I. (2018). A multi-verse optimizer approach for feature selection and optimizing SVM parameters based on a robust system architecture. Neural Computing and Applications, 30(8), 2355–2369. https://doi.org/10.1007/s00521-016-2818-2.
Article Google Scholar
Farzad, F., & El-Shafie, A. H. (2017). Performance enhancement of rainfall pattern – water level prediction model utilizing self-organizing-map clustering method. Water Resources Management, 31(3), 945–959. https://doi.org/10.1007/s11269-016-1556-7.
Article Google Scholar
Fatimah, S., & Wiharto, W. (2018). Multiple linear regression for the analysis of the parameters used in dyes decolourisation by ozonation techniques. MATEC Web of Conferences, 154, 2–6. https://doi.org/10.1051/matecconf/201815401003.
Article CAS Google Scholar
Fayed, H. A., & Atiya, A. F. (2019). Speed up grid-search for parameter selection of support vector machines. Applied Soft Computing Journal, 80, 202–210. https://doi.org/10.1016/j.asoc.2019.03.037.
Article Google Scholar
Flores-Vergara, D., Nanculef, R., Valle, C., Osses, M., Jacques, A., & Dominguez, M. (2019). Forecasting ozone pollution using recurrent neural nets and multiple quantile regression. IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON, 2019, 1–6. https://doi.org/10.1109/CHILECON47746.2019.8988110.
Article Google Scholar
Fountoukis, C., Ayoub, M. A., Ackermann, L., Perez-Astudillo, D., Bachour, D., Gladich, I., & Hoehn, R. D. (2018). Vertical ozone concentration profiles in the Arabian Gulf region during summer and winter: Sensitivity of WRF-CHEM to planetary boundary layer schemes. Aerosol and Air Quality Research, 18(5), 1183–1197. https://doi.org/10.4209/aaqr.2017.06.0194.
Article Google Scholar
Freeman, B. S., Taylor, G., Gharabaghi, B., & Thé, J. (2018). Forecasting air quality time series using deep learning. Journal of the Air and Waste Management Association, 68(8), 866–886. https://doi.org/10.1080/10962247.2018.1459956.
Article CAS Google Scholar
Galelli, S., Humphrey, G. B., Maier, H. R., Castelletti, A., Dandy, G. C., & Gibbs, M. S. (2014). An evaluation framework for input variable selection algorithms for environmental data-driven models. Environmental Modelling and Software, 62, 33–51. https://doi.org/10.1016/j.envsoft.2014.08.015.
Article Google Scholar
Gao, M., Han, Z., Liu, Z., Li, M., Xin, J., Tao, Z., et al. (2018a). Air quality and climate change, topic 3 of the model inter-comparison study for Asia phase III (MICS-Asia III) - part 1: Overview and model evaluation. Atmospheric Chemistry and Physics, 18(7), 4859–4884. https://doi.org/10.5194/acp-18-4859-2018.
Article CAS Google Scholar
Gao, M., Yin, L., & Ning, J. (2018b). Artificial neural network model for ozone concentration estimation and Monte Carlo analysis. Atmospheric Environment, 184, 129–139. https://doi.org/10.1016/j.atmosenv.2018.03.027.
Article CAS Google Scholar
Gavrila, C. (2017). Ozone concentration prediction using artificial neural networks. Revista de Chimie, 68(10), 2224–2227. https://doi.org/10.37358/rc.17.10.5860.
Article CAS Google Scholar
Geetha, S., & Prasika, L. (2018). Ground level ozone prediction for Delhi using LSTM-RNN. International Journal of Innovative Technology and Exploring Engineering, 8(2S), 478–480.
Google Scholar
Ghoneim, O. A., Doreswamy, & Manjunatha, B. R. (2017). Forecasting of ozone concentration in smart city using deep learning. In 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017, 2017-Janua (pp. 1320–1326). https://doi.org/10.1109/ICACCI.2017.8126024.
Chapter Google Scholar
Gomes, H. M., Bifet, A., Read, J., Barddal, J. P., Enembreck, F., Pfharinger, B., et al. (2017). Adaptive random forests for evolving data stream classification. Machine Learning, 106(9–10), 1469–1495. https://doi.org/10.1007/s10994-017-5642-8.
Article Google Scholar
Goulier, L., Paas, B., Ehrnsperger, L., & Klemm, O. (2020). Modelling of urban air pollutant concentrations with artificial neural networks using novel input variables. International Journal of Environmental Research and Public Health, 17(6). https://doi.org/10.3390/ijerph17062025.
Gruca, A., Czachórski, T., & Kozielski, S. (2014). Man-machine interactions 3. Adv. Intell. Syst. Comput., 242. https://doi.org/10.1007/978-3-319-02309-0.
Guérette, E. A., Chang, L. T. C., Cope, M. E., Duc, H. N., Emmerson, K. M., Monk, K., et al. (2020). Evaluation of regional air quality models over Sydney, Australia: Part 2, comparison of PM2.5 and ozone. Atmosphere, 11(3). https://doi.org/10.3390/atmos11030233.
Guo, Y., Wang, H., Wang, B., Deng, S., Huang, J., Yu, G., & Wang, Y. (2018). Prediction of micropollutant abatement during homogeneous catalytic ozonation by a chemical kinetic model. Water Research, 142, 383–395. https://doi.org/10.1016/j.watres.2018.06.019.
Article CAS Google Scholar
Hafeez, A., Ammar Taqvi, S. A., Fazal, T., Javed, F., Khan, Z., Amjad, U. S., et al. (2020). Optimization on cleaner intensification of ozone production using artificial neural betwork and response surface methodology: Parametric and comparative study. Journal of Cleaner Production, 252, 119833. https://doi.org/10.1016/j.jclepro.2019.119833.
Article Google Scholar
Halliday. (2017). Modeling the climate impacts of deploying solar reflective cool pavements in California cities. Journal of Geophysical Research: Atmospheres. Journal of Geophysical Research-Atmospheres, 2, 6798–6817. https://doi.org/10.1002/2017JD026845.
Hooyberghs, J., Mensink, C., Dumont, G., Fierens, F., & Brasseur, O. (2005). A neural network forecast for daily average PM10 concentrations in Belgium. Atmospheric Environment, 39(18), 3279–3289. https://doi.org/10.1016/j.atmosenv.2005.01.050.
Article CAS Google Scholar
Hua, A. K. (2018). Applied chemometric approach in identification sources of air quality pattern in Selangor, Malaysia. Sains Malaysiana, 47(3), 471–479. https://doi.org/10.17576/jsm-2018-4703-06.
Article CAS Google Scholar
Huang, J., Zhu, Y., Kelly, J. T., Jang, C., Wang, S., Xing, J., et al. (2020a). Large-scale optimization of multi-pollutant control strategies in the Pearl River Delta region of China using a genetic algorithm in machine learning. Science of the Total Environment, 722, 137701. https://doi.org/10.1016/j.scitotenv.2020.137701.
Article CAS Google Scholar
Huang, Y., Li, T., Zheng, S., Fan, L., Su, L., Zhao, Y., et al. (2020b). QSAR modeling for the ozonation of diverse organic compounds in water. Science of the Total Environment, 715, 136816. https://doi.org/10.1016/j.scitotenv.2020.136816.
Article CAS Google Scholar
Humphrey, G. B., Maier, H. R., Wu, W., Mount, N. J., Dandy, G. C., Abrahart, R. J., & Dawson, C. W. (2017). Improved validation framework and R-package for artificial neural network models. Environmental Modelling and Software, 92, 82–106. https://doi.org/10.1016/j.envsoft.2017.01.023.
Article Google Scholar
Hyung, J. S., Kim, K. B., Kim, M. C., Lee, I. S., & Koo, J. Y. (2017). A study on prediction method for ozone dosage and residual ozone concentration in advanced ozone water treatment. Water Practice Technology, 12(1), 87–96. https://doi.org/10.2166/wpt.2017.014.
Article Google Scholar
Iram, S., Fernando, T., & Hill, R. (2018). Connecting to smart cities: Analyzing energy times series to visualize monthly electricity peak load in residential buildings (Vol. 1). Springer International Publishing. https://doi.org/10.1007/978-3-030-02686-8.
Islam, M. M., Sharmin, M., & Ahmed, F. (2020). Predicting air quality of Dhaka and Sylhet divisions in Bangladesh: a time series modeling approach. Air Quality, Atmosphere and Health, 13(5), 607–615. https://doi.org/10.1007/s11869-020-00823-9.
Article CAS Google Scholar
Ismail, S., Shabri, A., & Samsudin, R. (2011). A hybrid model of self-organizing maps (SOM) and least square support vector machine (LSSVM) for time-series forecasting. Expert Systems with Applications, 38(8), 10574–10578. https://doi.org/10.1016/j.eswa.2011.02.107.
Article Google Scholar
Jahani, E., Cetin, K., & Cho, I. H. (2020). City-scale single family residential building energy consumption prediction using genetic algorithm-based numerical moment matching technique. Building and Environment, 172(October 2019), 106667. https://doi.org/10.1016/j.buildenv.2020.106667.
Article Google Scholar
Jahed Armaghani, D., Hasanipanah, M., Mahdiyar, A., Abd Majid, M. Z., Bakhshandeh Amnieh, H., & Tahir, M. M. D. (2018). Airblast prediction through a hybrid genetic algorithm-ANN model. Neural Computing and Applications, 29(9), 619–629. https://doi.org/10.1007/s00521-016-2598-8.
Article Google Scholar
Jassim, M. S., Coskuner, G., & Munir, S. (2018). Temporal analysis of air pollution and its relationship with meteorological parameters in Bahrain, 2006–2012. Arabian Journal of Geosciences, 11(3), 2006–2012. https://doi.org/10.1007/s12517-018-3403-z.
Article CAS Google Scholar
Jiang, D., Zhang, Y., Hu, X., Zeng, Y., Tan, J., & Shao, D. (2004). Progress in developing an ANN model for air pollution index forecast. Atmospheric Environment, 38(40 SPEC.ISS.), 7055–7064. https://doi.org/10.1016/j.atmosenv.2003.10.066
Jin, X., Chen, Y., Jie, Z., Feng, J., & Yan, S. (2017). Multi-path feedback recurrent neural networks for scene parsing. 31st AAAI Conference on Artificial Intelligence, AAAI 2017, 4096–4102.
Jumin, E., Zaini, N., Ahmed, A. N., Abdullah, S., Ismail, M., Sherif, M., et al. (2020). Machine learning versus linear regression modelling approach for accurate ozone concentrations prediction. Engineering Applications of Computational Fluid Mechanics, 14(1), 713–725. https://doi.org/10.1080/19942060.2020.1758792.
Article Google Scholar
Juráň, S., Edwards-Jonášová, M., Cudlín, P., Zapletal, M., Šigut, L., Grace, J., & Urban, O. (2018). Prediction of ozone effects on net ecosystem production of Norway spruce forest. IForest, 11(6), 743–750. https://doi.org/10.3832/ifor2805-011.
Article Google Scholar
Kai, Y., Cai, Y., Dongping, H., Li, J., Zhou, Z., & Lei, X. (2017). An effective hybrid model for opinion mining and sentiment analysis. 2017 IEEE International Conference on Big Data and Smart Computing, BigComp 2017, 465–466. https://doi.org/10.1109/BIGCOMP.2017.7881759
Kakkar, M., & Jain, S. (2016). Feature selection in software defect prediction: A comparative study. In Proceedings of the 2016 6th International Conference - Cloud System and Big Data Engineering, Confluence 2016 (pp. 658–663). https://doi.org/10.1109/CONFLUENCE.2016.7508200.
Chapter Google Scholar
Kalita, D. J., & Singh, S. (2020). SVM Hyper-parameters optimization using quantized multi-PSO in dynamic environment. Soft Computing, 24(2), 1225–1241. https://doi.org/10.1007/s00500-019-03957-w.
Article Google Scholar
Kamińska, J. A. (2018). The use of random forests in modelling short-term air pollution effects based on traffic and meteorological conditions: A case study in Wrocław. Journal of Environmental Management, 217, 164–174. https://doi.org/10.1016/j.jenvman.2018.03.094.
Article CAS Google Scholar
Kane, S. N., Mishra, A., & Dutta, A. K. (2016). Preface: International Conference on Recent Trends in Physics (ICRTP 2016). Journal of Physics: Conference Series, 755(1). https://doi.org/10.1088/1742-6596/755/1/011001.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., et al. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 2017-Decem(Nips), 3147–3155.
Khan, A. W., & Kumar, P. (2019). Impact of chemical initial and lateral boundary conditions on air quality prediction. Advances in Space Research, 64(6), 1331–1342. https://doi.org/10.1016/j.asr.2019.06.028.
Article CAS Google Scholar
Kim, H. Y., & Won, C. H. (2018). Forecasting the volatility of stock price index: A hybrid model integrating LSTM with multiple GARCH-type models. Expert Systems with Applications, 103, 25–37. https://doi.org/10.1016/j.eswa.2018.03.002.
Article Google Scholar
Kim, M. S., Cha, D., Lee, K. M., Lee, H. J., Kim, T., & Lee, C. (2020). Modeling of ozone decomposition, oxidant exposures, and the abatement of micropollutants during ozonation processes. Water Research, 169, 115230. https://doi.org/10.1016/j.watres.2019.115230.
Article CAS Google Scholar
Kingman, J. F. C., & Kullback, S. (1970). Information theory and statistics. The Mathematical Gazette (New editio., Vol. 54). Dover Publications. https://doi.org/10.2307/3613211.
Kocijan, J., Gradišar, D., Stepančič, M., Božnar, M. Z., Grašič, B., & Mlakar, P. (2018). Selection of the data time interval for the prediction of maximum ozone concentrations. Stochastic Environmental Research and Risk Assessment, 32(6), 1759–1770. https://doi.org/10.1007/s00477-017-1468-y.
Article Google Scholar
Kotzias, D., Geiss, O., Tirendi, S., Josefa, B. M., Reina, V., Gotti, A., et al. (2009). Exposure to multiple air contaminants in public buildings, schools and kindergartens-the European indoor air monitoring and exposure assessment (airmex) study. Fresenius Environmental Bulletin, 18(5 A), 670–681.
CAS Google Scholar
Kovač-Andrić, E., Sheta, A., Faris, H., & Gajdošik, M. Š. (2016). Forecasting ozone concentrations in the east of Croatia using nonparametric neural network models. Journal of Earth System Science, 125(5), 997–1006. https://doi.org/10.1007/s12040-016-0705-y.
Article CAS Google Scholar
Kumar, B., Vyas, O. P., & Vyas, R. (2019). A comprehensive review on the variants of support vector machines. Modern Physics Letters B, 33(25), 1–11. https://doi.org/10.1142/S0217984919503032.
Article CAS Google Scholar
Kumar, P., Lai, S. H., Wong, J. K., Mohd, N. S., Kamal, M. R., Afan, H. A., et al. (2020). Review of nitrogen compounds prediction in water bodies using artificial neural networks and other models. Sustainability (Switzerland), 12(11), 1–26. https://doi.org/10.3390/su12114359.
Article CAS Google Scholar
Lee, K., Yu, J., Lee, S., Park, M., Hong, H., Young Park, S., et al. (2020). Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues. Geoscientific Model Development, 13(3), 1055–1073. https://doi.org/10.5194/gmd-13-1055-2020.
Article CAS Google Scholar
Lessmann, S., Stahlbock, R., & Crone, S. F. (2006). Genetic algorithms for support vector machine model selection. In IEEE International Conference on Neural Networks - Conference Proceedings (pp. 3063–3069). https://doi.org/10.1109/ijcnn.2006.247266.
Chapter Google Scholar
Li, J., Chen, X., Wang, Z., Du, H., Yang, W., Sun, Y., et al. (2018). Radiative and heterogeneous chemical effects of aerosols on ozone and inorganic aerosols over East Asia. Science of the Total Environment, 622-623, 1327–1342. https://doi.org/10.1016/j.scitotenv.2017.12.041.
Article CAS Google Scholar
Li, M., Xu, H., & Deng, Y. (2019). Evidential decision tree based on belief entropy. Entropy, 21(9), 897. https://doi.org/10.3390/e21090897
Li, R., Zhao, Y., Zhou, W., Meng, Y., Zhang, Z., & Fu, H. (2020). Developing a novel hybrid model for the estimation of surface 8 h ozone (O3) across the remote Tibetan Plateau during 2005-2018. Atmospheric Chemistry and Physics, 20(10), 6159–6175. https://doi.org/10.5194/acp-20-6159-2020.
Article CAS Google Scholar
Lightstone, S. D., Moshary, F., & Gross, B. (2017). Comparing CMAQ forecasts with a neural network forecast model for PM2.5 in New York. Atmosphere, 8(9). https://doi.org/10.3390/atmos8090161.
Linero, A. R. (2018). Bayesian regression trees for high-dimensional prediction and variable selection. Journal of the American Statistical Association, 113(522), 626–636. https://doi.org/10.1080/01621459.2016.1264957.
Article CAS Google Scholar
Liu, C., Hu, Z., Li, Y., & Liu, S. (2017a). Forecasting copper prices by decision tree learning. Resources Policy, 52(August 2016), 427–434. https://doi.org/10.1016/j.resourpol.2017.05.007.
Article Google Scholar
Liu, C., Geng, H., Shen, P., Wang, Q., & Shi, K. (2018a). Coupling detrended fluctuation analysis of the relationship between O3 and its precursors –a case study in Taiwan. Atmospheric Environment, 188(January), 18–24. https://doi.org/10.1016/j.atmosenv.2018.06.022.
Article CAS Google Scholar
Liu, P., Choo, K. K. R., Wang, L., & Huang, F. (2017b). SVM or deep learning? A comparative study on remote sensing image classification. Soft Computing, 21(23), 7053–7065. https://doi.org/10.1007/s00500-016-2247-2.
Article Google Scholar
Liu, Pengfei, Li, H., Jing, Z., & Song, H. (2020a). Analysis of potential factors influencing ground-level ozone concentrations in Chinese cities, (June), 422–441. https://doi.org/10.1007/978-981-15-6106-1_31
Book Google Scholar
Liu, P., Song, H., Wang, T., Wang, F., Li, X., Miao, C., & Zhao, H. (2020b). Effects of meteorological conditions and anthropogenic precursors on ground-level ozone concentrations in Chinese cities. Environmental Pollution, 262. https://doi.org/10.1016/j.envpol.2020.114366.
Liu, R., Ma, Z., Liu, Y., Shao, Y., Zhao, W., & Bi, J. (2020c). Spatiotemporal distributions of surface ozone levels in China from 2005 to 2017: A machine learning approach. Environment International, 142(March), 105823. https://doi.org/10.1016/j.envint.2020.105823.
Article CAS Google Scholar
Liu, T., Liu, Y., Wang, D., Li, Y., & Shao, L. (2019). Artificial neural network modeling on the prediction of mass transfer coefficient for ozone absorption in RPB. Chemical Engineering Research and Design, 152, 38–47. https://doi.org/10.1016/j.cherd.2019.09.027.
Article CAS Google Scholar
Liu, T., Lau, A. K. H., Sandbrink, K., & Fung, J. C. H. (2018b). Time series forecasting of air quality based on regional numerical modeling in Hong Kong. Journal of Geophysical Research-Atmospheres, 123(8), 4175–4196. https://doi.org/10.1002/2017JD028052.
Article CAS Google Scholar
Liu, Z., Loo, C. K., Masuyama, N., & Pasupa, K. (2017c). Multiple steps time series prediction by a novel recurrent kernel extreme learning machine approach. 2017 9th International Conference on Information Technology and Electrical Engineering, ICITEE 2017, 2018-Janua, 1–4. https://doi.org/10.1109/ICITEED.2017.8250482.
Liu, Z., Loo, C. K., Masuyama, N., & Pasupa, K. (2018c). Recurrent kernel extreme reservoir machine for time series prediction. IEEE Access, 6, 19583–19596. https://doi.org/10.1109/ACCESS.2018.2823336.
Article Google Scholar
Luhar, A. K., Galbally, I. E., Woodhouse, M. T., & Thatcher, M. (2017). An improved parameterisation of ozone dry deposition to the ocean and its impact in a global climate-chemistry model. Atmospheric Chemistry and Physics, 17(5), 3749–3767. https://doi.org/10.5194/acp-17-3749-2017.
Article CAS Google Scholar
Swamy, M. N. S. (2018). Nerual networks in softcomputing framework. 經濟研究.
Ma, X., & Xie, F. (2020). Predicting April precipitation in the Northwestern United States based on Arctic stratospheric ozone and local circulation. Frontiers in Earth Science, 8(March), 1–12. https://doi.org/10.3389/feart.2020.00056.
Article CAS Google Scholar
Maciąg, P. S., Kasabov, N., Kryszkiewicz, M., & Bembenik, R. (2019). Air pollution prediction with clustering-based ensemble of evolving spiking neural networks and a case study for London area. Environmental Modelling and Software, 118(May), 262–280. https://doi.org/10.1016/j.envsoft.2019.04.012.
Article Google Scholar
Mansfield, M. L. (2018). Statistical analysis of winter ozone exceedances in the Uintah Basin, Utah, USA. Journal of the Air and Waste Management Association, 68(5), 403–414. https://doi.org/10.1080/10962247.2017.1339646.
Article CAS Google Scholar
Mao, Y., & Lee, S. (2019). Deep convolutional neural network for air quality prediction. Journal of Physics: Conference Series, 1302(3). https://doi.org/10.1088/1742-6596/1302/3/032046.
Masih, A. (2018). Modelling the atmospheric concentration of carbon monoxide by using ensemble learning algorithms. CEUR Workshop Proceedings, 2298.
Mayer, A. L., Donovan, R. P., & Pawlowski, C. W. (2014). Information and entropy theory for the sustainability of coupled human and natural systems, 19(3).
Melville, S. (2014). xgboost: Extreme Gradient Boosting. R Lecture, (2016), 1–84. https://doi.org/10.1145/2939672.2939785>.This.
Ménard, R., Chabrillat, S., Robichaud, A., de Grandpré, J., Charron, M., Rochon, Y., et al. (2020). Coupled stratospheric chemistry-meteorology data assimilation. Part I: Physical background and coupled modeling aspects. Atmosphere, 11. https://doi.org/10.3390/atmos11020150.
Meng, Q., Ke, G., Wang, T., Chen, W., Ye, Q., Ma, Z. M., & Liu, T. Y. (2016). A communication-efficient parallel algorithm for decision tree. Advances in Neural Information Processing Systems, (Nips), 1279–1287.
Mensink, C. (2018). Air pollution modeling and its application XXV, (September 2017) (pp. 83–87). https://doi.org/10.1007/978-3-319-57645-9.
Book Google Scholar
Messikh, N., Bousba, S., & Bougdah, N. (2017). The use of a multilayer perceptron (MLP) for modelling the phenol removal by emulsion liquid membrane. Journal of Environmental Chemical Engineering, 5(4), 3483–3489. https://doi.org/10.1016/j.jece.2017.06.053.
Article CAS Google Scholar
Mo, Y., Li, Q., Karimian, H., Fang, S., Tang, B., Chen, G., & Sachdeva, S. (2020). A novel framework for daily forecasting of ozone mass concentrations based on cycle reservoir with regular jumps neural networks. Atmospheric Environment, 220(April 2019). https://doi.org/10.1016/j.atmosenv.2019.117072.
Mohd Napi, N. N. L., Abdullah, S., Ahmed, A. N., Mansor, A. A., & Ismail, M. (2020). Annual and diurnal trend of surface ozone (O3) in industrial area. IOP Conference Series: Earth and Environmental Science, 498(1). https://doi.org/10.1088/1755-1315/498/1/012062.
Mok, K. M., Yuen, K. V., Hoi, K. I., Chao, K. M., & Lopes, D. (2018). Predicting ground-level ozone concentrations by adaptive Bayesian model averaging of statistical seasonal models. Stochastic Environmental Research and Risk Assessment, 32(5), 1283–1297. https://doi.org/10.1007/s00477-017-1473-1.
Article Google Scholar
Mountrakis, G., Im, J., & Ogole, C. (2011). Support vector machines in remote sensing: A review. ISPRS Journal of Photogrammetry and Remote Sensing, 66(3), 247–259. https://doi.org/10.1016/j.isprsjprs.2010.11.001.
Article Google Scholar
Mozafari, A. S., & Jamzad, M. (2016). A SVM-based model-transferring method for heterogeneous domain adaptation. Pattern Recognition, 56, 142–158. https://doi.org/10.1016/j.patcog.2016.03.009.
Article Google Scholar
Muslim, T. O., Ahmed, A. N., Malek, M. A., Afan, H. A., Ibrahim, R. K., El-Shafie, A., et al. (2020). Investigating the influence of meteorological parameters on the accuracy of sea-level prediction models in Sabah, Malaysia. Sustainability (Switzerland), 12(3). https://doi.org/10.3390/su12031193.
Nalepa, J., & Kawulok, M. (2019). Selecting training sets for support vector machines: a review. Artificial Intelligence Review, 52(2), 857–900. https://doi.org/10.1007/s10462-017-9611-1.
Article Google Scholar
Nie, H., Liu, G., Liu, X., & Wang, Y. (2012). Hybrid of ARIMA and SVMs for short-term load forecasting. Energy Procedia, 16(PART C), 1455–1460. https://doi.org/10.1016/j.egypro.2012.01.229.
Article Google Scholar
Niyonsaba, E., & Jang, J. (2015). A study of security level in cloud computing An overview on cloud computing, 451–456. https://doi.org/10.1007/978-981-10-0281-6
Offenberg, J. H., Lewandowski, M., Kleindienst, T. E., Docherty, K. S., Jaoui, M., Krug, J., et al. (2017). Predicting thermal behavior of secondary organic aerosols. Environmental Science and Technology, 51(17), 9911–9919. https://doi.org/10.1021/acs.est.7b01968.
Article CAS Google Scholar
Orhan, H., Eyduran, E., Tatliyer, A., & Saygici, H. (2016). Prediction of egg weight from egg quality characteristics via ridge regression and regression tree methods. Revista Brasileira de Zootecnia, 45(7), 380–385. https://doi.org/10.1590/S1806-92902016000700004.
Article Google Scholar
Ozer, M. E., Sarica, P. O., & Arga, K. Y. (2020). New machine learning applications to accelerate personalized medicine in breast cancer: Rise of the support vector machines. OMICS A Journal of Integrative Biology, 24(5), 241–246. https://doi.org/10.1089/omi.2020.0001.
Article CAS Google Scholar
Pak, U., Kim, C., Ryu, U., Sok, K., & Pak, S. (2018). A hybrid model based on convolutional neural networks and long short-term memory for ozone concentration prediction. Air Quality, Atmosphere and Health, 11(8), 883–895. https://doi.org/10.1007/s11869-018-0585-1.
Article CAS Google Scholar
Pan, B. (2018). Application of XGBoost algorithm in hourly PM2.5 concentration prediction. IOP Conference Series: Earth and Environmental Science, 113(1). https://doi.org/10.1088/1755-1315/113/1/012127
Pandiselvam, R., Thirupathi, V., Chandrasekar, V., Kothakota, A., & Anandakumar, S. (2018). Numerical simulation and validation of mass transfer process of ozone gas in rice grain bulks. Ozone Science and Engineering, 40(3), 191–197. https://doi.org/10.1080/01919512.2017.1404902.
Article CAS Google Scholar
Pandya, R. (2015). C5. 0 algorithm to improved decision tree with feature selection and reduced error pruning., 117(16), 18–21.
Panek, J. A., McCarthy, J. M., Huth, A. Z., Krol, A. J., & Nowak, C. (2020). PRCI ambient NO2 AERMOD performance assessment and model improvement project: Modeled to observed comparison. Journal of the Air and Waste Management Association, 70(5), 504–521. https://doi.org/10.1080/10962247.2020.1743382.
Article CAS Google Scholar
Paton-Walsh, C., Guérette, É. A., Emmerson, K., Cope, M., Kubistin, D., Humphries, R., et al. (2018). Urban air quality in a coastal city: Wollongong during the MUMBA campaign. Atmosphere, 9(12). https://doi.org/10.3390/atmos9120500.
Pavlov, Y. L. (2019). Random forests. Random Forests, 1–122. https://doi.org/10.1201/9780367816377-11.
Pendlebury, D., Gravel, S., Moran, M. D., & Lupu, A. (2018). Impact of chemical lateral boundary conditions in a regional air quality forecast model on surface ozone predictions during stratospheric intrusions. Atmospheric Environment, 174(October 2017), 148–170. https://doi.org/10.1016/j.atmosenv.2017.10.052.
Article CAS Google Scholar
Pernak, R., Alvarado, M., Lonsdale, C., Mountain, M., Hegarty, J., & Nehrkorn, T. (2019). Forecasting surface O3 in Texas urban areas using random forest and generalized additive models. Aerosol and Air Quality Research, 19(12), 2815–2826. https://doi.org/10.4209/aaqr.2018.12.0464.
Article CAS Google Scholar
Qin, K., Han, X., Li, D., Xu, J., Li, D., Loyola, D., et al. (2020). Satellite-based estimation of surface NO2 concentrations over east-central China: A comparison of POMINO and OMNO2d data. Atmospheric Environment, 224, 117322. https://doi.org/10.1016/j.atmosenv.2020.117322.
Article CAS Google Scholar
Qiu, J., Wu, Q., Ding, G., Xu, Y., & Feng, S. (2016). A survey of machine learning for big data processing. Eurasip Journal on Advances in Signal Processing, 2016(1). https://doi.org/10.1186/s13634-016-0355-x.
Quej, V. H., Almorox, J., Arnaldo, J. A., & Saito, L. (2017). ANFIS, SVM and ANN soft-computing techniques to estimate daily global solar radiation in a warm sub-humid environment. Journal of Atmospheric and Solar-Terrestrial Physics, 155(September 2016), 62–70. https://doi.org/10.1016/j.jastp.2017.02.002.
Article Google Scholar
Rahimi, A. (2017). Short-term prediction of NO2 and NOx concentrations using multilayer perceptron neural network: a case study of Tabriz, Iran. Ecological Processes, 6(1). https://doi.org/10.1186/s13717-016-0069-x.
Rahmati, O., Panahi, M., Ghiasi, S. S., Deo, R. C., Tiefenbacher, J. P., Pradhan, B., et al. (2020). Hybridized neural fuzzy ensembles for dust source modeling and prediction. Atmospheric Environment, 224, 117320. https://doi.org/10.1016/j.atmosenv.2020.117320.
Article CAS Google Scholar
Rao, H., Shi, X., Rodrigue, A. K., Feng, J., Xia, Y., Elhoseny, M., et al. (2019). Feature selection based on artificial bee colony and gradient boosting decision tree. Applied Soft Computing Journal, 74, 634–642. https://doi.org/10.1016/j.asoc.2018.10.036.
Article Google Scholar
Rao, X. (2018). Establishment and application of air quality statistical forecasting model - taking air quality data from city A as an example. IOP Conference Series: Earth and Environmental Science, 208(1). https://doi.org/10.1088/1755-1315/208/1/012008.
Rekhate, C. V., & Shrivastava, J. K. (2020). Decolorization of azo dye solution by ozone based advanced oxidation processes: Optimization using response surface methodology and neural network. Ozone Science and Engineering, 00(00), 1–15. https://doi.org/10.1080/01919512.2020.1714426.
Article CAS Google Scholar
Rokach, L. (2016). Decision forest: Twenty years of research. Information Fusion, 27, 111–125. https://doi.org/10.1016/j.inffus.2015.06.005.
Article Google Scholar
Roy, S. S., Paraschiv, N., Popa, M., Lile, R., & Naktode, I. (2020). Prediction of air-pollutant concentrations using hybrid model of regression and genetic algorithm. Journal of Intelligent Fuzzy Systems, 38(5), 5909–5919. https://doi.org/10.3233/JIFS-179678.
Article Google Scholar
Rozinajová, V., Ezzeddine, A. B., Lóderer, M., Loebl, J., Magyar, R., & Vrablecová, P. (2018). Computational intelligence in smart grid environment. Computational Intelligence for Multimedia Big Data on the Cloud with Engineering Applications. Elsevier Inc. https://doi.org/10.1016/B978-0-12-813314-9.00002-5.
Rubal, & Kumar, D. (2018). Evolving differential evolution method with random forest for prediction of air pollution. Procedia Computer Science, 132, 824–833. https://doi.org/10.1016/j.procs.2018.05.094.
Article Google Scholar
Ryu, Y. H., Hodzic, A., Barre, J., Descombes, G., & Minnis, P. (2018). Quantifying errors in surface ozone predictions associated with clouds over the CONUS: A WRF-Chem modeling study using satellite cloud retrievals. Atmospheric Chemistry and Physics, 18(10), 7509–7525. https://doi.org/10.5194/acp-18-7509-2018.
Article CAS Google Scholar
Sagan, V., Maimaitiyiming, M., & Fishman, J. (2018). Effects of ambient ozone on soybean biophysical variables and mineral nutrient accumulation. Remote Sensing, 10(4), 1–24. https://doi.org/10.3390/rs10040562.
Article Google Scholar
Santosa, H., & Hobara, Y. (2017). One day prediction of nighttime VLF amplitudes using nonlinear autoregression and neural network modeling. Radio Science, 52(1), 132–145. https://doi.org/10.1002/2016RS006022.
Article Google Scholar
Sarnaglia, A. J. Q., Monroy, N. A. J., & da Vitória, A. G. (2018). Modeling and forecasting daily maximum hourly ozone concentrations using the RegAR model with skewed and heavy-tailed innovations. Environmental and Ecological Statistics, 25(4), 443–469. https://doi.org/10.1007/s10651-018-0413-7.
Article CAS Google Scholar
Sayeed, A., Choi, Y., Eslami, E., Lops, Y., Roy, A., & Jung, J. (2020). Using a deep convolutional neural network to predict 2017 ozone concentrations, 24 hours in advance. Neural Networks, 121, 396–408. https://doi.org/10.1016/j.neunet.2019.09.033.
Article Google Scholar
Seo, J. W., Youn, J. S., Park, S. J., & Joo, C. K. (2018). Development of a conjunctivitis outpatient rate prediction model incorporating ambient ozone and meteorological factors in South Korea. Frontiers in Pharmacology, 9(OCT), 1–8. https://doi.org/10.3389/fphar.2018.01135.
Article CAS Google Scholar
Shah, A. D., Bartlett, J. W., Carpenter, J., Nicholas, O., & Hemingway, H. (2014). Comparison of random forest and parametric imputation models for imputing missing data using MICE: A CALIBER study. American Journal of Epidemiology, 179(6), 764–774. https://doi.org/10.1093/aje/kwt312.
Article Google Scholar
Shahbazi, H., Karimi, S., Hosseini, V., Yazgi, D., & Torbatian, S. (2018). A novel regression imputation framework for Tehran air pollution monitoring network using outputs from WRF and CAMx models. Atmospheric Environment, 187, 24–33. https://doi.org/10.1016/j.atmosenv.2018.05.055.
Article CAS Google Scholar
Shen, L., & Mickley, L. J. (2017). Seasonal prediction of US summertime ozone using statistical analysis of large scale climate patterns. Proceedings of the National Academy of Sciences of the United States of America, 114(10), 2491–2496. https://doi.org/10.1073/pnas.1610708114.
Article CAS Google Scholar
Sihag, P., Mohsenzadeh Karimi, S., & Angelaki, A. (2019). Random forest, M5P and regression analysis to estimate the field unsaturated hydraulic conductivity. Applied Water Science, 9(5), 1–9. https://doi.org/10.1007/s13201-019-1007-8.
Article Google Scholar
Singh, A., Imtiyaz, M., Isaac, R. K., & Denis, D. M. (2012). Comparison of soil and water assessment tool (SWAT) and multilayer perceptron (MLP) artificial neural network for predicting sediment yield in the Nagwa agricultural watershed in Jharkhand, India. Agricultural Water Management, 104, 113–120. https://doi.org/10.1016/j.agwat.2011.12.005.
Article Google Scholar
Solaiman, T. A., Coulibaly, P., & Kanaroglou, P. (2008). Ground-level ozone forecasting using data-driven methods. Air Quality, Atmosphere and Health, 1(4), 179–193. https://doi.org/10.1007/s11869-008-0023-x.
Article CAS Google Scholar
Somvanshi, M., Chavan, P., Tambade, S., & Shinde, S. V. (2017). A review of machine learning techniques using decision tree and support vector machine. Proceedings - 2nd International Conference on Computing, Communication, Control and Automation, ICCUBEA 2016. https://doi.org/10.1109/ICCUBEA.2016.7860040
Song, Q., & Chissom, B. S. (1993). Fuzzy time series and its models, 54, 269–277.
Song, Y. Y., & Lu, Y. (2015). Decision tree methods: applications for classification and prediction. Shanghai Archives of Psychiatry, 27(2), 130–135. https://doi.org/10.11919/j.issn.1002-0829.215044.
Article Google Scholar
Su, X., An, J., Zhang, Y., Zhu, P., & Zhu, B. (2020). Prediction of ozone hourly concentrations by support vector machine and kernel extreme learning machine using wavelet transformation and partial least squares methods. Atmospheric Pollution Research, (November 2019). https://doi.org/10.1016/j.apr.2020.02.024
Suárez Sánchez, A., García Nieto, P. J., Iglesias-Rodríguez, F. J., & Vilán Vilán, J. A. (2013). Nonlinear air quality modeling using support vector machines in Gijón urban area (Northern Spain) at local scale. International Journal of Nonlinear Sciences and Numerical Simulation, 14(5), 291–305. https://doi.org/10.1515/ijnsns-2012-0119.
Article CAS Google Scholar
Sun, H., & Hu, X. (2017). Attribute selection for decision tree learning with class constraint. Chemometrics and Intelligent Laboratory Systems, 163, 16–23. https://doi.org/10.1016/j.chemolab.2017.02.004.
Article CAS Google Scholar
Vapnik, V. (1979). Support vector machines.
Vapnik, V. (2000) SVM method of estimating density, conditional probability, and conditional density.
Tan, Z., Lu, K., Jiang, M., Su, R., Dong, H., Zeng, L., et al. (2018). Exploring ozone pollution in Chengdu, southwestern China: A case study from radical chemistry to O3-VOC-NOx sensitivity. Science of the Total Environment, 636, 775–786. https://doi.org/10.1016/j.scitotenv.2018.04.286.
Article CAS Google Scholar
Tanaskuli, M., Ahmed, A. N., Zaini, N., Abdullah, S., Borhana, A. A., & Ahmed, A. N. (2020). Ozone prediction based on support vector machine., 17(3), 1461–1466. https://doi.org/10.11591/ijeecs.v17.i3.pp1461-1466.
Tao, H., Xing, J., Zhou, H., Pleim, J., Ran, L., Chang, X., et al. (2020). Impacts of improved modeling resolution on the simulation of meteorology, air quality, and human exposure to PM2.5, O3 in Beijing, China. Journal of Cleaner Production, 243, 118574. https://doi.org/10.1016/j.jclepro.2019.118574.
Article CAS Google Scholar
Taylan, O. (2017). Modelling and analysis of ozone concentration by artificial intelligent techniques for estimating air quality. Atmospheric Environment, 150, 356–365. https://doi.org/10.1016/j.atmosenv.2016.11.030.
Article CAS Google Scholar
Taylor, C. C., Yousif, A. E., & Mwitondi, K. S. (2018). Statistical analysis of particulate matter data in Doha, Qatar. WIT Transactions on Ecology and the Environment, 230, 107–118. https://doi.org/10.2495/AIR180101.
Article CAS Google Scholar
Varotsos, C. (2005a). Power-law correlations in column ozone over Antarctica. International Journal of Remote Sensing, 26(16), 3333–3342. https://doi.org/10.1080/01431160500076111.
Article Google Scholar
Varotsos, C. (2005b). Modern computational techniques for environmental data; application to the global ozone layer. Lecture Notes in Computer Science, 3516(III), 504–510. https://doi.org/10.1007/11428862_69.
Article Google Scholar
Vautard, R., Schaap, M., Bergström, R., Bessagnet, B., Brandt, J., Builtjes, P. J. H., et al. (2009). Skill and uncertainty of a regional air quality model ensemble. Atmospheric Environment, 43(31), 4822–4832. https://doi.org/10.1016/j.atmosenv.2008.09.083.
Article CAS Google Scholar
Wang, C., Qi, Y., & Zhu, G. (2020a). Deep learning for predicting the occurrence of cardiopulmonary diseases in Nanjing, China. Chemosphere, 257. https://doi.org/10.1016/j.chemosphere.2020.127176.
Wang, H. W., Li, X. B., Wang, D., Zhao, J., di He, H., & Peng, Z. R. (2020b). Regional prediction of ground-level ozone using a hybrid sequence-to-sequence deep learning approach. Journal of Cleaner Production, 253, 119841. https://doi.org/10.1016/j.jclepro.2019.119841.
Article CAS Google Scholar
Wang, H., Xu, D., & Martinez, A. (2020c). Parameter selection method for support vector machine based on adaptive fusion of multiple kernel functions and its application in fault diagnosis. Neural Computing and Applications, 32(1), 183–193. https://doi.org/10.1007/s00521-018-3792-7.
Article Google Scholar
Wang, J., Li, H., & Lu, H. (2018a). Application of a novel early warning system based on fuzzy time series in urban air quality forecasting in China. Applied Soft Computing Journal, 71, 783–799. https://doi.org/10.1016/j.asoc.2018.07.030.
Article Google Scholar
Wang, J., Zhang, X., Guo, Z., & Lu, H. (2017). Developing an early-warning system for air quality prediction and assessment of cities in China. Expert Systems with Applications, 84, 102–116. https://doi.org/10.1016/j.eswa.2017.04.059.
Article Google Scholar
Wang, L., Liu, D., Han, G., Wang, Y., Qing, T., & Jiang, L. (2018b). Study on the relationship between surface ozone concentrations and meteorological conditions in Nanjing, China | 南京地区近地面臭氧浓度与气象条件关系研究. Huanjing Kexue Xuebao/Acta Scientiae Circumstantiae, 38(4), 1285–1296. https://doi.org/10.13671/j.hjkxxb.2017.0401.
Article CAS Google Scholar
Wang, L., Liu, D., Han, G., Wang, Y., Qing, T., & Jiang, L. (2018c). Study on the relationship between surface ozone concentrations and meteorological conditions in Nanjing, China. Huanjing Kexue Xuebao/Acta Scientiae Circumstantiae, 38(4), 1285–1296. https://doi.org/10.13671/j.hjkxxb.2017.0401.
Article CAS Google Scholar
Wang, R., Li, W., Li, R., & Zhang, L. (2019). Automatic blur type classification via ensemble SVM. Signal Processing: Image Communication, 71, 24–35. https://doi.org/10.1016/j.image.2018.08.003.
Article CAS Google Scholar
Wang, Z., Shao, Y. H., & Wu, T. R. (2013). A GA-based model selection for smooth twin parametric-margin support vector machine. Pattern Recognition, 46(8), 2267–2277. https://doi.org/10.1016/j.patcog.2013.01.023.
Article Google Scholar
Watson, G. L., Telesca, D., Reid, C. E., Pfister, G. G., & Jerrett, M. (2019). Machine learning models accurately predict ozone exposure during wildfire events. Environmental Pollution, 254. https://doi.org/10.1016/j.envpol.2019.06.088.
Wei, W., Lv, Z. F., Li, Y., Wang, L. T., Cheng, S., & Liu, H. (2018). A WRF-Chem model study of the impact of VOCs emission of a huge petro-chemical industrial zone on the summertime ozone in Beijing, China. Atmospheric Environment, 175(December 2017), 44–53. https://doi.org/10.1016/j.atmosenv.2017.11.058.
Article CAS Google Scholar
Weissert, L., Alberti, K., Miles, E., Miskell, G., Feenstra, B., Henshaw, G. S., et al. (2020). Low-cost sensor networks and land-use regression: Interpolating nitrogen dioxide concentration at high temporal and spatial resolution in Southern California. Atmospheric Environment, 223(November 2019), 117287. https://doi.org/10.1016/j.atmosenv.2020.117287.
Article CAS Google Scholar
Wojtylak, M. (2012). Expert systems with applications, 39, 7673–7679. https://doi.org/10.1016/j.eswa.2012.01.023
World Health Organization. (2018). WHO | Ambient (outdoor) air quality and health. 經濟研究.
Wu, L., & Shahidehpour, M. (2014). A hybrid model for integrated day-ahead electricity price and load forecasting in smart grid. IET Generation, Transmission and Distribution, 8(12), 1937–1950. https://doi.org/10.1049/iet-gtd.2013.0927.
Article Google Scholar
Wu, Q., & Lin, H. (2019). A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors. Science of the Total Environment, 683, 808–821. https://doi.org/10.1016/j.scitotenv.2019.05.288.
Article CAS Google Scholar
Wu, W., Dandy, G. C., & Maier, H. R. (2014). Protocol for developing ANN models and its application to the assessment of the quality of the ANN model development process in drinking water quality modelling. Environmental Modelling and Software, 54, 108–127. https://doi.org/10.1016/j.envsoft.2013.12.016.
Article Google Scholar
Xi, X., Wei, Z., Xiaoguang, R., Yijie, W., Xinxin, B., Wenjun, Y., & Jin, D. (2015). A comprehensive evaluation of air pollution prediction improvement by a machine learning method. 10th IEEE Int. Conf. on Service Operations and Logistics, and Informatics, SOLI 2015 - In conjunction with ICT4ALL 2015, 176–181. https://doi.org/10.1109/SOLI.2015.7367615.
Xiong, J., He, Z., Tang, X., Misztal, P. K., & Goldstein, A. H. (2019). Modeling the time-dependent concentrations of primary and secondary reaction products of ozone with squalene in a university classroom. Environmental Science and Technology, 53(14), 8262–8270. research-article. https://doi.org/10.1021/acs.est.9b02302.
Article CAS Google Scholar
Xu, J., & Pei, L. (2018). Air quality index prediction using error back propagation algorithm and improved particle swarm optimization. Adv. Intell. Syst. Comput., 690. https://doi.org/10.1007/978-3-319-65978-7_2.
Xu, W., Riley, E. A., Austin, E., Sasakura, M., Schaal, L., Gould, T. R., et al. (2017). Use of mobile and passive badge air monitoring data for NOX and ozone air pollution spatial exposure prediction models. Journal of Exposure Science & Environmental Epidemiology, 27(2), 184–192. https://doi.org/10.1038/jes.2016.9.
Article CAS Google Scholar
Yahya, K., Wang, K., Campbell, P., Chen, Y., Glotfelty, T., He, J., et al. (2017). Decadal application of WRF/Chem for regional air quality and climate modeling over the U.S. under the representative concentration pathways scenarios. Part 1: Model evaluation and impact of downscaling. Atmospheric Environment, 152(2017), 562–583. https://doi.org/10.1016/j.atmosenv.2016.12.029.
Article CAS Google Scholar
Yang, H., Zhu, Z., Li, C., & Li, R. (2020). A novel combined forecasting system for air pollutants concentration based on fuzzy theory and optimization of aggregation weight. Applied Soft Computing Journal, 87, 105972. https://doi.org/10.1016/j.asoc.2019.105972.
Article Google Scholar
Yang, W., Chen, H., Wang, W., Wu, J., Li, J., Wang, Z., et al. (2019). Modeling study of ozone source apportionment over the Pearl River Delta in 2015. Environmental Pollution, 253, 393–402. https://doi.org/10.1016/j.envpol.2019.06.091.
Article CAS Google Scholar
Yin, S., & Yin, J. (2016). Tuning kernel parameters for SVM based on expected square distance ratio. Information Sciences, 370-371, 92–102. https://doi.org/10.1016/j.ins.2016.07.047.
Article Google Scholar
Yin, Z., Li, Y., & Cao, B. (2020). Seasonal prediction of surface O3-related meteorological conditions in summer in North China. Atmospheric Research, 246(June), 105110. https://doi.org/10.1016/j.atmosres.2020.105110.
Article CAS Google Scholar
Yu, Z., Jang, M., & Park, J. (2017). Modeling atmospheric mineral aerosol chemistry to predict heterogeneous photooxidation of SO2. Atmospheric Chemistry and Physics, 17(16), 10001–10017. https://doi.org/10.5194/acp-17-10001-2017.
Article CAS Google Scholar
Yuan, C., Liu, S., & Fang, Z. (2016). Comparison of China’s primary energy consumption forecasting by using ARIMA (the autoregressive integrated moving average) model and GM(1,1) model. Energy, 100, 384–390. https://doi.org/10.1016/j.energy.2016.02.001.
Article Google Scholar
Zhan, Y., Luo, Y., Deng, X., Grieneisen, M. L., Zhang, M., & Di, B. (2018a). Spatiotemporal prediction of daily ambient ozone levels across China using random forest for human exposure assessment. Environmental Pollution, 233, 464–473. https://doi.org/10.1016/j.envpol.2017.10.029.
Article CAS Google Scholar
Zhan, Y., Luo, Y., Deng, X., Zhang, K., Zhang, M., Grieneisen, M. L., & Di, B. (2018b). Satellite-based estimates of daily NO2 exposure in China using hybrid random forest and spatiotemporal kriging model. Environmental Science and Technology, 52(7), 4180–4189. https://doi.org/10.1021/acs.est.7b05669.
Article CAS Google Scholar
Zhang, D., Chen, S., Liwen, L., & Xia, Q. (2020). Forecasting agricultural commodity prices using model selection framework with time series features and forecast horizons. IEEE Access, 8, 28197–28209. https://doi.org/10.1109/ACCESS.2020.2971591.
Article Google Scholar
Zhang, J., Wei, Y. M., Li, D., Tan, Z., & Zhou, J. (2018a). Short term electricity load forecasting using a hybrid model. Energy, 158, 774–781. https://doi.org/10.1016/j.energy.2018.06.012.
Article Google Scholar
Zhang, L., Li, Q., Wang, T., Ahmadov, R., Zhang, Q., Li, M., & Lv, M. (2017). Combined impacts of nitrous acid and nitryl chloride on lower tropospheric ozone: New module development in WRF-Chem and application to China. Atmospheric Chemistry and Physics Discussions, (May), 1–31. https://doi.org/10.5194/acp-2017-389.
Zhang, L., Mistry, K., Lim, C. P., & Neoh, S. C. (2018b). Feature selection using firefly optimization for classification and regression models. Decision Support Systems, 106, 64–85. https://doi.org/10.1016/j.dss.2017.12.001.
Article Google Scholar
Zhang, X., & Song, Q. (2015). A multi-label learning based kernel automatic recommendation method for support vector machine. PLoS One, 10(4), 1–30. https://doi.org/10.1371/journal.pone.0120455.
Article CAS Google Scholar
Zhu, Y., Chen, C., Shi, J., & Shangguan, W. (2020). A novel simulation method for predicting ozone generation in corona discharge region. Chemical Engineering Science, 227, 115910. https://doi.org/10.1016/j.ces.2020.115910.
Article CAS Google Scholar
Zimmerman, N., Presto, A. A., Kumar, S. P. N., Gu, J., Hauryliuk, A., Robinson, E. S., et al. (2018). A machine learning calibration model using random forests to improve sensor performance for lower-cost air quality monitoring. Atmospheric Measurement Techniques, 11(1), 291–313. https://doi.org/10.5194/amt-11-291-2018.
Article CAS Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge the Ministry of Higher Education Malaysia for providing a fundamental research grant scheme (No.: FRGS/1/2018/TK10/UNITEN/03/2). In addition, the authors would like to thank the Malaysian Meteorological Department (MetMalaysia) for providing the relevant data.

Author information

Authors and Affiliations

Department of Civil Engineering, College of Engineering, Universiti Tenaga Nasional (UNITEN), 43000 Selangor, Malaysia
Ayman Yafouz & Nur’atiah Zaini
Institute of Energy Infrastructure (IEI), Department of Civil Engineering, Universiti Tenaga Nasional (UNITEN), 43000 Selangor, Malaysia
Ali Najah Ahmed
Department of Civil Engineering, Faculty of Engineering, University Malaya, 50603 Kuala Lumpur, Malaysia
Ahmed El-Shafie
National Water and Energy Center, United Arab Emirate University, P.O. Box 15551, Al Ain, United Arab Emirates
Ahmed El-Shafie

Authors

Ayman Yafouz
View author publications
You can also search for this author in PubMed Google Scholar
Ali Najah Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Nur’atiah Zaini
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed El-Shafie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ayman Yafouz.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yafouz, A., Ahmed, A.N., Zaini, N. et al. Ozone Concentration Forecasting Based on Artificial Intelligence Techniques: A Systematic Review. Water Air Soil Pollut 232, 79 (2021). https://doi.org/10.1007/s11270-021-04989-5

Download citation

Received: 08 September 2020
Accepted: 04 January 2021
Published: 13 February 2021
DOI: https://doi.org/10.1007/s11270-021-04989-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Ozone Concentration Forecasting Based on Artificial Intelligence Techniques: A Systematic Review

Abstract

Similar content being viewed by others

Prediction of tropospheric ozone using artificial neural network (ANN) and feature selection techniques

Air Quality Modeling Using the PSO-SVM-Based Approach, MLP Neural Network, and M5 Model Tree in the Metropolitan Area of Oviedo (Northern Spain)

Ground-level Ozone Prediction Using Machine Learning Techniques: A Case Study in Amman, Jordan

1 Introduction

2 Literature Review

3 Discussion

3.1 Theoretic Approaches

3.1.1 Informative Theory Approach

3.1.2 Fuzzy set theoretic approaches

3.1.3 Probabilistic Set Theoretic Approaches

3.2 Overview on prediction modelling

3.3 Support Vector Machine

3.3.1 Model selection for SVMs

3.4 Decision Tree

3.4.1 Different Algorithms of Decision Tree for Regression Model

Decision Tree Learning

Random forest

Gradient Boosting

3.4.2 Attribute Selection for Regression

3.5 Artificial Neural Network

3.5.1 Feed-Forward Neural Networks

3.5.2 Recurrent or Feedback Neural Networks

3.5.3 Multilayer Perceptron

3.6 Hybrid Model

4 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation