Abstract

Traffic prediction is critical to expanding a smart city and country because it improves urban planning and traffic management. This prediction is very challenging due to the multifactorial and random nature of traffic. This study presented a method based on ensemble learning to predict urban traffic congestion based on weather criteria. We used the NAS algorithm, which in the output based on heuristic methods creates an optimal model concerning input data. We had 400 data, which included the characteristics of the day’s weather, including six features: absolute humidity, dew point, visibility, wind speed, cloud height, and temperature, which in the final column is the urban traffic congestion target. We have analyzed linear regression with the results obtained in the project; this method was more efficient than other regression models. This method had an error of 0.00002 in terms of MSE criteria and SVR, random forest, and MLP methods, which have error values of 0.01033, 0.00003, and 0.0011, respectively. According to the MAE criterion, this method has a value of 0.0039. The other methods have obtained values of 0.0850, 0.0045, and 0.027, respectively, which show that our proposed model has a minor error than other methods and has been able to outpace the other models.

1. Introduction

Predicting and reducing traffic congestion is a critical priority for all cities in the world. Recently, this issue has received much attention from research projects to improve forecasting methods [1]. Traffic forecasting is critical to expanding a smart city and country because it improves urban planning and traffic management [2]. Traffic forecasting is an interdisciplinary research field that includes mathematics, computer science, and engineering. This prediction is very challenging due to the multifactorial and random nature of traffic. This complexity is due to the limitations that exist under the physical infrastructure of traffic, including road network capacity, traffic regulations, demeanor, days of the week, weather, and accidents [3]. Studies in the field of urban traffic in the past were based on the location of each person. It was a metropolitan area. Today, knowledge of traffic data has dramatically expanded the use of predictive model development. Many studies have examined driver behavior, travel mode, road conditions, and the importance of weather data. For example, in a study [4], rainfall intensity was associated with a 4–9% reduction in traffic. It is also found that traffic congestion has a significant relationship with temperature intensity. However, studies in this area ignore important environmental data sources and thus lose the accurate evaluation of the traffic network [5]. The effects of weather conditions on traffic are undeniable. However, little research has been done to quantify the impact of rain and snow. Besides, existing studies have not distinguished between urban and rural freeways. Therefore, it can be said that the inclusion of nontraffic input data sets has a significant effect on predicting the ground traffic parameter. Such studies have provided accurate predictions [6, 7].

In recent years, methods based on deep learning have received much attention. Deep learning (DL) has advanced speech, natural language processing (NLP). The deep learning approach is efficient in the field of short-term traffic forecasting [7, 8]. Interestingly, the use of social media data in DL approaches helps predict traffic. Social media such as Twitter and Instagram are increasingly used for communication, news reports, and promotional events. These media are forced to compensate for high-speed data and timely dissemination. For example, in Twitter data, due to the high level of users, they have been used for various data mining purposes such as stock exchange [9] and traffic forecasting [10, 11]. Climate variables are an influential factor in road traffic. Climate variables (x variables and y variables are traffic compensation variables) include absolute humidity, dew point, visibility, wind speed, cloud height, and temperature. These variables lead to reduced visibility, friction, and road blockage, thus increasing accidents. 22% of road accidents have been due to weather conditions, especially rainfall, in the last ten years. Hence, the importance of these variables is known [12]. It should be borne in mind that rain is only one of the parameters of difficult weather conditions. We will review the studies traffic forecasting by considering the weather conditions using deep learning.

In this paper, we presented a method based on ensemble learning. Based on this approach, we were able to predict urban traffic congestion based on weather criteria. It is based on a deep learning method. We used the neural architecture search algorithm, which, based on heuristic methods, creates an optimal model concerning input data. In addition to the deep learning model, we used a linear regression model. We combined the two models based on the weighted average, and the prediction output was based on the ensemble model. The results are illustrated in the following section.

2. Literature Review

An LSTM model was investigated in the study [13]. It has studied two parameters of temperature and rainfall in addition to traffic characteristics. The results of this study show a more accurate prediction. Despite all the advances in studies in this field, even the most accurate prediction models do not have robust predictions in unusual situations that occur specifically, such as accidents, crashes, and sporting events [7]. Recently, some studies have been beneficial in predicting these events on social media data. For example, the study [14] presents a linear regression-based method for predicting road traffic. This model uses the California Performance Measurement System (PeMS) in the United States. A similar model [15] has been used that uses social media data to predict short-term traffic. This model uses Twitter data to predict incoming traffic compensation before the start of sports games. This method has been evaluated using four models: ARIMA, neural network, support vector regression, and k-nearest neighbor (k-NN). Short-term traffic forecasting techniques are mainly classified into two parts: parametric and nonparametric approaches.

Parametric models are models that recognize input data as a function and summarize the data. For example, the autoregressive integrated moving average (ARIMA) model is one of the parametric prediction models [16]. However, in the nonparametric model, we can say that the algorithms are trained on the data. That is, they select a function that fits the data set. The k-nearest neighbor (k-NN) method is one of the simple nonparametric methods [15]. Many studies have been studied as a method of traffic forecasting [17, 18]. Regarding forecasting based on virtual social page data, a study [18] shows that the inclusion of Twitter data in addition to rainfall and weather data reduced the MAE from 8 to 5.5 and created a more accurate forecasting model.

In general, many techniques and models are presented in time series predictions, such as ARIMA [1], SVM [3], and deep nonlinear algorithms [19]. Neural networks such as RNNs [5], LSTMs [13], Kernel Extreme Learning [20], and CNN’s [13] are used for financial [9, 21] and traffic forecasts [14, 15]. Mainly, three main methods of deep learning have been studied in previous studies: convolutional neural networks (CNNs) [2], deep belief networks (DBNs) [22], and SAEs [23], but recent research has led to algorithms. The new one is anticipated and provides an opportunity for further investigation. In recent years, studies have shown that rainy or stormy weather significantly impacts traffic on drivers’ behavior, travel demand, safety, and so on [24, 25]. Mashros et al. [26] have found that the driving speed decreases by about 14% when it rains. Another study [27] with regression analysis investigated that topical rain reduces 1.1 mph in operating speed, and storm conditions cause 4 to 8 mph. Another report shows that adverse weather conditions can affect traffic. Many other studies show that rainfall intensity hurts traffic speed [28, 29]. Identifying the impact of weather on-peak and off-peak traffic times provides policymakers with invaluable information. A summary of the results of studies that have measured weather conditions and rainfall is given in Table 1. Almost all studies have stated that light rain has little effect on traffic, i.e., the parameter of rain intensity is essential in examining traffic forecasts. Temperature also has little impact on traffic [30]. Research by Smith et al. [31] has shown the importance of rainfall intensity values at moderate speeds. Like most other articles, they categorized the rain as none, light, and heavy. They used the Scheffé method for statistical comparisons. Still, a study by Neter [32] explained that this method might be unsuitable for climatic conditions because of using an equation of equal variance.

Deep learning models today have been able to demonstrate their ability to predict traffic. New traffic forecasting models based on deep learning and a combination of existing methods have been proposed in recent years. For example, the two-way LSTM-based model in study [36] and the deep LSTM in study [37] are designed to predict traffic. Sequence-to-sequence models are also used to predict traffic sequences [38, 39]. Some other studies also described multistream deep learning models [40, 41] as more accurate prediction methods because these models use auxiliary data such as accidents [37] and weather conditions [42]. Some prediction models [43] use spatial feature extraction to obtain spatial relationships in traffic networks, but the images become noisy. Recent studies [44, 45] attempt to use a three-dimensional convolutional network to extract data. Lin et al. [46] proposed geographical and temporal criteria to extract features from traffic data, followed by the random forest method to order the relevance of factors. Additionally, generative adversarial networks are used to produce some fresh event examples. Five types of tests are used to see if the suggested framework can handle the incident detection system’s limited sample size concern, unbalanced sample problem, and timeliness issue. A stacked autoencoder is also used to identify temporal and geographical correlations of traffic flow and detect events in other research. Similarly, the sample selection approach increased the detection’s real-time capability [47]. The studies that used deep learning are summarized in Table 2. Also, there are some optimization methods based on metaheuristic algorithms such as Harris hawk’s optimization [48], multiswarm whale [49], moth-flame optimizer [5052], grey wolf [53, 54], fruit fly [55, 56], bacterial foraging optimization [57], boosted binary Harris hawk’s optimizer [58], ant colony [59, 60], biogeography-based whale optimization [61], and grasshopper optimizer [62]. Moreover, optimization methods use machine learning in biological studies [6366].

3. Research Methodology

3.1. AutoML

Auto-machine learning or AutoML is an automated way to find the best data preparation, models, and hyperparameters for a predictive modeling problem. One of the advantages of AutoML is working with small amounts of data and preparing the model for better accuracy. Also, the purpose of this method is for people with machine learning knowledge to build optimal models.

3.2. Neural Architecture Search

NAS is one of the AutoML methods, in which we try to choose the architecture of the neural network in a way that is appropriate to our data and increase the accuracy of the work. In these methods, work is grouped based on the following:(i)Search space that includes the types of neural networks we can build(ii)Search strategy includes achieving the goal, defining the objective function, and the method of searching(iii)Performance estimation strategy that describes the method of estimating the performance of neural network models after being defined and obtained

This method has been used in different fields of different problems. Wang et al., for example, used the NAS method in an article to optimize the deep U-Net network for segmenting medical images and other models in this field. Figure 1 shows the NAS recursive controller. In this controller, each layer counts the number of filters, filter height, filter width, stride width, and stride height backward; each prediction is done by a SoftMax and is also given as input to the next step.

This method has not been used much in the detection of urban traffic and regression data. Therefore, in this work, we used one of the NAS methods for our deep learning model and optimized this model based on “our available data” to give higher accuracy compared with other deep learning models.

3.3. Ensemble Learning

According to articles and experimental evidence, some linear regression methods are more accurate in regression problems than deep learning methods. Also, in a small amount of data, these methods can easily fit their graph on the problem data, and their problems, in this case, are minor. Therefore, we used the ensemble learning method in this issue and tried to use two experts to predict traffic congestion. In ensemble learning methods, several expert learning algorithms are used to increase prediction accuracy compared with the prediction by an expert. Figure 2 shows a diagram of this model of methods.

In classification, the voting method is one of the standard techniques in this subject. However, in regression methods, the average (or the weighted average) approach can be used.

3.4. The Suggested Method

We first seek to find the best architecture for the deep learning method to fit our data in this method. First, we put the NAS module initially, and the NAS algorithm selects the best architecture. In the next step, we train this architecture with train data and build our deep learning model. This model is given in Figure 3 according to our work training. In the model analysis, we concluded that a complex model was not required due to the low amount of data. The NAS algorithm is given as the best model in the output given in the dense two-layer model. Also, ReLUx activator functions are given for each layer, and a normalization layer is provided to increase accuracy and prevent overfitting.

3.5. Linear Regression Method as the Second Expert

Since linear regression methods have high accuracy in regression problems, the second expert in our ensemble learning module was the linear regression method. This method gave the training data to a linear regression model for training and used this model for testing. In our method of testing and predicting data, the general form of work is similar to Figure 4.

In the ensemble module section, we used the weighted average of the two sections, deep learning and linear regression, and tested different numbers as weights between zero and one, which reached the amount of 0.7 and 0.3 (Algorithm 1).

Input: Labeled Data
Main Scope:
Split_train_test ()
NAS_algorithm ()
While (models_count = = 30):
Best_model = NAS_algorithm ()
Train_LinearRegression ()
Make Ensemble Module (deep architecture + linear regression)
predict (test_set)
Output: Prediction for Test Data

4. Results and Simulations

4.1. Data Collection

Estimating urban traffic is one of the most critical issues in helping municipalities control traffic in large cities. For this reason, this issue has been studied in many articles, and attempts have been made to solve this issue by machine learning methods. In this issue, several data from the urban traffic flow based on six variables are available. These six variables are as follows: absolute humidity, dew point, visibility, wind speed, cloud height, and temperature. Also, in the last column, the urban traffic congestion per day with the conditions of the values of the above variables is given. This work has presented a method based on deep learning and regression that estimates the urban traffic congestion with six relevant variables. In this paper, we designed a deep learning model for the urban traffic control system. The model predicts traffic congestion, and it can prevent heavy traffic. In this work, we used Python programming to build models and final estimates. Python libraries have also been used to build deep networks and regression models, including Keras, NumPy, sklearn, and scipy.

4.2. Descriptive Statistics of Data

Traffic flow is recorded by sensors every thirty seconds, and these data are collected in 60 minutes. Our database has been collected over thirty days with different types of variables. The shape of the pattern expresses this traffic flow based on the variables and their changes. These variables include available meteorological information, including absolute humidity, dew point, visibility, wind speed, cloud height, and temperature. The table of detailed statistics of traffic flow, including the average and standard deviation of variables and data and the variable of traffic flow forecast, is given. The amount of traffic flow per vehicle number is expressed in 10 seconds (see Table 3).

4.3. The Implementation of the Proposed Method

In this paper, it was assumed that the climatic characteristics are closely related to the traffic flow. Therefore, we presented a model based on neural networks and regression, which we explained in detail in the previous section. This section shows the architecture obtained from the method and compares our method with the famous regression method. Figure 5 shows the diagram of meteorological variables.

We used Python 3.6 to implement this project. We used Google Colaboratory to use the appropriate GPU and hardware and wrote our code in the Jupyter Notebook. Out of 400 available data due to lack of data for deep learning, we used only 50 data for testing. Moreover, in the next step, we implemented the NAS method. In this method, following our panel data, 30 different models are based on the available search space such as number of layers, type of activation function, number of neurons, and presence or absence of normalization layer, and each model is taught 30 epochs. In this training, out of a total of 350 training data, 20% of the data have been used as validation data to evaluate the models’ performance. This model results from 0.0003 in terms of mean absolute error, which is optimal compared with other methods. Figure 6 is an overview of the architecture selected by the NAS method for our train data.

After obtaining the optimal architecture, we trained this architecture with 250 epochs on all training data and improved the model based on training and testing data (see Figure 7). We also plotted the corresponding loss and MSE, which can be seen in the figure. The training process is such that the model reduces the error in the MSE criterion with the progress of the epochs, which shows the correct training process in deep learning.

Also, our training line is parallel to the axis of the horizon, which indicates the optimal fit of the model, which may be overfitting if the training process continues and the epochs increase. Finally, the loss line for train and test data is matched, indicating the model’s actual performance on both categories (see Figure 8).

After assembling the deep learning model with regression (explained in the previous section), we prepared the general model to process the test data. We gave the test data to this model for evaluation. The amount of data loss in our model is almost zero. This trend can be seen in Figure 9, where all the projected labels are close to the mainline on which the data fit, indicating the model’s superiority. In this figure, the blue line is the fitted line on the predicted data, close to the 45-degree angle, which shows that our targets are predicted with high accuracy. The results are also compared with the traffic flow trend in the figure. According to Figure 10, the closer we are to the red line, the higher the model’s accuracy. Our model is almost on the red line. This figure illustrates the concept that our model predicts almost all test data with equal accuracy equal to the data label and that the trend of this line is the same as that of the actual label line.

Also, the comparison results with the traffic flow trend for educational data are shown in Figure 11. The closer we get to the red line, the more accurate the model is. Our model is almost entirely on the red line for training data. This figure shows the concept that our model predicts almost all training data with equal accuracy equal to the data label, and the trend of this line is the same as the trend of actual labels. To compare and evaluate our method, we evaluated the support vector regression method. We trained this model with the same train data with which we trained the NAS model.

Furthermore, we validated this model on the same test data that we used for our method. The rate of data loss in this model was higher than in our model. In the figure, it can be seen that the items predicted by the model on the desired line do not fit well. In this figure, there are red circles that are 45° away from our line, which should be the fitted line for the test data. The distance from this line to these circles can also be directly related to the absolute error relation, the average of which was one of the evaluation points of this project. Figure 11 shows the diagram for the train data, a model estimate of the training data. In this method, the red line does not fit well on the blue line, which is the baseline that should be predicted.

It indicates that the SVR model has not been well trained in our training data. The results are also compared with the traffic flow trend for the SVR model. Due to this shape, the closer we are to the red line, the higher the model’s accuracy. This model did not have a reasonable estimate in more points than our model and was not well placed on the red line. In this figure, in some parts of the model, the data values are predicted to be more or less than the original value, so the test data trend line is not well aligned with our prediction trend line. Figure 12 shows the processing diagram for training data, which does not match the blue lines to the orange lines, which shows the weakness of the SVR model in regression training on training data.

Also, Table 4 compares the amount of MSE obtained with our proposed method with the SVR method. Our method can outpace the SVR method in terms of this criterion. The MSE criterion is calculated by the following equation:

In this regard, the expressions in parentheses are the predicted values and the central target values for the test data. Our model has reached 0.00002 in terms of MSE criteria, which is an absolute advantage over the SVR method, which in terms of MSE has a value of 0.01033. Also, random forest regression and MLP methods have obtained values of 0.00003 and 0.0011 for MSE, respectively, which are overcome by this criterion concerning our method.

5. Conclusion

This study presented a method based on ensemble learning to predict urban traffic congestion based on weather criteria. It is based on a deep learning method in which we used the NAS algorithm, which in the output based on heuristic methods creates an optimal model concerning input data. We had 400 data, which included the characteristics of the day’s weather, including six features: absolute humidity, dew point, visibility, wind speed, cloud height, and temperature, which in the final column is the urban traffic congestion as the target. We trained the training data with the deep learning model, which was obtained through the NAS method. These data contain 87.5% of our data. In the end, the loss for train and test data in this model overlapped, which indicates that the network is completely fitted on the data. In addition to the deep learning model, we used a linear regression model and combined the two models based on the weighted average, and the prediction output was based on the ensemble model. The linear regression model was also trained with train data. Our analysis was that due to the problems of neural networks for regression problems, combining the deep learning model with a neural network model increases our accuracy and problem and reduces the MSE error. One of the best and simplest regression models was the linear regression model used for this work and the ensemble learning method.

Moreover, we analyzed them with the results obtained in the project; this method was more efficient than other regression models. This method had an error of 0.00002 in terms of MSE criteria and SVR, random forest, and MLP methods, which have error values of 0.01033, 0.00003, and 0.0011, respectively. According to the MAE criterion, this method has a value of 0.0039. The other methods have obtained values of 0.0850, 0.0045, and 0.027, respectively, which show that our proposed model has more minor errors than other methods that can outpace the other models. This issue is defeated with these data. Future studies suggest that the NAS method can be modified, which can also be optimized based on the convolutional method and can also depict data like a photo and use the images for work. Also, the combination of deep learning methods and machine learning is a suitable method that can be used by several other experts in the work and make predictions. Collecting more data in this field can be one of the future works of this project because the data in this project was small, and with more data, deep learning models work better.

Data Availability

The data are extracted from reference [70] for traffic flow prediction in the paper.

Disclosure

The funding sources had no involvement in the study design, collection, analysis, or interpretation of data, writing of the manuscript, or in the decision to submit the manuscript for publication.

Conflicts of Interest

The authors declare that they have no conflicts of interest.