1 Introduction

With development of technology and industry, environment and air quality starts to attract attention not only from common people, but also from researchers. Especially under the background of slow speed of global economy growth, governments and economists pay more attention to the situation of the big countries‘ annual GDP, which highly impacts on the global economy. As the second largest economy, China becomes a center of attention in many fields, especially in economy and environment protection. The global organizations want China to take more contributions for the development of global economy and environment protection, such as keeping GDP growth rate and reduction of carbon dioxide emissions.

In fact, China has pay attention to enhance regional cooperation. In 2001, Chinese government and other five countries announced a creation for an organization called Shanghai Cooperation Organization (SCO) and formally established the organization in June 2002. After that, India and Pakistan joined SCO as full members on June 2017. Since now, China, Russia, India, which is the top countries that take larger share of world GDP, are the members in SCO. Furthermore, China and Pakistan has closed cooperation in economic and security. The situation of GDP in these countries will directly impact on regional economy or even the developing trend of world economic.

At the same time, CO2 emissions has become a sign of extent of development. However, the environmental protection organization and some developed countries started appealed to reduce CO2 emissions for stopping global warming that causes lots of natural hazards and environmental disruption. In 2016, the Paris Agreement, which is an agreement within the United Nations Framework Covention on Climate Change (UNFCCC) for dealing with greenhouse gas emissions mitigation, signed in Finance. China, Russia and India, as the one of the largest CO2 emissions countries, also ratified or acceded to the agreement for taking responsible for reduction of CO2 emissions.

In analysing the impact of the environment on the global economy, attention has naturally been focused on the larger countries because their environment outcomes have a material impact on the global economy. A favorite target is China, with its huge population, land mass and decades of rapid economic growth that until more recently had paid insufficient attention to environmental protection. Kan (2009) remarks that “China’s environmental problems, including outdoor and indoor air pollution, water shortages and pollution, desertification, and soil pollution, have become more pronounced and are subjecting Chinese residents to significant health risks” have been repeated and infinitum by numerous studies on the country.

As serious China’s impact is on the global environment, other large countries are likely to produce significant environmental impacts. In this study, we focus on four large countries, all members of the China-initiated Shanghai Cooperation Organization (SCO), that count as among the largest countries in the world. Besides China, SCO members include Russia, at 17 million km2 (https://www.worldometers.info/geography/largest-countries-in-the-world/) and (https://www.worldometers.info/geography/largest-countries-in-the-world/), almost geographically twice China’s size (9.7 million km2) if less populous. Both China and Russia were founder members of the SCO. As populous as China was India, with a population of 1.4 billion set to overtake China’s shortly. India joined the SCO in 2017 at Russia’s invitation together with Pakistan (population 212.2 million in 2018 according to the World Bank. Besides, Pakistan as closely neighbor of China has influences on the relationships among China, India, and Russia.

Furthermore, to find out the relationship between GDP and CO2 emissions, some papers focused on this topic. For example, Cederborg examined the relationship between per capita GDP and per capita emissions of CO2 in order to observe the possible influence of economic growth on environmental degradation (Cederborg & Snöbohm, 2016), which concluded that the growing per capita GDP leads to increasing carbon dioxide emissions. The paper (Ameyaw, 2018) analyzed the relationship between gross domestic product (GDP) and CO2 emissions in the five West African countries and applied bidirectional long short-term memory (BiLSTM) sequential algorithm to predict CO2 emissions. Marjanvice et. al. proposed a neural network to forecast GDP based on CO2 emission (Marjanović et al., 2016). These studies indicated that there is a positive relationship between GDP and CO2 emission and CO2 emission could be used to assist on predicting the amount of GDP. The changing trend of GDP and CO2 emission impacted on the economy development and regional development under background of economic globalization. Therefore, this study collected CO2 emission and GDP data from China, Russia, India, and Pakistan to build a novel model for forecasting annual GDP with high accuracy, which presented the economic trend of the corresponding countries in the future.

Although previous studies provided models to predict GDP with acceptable performance, these time series prediction models are the conventional neural networks with the restriction of prediction horizon, complex computation, and parameter dependency problem. Moreover, few studies focus on SCO member countries and build model to predict their economic trends and explain what role of CO2 emission plays in annual GDP. To overcome these problems appearing in the existing studies, this study proposed a novel model to forecast annual GDP. There are two objectives in this study. The first one is to build a novel prediction model that can forecast annual GDP based on CO2 and GDP data, which solves the restriction of prediction horizon, complex computation, and parameter dependency problem. The second one is to analyze the corresponding changing trend of annual GDP based on prediction results from four main member of SCO. The situation of GDP in the future will not only interaction effect among four countries, but it also will impact on the development of SCO. In this study, we focus on the four important member of SCO, including China, Russia, India, Pakistan, impact on the global economic and reduction of CO2 emission and play a vital role in Asia. Our data analysis and statistical tests are applied in annual GDP and total CO2 emission for each countries and find the relationship between them. The proposed model applies the unique structure of ELM that reduces the computation of seeking suitable input weights in the neural network. It directly leads to decrease the thousands times in the training speed than that of conventional neural networks. Besides, the optimization method -ABC- is applied to seek the suitable parameters of ELM automatically, which solves the parameter dependency problem. Our proposed recurrent multi-step prediction strategy solves the restriction of prediction horizon and is applied to predict ten-year future annual GDP for the four target countries. Finally, based on the forecasting annual GDP of four countries, this study analyzes the discusses the future trend of annual GDP in China, Russia, India, and Pakistan, respectively. The contributions of this paper can be shown as follow:

(i) Determine the relationship between annual GDP and total CO2 emissions for four different countries in SCO; (ii) Build a novel model for predict annual GDP based on CO2 emissions and historical GDP features; (iii) Implement an optimization algorithm to find the most suitable parameter in ELM, which solves the parameter dependency problem in ELM; (iv) Use the new recurrent multi-step prediction algorithm for the multiple step, which overcomes the restriction of prediction horizon in ELM; (v) Use future annual GDP prediction to analyse the economic trends.

This paper is organized as follows: Sect. 2 mainly reviews the variant forecasting models regarding time series data. Section 3 introduces the extreme learning machine (ELM) model, optimization method, and recurrent multi-step prediction strategy. Section 4 gives the data description and shows how to analyze and deal with data sets. The experimental design and results of proposed model ELM-ABC are shown in Sect. 5. Finally, the conclusion is introduced in Sect. 6.

2 Literature Review

In the regression problem, statistical methods generally applied for solving stationary or non-stationary prediction, such as autoregressive-moving average mod-el (ARMA), autoregressive integrated moving average model (ARIMA), and automatic holt‘s linear exponential smoothing model. There are lots of researches using statistical models or classical time series models to predict GDP. For example, Perić (2019) applied clean auto regression models to predict Croatia’s GDP. Li et. al. used multiple seasonal ARIMA modl in Eviews software to forecast national economy GDP (Li & Huang, 2018), which processed original GDP to stationary data before going to build prediction model. At the same time, Wang and Feng proposed that predicted cyclic sequence and predicted trend sequence were used as the independent variables to establish a binary regression model of the GDP sequence by ARIMA-ARCH Model (WANG & FENG, 2018), which also need to preprocessing the data set. These kinds of models need to preprocessing data if it is not a stationary data or normal distribution, which causes the complex data environment that cannot be applied in the real world.

In the recent decade, machine learning has attracted more attention by researchers. Some classical machine learning algorithms were applied in various fields, such as forecasting weather (Haupt et al., 2018), financial data prediction (Liu et al., 2019), wind speed prediction (Khosravi et al., 2018), and water quality prediction (Ahmed et al., 2019). These algorithms play a significant role in time series prediction and overcome the drawback of classical statistical models. Furthermore, one of time series prediction neural network is Support Vector Regression (SVR) that has been implemented in many aspects non-stationary or stationary data. The paper (Ülker et al., 2019) applied SVR to forecast GDP and employment rate and proved that SVR has super ability for predicting GDP in their experiments. In 2021, Ehteram et al. (2021) proposed a hybrid novel SVM model for prediction CO2 emission. Despite SVR’s forecasting ability, it still has its limitations in time series prediction, including high computation cost in the training phase and high parameter dependency problem.

In 2004, Huang proposed a new algorithm called Extreme Learning Machine (ELM) (Huang et al., 2004), which is a single hidden layer feed-forward neural network, which belongs to one of random projections, because the input weights are random selection. After nearly a decade of development, the increasing number of researchers participate in the development of ELM and related algorithms. It not only increases the thousands of times than that of traditional models, such as back-propagation (Huang et al., 2006) Support Vector Machine (SVM) in classification (Duoduo & Lei, 2019; Chy & Rahaman, 2019), and SVR in regression (Liu et al., 2018). Besides, other famous random projection model, Echo State Network (ESN) (Jaeger, 2001), is proposed by Jaeger in 2001. However, it has more number of parameters to define than ELM. Furthermore, there are some papers have proved ELM family models have good performance in time series prediction (Li et al., 2018; Liu et al., 2017; Wang et al., 2019). And Marjanovic et. al. proved that GDP growth rate can be predicted by ELM based on CO2 emissions (Marjanović et al., 2016). Besides, deep learning and multi-layer neural networks were also employed in the field of GDP prediction. For example, Liu et al. (2022) combined deep learning with Back-propagation neural network to analysis internet financial risk based on the trends of the growth rate of GDP. At the same, the deep learning model with quantum computing was also used for GDP growth forecasting (Alaminos et al., 2021). Moreover, multi-layer neural networks also achieved good performance in time series prediction (Liu et al., 2021). Ngige (2020) proposed a multi-layer neural network that combined ARIMA with feed forward artificial neural network for forecasting Kenya‘s GDP. These multi-layer models showed the good performance in GDP prediction. However, deep learning and multi-layer neural networks have complex computation in the training process.

According to above revision, the conventional neural networks, such as BP neural network, SVM, artificial neural networks and so forth, generally face complex computation when the large-scaled data comes. The time consumption of deep learning and multi-layer neural networks is more than that of these conventional neural networks. Besides, due to the structure of neural networks, these algorithm has the restriction of prediction horizon, which limits the width of prediction by the length of training data. On the other hand, although the training speed of ELM has thousands faster times than that of the conventional neural networks, it has its limitation in the training phase. The number of hidden neurons need to be defined by user and this algorithm has the restriction of prediction horizon. Therefore, we propose to apply Artificial Bee Colony to find out the most suitable number of hidden neurons in ELM and employ recurrent multi-step prediction strategy to solve the restriction of prediction horizon. The next section will explain detail about our methodology.

3 Methodology

This section introduces Extreme Learning Machine (ELM). Secondly, we present an optimization algorithm called Artificial Bee Colony (ABC) in order to overcome the drawback of ELM. The Sect. 3.3 presents a novel prediction algorithm called recurrent multi-step prediction algorithm using for forecasting multi-step after training model.

3.1 Extreme Learning Machine

ELM, is proposed by Huang et al. (2004) in 2004, is a random projection algorithm. The ELM algorithm greatly improves the learning speed of the Single Layer Feedforward Network (SLFN) and efficient to reach a global optimum. The Learning speed of ELM is thousands of times faster than back-propagation algorithm (Werbos, 1974) and performing better generalization ability. At the same time, it is widely used in various fields (Huang et al., 2011, 2006, 2016; Sattar et al., 2019).

It is assumed that there are N arbitrary observations (XT), where \(X = (x_{1}, x_{2}, \ldots , x_{N})\), \(T = (t_{1}, t_{2}, \ldots , t_{N})\). The output function (F(x)) for generalized SLFN with activation function (\(G(\dot{)}\)) is shown as follow:

$$\begin{aligned} F(x) = \beta G(x) = \sum _{i=1}^{k}\beta _{i}G(w_{i}x_{i}+b_{i}) = R, \end{aligned}$$
(1)

where \(\beta \) is the output weights that connects the hidden layer with the output layer, L is the number of training data, k represents the number of hidden neurons in ELM, w is the input weights with dimension of (\(k \times k\)) that connects between the input layer and hidden layer, \(b = (b_{1}, b_{2}, \ldots , b_{L})\) is bias of activation with dimension of (\(1 \times L\)), R is the output vector. The most common activation functions (G) include Sigmoid function, cosine function, tangent function, and Multi-quadric function. In this study, we employ Sigmoid function in ELM.

If we assume that there is no error based on above neural network with G(x) activation function, we can transform equation 1 to be equation 2.

$$\begin{aligned} \sum _{i=1}^{k}\beta _{i}G(w_{i}x_{i}+b_{i}) = t_{i}, i = 1, \ldots , N, \end{aligned}$$
(2)

This can be written as:

$$\begin{aligned} \begin{bmatrix} G(w_{1} x_{1}+b_{1}) &{} \cdots &{} G(w_{k} x_{1} + b_{k}) \\ \vdots &{} \vdots &{} \ddots &{} \\ G_(w_{1} x_{L}+ b_{1}) &{} \cdots &{} G(w_{k} x_{L} + b_{k}) \\ \end{bmatrix} \ \begin{bmatrix} \beta _{1}\\ \vdots \\ \beta _{k}\\ \end{bmatrix} = \begin{bmatrix} t_{1}\\ \vdots \\ t_{L}\\ \end{bmatrix} \end{aligned}$$
(3)

Then, the equation 3 can be shorten as:

$$\begin{aligned} H \beta = T \end{aligned}$$
(4)

where H represent the hidden matrix, which is defined in equation 5.

$$\begin{aligned} \begin{bmatrix} G(w_{1} x_{1}+b_{1}) &{} \cdots &{} G(w_{k} x_{1} + b_{k}) \\ \vdots &{} \vdots &{} \ddots &{} \\ G_(w_{1} x_{L}+ b_{1}) &{} \cdots &{} G(w_{k} x_{L} + b_{k}) \\ \end{bmatrix} \end{aligned}$$
(5)

In ELM, the input weight (w) and the hidden biases (b) are selected randomly. In order to find a least-squares solution output weight of ELM in equation 5. The output weights can be calculated in equation 6.

$$\begin{aligned} \beta = H^{\dagger }T \end{aligned}$$
(6)

where \(H^{\dagger }\) is inverse of the hidden matrix of H, which generally use Moore-Penrose.

The prediction value (\(\hat{y}\)) can be computed by the following equation:

$$\begin{aligned} \hat{y} = H \beta \end{aligned}$$
(7)

3.2 Artificial Bee Colony

However, ELM still has its drawbacks, such as the parameter dependency problem. There is a number of hidden neurons in ELM need to be defined before training. Therefore, this study employ Artificial Bee Colony (ABC) to find out the most suitable number of hidden neurons in ELM.

In ABC optimization algorithm, it consists of three groups of bees: employed bees, onlookers and scouts. The number of employed bees in the colony stands for the number of food sources. Firstly, employed bees go to their food source and come back to hive and dance, which show the place where it found food source. After that, employed bee start looking for a new food source as scout. Onlookers are responsible to watch the dances of employed bees and choose food sources depending on dance.

Fig. 1
figure 1

The flow chart of artificial bee colony

In computer science, ABC firstly initializes certain range of population with corresponding their cost that are calculated by cost function for updating the solution based on initial population. After that, define acceleration coefficient \(\phi \) from equation 8.

$$\begin{aligned} \phi =a R(.) \end{aligned}$$
(8)

where a generally is defined as 1, R(.) represents the random selected number based on uniformed distribution and assumed population size. Based on the the acceleration coefficient (\(\phi \)), the positions of new population in recruited bees are selected. Then, the best cost value and position in employed bees can be compared the initial best cost value and the cost values based on updated positions of employed bees. Next, the procss of onlookers bees is as same as employed bees. In scouts processing, its positions are directly random selected based on uniformed distribution and calculate the cost values based on the loss function. Finally, according to the best cost value and its corresponding position in onlookers bees and scout bees, the global best cost and position in this iteration can be calculated. The next iteration will continue until reach the best cycle number. The flow chart is shown in Fig. 1.

3.3 Recurrent Multi-step Prediction Strategy

The recurrent multi-step prediction algorithm is a feedback strategy which backward from the output of previous step to the testing data of the current step in testing data when the trained model do testing phase. In ELM algorithm, the number of target data determines the size of prediction horizon. It restricts the generalization of prediction in the application of the real world. In order to extend the prediction horizon, this study applies recurrent multi-step prediction strategy in ELM. It not only assists extend the prediction horizon, but it also does not change the number of target horizon in the training phase.

As shown in Fig. 3, each country experienced different GDP and CO2 growth trajectories. China’s GDP growth gathered pace after economic liberalization in 1978 and accelerated after year 200, with commensurate CO2 output growing apace. After a period of stagnation after the break-up of the Soviet Union, Russia returned to growth after year 2000. CO2 emissions remained relatively stable however. India experienced slow growth until the 1990s when liberalization boosted its growth, with CO2 emissions rising from the mid-1980s. Pakistan likewise experienced higher economic growth post-year 2000, but saw CO2 emissions rise well before that year.

In this study, the main aim is to predict future annual GDP based on CO2 emission with D features and historical GDP with D-1 features. After training ELM model, we plan to predict ten-year future annual GDP. It is assumed that the testing data is \(B_{1}\) = (\(c_{1}, c_{2}, c_{10}, g_{1}, \ldots , g_{9}\)), where\(B_{1}\) is the testing data in the first step, \(c_{1}\) is CO2 emissions in the first step and \(g_{1}\) represents the historical GDP in the first step. The testing data consists of 10 steps CO2 emission and 9 steps annual GDP time series features, which is used to forecast the next D+9 steps (from \(g_{D} - g_{D+9}\)) in annual GDP. The testing data in each step p can be updated in equation 9.

$$\begin{aligned} B_{p} = \ \begin{bmatrix} c_{1,1+p}&{}\cdots &{} c_{D,1+p} &{} \hat{g}_{1, 1} &{} \cdots &{}\hat{g}_{D-1, p}\\[6pt] \end{bmatrix} \end{aligned}$$
(9)

where, p is prediction step (p = 2, 3, …,D-1), \(\hat{g}_{p,p}\) represents the prediction value of annual GDP in the p-th step.

Fig. 2
figure 2

The main flow chart of methodology

Overall, based on above proposed methods, ELM-ABC can be used to train time series model, and then employ recurrent multi-step prediction strategy to predict future p step in annual GDP based on CO2 emission and historical GDP features. The main flow chart can be shown in Fig. 2.

4 Data Analysis and Processing

This section mainly describes data sets, the relationship between data sets, and processing data for evaluating the proposed prediction model.

4.1 Data Description

With economic growth, development of industry, regulatory policy and technology, the environment changes over time. Many researches (Al-Mulali, 2015; Farhani & Rejeb, 2012; Heil & Selden, 1999; Kasperowicz, 2015) have shown that CO2 emissions have relationship between GDP. While developed countries now pay more attention to balancing the environment with economic growth, many developing or poor countries have developed their economies while destroying the environment. Although China is a developing country, it seeks to be environmentally responsible.

Fig. 3
figure 3

The time series trend in GDP and CO2 emission from the four different countries

In 2001, in an effort to extend its soft power and connect the Asian heartland, the Chinese government launched the Shanghai Cooperation Organization (SCO). Moreover, in 2017, India and Pakistan participated in SCO. In environment protection perspective, in 2008, Beijing hosted the 10th meeting of experts from the ministries and agencies of the SCO member countries responsible for environmental protection to preserve as well as restore biodiversity in the interests of future generations. Besides, the SCO Energy Club, which was established in 2018, provides a platform for collaborative energy development, giving members the opportunity to reduce carbon emission. Therefore, in this paper, we focus on the four main members in SCO, China, Russia, India, and Pakistan are all large countries with sizable populations. Their time series data sets are collected in the Database of the World bank (https://data.worldbank.org/country/). The changing trend of CO2 and GDP in the four countries between 1960 and 2014 is shown in Fig. 3.

As shown in Fig. 3, each country experienced different GDP and CO2 growth trajectories. The unit of annual GDP and CO2 emission are US dollar (US$) and kiloton(KT), which is defined in the line chart of GDP and CO2 emission. China’s GDP growth gathered pace after economic liberalization in 1978 and accelerated after year 2000, with commensurate CO2 output growing apace. After a period of stagnation after the break-up of the Soviet Union, Russia returned to growth after year 2000. CO2 emissions remained relatively stable however. Russia’s case represents an exception to the rule that appears to hold in the other countries studied that CO2 emissions rise along with, and sometime outpacing, the growth in GDP. This phenomenon was explained by Pao et al. (2011) in terms in Russia’s investment in energy efficiency. Russia’s experience raises the question of whether the ELM technique works well in situations where CO2 emissions are output inelastic. No such phenomenon is found in India which experienced slow growth until the 1990s when liberalization boosted its growth, with CO2 emissions rising from the mid-1980s. Pakistan likewise experienced higher economic growth post-year 2000, but saw CO2 emissions rise well before that year.

4.2 Data Analysis

Based on above time series data, the specific data analysis methods can be employed for testing what the relationship between GDP and CO2 emissions. Firstly, R square test and ANOVA test are used for finding out how the effects of CO2 emissions on economic growth. Secondly, the correlation between CO2 and GDP can be detected by correlation analysis.

This study employs SPSS 21 to evaluate all tests. SPSS provides the regression analysis that generally is used to predict the values of a variable (dependent variable) based on the other variable‘s values (independent variable). In this study, we set the annual GDP (US dollar) as the dependent variable and CO2 emissions is an independent variable. The results of regression analysis for the different four data set are shown in Tables 1 and 2, including Model summary for R Square and ANOVA Test, respectively. In Table 1, it is easy to find how impaction on GDP based on CO2 emissions. There are the four different results that is tested by regression analysis. Firstly, R represents the square root of R-Squared and is the correlation between observed and predicted valued of dependent variable. More importantly, R Square is the proportion of variance in the dependent variable (GDP) which can be predicted from the independent variable (CO2 emissions). The adjusted R-square attempts to yield a more honest value to estimate the R-squared for the population. Therefore, based on the observed values from R Square in four different data sets, we can conclude that India has the highest value (0.926) in R Square, which means that 92.6% of the variance in annual GDP can be predicted from CO2 emissions. The next is China and Pakistan with 0.885 and 0.882. The last position is Russia. There is only 12.8% of variance in annual GDP can be predicted from CO2 emission, which means the annual GDP cannot be explained by CO2 emission in Russia. In Table 2, it mainly answers the question “Do CO2 emissions reliably predict the annual GDP?” The null hypothesis and alternative hypothesis are as follow:

H0:

CO2 emissions do not reliably predict the annual GDP

H1:

CO2 emissions reliably predict the annual GDP

It is assumed that there is 95% significant level, if Sig value in ANOVA is less than 0.05, then we can reject H0. Therefore, according to the values of Sig of ANOVA Test in Table 2, China, Pakistan, and India can reject H0, which means that CO2 emissions of China, Pakistan and India reliably predict the annual GDP of these countries, respectively. Only Russia cannot reject H0. It indicates that there is no evidence to prove CO2 emissions reliably predict the annual GDP in Russia.

Table 1 The model summary for R square test about GDP and CO2 emission of different four data sets
Table 2 ANOVA test for GDP and CO2 emission of four different data sets

Furthermore, the next one is Correlation analysis. In SPSS, this study uses Pearson correlation, which is a number between -1 and +1 that indicates to what extent two variables are related. The correlation of -1 represents a perfect negative relation. A correlation of 0 indicates there is no relation between two variables. A correlation of 1 means a perfect positive relation. In Table 3 shows the relation between CO2 emissions and GDP in four different countries. There are two important factors in this Table we need to pay more attention, including correlation coefficients and Sig. value (2-tailed).

Firstly, we can observe that the correlation coefficients between CO2 and GDP from China, Pakistan, India are all closed to 1, including 0.941, 0.939 and 0.962, which means there is a positive relation between CO2 and GDP. However, in Russia, the correlation coefficient in Pearson correlation test is 0.358, which represents there is less correlation between CO2 and GDP in Russia. Secondly, we pay attention to the Sig. value (2-tailed) in Table 3.

We are assumed that there is a significant at the 0.01 level (2-tailed). The null and alternative hypothesis are shown respectively in the following:

H0:

there is no relationship between CO2 and GDP

H1:

there is a relationship between CO2 and GDP

The condition of correlation analysis in Sig value is: \( {\left\{ \begin{array}{ll} Sig. (2-tailed) \le 0.01, &{} reject \ H0 \\ Sig. (2-tailed) > 0.01, &{} cannot \ reject \ H0 \\ \end{array}\right. } \)

According to the results in Sig value of four countries in Table 3, we can conclude that there are evidences that CO2 and GDP has correlation in China, Pakistan and India. However, due to the value of Sig in Russia (0.385), we cannot reject H0, which means there is no evidence to prove there is a relationship between CO2 and GDP in Russia.

Table 3 Correlation test between GDP and CO2 emission of China

Therefore, referring to the results of above analysis, the annual GDP has extent of relation with CO2 emissions in China, Pakistan, and India. However, in Russia, CO2 emissions does not have much relation with annual GDP. However, we employ CO2 emissions data as features to forecast annual GDP by our proposed prediction model. In order to fulfill the requirement of proposed model, this study will transform original data to a matrix for CO2 emissions. The next section will give detail process of transformation.

4.3 Data Processing

In this paper, we plan to use CO2 emissions data and historical GDP as features to predict the future annual amount of GDP. In order to achieve training our proposed model and provide the enough attributes for the neural network, the time series CO2 emissions and annual GDP data set need to transform to be a matrix for training our proposed model.

Here, it is assumed that the time series data CO2 emission is X = {\(x_{1}, x_{2}, \ldots , x_{N}\)}, where N is the number of time series data, \(x_{1}\) represents the first sample in time series data X. The time series data GDP is G = {\(g_{1}, g_{2}, \ldots , g_{N}\)} (we collected the same number of time series data for CO2 and GDP), where \(g_{N}\) is the \(N-th\) sample in data G. The number of attributes of historical annual GDP is L and D represents the number of attributes for CO2 emission.

Then, the time series data CO2 emission (X) can be transformed by the following equation:

$$\begin{aligned} T = \ \begin{bmatrix} x_{1,1} &{} \cdots &{} x_{1,D} &{} \ &{} g_{1,1} &{} \cdots &{} g_{1,L}\\ x_{2,2} &{} \cdots &{} x_{2,D+1} &{} \ &{} g_{2,2} &{} \cdots &{} g_{2,L+1}\\ x_{3,3} &{} \cdots &{} x_{3,D+2} &{} \ &{} g_{3,2} &{} \cdots &{} g_{3,L+2}\\ \vdots &{} \vdots &{} \ddots &{} \ &{} \ddots &{} \vdots &{}\ddots &{} \vdots \\ x_{j,j} &{} \cdots &{} x_{j,D+j-1} &{} \ &{} g_{j,j} &{} \cdots &{} g_{j,L+k-1}\\ \end{bmatrix} \ \end{aligned}$$
(10)

where j represents the number of row of the transformed matrix T, j = [1, 2, ..., N-D+1]; the number of column of the transformed matrix T is \((D+L)\). The time series data X and G can be transformed to a matrix (T) with dimension of (\(N-D+1, D+L\)).

The target values (Y) can be transformed by equation 11 based on the time series data (G).

$$\begin{aligned} Y = \ \begin{bmatrix} g_{L+1} \\ \vdots \\ g_{N} \\ \end{bmatrix} \ \end{aligned}$$
(11)

Therefore, the transformed matrix (T) and target values (Y) can be used for training proposed prediction neural network and testing the forecasting performance.

5 Experiments

In this section, we employ transformed matrix (T) and (Y) to evaluate the proposed model for the different four data sets, including China, Russia, Pakistan and India. In order to prove the super forecasting ability of ELM-ABC, this study compares with other well-known algorithms, such as ESN, SVR, ELM, KELM.

5.1 Evaluation for Data Sets

The experiments using time series data CO2 emissions and annual GDP (from Sect. 4.1) from four members of SCO to test the forecasting performance. Firstly, because of national strategic planning that is made 5 or 10 years in short or long term development, in order to train a model for forecasting the long-term GDP situation, ten years CO2 emission and nine year historical GDP data as training features, and the-ten-th year GDP as target values can be used for training prediction model. Then, the number of attributes for CO2 emissions (D) is set as ten and the number of attributes for annual GDP (L) is defined by nine. Based on the theory of Sect. 4.3, the input data and target values can be transformed by equation 10 and 11. Secondly, parameter plays a significant role in the performance of model. In our compared model, this study seeks the most suitable parameters in our compared model in order to compare fairly, including SVM, ESN and KELM. Table 4 shows how to search and set the parameters in compared models.

Table 4 The parameter setting for compared models

In ESN, the number of states in reservoir (R) is searched in the list of [1, 5, ..., 200] and the leaky-rate (\(\alpha \)) is searched in the list of [0.1, 0.2, ..., 1] that must be defined in the range of 0 and 1. The parameters of SVR are defined by the method of paper (Emsia & Coskuner, 2016) for the four different data. The kernel parameter (k) of KELM is defined as one. Finally, in order to prove the super ability of ELM-ABC, this study compares our proposed model ELM-ABC with others. Furthermore, due to less number of samples in each of data sets, it is difficult to separate the training and testing data. Then, this study employs Leave One Out (LOO) cross-validation based on Symmetric Mean Absolute Percentage Error (SMAPE) (Armstrong & Forecasting, 1985) and Mean Square Error (MSE) (Makridakis, 1993) to measure the predictive performance for all models. SMAPE is a kind of error percentage for measuring the accuracy, which can be defined as equation (12). On the other hand, MSE represents the difference between the actual value and predictive value. It measures the average of squared errors and is defined in the equation (13). Moreover, this study sets average SMAPE as the fitness value of ABC for optimizing parameter.

$$\begin{aligned} SMAPE= & {} \frac{100 \%}{n}\sum _{t=1}^{T} \frac{\vert \hat{Y_{t}}-Y_{t}\vert }{\left( \vert \hat{Y_{t}} \vert +\vert Y_{t} \vert \right) /2}, \end{aligned}$$
(12)
$$\begin{aligned} MSE= & {} \frac{1}{n} \sum _{t=1}^{T}\left( (Y_{t}-\hat{Y_{t}})^{2}\right) , \end{aligned}$$
(13)

where T is the number of predictions (\(t = 1, 2, \ldots , T\)). \(\hat{Y_{t}}\) is the prediction value in t-th time, and \(Y_{t}\) is the t-th actual target value.

Based on the above discussion and data transformation, the four data sets can be evaluated. Table 5 shows the comparison results in four different data sets based on LOO cross-validation. Due to the characteristic of ELM (random selection of input weights), our results of ELM-ABC run 100 times and take the average value for the final achievement for ELM-ABC. Furthermore, ELM has parameter dependency problem (Liu et al., 2018), which impacts on forecasting performance. The number of hidden neurons in ELM is influence on time series prediction. It is reason why this study employs ABC to find the most suitable number of hidden neurons in ELM. The result of LOO cross-validation in SMAPE as the value of loss function in ABC. The convergence chart of the forecasting performance of four countries using ABC can be shown in Fig. 4. Based on the 100 iterations optimization, the number of hidden neurons in ELM for China, Russia, Pakistan and India can be optimized by ABC.

Fig. 4
figure 4

The convergence chart using ABC for four different countries

According to the results of LOO cross-validation in MSE and SMAPE, Table 5 shows that the model ELM-ABC observes the best predictive performance in forecasting annual GDP among other three compared models. Although Russia shown the less correlation between CO2 emissions and annual GDP, our proposed model still has good performance with 0.1182 in MES and 12.70% in SMAPE. Besides, China, Pakistan and India also achieved the best forecasting performance than compared models, with 15.41%, 16.01%, and 13.73%, respectively.

Table 5 The performance in MSE and SMAPE based on LOO cross-validation for different datasets
Fig. 5
figure 5

Line chart of four countries for predictive values based on ELM-ABC and actual values in annual GDP

5.2 Prediction and Discussion

Section 5.1 proved that model ELM-ABC had super forecasting ability for annual GDP rather than others based on LOO cross-validation. In this section, we will visualize the predictive performance of ELM-ABC and predict the annual GDP of 10 future years (2019-2029) for China, Russia, Pakistan and India.

Firstly, the all transformed matrix of each countries data sets as training data in ELM-ABC. The most suitable number of hidden neurons in ELM will be optimized by ABC when our proposed model achieves training. The same data will be used for testing the prediction performance. Line charts in four different countries will be drew for comparing the actual observation with predictive values based on ELM-ABC. It is easily to observe how matched the forecasting values based on ELM-ABC with the actual values in annual GDP. The annual GDP (1013) for forecasting and actual values of line charts in China, Russia, Pakistan and India is shown in Fig. 5. It clearly indicates that the forecasting values based on ELM-ABC is matchable with the actual values in annual GDP for all four countries. Therefore, it also can prove ELM-ABC has good forecasting performance in annual GDP in other way.

Fig. 6
figure 6

The 10-year annual GDP prediction from 2019 to 2028

Secondly, we collected 10-year CO2 emissions (2009–2018) and 9-year annual GDP (2010-2018) in four different countries from website (www.countryeconomy.com) as the testing features in ELM-ABC to predict next ten-year annual GDP (2019–2028) by recurrent multi-step forecasting algorithm. Based on prediction model ELM-ABC and recurrent multi-step prediction strategy, ten-year annual GDP (2019–2028) can be predicted as Fig. 6, which shows the tendency of development in recent ten years. The trend of annual GDP in China will have a relatively stable growth in next five years and then appear a slowly growth period between 2024 and 2028. Moreover, in Pakistan, the trend of annual GDP will appear a slowly growth from 2019 to 2024, and then will keep in around 0.4E+12 US dollar until 2028. As the one of most influential countries, the trend of annual GDP in Russia will follow the pattern between 2010 and 2016. Finally, India, in generally, will have a linear increase in annual GDP. But it still has fluctuation in next ten years. Thus it can be seen that forecasts are affected by the historical changes of GDP.

According to the data analysis in Sect. 4.2, total CO2 emissions in China, India and Pakistan have positive relationships with the respective countries’ annual GDP. But Russia did not have much of a correlation between annual GDP and CO2 emission. In the aspect of CO2 emissions, the total amount of CO2 emissions in India appears to achieved sustained growth from 1960 to 2014. By comparison, Pakistan and China plateaued in their CO2 emissions from 2011. Russia has been able to maintain around 1.5E+6 KT in CO2 emission. However, all countries experienced different growth rates in their annual GDP during the reference period. Based on these correlations between CO2 and GDP, we apply ELM-ABC and recurrent multi-step prediction strategy to train the time series prediction model and predict 10-year annual GDP based on CO2 emission and historical annual GDP. The experimental results show ELM-ABC has the best prediction performance and the levels of 10-year future annual GDP can be predicted accurately.

The forecasts show that the annual GDP of China and Pakistan will continue to grow but growth will slow after 2025. The annual GDP in India will exhibit unstable growth. The trend of Russia will follow the pattern between 2010 and 2016. The GDP prediction plays a vital role in economic analysis and policy changes for each country. Based on the results of prediction in four members of SCO, the investors do not only make decision for the investment in these countries but these nations also can change their policies to attempt to achieve their target growth rates. For instance, recognizing its slowing growth, China has adopted policies to improve the quality of its economic growth (Green & Stern, 2015).

6 Conclusion

The increase of CO2 emissions, the reason behind global warming and climate change, is one of the most important issues in the environmental and economic area. This study uses precisely this variable in a novel ELM-ABC model to predict annual GDP successfully for ten-years for four SCO member countries. In this model, the relationship between annual GDP and total CO2 emissions is investigated. The results show that the increase of CO2 emissions has clearly an impact on the annual GDP growth of China, Russia, India and Pakistan.

Methodologically, that a single explanatory variable, CO2, in combination with lagged values of the dependent variable, is able to predict the future speaks to the explanatory power of CO2 alone and consequently the limited information that forecasters need to predict future GDP. This should compare favorably with the plethora of variables typically needed to predict the future. The changes in CO2 tend to be less volatile to business cycles than economic variables, and therefore prove to be more robust than the latter for long-term forecasts. In aspect of algorithm, our proposed model ELM-ABC overcomes the restriction of prediction horizon, complexity of computation, and parameter dependency problem by the recurrent multi-step strategy, the unique structure of ELM, and ABC optimization method, respectively. It build an efficient and accuracy prediction model. Based on the experimental results, the ELM ABC methodology is most robust when the relationship between GDP and CO2 emission is strong. Even, In the minority of situations where this relationship does not hold (based on Russia situation), the proposed model still performed good. Furthermore, our GDP prediction can also be a tool to forecast the future GDP status precisely. The prediction results are not only useful as bases for investment decisions but also for policy-making.

Although the methodology has proven to be superior to several alternatives, the unexpected event generally impacts on the trend of annual GDP, such as coronavirus disease (COVID-19). This epidemic situation caused factory closed and restricted transportation, which directly decreased trade with overseas and working efficiency of factory. This situation is defined as concept drift problem. In the future, we will consider the probability of unexpected event happened and propose a method to solve the concept drift problem.