Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Multi-Layer Perception model with Elastic Grey Wolf Optimization to predict student achievement

Abstract

This study proposes a Grey Wolf Optimization (GWO) variant named Elastic Grey Wolf Optimization algorithm (EGWO) with shrinking, resilient surrounding, and weighted candidate mechanisms. Then, the proposed EGWO is used to optimize the weights and biases of Multi-Layer Perception (MLP), and the EGWO-MLP model for predicting student achievement is thus obtained. The training and verification of the EGWO-MLP prediction model are conducted based on the thirty attributes from the University of California (UCI) Machine Learning Repository dataset’s student performance dataset, including family features and personal characteristics. For the Mathematics (Mat.) subject achievement prediction, the EGWO-MLP model outperforms one model’s prediction accuracy, and the standard deviation possesses the stable ability to predict student achievement. And for the Portuguese (Por.) subject, the EGWO-MLP outperforms three models’ Mathematics (Mat.) subject achievement prediction through the training process and takes first place through the testing process. The results show that the EGWO-MLP model has made fewer test errors, indicating that EGWO can effectively feedback weights and biases due to the strong exploration and local stagnation avoidance. And the EGWO-MLP model is feasible for predicting student achievement. The study can provide reference for improving school teaching programs and enhancing teachers’ teaching quality and students’ learning effect.

Introduction

Education refers to school education organized by particular organizations from a narrow view. And from a broad perspective, it relates to social practice activities that affect people’s physical and mental development [1]. According to school conditions and professional titles, it aims to educate and cultivate cognitive development in a planned and organized way, teach people with existing experience and knowledge [2, 3], explain various phenomena, problems, or behavior [4], and improve their practical ability. It is fundamental to recognize and treat things with people’s relatively mature or rational thinking.

In contemporary society, with the rapid development of information technology, the deepening of education and teaching reform has also been deeply affected [5]. Especially in the era of big data, various data mining methods are applied to the education industry, which provides a new idea for finding better educational and teaching methods [6, 7]. Due to the continuous progress of education, more and more advanced techniques are constantly used by the education industry to help decision-makers carry out educational analysis [8] and provide better educational effects by continuously developing the existing education system. Student performance and achievement prediction can urge students to strengthen learning efficiency [9, 10], encourage teachers to boost teaching quality [11, 12], and better improve teaching to achieve the best effect [13]. Student achievement is not only an important indicator to measure talent training level but also an essential part of big educational data [14]. Accordingly, student achievement prediction is the leading research direction of scholars.

Student achievement is characterized by a relatively unified data type, a large data volume, and relatively easy access. It has become a research hotspot of teaching reform to conduct in-depth mining and analysis of student achievement from multiple angles according to appropriate data mining technology [14, 15]. The above descriptions demonstrate that student education is a non-linear, higher-dimension problem [16]. Traditional technology is challenging to solve high-dimensional and complex problems, and obtaining new methods or theories to guide teaching is a significant task [17].

Artificial neural networks (ANNs) are an algorithmic mathematical model that imitates the behavior characteristic of animal neural networks for distributed parallel information processing. Depending on the complex system, this network achieves the purpose of processing information by adjusting the inter-connected relationship between plenty of internal nodes obtaining the ability of self-learning and self-adapting [18, 19]. ANNs are widely used in various complex practices, such as predicting performance, modeling, and grouping of students according to their personal characteristics, providing personalized learning support for students, and so on [20]. Multi-Layer Perceptron (MLP) is the most straightforward neural network, which can contain one or more hidden layers, and has been applied to many fields [21, 22].

The swarm intelligence optimization algorithm is inspired by the behavior of group communication and predation in nature. Experts and scholars have proposed many novel and efficient swarm intelligence optimization algorithms, such as Ebola Optimization Search Algorithm (EOSA) [23], multi-objective multi-objective (MOAVOA) [24] and so on. The above algorithms are of good parallelism and autonomous exploration, provide new ideas and methods for solving complex problems and have become the focus of increasing attention among researchers. It has been widely for solving practical problems [25, 26] and for optimizing the weights and biases of the MLP [27]. The Grey wolf optimization (GWO) algorithm is a kind of swarm intelligence optimization algorithm [28]. Based on the wolf hierarchy, this algorithm simulates wolves’ behaviors of surrounding, following, and hunting prey. It has a better ability to solve high-dimensional complex problems [29, 30]. Since it was put forward, it has attracted the attention of many scholars and applied to various fields. In recent years, combining neural networks to solve practical problems has had a high interional profile [31, 32].

Following No Free Lunch (NFL) thoery, no algorithm can effectively solve all problems [33]. When solving different problems, the algorithm will encounter such unfavorable situations as local stagnation, premature convergence, unbalanced exploration and exploitation, etc. Therefore, to avoid the above problems, this paper proposes a GWO variant called Elastic Grey Wolf Optimization algorithm (EGWO) to train Multi-Layer Perceptron and optimize weights and biases.

The architectural idea of this paper is shown in Fig 1.

Literature review

Many scholars have conducted studies on student performance to make full use of educational big data resources and have promoted the peaceful development of teaching reform.

Scholars apply various methods to predict student performance. Woo et al. study the relationship between indoor conditions and students’ classroom performance and use a hierarchical multiple regression model to determine the crucial predictors of student performance [34]. Based on different central dimensions, Vignery predicts student performance through Principle Component Analysis (PCA), Exponential Random Graph Models (ERGM), agglomerative hierarchical clustering, and multilevel modeling. The experimental results show that geodesic k-path and closeness centrality positively impact Grade Point Agerage (GPA) [35]. To help teachers offer help in time to improve students’ academic performance, Khan et al. classify student performance before the beginning of the class and propose a novel classification method, which can provide an additional confidence measure and improve the acceptability of prediction [36]. Different principal leaders impact student performance as a significant force in the school. Wu et al. conduct a Multivariate meta-meta-analysis to investigate the relationship between the central leadership and student achievement [37]. Based on the DEA model and Bootstrap method, Masci et al. measure the impact of school (district) size, management practice, and principal leaders’ characteristics on student groups through reading and mathematics standardized test scores [38]. The experimental results show that the composition of school subjects mainly affects students’ reading efficiency, and management practice mainly affects students’ efficiency of mathematics.

Scholars choose various characteristics to predict student performance. Many higher education institutions try to understand student factors to improve the quality of education. Students’ semantic trajectory is tested by dynamic time distortion, hierarchical clustering, and variance analysis. The experimental results show that semantic trajectory is a relevant factor affecting student performance [39]. Silva et al. analyze family characteristics, values, beliefs, expectations and family support, self-efficacy, goal progress, and academic achievement [40]. The results show that families affect academic performance through academic self-efficacy and views on the progress of educational goals. However, the information provided by self-efficacy has less impact but is more related to the support of material resources. Sarfraz et al. study the factors affecting the performance of business school students during the COVID-19 pandemic: assessing students’ views and preferences, the impact of blended learning (BL) setting on students’ academic performance, and studying the relationship between unified theory of acceptance and use of the technology of UTAUT and students’ academic performance [41]. Javadizadeh et al. study the impact of class structure, teaching style, and class environment on students’ classroom realization, and draw lessons from the self-determination theory to assume the relationship between SCARF (status, certainty, autonomy, relatedness, fairness) elements, students’ internal motivation, and classroom performance [42]. This study has important guiding significance for improving students’ enthusiasm.

Some scholars also choose deep learning techniques to predict student grades, which can be summarized as follows. Rivas et al., to understand the influencing factors of college students, determine the critical factors of student performance through the number of visits to available resources, based on the tree model and different types of ANNs [43]. Li et al. propose a Multi-View Hypergraph Neural Network (MVHGNN) for predicting students’ academic performance, which uses hypergraphs to construct high-order relations among students. And a Cascade Attention Transformer (CAT) module is introduced to mine the weight of different behaviors by the self-attention mechanism. The experimental results demonstrate that the MVHGNN method outperforms the state-of-the-art ones evaluated on real campus student behavioral datasets [44]. Bertolini et al. utilize bootstrapping to examine performance variability among five data mining methods (DMMs) and four filter preprocessing feature selection techniques for forecasting course grades for 3225 students enrolled in an undergraduate biology class [45]. Wu et al. propose a novel knowledge tracing model based on an exercise session graph, named session graph-based knowledge tracing (SGKT). The session graph models the students’ answering process. And a relationship graph models the relationship between exercises and skills. The experimental results demonstrate the model can outperform some existing baseline methods conducted on three publicly available datasets [46]. Pallathadka et al. analyze the ability of machine learning, such as Naive Bayes, ID3, C4.5, and SVM to predict the students’ performance in future tests. And the above methods are evaluated by criteria like accuracy and error rate by the UCI machinery student performance data set [47]. Tomasevic et al. accomplish the task of student exam performance prediction, i.e., discovering students at a “high risk” of dropping out from the course by providing a comprehensive analysis and comparison of state-of-the-art supervised machine learning techniques and predicting their future achievements, such as instance, final exam scores [48].

Although scholars have adopted different methods to study the factors affecting student performance and achievement [4951], and to predict final performance [52, 53]. There are few studies conducted on the impact of students’ characteristics and family factors on their achievement [54, 55].

Method

Ethics statement

In this thesis, the standard University of California (UCI) Machine Learning Repository dataset (https://archive.ics.uci.edu/ml/datasets/Student+Performance) is selected in the simulation experiment. The standard UCI dataset is usually used as a general dataset and often appears in most papers or studies, and the original data are provided on the official website. The Student Performance Data Set introduced in the experimental dataset and environment section are obtained from the official website, which is from two Portuguese secondary schools collected through reports and questionnaires on the performance of students in secondary education. It does not involve human participants, human specimens or tissue, vertebrate animals or cephalopods, vertebrate embryos or tissues, and field research.

Grey Wolf Optimizer

Grey Wolf Optimizer (GWO) is a swarm intelligence algorithm that mimics the hunting behavior of wolves [28]. The superior performance of this algorithm benefits from the wolf herd hierarchy mechanism. Among wolves, α, β, and δ wolves are the three primary wolves. The rest are named ω wolves, in the lowest class, to attack the prey. GWO algorithm incorporates two primary operations: (1) surrounding the prey and (2) hunting the prey. The whole process of the algorithm is as shown in the Algorithm 1.

Operation 1: Surrounding the prey. Eqs (1) and (2) express the surrounding mechanism. (1) (2) where t is the current number of iteration; xp(t) is the location of the prey (equivalent to α, β, δ and ω), x(t) and x(t + 1) are wolf locations of the tth and (t + 1)th iteration. D is the distance between the prey and the wolf. A and C are calculated by Eqs (3) and (4). (3) (4) where a is reduced linearly from 2 to 0 along with the iterations, and r1 and r2 are random values in the range of [0, 1].

Operation 2: Hunting the prey. The α wolf leads the whole process. The final position is to update Eqs (5) to (11) at any position in the circle. The ω wolves randomly update around the prey, and α, β, and δ wolves evaluate the prey location. (5) (6) (7) (8) (9) (10) (11)

Algorithm 1: The basic Grey Wolf Optimization Algorithm

Step 1: Initialize the grey wolf population: Xi (i = 1,2,…,n);

Step 2: Initialize the parameters: a, A and C;

Step 3: Calculate the fitness value of each wolf:

Xαbest(f(Xi)); Xβsecondbest(f(Xi)); Xδthirdbest(f(Xi));

Step 4: While t < tmax

For each wolf

  Upate the current position based on the Eqs (1) to (11);

End For

End While

Step 5: Return Xα.

The variant of the GWO

Since the GWO was proposed, it has attracted the attention of many scholars. In the past two years (2021 and 2022), many scholars have proposed a variety of variants, some of which are as follows:

  • Ensemble Grey Wolf Optimizer (EGWO): Yu et al. propose a variant Ensemble GWO (EGWO) with two strategies to boost the performance of GWO, which is validated by the IEEE CEC 2019 and image segmentation in the real world. The results show that the proposed EGWO algorithm is reliable and effective [56].
  • Random Walk Grey Wolf Optimizer (RWGWO): Deep et al. propose a Random Walk Grey Wolf Optimizer based on the dispersion factor (RWGWO) approach to the feature selection problem, and it examines eighteen different chronic disease data. The experimental results show the RWGWO is an effective GWO variant [57].
  • Diversity enhanced Strategy based Grey Wolf Optimizer (DSGWO): To solve the poor population diversity and global search capabilities, Jiang et al. propose Diversity enhanced Strategy based Grey Wolf Optimizer (DSGWO), which combines group-stage competition and the exploration-exploitation balance mechanisms to improve the performance of the GWO algorithm. The DSGWO is validated by IEEE CEC 2014 and two engineering design problems, and the results prove that the DSGWO has a strong exploration and exploitation ability [58].
  • Adult-Pup Teaching–Learning based Interactive Grey Wolf Optimization (AP-TLB-IGWO): Banerjee et al. propose adult-pup teaching–learning based interactive grey wolf optimization (AP-TLB-IGWO) algorithm to mitigate these challenges and for the better representation of the existing system. The performance of the AP-TLB-IGWO is tested by the IEEE CEC 2014 and CEC 2017, which provide significantly promising results in comparison with current techniques [59].
  • Hybrid grey wolf optimization (HGWO): Hoballah et al. propose a variant of Hybrid grey wolf optimization (HGWO) with two internal loops based on particle swarm optimization (PSO) and genetic algorithm (GA) techniques. And the HGWO is conducted on a 66-bus three-area test system for cost minimization following the outage of the biggest generator in each area [60].
  • Randomized Balanced Grey Wolf Optimizer (RBGWO): Adhikary et al. propose a new variant of GWO termed Randomized Balanced Grey Wolf Optimizer (RBGWO), which outperforms some meta-heuristic algorithms. The proposed algorithm is applied to constrained and unconstrained real life problems. The results produced by the proposed variant are of better quality compared to those of others [61].
  • Mutation-driven Modified Grey wolf (MDM-GWO): Singh et al. propose a new variant of the GWO called Mutation-driven Modified Grey wolf (MDM-GWO). The variant’s performance is tested by 23 well-known standard benchmark problems and four real world engineering design problems. The numerical results, statistical tests, convergence and diversity curves, and comparisons among several algorithms show the superiority of the proposed MDM-GWO [62].
  • Binary Grey Wolf Optimization (BGWO): To solve the Virtual Network Function (VNF), Shahjalal et al. propose an artificial intelligence (AI) driven meta-heuristic Binary Grey Wolf Optimization (BGWO) algorithm for VNF deployment. The results show the method can minimize VNF deployment costs and maximize users’ QoE [63].
  • Gaze cues learning-based grey wolf optimizer (GGWO): Nadimi-Shahraki et al. propose Gaze cues learning-based grey wolf optimizer (GGWO) with the two mechanisms of the neighbor gaze cues learning (NGCL) and random gaze cues learning (RGCL) inspired by the gaze cueing behavior in wolves. Four real engineering design problems and two optimal power flow (OPF) problems for the IEEE 30-bus and IEEE 118-bus are optimized to verify the applicability of the GGWO in practice. And the results show that the GGWO algorithm is able to provide competitive and superior results to the compared algorithms [64].
  • Improved grey wolf optimizer (IGWO): Wang et al. propose an improved grey wolf optimizer (IGWO) to optimize the model. The experimental results show the superiority of the proposed method compared with other algorithms for solving WSISIP [65].
  • Advanced Grey Wolf Optimization algorithm (AGWO): Meng et al. propose a variant named Advanced Grey Wolf Optimization algorithm (AGWO) with elastic, circling, and attacking mechanisms. The variant optimizes the weights and biases of the MLP. Seven classification and three function approximation datasets investigate the performance of the AGWO, and the results show that the AGWO is superior to some other heuristic algorithms in local optimum avoidance and computational accuracy [19].

Multi-Layer Perceptron (MLP)

Multi-Layer Perceptron (MLP) is an artificial neural network with a forward structure, which maps a set of input vectors to a group of output vectors [66, 67]. MLP (as shown in Fig 2) can be regarded as a directed graph composed of multiple node layers, and each layer connects to the next layer. Each node is a neuron (or processing unit) with a non-linear activation function. The basic structure of multi-layer perception consists of three layers: the first input layer, the middle hidden layer, and the last output layer. The input elements and weights results feed to the summation node with neuron bias. The primary calculation process is as follows:

(1) The weighted sum of the inputs can be calculated by Eq (12). (12) where n is the number of the input nodes, Wij indicates the weight linking the ith input layer node and the jth hidden layer node, and Xi presents the ith input.

(2) The hidden nodes output can be calculated by Eq (13). (13)

(3) The hidden node and final output are calculated by Eqs (14) and (15). (14) (15) where wjk is the weight connecting the jth hidden node with the kth output node, θj and are the biases of the jth hidden node and kth output node.

Elastic Grey Wolf Optimization algorithm (EGWO)

To improve the performance of the basic GWO, the shrinking, resilient surrounding, and weighted candidate mechanisms are introduced to remedy the local stagnation and premature convergence deficiencies of the proposed Elastic Grey Wolf Optimization algorithm (EGWO).

Shrinking mechanism. Cauchy distribution is a continuous probability distribution in which mathematical expectation does not exist. When the random variable satisfies its probability density function, the variable obeys the Cauchy distribution. To prevent the wolf position from falling into local stagnation, the parameters A and C are updated, inspired by Cauchy distribution, to propose a shrinking mechanism. The parameters A and C are computed by Eqs (3) and (4), and the random constant r1 and r2 are updated by Eqs (16) and (17). Parameter a can be calculated by Eq (18). (16) (17) (18)

Resilient surrounding mechanism. The core of the GWO algorithm is the movement and location of three α, β, and δ wolves. The original location updating strategy can not effectively solve high-dimensional and complex problems. It is difficult to find the best solution due to the local stagnation and premature convergence when solving the above complex real world problems. Therefore, introducing the resilient surrounding mechanism can overcome the above weakness. The position of α, β, and δ wolves can be updated by Eqs (19)(22). (19) (20) (21) (22)

Weighted candidate mechanism. In the basic GWO, the α wolf is used for hunting the prey, leading to local stagnation. Therefore, introducing the weighted candidate mechanism can avoid the above problem. First, the weighted coefficient can be computed by Eq (23) to adjust the step direction and length. Second, candidate wolves prepare to hunt the prey, and positions can be updated by Eqs (24) to (27). The whole process of the algorithm is as shown in Algorithm 2. (23) (24) (25) (26) (27)

Algorithm 2: Elastic Grey Wolf Optimization algorithm (EGWO)

Step 1: Initialize population of N wolves: ;

Step 2: Evaluate population in the objective function;

Step 3: While (t < tmax)

  Set top three wolves as α, β and δ wolves:

Xαbest(f(Xi)); Xβsecondbest(f(Xi)); Xδthirdbest(f(Xi));

Step 4: For every wolf

  Set r1 and r2 by Eqs (16) and (17);

  Initialize a according to the Eq (18);

  Update Xα, Xβ and Xδ according to Eqs (20)(22);

  Update the weight w1 to w3 by Eq (23);

  Generate the candidate wolves by Eqs (24) to (27);

  Update the wolf population;

End For

 t ← t+1

End While

Step 5: Return Xα.

EGWO-MLP: Student achievement prediction model

The EGWO-MLP prediction model aims to predict student achievement and then determine the essential variables affecting educational success or failure. The hidden layer structure characteristics and dynamic weight parameter adjustment make it more suitable for predicting student final achievement. The prediction of student achievement can be expressed in Fig 3.

thumbnail
Fig 3. Framework diagram of performance prediction model.

https://doi.org/10.1371/journal.pone.0276943.g003

The model constructions contain input, operation, and output. The process includes normalization processing, determination of input, output and hidden units, training parameters setting, network model creation, calling of activation function, etc. The output is predicting outcomes. If the test sample’s output meets the training sample’s expectation, the learning ends. If it does not meet the expectation of the training sample, it learns again and adjusts the threshold until meeting the termination conditions. The whole process of EGWO training MLP is shown in Fig 4, and weights and biases assignments are presented in Fig 5.

thumbnail
Fig 4. Process of Elastic Grey Wolf Optimization algorithm (EGWO) training MLP.

https://doi.org/10.1371/journal.pone.0276943.g004

thumbnail
Fig 5. Weights and biases assignments of the Multi-Layer Perceptron (MLP) with two layers.

https://doi.org/10.1371/journal.pone.0276943.g005

Experimental dataset and environment

Student data

In this study, we will analyze recent real world data from two Portuguese secondary schools to train and verify the prediction model obtained from the student performance University of California (UCI) Machine Learning Repository dataset (https://archive.ics.uci.edu/ml/datasets/Student+Performance). This dataset was collected through reports and questionnaires on students’ performance in secondary education in two Portuguese schools. The two schools’ proportions of the UCI dataset are shown in Fig 6. The data attributes include student achievement, demographic, social, and school-related characteristics and provide two data sets on the performance of two different subjects: Mathematics (Mat.) and Portuguese (Por.) [68]. The dataset contains thirty attributes, and they are shown in Table 1. The second column of the Table 1 shows the names of the thirty attributes and the third column of the Table 1 describes each attribute. The thirty attributes are the input of the prediction model. And for this paper, we set 80% training data and 20% testing data.

thumbnail
Table 1. Attribute information for student performance data set.

https://doi.org/10.1371/journal.pone.0276943.t001

Prediction model (EGWO-MLP) parameters setting

The EGWO-MLP prediction model contains an input layer, two hidden layers, and an output layer. The input layer of the EGWO-MLP selects 30 attributes from the student performance UCI dataset, including family features and personal characteristics as the input nodes. The hidden layer sets to 2, the first hidden layer obtains (2 × numbers of input + 1) nodes, and the second hidden layer owns two. In the prediction model of this paper, 30 factors that affect students are used as the input of the model, 2 hidden layers, G1 and G2, are used as hidden nodes, and finally, the student’s grades are used as the output of the model.

Experimental environment setting

The experimental environment adopts MATLAB, an advanced technical computing language and interactive environment integrating numerical analysis, data visualization, matrix calculation, and non-linear dynamic modeling. The experiment codes in Matlab R2015b environment under the Windows 10 operating system, all simulations run on the computer with Intel Core(TM) i3-6100 CPU @ 3.70GHz, and its memory is 8G. Twenty runs for each working accesses the predictive performances. The population and max iteration are 10 and 300, and the comparison algorithm’s parameter settings are shown in Table 2.

Criteria for evaluating performance

The training error is the error between the value predicted by the model and the actual value in the training set. The Mean Square Error (MSE) is the training error for the training part. MSE is the distinction between the actual and the predicted value obtained by the training algorithm [69, 70], which is widely used as a criterion [71]. And MSE is computed by Eq (28). (28) where m is the output numbers, utilize kth training sample to get the required output value of the ith input, and is actual value of the kth training sample. To ensure the fairness and effectiveness of the experiment, the average MSE () for all training samples is computed by Eq (29). (29) where s is the number of training samples, the training of an MLP consists of multiple variables and functions, where for the EGWO algorithm is calculated by the Eq (30). (30)

The test error is the average error of the model on the test set, which measures the model’s generalization ability. In practice, the test error should be as small as possible.

Experimental results and discussion

To further analyze the variables that affect student achievement, SPSS software is used to analyze the dataset to obtain the variable importance, and the results are shown in Fig 7. According to the results, it can be seen that the importance of the selected 30 variables to the final output results is different. According to the analysis of the UCI dataset, the proportion of girls in the survey reach 53%, as shown in Fig 8(a). As seen from Fig 8(b), most students’ home addresses are in cities. As shown in Fig 8(c), among the surveyed students, there are relatively more students with family support education. The time students spend learning also determines their degree of knowledge acquisition. According to Fig 9(a), most students study for less than two hours every week. For students, guardians have a direct impact on students’ living environment and then on students’ learning environment. It can be seen from Fig 9(b) that the guardians of most students are mothers, accounting for 69%. Job distribution of students’ parents has displayed in Fig 10, and other jobs account for a large proportion. In addition, service workers account for a large proportion, reaching 26% for father and 26% for mother. And the different importance of variables is selected to discuss the results, such as sex, home address style, family educational support, study time, and guardian and parents’ job.

thumbnail
Fig 8. Students’ own various characteristics.

(a) Proportion of students’ sex, (b) Proportion of students’ home address type, (c) Proportion of students’ receiving family education support.

https://doi.org/10.1371/journal.pone.0276943.g008

thumbnail
Fig 9. Study time and family guardian of the students.

(a) Proportion of students’ study time, (b) Proportion of students’ guardian.

https://doi.org/10.1371/journal.pone.0276943.g009

thumbnail
Fig 10. Job of the students’ parents.

(a) Job proportion of students’ mother, (b) Job proportion of students’ father.

https://doi.org/10.1371/journal.pone.0276943.g010

Based on the above factors, we conclude that EGWO-MLP is a student achievement prediction model, and 30 factors, including the above ones, are input to forecast student achievement. To verify the optimization superiority of the EGWO algorithm, it is compared with other algorithms, including Advanced Grey Wolf Optimization algorithm (AGWO) [19], Particle Swarm Optimization (PSO) [72], Genetic Algorithm (GA) [73], Bat Algorithm (BA) [74], Differential Evolution (DE) [75], and Sine cosine algorithm (SCA) [76]. The model for various algorithms can be named AGWO-MLP, PSO-MLP, GA-MLP, BA-MLP, DE-MLP, and SCA-MLP models. Therefore, the EGWO-MLP model is compared with basic GWO-MLP, AGWO-MLP, PSO-MLP, GA-MLP, BA-MLP, DE-MLP, and SCA-MLP models. At the end of this process, the evaluated test set contains the entire data set, although ten various algorithms of the same MLP model create prediction results. As shown in the statistical test results, the “Do Not Test” occurs for a comparison when no significant difference is found between the two rank sums that enclose that comparison.

Discussion 1: Mathematics (Mat.)

For the Mathematics (Mat.) subject achievement prediction, it can be seen from Tables 3 and 4, the EGWO-MLP model among the swarm intelligence optimization algorithms can obtain the best results. It is worse than the evolutionary algorithms. However, according to the standard deviation, EGWO-MLP can get the smallest standard deviation (std) in the training error rate. Due to the strong exploration and local stagnation avoidance of the EGWO, it can effectively feedback weights and biases to predict student achievement compared with the basic GWO-MLP. To further verify the performance of the EGWO-MLP model and the difference from other algorithms-MLP models, we choose the ANOVA on the RANKS test. And the experimental results are shown in Table 5, demonstrating that EGWO-MLP is superior to AGWO-MLP, GWO-MLP, and GA-MLP models. In general, EGWO-MLP has a significant advantage in predicting student achievement.

thumbnail
Table 3. The training error of the Mathematics (Mat.) subject achievement prediction.

https://doi.org/10.1371/journal.pone.0276943.t003

thumbnail
Table 4. The test error of the Mathematics (Mat.) subject achievement prediction.

https://doi.org/10.1371/journal.pone.0276943.t004

thumbnail
Table 5. Results for RM ANOVA on RANKS of the Mathematics (Mat.) subject achievement prediction.

https://doi.org/10.1371/journal.pone.0276943.t005

Discussion 2: Portuguese (Por.)

For the Portuguese (Por.) subject, as shown in Tables 6 and 7, EGWO-MLP can outperform most models based on swarm intelligence optimization algorithms during model training. However, due to the unique evolutionary characteristics of evolutionary algorithms, it is difficult for EWGO-MLP to surpass its optimization model. For example, the remarkable difference strategy of DE makes it enhance the exploration ability and avoid local stagnation in the optimization process. During the testing process, it is difficult for the compared models to achieve stable model optimization, and EGWO-MLP can obtain the lowest test error and standard deviation. The experimental results of statistical tests are shown in Table 8. The experimental results show that its EGWO-MLP can outperform most of the compared models and has strong stability.

thumbnail
Table 6. The training error for the Portuguese (Por.) subject achievement prediction.

https://doi.org/10.1371/journal.pone.0276943.t006

thumbnail
Table 7. The test error for the Portuguese (Por.) subject achievement prediction.

https://doi.org/10.1371/journal.pone.0276943.t007

thumbnail
Table 8. Results for RM ANOVA on RANKS of the Portuguese (Por.) subject achievement prediction.

https://doi.org/10.1371/journal.pone.0276943.t008

To sum up, this section selects two subjects (Mathematics (Mat.) and Portuguese (Por.)) to train and test the model. The experimental results show that EGWO-MLP is better than the selected swarm intelligence optimization algorithm model. It is difficult to outperform the model trained by the typical evolutionary algorithm in the training process in terms of the unique evolution strategy. However, the testing process shows that EGWO-MLP is more stable and effective and can outperform the compared models. The shrinking, resilient surrounding and weighted candidate mechanisms can decide the wolf position to update the step direction and length. The operation is instrumental in accelerating the convergence speed, avoiding local stagnation, and balancing exploration and exploitation. The above advantages ensure that EGWO can effectively optimize the weights and biases of MLP to drive the EGWO-MLP model to solve high-dimensional complex problems and analyze a large amount of data. To prove the experiment’s validity, the experimental results are statistically analyzed, demonstrating that EGWO-MLP model is effective in dealing with the problem of student achievement prediction. Through the analysis of the above EGWO-MLP model, the selected thirty inputs are conducive to predicting student achievement. For the thirty variables, the weight of their importance is copied through EGWO to avoid the effect of weight assignment of objective reasons on the final prediction results.

Suggestions

Through the model construction in the above chapters and the analysis and discussion of the experimental results, the enlightenment to promote student performance and teaching effectiveness are shown in the following aspects:

Firstly, with the rapid development of modern information technology, communication technology, and computer technology, database application’s scope, depth and scale is expanding. Big data mining and analysis can also benefit educational institutions at all levels. Currently, the use of data stored in schools’ management systems is mainly in a relatively primary stage. Generally, only simple queries and statistical tables are provided in the system, while a large amount of information affecting students’ learning is not accessible. Data mining and analysis through the EGWO-MLP model can make full use of the obtained data to reveal the correlation between student performance and family and individual. And school factors, more accurately, can provide the basis for school decision-makers and help them more comprehensively monitor and regulate the factors affecting teaching quality to ensure the quality of education.

Secondly, by mining the main factors affecting student achievement in a subject, the EGWO-MLP model is generated. Through the analysis of the characteristics of learners, we can understand the learning environment, cognitive factors, and learning ability of different learning individuals. On this basis, teachers can provide personalized teaching content and method according to group differences and learners’ characteristics, which can give the basis for teachers to adjust teaching strategies according to students’ aptitudes. This method can be applied to other subjects so that students can maintain a good learning state and improve the overall learning effect.

Thirdly, the EGWO-MLP model proposed in this paper can predict and warn students, which is conducive to assisting teachers in managing and helping students. Simultaneously, the prediction results can enhance self-learning awareness of students and improve the teaching quality of teachers. The current study can help teachers predict student achievement, reflect on teaching performance, and provide technical analysis strategies and management recommendations for high-quality training of teachers. And it can prevent teachers from selectively ignoring students with a poor foundation, resulting in the unfairness of the teaching process.

Fourthly, during the epidemic period of COVID-19, online teaching is more extensive, making it difficult to ensure students’ performance and effectively give guidance in study or training for students. Meanwhile, online learning achievement prediction mainly relies on structured data, which is difficult to profoundly and accurately mine learners’ states, emotions, and other information, affecting prediction accuracy. Therefore, inspired by this paper, swarm intelligence technology is combined with neural networks to improve prediction accuracy and education quality.

Conclusion and future work

Under the pandemic trend of COVID-19, due to the change in curriculum arrangement and teachers’ teaching methods, student academic achievement and performance have become the focus of education. To effectively manage students’ learning factors and efficiently guide students in their learning, a direct and effective way is needed to predict student performance. The improvement of student performance and ability is a critical issue in education. The analysis of family features and personal characteristics on student achievement and performance finds that the case of student achievement prediction belongs to non-linear, high-dimensional, and complex practical problems. The existing NFL theory-based prediction methods fail to predict student achievement and performance, for they cannot fully cover influencing factors. To more accurately predict student performance, this paper builds a prediction model based on MLP.

MLP is a method applied to solve high-dimensional complex problems, and it has been applied to predict student achievement in precious research and educational practice. Since the MLP tends to fall into local stagnation, it is challenging to obtain the optimal solution during the optimization process, and finally obtain better optimization results. Therefore, introducing a swarm intelligence optimization algorithm optimizes the weights and biases of the MLP.

To solve the above problems and obtain better experimental results, this paper proposes an Elastic Grey Wolf Optimization algorithm (EGWO) variant of the grey wolf optimization algorithm. EGWO integrates with MLP to optimize the weights and biases to predict student achievement effectively. The contribution of the above can be summarized as follows:

  • Shrinking and resilient surrounding mechanisms compute the positions of the α, β, and δ wolves to enhance exploration.
  • Due to the introduction of the weighted candidate mechanism, the hunting operation is not limited to the α wolf. The occurrence of candidate wolves avoids local stagnation.
  • The proposal of the EGWO algorithm optimizes the weights and biases of the MLP to obtain an accurate prediction value.
  • The EGWO-MLP model is proposed to predict student achievement with reduced test error. It can fully mine data information and make full use of data information for prediction.

To verify the superiority of the proposed EGWO-MLP model, the achievement predictions of Mathematics (Mat.) and Portuguese (Por.) by the model are compared with those by AGWO-MLP, PSO-MLP, GA-MLP, BA-MLP, DE-MLP, and SCA-MLP models. The evaluation criteria include training (MSE) and test errors. The experimental results show that EGWO-MLP model has less test error and standard deviation. It demonstrates that this EGWO-MLP model can effectively predict student achievement. The corresponding suggestions and countermeasures are put forward through the analysis of the experimental results.

Experiments show that artificial neural network has certain advantages in student performance prediction, which can effectively manage and cultivate students. However, some limitation influences the prediction accuracy, such as the amount of data and feature attributes in existence referencing the UCI data. In the future, Convolutional Neural Networks (CNN) and Long short-term memory (STLM) can be selected to predict student performance according to the timeline. At the same time, more effective swarm intelligence technology can be chosen to optimize the neural network structure and adjust parameters to improve prediction accuracy.

References

  1. 1. Kader A A. Locus of control, self-efficacy, and student performance in an introductory economics course. International Review of Economics Education, 2022, 39: 100234.
  2. 2. Wright N A, Arora P. A for effort: Incomplete information and college students’ academic performance. Economics of Education Review, 2022, 88: 102238.
  3. 3. Fernandez-Perez V, Martin-Rojas R. Emotional competencies as drivers of management students’ academic performance: The moderating effects of cooperative learning. The International Journal of Management Education, 2022, 20(1): 100600.
  4. 4. Hutain J, Michinov N. Improving student engagement during in-person classes by using functionalities of a digital learning environment. Computers & Education, 2022, 183: 104496.
  5. 5. Demir E K. The role of social capital for teacher professional learning and student achievement: A systematic literature review. Educational Research Review, 2021, 33: 100391.
  6. 6. Aldowah H, Al-Samarraie H, Fauzy W M. Educational data mining and learning analytics for 21st century higher education: A review and synthesis. Telematics and Informatics, 2019, 37: 13–49.
  7. 7. Şen B, Uçar E, Delen D. Predicting and analyzing secondary education placement-test scores: A data mining approach. Expert Systems with Applications, 2012, 39(10): 9468–9476.
  8. 8. Myachin A. Analysis of global data education and patent activity using new methods of pattern analysis. Procedia Computer Science, 2014, 31: 468–473.
  9. 9. Gao L, Zhao Z, Li C, et al. Deep cognitive diagnosis model for predicting students’ performance. Future Generation Computer Systems, 2022, 126: 252–262.
  10. 10. Waheed H, Hassan S U, Aljohani N R, et al. Predicting academic performance of students from VLE big data using deep learning models. Computers in Human behavior, 2020, 104: 106189.
  11. 11. Daumiller M, Fasching M S, Steuer G, et al. From teachers’ personal achievement goals to students’ perceptions of classroom goal structures: Via student-oriented goals and specific instructional practices. Teaching and Teacher Education, 2022, 111: 103617.
  12. 12. Hwang N Y, Kisida B, Koedel C. A familiar face: Student-teacher rematches and student achievement. Economics of Education Review, 2021, 85: 102194.
  13. 13. Igboji J O, Umoke M J, Obande-Ogbuinya N E, et al. Perception of Head Teachers and Education Secretaries on Home Grown School Feeding Program in Nigeria. SAGE Open, 2022, 12(2): 21582440221095029.
  14. 14. Gore J M, Miller A, Fray L, et al. Improving student achievement through professional development: Results from a randomised controlled trial of Quality Teaching Rounds. Teaching and Teacher Education, 2021, 101: 103297.
  15. 15. Chen D, Ning B, Bos W. Relationship between Principal Leadership Style and Student Achievement: A Comparative Study between Germany and China. SAGE Open, 2022, 12(2): 21582440221094601.
  16. 16. Poortman C L, Schildkamp K. Solving student achievement problems with a data use intervention for teachers. Teaching and teacher education, 2016, 60: 425–433.
  17. 17. Chuang Liu and Haojie Wang and Yingkui Du and Zhonghu Yuan A Predictive Model for Student Achievement Using Spiking Neural Networks Based on Educational Data, Applied Sciences, 2022, 12(8): 3841.
  18. 18. Asteris P G, Mokos V G. Concrete compressive strength using artificial neural networks. Neural Computing and Applications, 2020, 32(15): 11807–11826.
  19. 19. Meng X, Jiang J, Wang H. AGWO: Advanced GWO in multi-layer perception optimization. Expert Systems with Applications, 2021, 173: 114676.
  20. 20. Baashar Y, Alkawsi G, Mustafa A, et al. Toward predicting student’s academic performance using artificial neural networks (ANNs). Applied Sciences, 2022, 12(3): 1289.
  21. 21. Li X D, Wang J S, Hao W K, et al. Multi-layer perceptron classification method of medical data based on biogeography-based optimization algorithm with probability distributions. Applied Soft Computing, 2022, 121: 108766.
  22. 22. Jiang J, Meng X, Liu Y, et al. An Enhanced TSA-MLP Model for Identifying Credit Default Problems. SAGE Open, 2022, 12(2): 21582440221094586.
  23. 23. Nathaniel Oyelade Olaide and El-Shamir Ezugwu Absalom and IA Mohamed Tehnan and Laith Abualigah Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 2022, 10: 16150–16177
  24. 24. Nima Khodadadi and Farhad Soleimanian Gharehchopogh and Seyedali Mirjalili MOAVOA: a new multi-objective artificial vultures optimization algorithm, Neural Computing and Applications, 2022, 1–39,
  25. 25. Mavrovouniotis M, Li C, Yang S. A survey of swarm intelligence for dynamic optimization: Algorithms and applications. Swarm and Evolutionary Computation, 2017, 33: 1–17.
  26. 26. Jaafari A, Panahi M, Mafi-Gholami D, et al. Swarm intelligence optimization of the group method of data handling using the cuckoo search and whale optimization algorithms to model and predict landslides. Applied Soft Computing, 2022, 116: 108254.
  27. 27. Moayedi H, Mehrabi M, Mosallanezhad M, et al. Modification of landslide susceptibility mapping using optimized PSO-ANN technique. Engineering with Computers, 2019, 35(3): 967–984.
  28. 28. Mirjalili S, Mirjalili S M, Lewis A. Grey wolf optimizer. Advances in engineering software, 2014, 69: 46–61.
  29. 29. Faris H, Aljarah I, Al-Betar M A, et al. Grey wolf optimizer: a review of recent variants and applications. Neural computing and applications, 2018, 30(2): 413–435.
  30. 30. Nadimi-Shahraki M H, Taghian S, Mirjalili S. An improved grey wolf optimizer for solving engineering problems. Expert Systems with Applications, 2021, 166: 113917.
  31. 31. Mirjalili S. How effective is the Grey Wolf optimizer in training multi-layer perceptrons. Applied Intelligence, 2015, 43(1): 150–161.
  32. 32. Mosavi A, Samadianfard S, Darbandi S, et al. Predicting soil electrical conductivity using multi-layer perceptron integrated with grey wolf optimizer. Journal of Geochemical Exploration, 2021, 220: 106639.
  33. 33. Moniz N, Monteiro H. No Free Lunch in imbalanced learning. Knowledge-Based Systems, 2021, 227: 107222.
  34. 34. Woo J, Rajagopalan P, Andamon M M. An evaluation of measured indoor conditions and student performance using d2 Test of Attention. Building and Environment, 2022, 214: 108940.
  35. 35. Vignery K. From networked students centrality to student networks density: What really matters for student performance?. Social Networks, 2022, 70: 166–186.
  36. 36. Ghosh Khan A Ghosh S K D et al. Random wheel: An algorithm for early classification of student performance with confidence. Engineering Applications of Artificial Intelligence, 2021, 102: 104270
  37. 37. Wu H, Shen J. The association between principal leadership and student achievement: A multivariate meta-meta-analysis. Educational Research Review, 2021: 100423.
  38. 38. Masci C, De Witte K, Agasisti T. The influence of school size, principal characteristics and school management practices on educational performance: An efficiency analysis of Italian students attending middle schools. Socio-Economic Planning Sciences, 2018, 61: 52–69.
  39. 39. Lim H, Kim S, Chung K M, et al. Is college students’ trajectory associated with academic performance?. Computers & Education, 2022, 178: 104397.
  40. 40. Silva A D, Vautero J, Usssene C. The influence of family on academic performance of Mozambican university students. International Journal of Educational Development, 2021, 87: 102476.
  41. 41. Sarfraz M, Khawaja K F, Ivascu L. Factors affecting business school students’ performance during the COVID-19 pandemic: A moderated and mediated model. The International Journal of Management Education, 2022, 20(2): 100630.
  42. 42. Javadizadeh B, Aplin-Houtz M, Casile M. Using SCARF as a motivational tool to enhance students′ class performance. The International Journal of Management Education, 2022, 20(1): 100594.
  43. 43. Rivas A, Gonzalez-Briones A, Hernandez G, et al. Artificial neural network analysis of the academic performance of students in virtual learning environments. Neurocomputing, 2021, 423: 713–720.
  44. 44. Mengran Li and Yong Zhang and Xiaoyong Li and Lijia Cai and Baocai Yin Multi-view hypergraph neural networks for student academic performance prediction, Engineering Applications of Artificial Intelligence,2022, 114: 105174
  45. 45. Roberto Bertolini and Finch Stephen J and Nehm Ross H Quantifying variability in predictions of student performance: Examining the impact of bootstrap resampling in data pipelines, Computers and Education: Artificial Intelligence, 2022, 3: 100067.
  46. 46. Zhengyang Wu and Li Huang and Qionghao Huang and Changqin Huang and Yong Tang SGKT: Session graph-based knowledge tracing for student performance prediction, Expert Systems with Applications, 2022, 206: 117681.
  47. 47. Harikumar Pallathadka and Alex Wenda and Edwin Ramirez-Asís and Maximiliano Asís-López and Judith Flores-Albornoz and Khongdet Phasinam Classification and prediction of student performance data using various machine learning algorithms, Materials today: proceedings, 2021.
  48. 48. Nikola Tomasevic and Nikola Gvozdenovic and Sanja Vranes An overview and comparison of supervised data mining techniques for student exam performance prediction, Computers & education, 2020, 143: 103676.
  49. 49. Harwell M R, Zhao Q. An empirical example of capturing the impact of SES on student achievement using path analysis. International Journal of Educational Research, 2021, 105: 101715.
  50. 50. Sanfo J B M B. Connecting family, school, gold mining community and primary school students’ reading achievements in Burkina Faso–A three-level hierarchical linear model analysis. International Journal of Educational Development, 2021, 84: 102442.
  51. 51. Kim Y, Mok S Y, Seidel T. Parental influences on immigrant students’ achievement-related motivation and achievement: A meta-analysis. Educational Research Review, 2020, 30: 100327.
  52. 52. Tygret J A. The influence of student teachers on student achievement: A case study of teacher perspectives. Teaching and Teacher Education: An International Journal of Research and Studies, 2017, 66(1): 117–126.
  53. 53. Kaiser J, Retelsdorf J, Südkamp A, et al. Achievement and engagement: How student characteristics influence teacher judgments. Learning and Instruction, 2013, 28: 73–84.
  54. 54. Doyle A, Li L. Family-Focused Early Learning Programing: Access, Opportunities, and Issues in one Canadian Context. SAGE Open, 2021, 11(4): 21582440211046943.
  55. 55. Swain J M, Cara O. Changing the home literacy environment through participation in family literacy programmes. Journal of Early Childhood Literacy, 2019, 19(4): 431–458.
  56. 56. Xiaobing Yu and Xuejing Wu. Ensemble grey wolf Optimizer and its application for image segmentation. Expert Systems with Applications 2022, 209: 118267.
  57. 57. Kusum Deep and others A random walk Grey wolf optimizer based on dispersion factor for feature selection on chronic disease prediction. Expert Systems with Applications 2022, 206: 117864.
  58. 58. Jianhua Jiang and Ziying Zhao and Yutong Liu and Weihua Li and Huan Wang DSGWO: An improved grey wolf optimizer with diversity enhanced strategy based on group-stage competition and balance mechanisms. Knowledge-Based Systems, 2022, 109100.
  59. 59. Nabanita Banerjee and Sumitra Mukhopadhyay AP-TLB-IGWO: Adult-pup teaching-learning based interactive grey wolf optimizer for numerical optimization. Applied Soft Computing, 2022, 109000.
  60. 60. Ayman Hoballah and Azmy Ahmed M Constrained economic dispatch following generation outage for hot spinning reserve allocation using hybrid grey wolf optimizer. Alexandria Engineering Journal, 2022, 169–180.
  61. 61. Joy Adhikary and Sriyankar Acharyya Randomized Balanced Grey Wolf Optimizer (RBGWO) for solving real life optimization problems. Applied Soft Computing, 2022, 117: 108429.
  62. 62. Shitu Singh and Chand Bansal Jagdish Mutation-driven grey wolf optimizer with modified search mechanism. Expert Systems with Applications, 2022, 194: 116450.
  63. 63. Shahjalal Mohammad and Farhana Nusrat and Roy Palash and Abdur Razzaque Md and Kaur Kuljeet and Mehedi Hassan Mohammad A Binary Gray Wolf Optimization algorithm for deployment of Virtual Network Functions in 5G hybrid cloud. Computer Communications, 2022, 193: 63–74.
  64. 64. Nadimi-Shahraki Mohammad H and Taghian Shokooh and Mirjalili Seyedali and Zamani Hoda and Bahreininejad Ardeshir GGWO: Gaze cues learning-based grey wolf optimizer and its applications for solving engineering problems. Journal of Computational Science, 2022,61:101636.
  65. 65. Cuiyu Wang and Li Zhao and Li Xinyu and Li Yang An improved grey wolf optimizer for welding shop inverse scheduling, Computers & Industrial Engineering, 2022, 163: 107809.
  66. 66. Turkoglu B, Kaya E. Training multi-layer perceptron with artificial algae algorithm. Engineering Science and Technology, an International Journal, 2020, 23(6): 1342–1350.
  67. 67. Gits-Muselli M, Campagne P, Desnos-Ollivier M, et al. Comparison of multilocus sequence typing (MLST) and microsatellite length polymorphism (MLP) for Pneumocystis jirovecii genotyping. Computational and Structural Biotechnology Journal, 2020, 18: 2890–2896. pmid:33163149
  68. 68. Cortez P, Silva A M G. Using data mining to predict secondary school student performance 2008, 5–12.
  69. 69. Meda-Campaña J A. On the estimation and control of nonlinear systems with parametric uncertainties and noisy outputs. IEEE Access, 2018, 6: 31968–31973.
  70. 70. Hernández G, Zamora E, Sossa H, et al. Hybrid neural networks for big data classification. Neurocomputing, 2020, 390: 327–340.
  71. 71. Chiang H S, Chen M Y, Huang Y J. Wavelet-based EEG processing for epilepsy detection using fuzzy entropy and associative petri net. IEEE Access, 2019, 7: 103255–103262.
  72. 72. Mirjalili S A, Hashim S Z M, Sardroudi H M. Training feedforward neural networks using hybrid particle swarm optimization and gravitational search algorithm. Applied Mathematics and Computation, 2012, 218(22): 11125–11137.
  73. 73. Singh K J, De T. MLP-GA based algorithm to detect application layer DDoS attack. Journal of information security and applications, 2017, 36: 145–153.
  74. 74. Ghanem W A H M, Jantan A. A new approach for intrusion detection system based on training multilayer perceptron by using enhanced Bat algorithm. Neural Computing and Applications, 2020, 32(15): 11665–11698.
  75. 75. Wang Y, Li H X, Huang T, et al. Differential evolution based on covariance matrix learning and bimodal distribution parameter setting. Applied Soft Computing, 2014, 18: 232–247.
  76. 76. Mirjalili S. SCA: a sine cosine algorithm for solving optimization problems. Knowledge-based systems, 2016, 96: 120–133.