Acessibilidade / Reportar erro

Predictors of Students’ Mathematics Achievement in Secondary Education* * Apoio: Universidade Federal de Minas Gerais.

Abstract

Acknowledging the relevance of mathematics education, as well the evidence about predictors related to achievement in this domain, the present study performed a predictive analysis of students’ mathematics achievement in the National Exam for Secondary Education, employing the Regression Tree Method and a model with 53 predictors. Results indicated that the model explained 29.97% of the mathematics achievement variance. Certain variables are related to worse achievement in mathematics: Students’ family monthly income equal or smaller than 2 minimum wages, be female, have not attended Primary and Secondary Education in private schools, live in North, North East and Center West regions of Brazil, be highly motivated to perform the exam to obtain Secondary Education certificate or scholarship. The results obtained highlight the role of variables related to the individual, school and family as predictors of mathematics achievement.

Keywords:
mathematics; secondary education; prediction; regression tree; National Exam for Secondary Education

Resumo

Considerando a relevância da formação em matemática, assim como a evidência de preditores relacionados ao desempenho nesse domínio, realizou-se no presente estudo uma análise preditiva do desempenho matemático de inscritos no Exame Nacional do Ensino Médio de 2011, empregando a abordagem de Regressão em Árvore e um modelo com 53 preditores. Os resultados indicam que o modelo explicou 29,97% da variância do desempenho em matemática na amostra teste. Determinadas variáveis relacionam-se a um pior rendimento em matemática: renda familiar de até dois salários mínimos, sexo feminino, não ter cursado escolas particulares no ensino fundamental e no ensino médio, residir nas regiões Norte, Nordeste e Centro-Oeste, e estar altamente motivado para fazer o Exame para obter certificação ou bolsa de estudos. Os resultados obtidos salientam o papel de variáveis relacionadas ao indivíduo, à escola e à família como preditoras do desempenho em matemática.

Palavras-chave:
matemática; ensino médio; predição; árvore de regressão; Exame Nacional do Ensino Médio

Access to education has been considered an essential requisite for the survival of and in the modern competitive society. It is no coincidence that the economic growth of a nation is associated with the development of its citizens’ cognitive skills (Hanushek & Woessmann, 2011Hanushek, A. E., & Woessmann, L. (2011). The Cost of Low Educational Achievement in the European Union (Report No. 7). European Expert Network on Economics of Education (EENEE). http://www.eenee.de/dms/EENEE/Policy_Briefs/PolicyBrief1-2011.pdf
http://www.eenee.de/dms/EENEE/Policy_Bri...
). Competence in mathematics, for example, has been identified in European countries as highly relevant for personal achievement, the full exercise of citizenship, social inclusion and employability in the world of the 21st century. Nevertheless, the decline in the number of students interested in mathematics, science and technology, and the male predominance in these areas have worried European managers, educators and researchers (European Commission, 2011European Commission. (2011). Mathematics Education in Europe: Common Challenges and National Policies. Education, Audiovisual and Cultural Executive Agency, Eurydice. ).

In Brazil, students’ low performance in mathematics has been evidenced in large-scale assessments, such as the National Exam for Secondary Education (ENEM), the Basic Education Assessment System (SAEB) and the Programme for International Student Assessment (PISA). According to the results of the SAEB tests applied in 2015, the average proficiency in mathematics of Brazilian secondary school students was 267 points, the worst result since 1995, the beginning of the construction of the historical series of performance in this evaluation. This scenario requires a review of social and educational public policies, as well as the innovation of teaching practices and an in-depth discussion about teacher professional development in mathematics and teaching conditions. To substantiate this review, in turn, one needs to identify which factors (predictors) have impacted the students’ mathematics performance.

The literature has fundamentally appointed individual, family, and school characteristics as the three categories of predictors most related to performance in mathematics (Akben-Selcuk, 2017Akben-Selcuk, E. (2017). Personality, Motivation, and Math Achievement among Turkish Students: Evidence from PISA Data. Perceptual and Motor Skills, 124, 514-530. http://doi.org/10.1177/0031512516686505
http://doi.org/10.1177/0031512516686505...
; Karokolidis et al., 2016Karakolidis, A., Pitsia, V., & Emvalotis, A. (2016). Examining Students’ Achievement in Mathematics: A Multilevel Analysis of the Programme for International Student Assessment (PISA) 2012 Data for Greece. International Journal of Educational Research, 79, 106-115. http://doi.org/10.1016/j.ijer.2016.05.013
http://doi.org/10.1016/j.ijer.2016.05.01...
; Lee & Stankov, 2013Lee, J., & Stankov, L. (2013). Higher-Order Structure of Noncognitive Constructs and Prediction of PISA 2003 Mathematics Achievement. Learning and Individual Differences, 26, 119-130. http://doi.org/10.1016/j.lindif.2013.05.004
http://doi.org/10.1016/j.lindif.2013.05....
; Pangeni, 2014Pangeni, K. P. (2014). Factors Determining Educational Quality: Student Mathematics Achievement in Nepal. International Journal of Educations Research, 34(1), 30-41. http://doi.org/10.1016/j.ijedudev.2013.03.001
http://doi.org/10.1016/j.ijedudev.2013.0...
; Thien & Ong, 2014). At the individual and family level, the contribution of variables such as gender, socioeconomic status, attending early childhood education, as well as self-concept, self-efficacy and anxiety in relation to mathematics is highlighted. With regard to the school, the average socioeconomic level of the group of students, the school management environment, and the availability of appropriate material resources/school equipment stand out.

Several studies support the argument that the three classes of predictors pointed out - individual, family and school - stand out to explain student performance in mathematics (Akben-Selcuk, 2017Akben-Selcuk, E. (2017). Personality, Motivation, and Math Achievement among Turkish Students: Evidence from PISA Data. Perceptual and Motor Skills, 124, 514-530. http://doi.org/10.1177/0031512516686505
http://doi.org/10.1177/0031512516686505...
; Karakolidis et al., 2016Karakolidis, A., Pitsia, V., & Emvalotis, A. (2016). Examining Students’ Achievement in Mathematics: A Multilevel Analysis of the Programme for International Student Assessment (PISA) 2012 Data for Greece. International Journal of Educational Research, 79, 106-115. http://doi.org/10.1016/j.ijer.2016.05.013
http://doi.org/10.1016/j.ijer.2016.05.01...
; Laros et al., 2010Laros, J. A., Marciano, J. L. P., & Andrade, J. M. (2010). Fatores que Afetam o Desempenho na Prova de Matemática do SAEB: Um Estudo Multinível [Factors that Affect the Performance on the SAEB Mathematics Test: A Multilevel Study]. Avaliação Psicológica, 9, 173-186.; Lee & Stankov, 2013Lee, J., & Stankov, L. (2013). Higher-Order Structure of Noncognitive Constructs and Prediction of PISA 2003 Mathematics Achievement. Learning and Individual Differences, 26, 119-130. http://doi.org/10.1016/j.lindif.2013.05.004
http://doi.org/10.1016/j.lindif.2013.05....
; Martin & Lazendic, 2018Martin, A. J., & Lazendic, G. (2018). Achievement in Large Scale National Numeracy Assessment: An Ecological Study of Motivation and Student, Home, and School Predictors. Journal of Educational Psychology, 110, 565-482. http://doi.org/10.1037/edu0000231
http://doi.org/10.1037/edu0000231...
; Pangeni, 2014Pangeni, K. P. (2014). Factors Determining Educational Quality: Student Mathematics Achievement in Nepal. International Journal of Educations Research, 34(1), 30-41. http://doi.org/10.1016/j.ijedudev.2013.03.001
http://doi.org/10.1016/j.ijedudev.2013.0...
; Pinto et al., 2016Pinto, J., Carvalho e Silva, J., & Bixirão Neto, T. (2016). Fatores Influenciadores dos Resultados de Matemática de Estudantes Portugueses e Brasileiros no PISA: Revisão Integrativa [Influencing Factors on the Mathematics Results of Portuguese and Brazilian Students in PISA: Integrative Review]. Ciência e Educação, 22(4), 837-853. http://doi.org/10.1590/1516-731320160040002
http://doi.org/10.1590/1516-731320160040...
; Pipere & Mierina, 2017Pipere, A., & Mierina, I. (2017). Exploring Non-Cognitive Predictors of Mathematics Achievement among 9th Grade Students. Learning and Individual Differences, 59, 65-77. http://doi.org/10.1016/j.lindif.2017.09.005
http://doi.org/10.1016/j.lindif.2017.09....
; Thien & Ong, 2015Thien, L. M., & Ong, M. Y. (2015). Malaysian and Singaporean Students’ Affective Characteristics and Mathematics Performance: Evidence from PISA 2012. Springer Plus, 4, 563. http://doi.org/10.1186/s40064-015-1358-z
http://doi.org/10.1186/s40064-015-1358-z...
). For example, Karakolidis et al. (2016Karakolidis, A., Pitsia, V., & Emvalotis, A. (2016). Examining Students’ Achievement in Mathematics: A Multilevel Analysis of the Programme for International Student Assessment (PISA) 2012 Data for Greece. International Journal of Educational Research, 79, 106-115. http://doi.org/10.1016/j.ijer.2016.05.013
http://doi.org/10.1016/j.ijer.2016.05.01...
) analyzed data from the International Student Assessment Program (PISA) for Greece, and found that gender, attending early childhood education, self-concept, self-efficacy and anxiety level in mathematics, as well as the socioeconomic status of the student and school are predictors of performance in mathematics. Similarly, Thien and Ong (2015Thien, L. M., & Ong, M. Y. (2015). Malaysian and Singaporean Students’ Affective Characteristics and Mathematics Performance: Evidence from PISA 2012. Springer Plus, 4, 563. http://doi.org/10.1186/s40064-015-1358-z
http://doi.org/10.1186/s40064-015-1358-z...
) analyzed PISA data for Singapore and Malaysia and observed that the student’s socioeconomic level, self-efficacy and anxiety level in mathematics predict performance in mathematics in both countries, while the school’s socioeconomic level predicts performance in mathematics only in the case of Malaysian students.

Lee and Stankov (2013Lee, J., & Stankov, L. (2013). Higher-Order Structure of Noncognitive Constructs and Prediction of PISA 2003 Mathematics Achievement. Learning and Individual Differences, 26, 119-130. http://doi.org/10.1016/j.lindif.2013.05.004
http://doi.org/10.1016/j.lindif.2013.05....
) investigated the extent to which academic self-belief, motivation, learning strategies, and attitudes towards school predict performance in mathematics, analyzing PISA data from 41 countries. They found evidence that self-efficacy and self-concept are relevant predictors of student performance. Akben-Selcuk (2017Akben-Selcuk, E. (2017). Personality, Motivation, and Math Achievement among Turkish Students: Evidence from PISA Data. Perceptual and Motor Skills, 124, 514-530. http://doi.org/10.1177/0031512516686505
http://doi.org/10.1177/0031512516686505...
), in turn, specifically analyzed the performance of Turkish students on the PISA mathematics test and found that gender, age, socioeconomic status, school resources, intrinsic motivation and personality variables, such as attribution of external causality to failure and openness to problem-solving activities, play a predictive role. The findings of Pangeni (2014Pangeni, K. P. (2014). Factors Determining Educational Quality: Student Mathematics Achievement in Nepal. International Journal of Educations Research, 34(1), 30-41. http://doi.org/10.1016/j.ijedudev.2013.03.001
http://doi.org/10.1016/j.ijedudev.2013.0...
) draw attention to the contribution of family and school factors to the performance in mathematics. The results revealed that parents’ educational level, number of books in the residence, parental support in performing school tasks, teacher training, number of school days and physical facilities predict the mathematical performance of a sample of 762 secondary education students in Nepal.

In the national scenario, the evidence also corroborates the assertion that the three classes of predictors pointed out - individual, family and school - are relevant to explain student performance in mathematics. Laros et al. (2010Laros, J. A., Marciano, J. L. P., & Andrade, J. M. (2010). Fatores que Afetam o Desempenho na Prova de Matemática do SAEB: Um Estudo Multinível [Factors that Affect the Performance on the SAEB Mathematics Test: A Multilevel Study]. Avaliação Psicológica, 9, 173-186.), for example, identified which student and school characteristics are associated with math performance in secondary education. The results indicated socioeconomic level, cultural resources, parental charge and incentive, disciplinary climate and collaborative work in school. In an integrative review, Pinto et al. (2016Pinto, J., Carvalho e Silva, J., & Bixirão Neto, T. (2016). Fatores Influenciadores dos Resultados de Matemática de Estudantes Portugueses e Brasileiros no PISA: Revisão Integrativa [Influencing Factors on the Mathematics Results of Portuguese and Brazilian Students in PISA: Integrative Review]. Ciência e Educação, 22(4), 837-853. http://doi.org/10.1590/1516-731320160040002
http://doi.org/10.1590/1516-731320160040...
) identified the main factors that influence the mathematics results of Brazilian and Portuguese students on PISA. The socioeconomic context, the educational system (e.g., high retention rates, educational inequality, school dropout) and school characteristics (public or private, school culture and teacher performance) were considered the factors most related to performance in mathematics.

In summary, the studies present evidence that a set of predictors is relevant to explain the performance in mathematics. Identifying such predictors is not a trivial or irrelevant scientific task, because knowledge about them allows the construction of well-founded information about which factors are associated with worse or better performance. Furthermore, data on the predictors permit the generation of empirically grounded public policies, capable of acting on well-identified factors. Schwartzman et al. (2017Schwartzman, S., Costin, C., & Coutinho, A. M. J. (2017). Sociologia e Economia da Educação [Sociology and Economy of Education]. Rede Ciência para Educação. ) argue that educational policies and practices have rarely been based on empirical evidence. In developing countries like Brazil, this need is urgent in view of the conditions of social, economic, and educational inequality. Evidence-based policies and practices may have a greater chance of becoming effective and achieving the goal of leveraging the educational development of the Brazilian people.

In this sense, the purpose of this study was to investigate a large number of predictive variables of performance in mathematics in secondary education, as well as to apply a nonparametric data analysis technique, the tree regression method, in order to particularly identify nonlinear relationships between the variables of the study which are not easily identified by usual methods of data analysis, such as multiple regression analysis or hierarchical regression analysis. This study goes beyond the verification of the extent to which the predictive variables of the study can explain the variance of performance in mathematics. The predictive analysis proposed focuses on providing a “map” of interactions between the variables of the study, with a view to deepening the theoretical knowledge about the performance in mathematics in secondary education. Therefore, the outcome variable in this study is the performance of the participants in the 2011 National Exam for Secondary Education (ENEM) in the field of mathematics, using a wide set of 53 microdata variables of this edition of the exam as predictors.

Next, we present the reasons for choosing to analyze ENEM data. Given its current scope, the ENEM can be seen as a public policy capable of supporting the improvement of the Brazilian educational system at the level of basic education, particularly in secondary education. Established by Decree N. 438 of May 28, 1998 (Ministry of Education [MEC]/National Institute of Educational Studies and Research Anísio Teixeira [INEP], 1998Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (1998). Portaria Normativa nº 438, de 28 de maio de1998. DOU de 01 jun. 1998, nº 102-E, seção 1, p. 5. http://www.normasbrasil.com.br/norma/?id=181137
http://www.normasbrasil.com.br/norma/?id...
), with the purpose of evaluating the “competencies and skills developed by the examinees throughout elementary and secondary school, essential to academic life, the world of work and the exercise of citizenship” (art. 2), the ENEM was structured as an individual assessment of skills development, with interdisciplinarity and contextualization of knowledge expressed in the form of problem situations as structuring axes.

The creation of the University for All program (ProUni), with the consequent granting of scholarships in private higher education institutions based on the ENEM score, and the reformulation introduced in the exam in 2009, associated with the implementation of the Unified Selection System (SiSU), gave ENEM another goal: the selection process of access to Brazilian higher education (MEC/INEP, 2013Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2013). Exame Nacional do Ensino Médio (Enem): Relatório pedagógico 2009-2010. INEP/MEC. http://portal.inep.gov.br/documents/186968/484421/Relatório+Pedagógico+ENEM+2009-2010/70890e24-a78a-44f8-a909-b235f02948f2?version=1.1
http://portal.inep.gov.br/documents/1869...
). This caused the number of subscribers to jump from 157.2 thousand, upon its creation in 1998, to more than 8.6 million in 2016. It is the second higher education entry exam in the world in terms of number of enrolments, lagging behind only by Gaokao, the exam created in China in 1952 to select students for the universities of that country. Thus, studies involving the analysis of students’ performance on exams the size of ENEM represent contributions beyond the Brazilian context, as they serve as a basis for comparison with similar results from other countries.

After the changes introduced in ENEM in 2009, in addition to the traditional writing test, the exam went from 63 to 180 objective questions, equally distributed in four areas of knowledge: (a) Languages, Codes and their Technologies (including writing), (b) Humanities and their Technologies, (c) Natural Sciences and their Technologies, and (d) Mathematics and its Technologies. With regard to Mathematics and its Technologies, the reference matrix of the exam, in force since 2009, relates five cognitive axes - common to all four areas evaluated in the exam - with seven area-specific competencies, 30 skills and objects of knowledge specific to mathematics. Coping with problem situations, understanding phenomena and building arguments are evident in the reference matrix of the exam. The seven competencies in the area of mathematics refer to contents present in basic education and are organized by thematic blocks: numbers, geometry, algebra, quantities and measures, mathematical modeling, information treatment and knowledge of statistics and probability, while the objects of knowledge are subdivided into five topics: numerical knowledge; geometric knowledge; statistical knowledge and probability; algebraic knowledge; algebraic/geometric knowledge (Rabelo, 2013Rabelo, M. L. (2013). Avaliação Educacional: Fundamentos, Metodologia e Aplicações no Contexto Brasileiro [Educational Evaluation: Fundamentals, Methodology and Applications in the Brazilian Context]. Rio de Janeiro, RJ: SBM.). Overall, mathematics questions are presented in the form of problem situations, which the participant must solve by mobilizing cognitive and conceptual knowledge acquired throughout basic education.

Today, almost all Brazilian federal higher education institutions (HEIs) use the grades students obtained on ENEM as a criterion to select candidates for their undergraduate courses. The exam has become the most influential and relevant assessment in the Brazilian society. By replacing the traditional entrance exam in most federal higher education institutions across the country, it has started to be considered as a tool to democratize the access to university as, with a single selection process and payment of a single registration fee, it is possible to compete for vacancies in higher undergraduate courses throughout the national territory. The INEP, responsible for operating the ENEM, has a range of information about registrants and participants in the exam. Some of these data are collected directly during the enrolment, while data involving performance on the test is obviously collected and recorded after its completion and correction. In short, this entire set of information composes the so-called microdata of ENEM and serves to store important data about the participants’ performance, about their demographic and socioeconomic characteristics, in addition to informing if the registrant has any disability that requires the need to perform the test in a special condition, and with differentiated care (MEC/INEP, 2012Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2012). Microdados do ENEM - 2011, Exame Nacional do Ensino Médio: Manual do usuário. MEC/INEP. http://portal.inep.gov.br/web/guest/microdados
http://portal.inep.gov.br/web/guest/micr...
). INEP freely provides access to these microdata, permitting further exploration in the different years in which the ENEM test has been applied. Specifically for the 2011 edition, Gomes et al. (2016Gomes, C. M. A., Golino, H. F., & Peres, A. J. S. (2016). Investigando a Validade Estrutural das Competências do ENEM: Quatro Domínios Correlacionados ou Um Modelo Bifatorial? [Investigating the Structural Validity of the Competencies of ENEM: Four Correlated Domains or A Bifactorial Model?]. Boletim na Medida, 5(10), 33-38. , 2018Gomes, C. M. A., Golino, H. F., & Peres, A. J. S. (2018). Análise da fidedignidade composta dos escores do ENEM por meio da análise fatorial de itens [Analysis of the Composite Reliability of the Scores of ENEM via Factor Analysis of Items]. European Journal of Education Studies, 5, 331-344. http://doi.org/10.5281/zenodo.2527904
http://doi.org/10.5281/zenodo.2527904...
) found evidence, using factor analyses, that corroborate the validity of the four domains evaluated in the exam.

Reflections on the significance of the ENEM data to investigate performance in mathematics show that the microdata contain information on a number of variables recognized in the literature as associated with performance in mathematics, such as, for example, gender, socioeconomic status (as measured by monthly individual and family income, parents’ level of education, place of residence, having a paid job, etc.), the type of school in which the candidate took elementary and secondary education, the location and operational status of the school and of the applicant, among other aspects. In view of the above, this study presents the results of a predictive analysis that incorporates a model with 53 predictors, all of which taken from the ENEM 2011 microdata, with the candidates’ performance in mathematics on the ENEM 2011 as the outcome variable.

Method

Participants

The data analysis was based on the data of the candidates for the ENEM 2011 who took all tests, including the writing test, and who answered the socioeconomic questionnaire. These conditions generated a sample of 3,670,089 participants, to be analyzed in this study. Among the most striking sociodemographic characteristics of this sample, 59.51% were female, 55.23% reported having completed secondary education, 91.24% reported having completed or were completing secondary education through regular education, while 86.23% reported being single, 43.51% white, 39.52% mulatto and 11.57% black. In turn, 75.07% reported attending or having attended secondary education in the state-owned network, while 21.84% were attending or had attended a private school, 97.58% in urban areas, and 74.63% of the participants indicated a monthly family income of up to 2 minimum wages, while 56.54% marked no income. Pursuing higher education and obtaining a scholarship were the two reasons that most strongly drove the participants of this sample to participate in the ENEM. On a scale from 0 to 5, with 5 indicating the strongest motivation, 90.60% selected the maximum score for the motivation to participate in the ENEM as a way to pursue higher education, while 82.81% selected score 5 for the motivation to participate in the ENEM as a way to get a scholarship.

Variables of ENEM 2011 Microdata Used in the Study

The predictive variables used in this study come from the data blocks about the candidate, the school of the candidate and the socioeconomic questionnaire, referring to the microdata of ENEM 2011 (MEC/INEP, 2012Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2012). Microdados do ENEM - 2011, Exame Nacional do Ensino Médio: Manual do usuário. MEC/INEP. http://portal.inep.gov.br/web/guest/microdados
http://portal.inep.gov.br/web/guest/micr...
). Not all variables in these blocks were selected, as some focus on information for very specific populations. Particularly the variable related to the application in the prison system was not included because, according to information from INEP (2012Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2012). Microdados do ENEM - 2011, Exame Nacional do Ensino Médio: Manual do usuário. MEC/INEP. http://portal.inep.gov.br/web/guest/microdados
http://portal.inep.gov.br/web/guest/micr...
), the candidates who performed the tests in this modality did not complete the socioeconomic questionnaire. We used 53 predictive variables in this study. These are listed in Table 1. The dependent variable involves the students’ standardized scores in the field of mathematics, arranged on a scale created by INEP and stored in the microdata for 2011. This standardized scale ranges from 0 to 1000 points, with an average of 500 points and standard deviation of 100 points (MEC/INEP, 2015Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2015). Relatório pedagógico: ENEM 2011-2012. INEP. http://www.publicacoes.inep.gov.br/portal/download/1401
http://www.publicacoes.inep.gov.br/porta...
).

Table 1
Predictive Variables Used in the Study

Data Collection Procedures

On its website (http://portal.inep.gov.br/web/guest/microdados), INEP publishes files with the ENEM microdata. It also provides a code to transform these files into the .sav format of SPSS statistical software. When we downloaded the files of the ENEM 2011 microdata, these were transformed into the .sav format. Only those students who were present during the two days of application of the ENEM test and who answered the socioeconomic questionnaire were selected in these files. We then saved the files as objects in R software (R Core Team, 2017R Core Team (2017). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. http://www.R-project.org
http://www.R-project.org...
), and developed all data analyses in that software. The ENEM 2011 microdata used are in the public domain. Thus, the candidates’ privacy and anonymity in the test were guaranteed.

Data Analysis

We used the tree regression method and the CART (Classification and Regression Trees) algorithm, created by Breiman et al. (1984Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Chapman & Hall/CRC.), to investigate the role of predictive variables in the explanation of the outcome variable. Rpart (Therneau & Atkinson, 2015Therneau, T. M., & Atkinson, E. J. (2015). An Introduction to Recursive Partitioning Using the rpart Routines. https://cran.rproject.org/web/packages/rpart/vignettes/longintro.pdf
https://cran.rproject.org/web/packages/r...
) and caret (Kuhn, 2017Kuhn, M. (2017). caret: Classification and Regression Training. https://CRAN.Rproject.org/package=caret
https://CRAN.Rproject.org/package=caret...
), both packages of the software R (R Core Team, 2017R Core Team (2017). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. http://www.R-project.org
http://www.R-project.org...
), were employed to perform the procedures involved in tree regression. Considering that the tree regression method and the CART algorithm do not figure among the best-known data analysis techniques in psychology, we will very briefly explain their logic. Further details on this approach are available in Gomes and Almeida (2017Gomes, C. M. A., & Almeida, L. S. (2017). Advocating the Broad Use of the Decision Tree Method in Education. Practical Assessment, Research & Evaluation, 22(10), 1-10. https://doi.org/10.7275/2w3n-0f07
https://doi.org/10.7275/2w3n-0f07...
).

The CART algorithm is a machine learning technique that operationalizes the tree regression method (James et al., 2013James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer. https://doi.org/10.1007/978-1-4614-7138-7
https://doi.org/10.1007/978-1-4614-7138-...
). Concerning its functioning, it breaks the data into pairs of distinct groups as many times as possible. While the original sample, not yet divided by the algorithm, is named root node or single node, the groups created are called tree nodes (Zhang & Singer, 2010Zhang, H., & Singer, B. H. (2010). Recursive Partitioning and Applications. Springer. https://doi.org/10.1007/978-1-4419-6824-1
https://doi.org/10.1007/978-1-4419-6824-...
). Each node in the tree can be partitioned, generating new pairs of nodes. The node that gives rise to a new node pair is referred to as the parent node, and the nodes that do not generate any node pair are referred to as leaves or terminal nodes. As can be verified, the whole language of the method alludes to the construction of a tree, with its base root, nodes, and leaves (Lantz, 2015Lantz, B. (2015). Machine Learning with R. Packt.). The predictive models derived from this approach tend to become excessively adjusted to the analyzed data, entailing problems for the generalization of the prediction. The machine learning literature acknowledges this problem and calls it super learning or super adjustment. This literature recommends randomly separating the data into two parts, a training sample and a test sample, precisely as a way of treating super learning (James et al., 2013James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer. https://doi.org/10.1007/978-1-4614-7138-7
https://doi.org/10.1007/978-1-4614-7138-...
). According to this recommendation, the microdata were randomly separated into two samples: training (75% of cases) and test (25% of cases). The proportioning of cases between 3/4 for the training sample and 1/4 for the test sample also followed the recommendation of the machine learning literature (James et al., 2013James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer. https://doi.org/10.1007/978-1-4614-7138-7
https://doi.org/10.1007/978-1-4614-7138-...
).

The initial strategy of the CART algorithm is to generate a tree with as many nodes as possible. Then, it “prunes” this tree, that is, it eliminates the nodes that worsen the prediction of the model. To verify the nodes that worsened the prediction of the model and could be pruned, the 3-Fold cross-validation technique was applied to the training sample, as recommended in the machine learning literature for large samples (James et al., 2013James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer. https://doi.org/10.1007/978-1-4614-7138-7
https://doi.org/10.1007/978-1-4614-7138-...
; Lantz, 2015Lantz, B. (2015). Machine Learning with R. Packt.) and the complexity cost criterion was inspected, whose function is to identify the number of leaves of the tree that best explain the variance of the outcome variable. To generate the final tree, already “pruned”, we used the parsimony criterion, that is, only a small number of leaves from the original tree was maintained to permit the easy interpretation of the tree, producing meaningful information on the performance in the field of mathematics (Rokach & Maimon, 2015Rokach, L., & Maimon, O. (2015). Data Mining with Decision Trees: Theory and Applications. World Scientific Publishing. ). The predictive model was built on the learning sample. Then, its predictive capacity was analyzed, applying it to predict the outcome variable in the test sample, as recommended in the literature (James et al., 2013James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer. https://doi.org/10.1007/978-1-4614-7138-7
https://doi.org/10.1007/978-1-4614-7138-...
; Lantz, 2015Lantz, B. (2015). Machine Learning with R. Packt.). For the analysis of the predictive capacity, we used R² for reference, that is, the explained variance percentage of the outcome variable.

Results and Discussion

We chose to present the results and the discussion together because the tree regression method is not a usual or standard approach in data analysis in psychology, requiring more extensive clarification on the interpretation and discussion of its results. Before getting specifically into the results of the final tree generated by the model, some findings related to the construction of the tree need further description. The initial tree that was created, still without the “pruning” process, generated an impressive number of 32,569 leaves, that is, terminal nodes. Using the 3-Fold technique, we could identify that the tree with 2,830 leaves offered the best explanation of the variance in the mathematics performance, generating a 61.58% prediction error. Nevertheless, this tree has a huge number of leaves, making it impossible to interpret the results of the tree and to generate meaningful knowledge about the performance in mathematics. Using the parsimony criterion, a very interesting cutoff point was found from the interpretative point of view, which would involve the selection of only 20 leaves. This tree would generate a relative error of about 70.08% of the variance in the outcome variable, i.e., it would lose about 8.50% of the variance explained in relation to the best predictive model that can be obtained. Nevertheless, instead of producing thousands of sheets, this model would have only 20 sheets, and could be easily interpreted to produce meaningful results. In that sense, the results of this study derive from this tree. In terms of accuracy, the 20-leaf tree explains 29.92% (100% minus the prediction error of the model) of the variance in the participants’ performance in the mathematics domain of ENEM 2011 in the training sample, as well as 29.97% of the variance in the mathematics performance in the test sample, indicating a very similar result in both samples.

Of the 53 predictive variables used by the CART algorithm, only seven were used to construct the 20-leaf tree. On a scale from 0 to 100 points provided by the rpart package, the importance of these variables to predict the outcome variable is as follows: [1] Q4. Monthly Family Income (36 points), [2] Q30. What type of school did you attend in elementary school? (15 points), [3] Q33. What type of school did you attend in secondary education? (14 points), [4] Gender (nine points), [5] Q27. Indicate what prompted you to participate in ENEM: get a scholarship, [6] Unit of the federation of residence (three points), [7] Q26. Indicate what prompted you to participate in ENEM: get secondary education certification or speed up my studies (two points).

Regarding these variables, it is important to note that a considerable part of them belongs to the list of predictors of student performance in mathematics well recognized by the literature in the area (Akben-Selcuk, 2017Akben-Selcuk, E. (2017). Personality, Motivation, and Math Achievement among Turkish Students: Evidence from PISA Data. Perceptual and Motor Skills, 124, 514-530. http://doi.org/10.1177/0031512516686505
http://doi.org/10.1177/0031512516686505...
; Karakolidis et al., 2016Karakolidis, A., Pitsia, V., & Emvalotis, A. (2016). Examining Students’ Achievement in Mathematics: A Multilevel Analysis of the Programme for International Student Assessment (PISA) 2012 Data for Greece. International Journal of Educational Research, 79, 106-115. http://doi.org/10.1016/j.ijer.2016.05.013
http://doi.org/10.1016/j.ijer.2016.05.01...
; Laros et al., 2010Laros, J. A., Marciano, J. L. P., & Andrade, J. M. (2010). Fatores que Afetam o Desempenho na Prova de Matemática do SAEB: Um Estudo Multinível [Factors that Affect the Performance on the SAEB Mathematics Test: A Multilevel Study]. Avaliação Psicológica, 9, 173-186.; Pangeni, 2014Pangeni, K. P. (2014). Factors Determining Educational Quality: Student Mathematics Achievement in Nepal. International Journal of Educations Research, 34(1), 30-41. http://doi.org/10.1016/j.ijedudev.2013.03.001
http://doi.org/10.1016/j.ijedudev.2013.0...
; Pinto et al., 2016Pinto, J., Carvalho e Silva, J., & Bixirão Neto, T. (2016). Fatores Influenciadores dos Resultados de Matemática de Estudantes Portugueses e Brasileiros no PISA: Revisão Integrativa [Influencing Factors on the Mathematics Results of Portuguese and Brazilian Students in PISA: Integrative Review]. Ciência e Educação, 22(4), 837-853. http://doi.org/10.1590/1516-731320160040002
http://doi.org/10.1590/1516-731320160040...
; Thien & Ong, 2015Thien, L. M., & Ong, M. Y. (2015). Malaysian and Singaporean Students’ Affective Characteristics and Mathematics Performance: Evidence from PISA 2012. Springer Plus, 4, 563. http://doi.org/10.1186/s40064-015-1358-z
http://doi.org/10.1186/s40064-015-1358-z...
), as is the case of family income, sex, type of school and the region where the student lives. In this sense, the results of the 20-leaf tree corroborate findings from the literature on performance predictors in mathematics. Figure 1 shows the 20-leaf tree. Due to space limitations, it is not possible to describe all of the nodes presented. Nevertheless, as follows, we will show how to read the tree of the figure to allow the reader to accompany and understand the information contained in all nodes and the leaves that were generated. Then, we will summarize the most important results.

Figure 1.
20-Leaf Tree.

At the top of Figure 1, there is information that the root node comes from the participants in the training sample. The root node is the training sample not yet ruptured by the CART algorithm. Recalling, as previously argued, the training sample was used to generate the tree, while the test sample was used to evaluate the predictive capacity of the model generated in the training sample. Regarding the root node, just below the phrase “Root Node Training Sample” in Figure 1, the phrase “Q4. Family Income” is displayed. That is one of the predictive variables used in this study. As it stands at the top of Figure 1, this means that it was the first variable used to rupture the training sample into two nodes. Just below the phrase “Q4. Family income”, the phrases “up to 2 wages” and “<yes> <no>” are displayed. This information indicates that the ENEM participants in the training sample who reported a family income of up to two minimum wages constituted a node to the left of the reader, while those who indicating a family income of more than two minimum wages constituted a node to the right of the reader.

The node with the participants who reported a family income of up to two minimum wages was broken into two new nodes, through the gender variable. Note in Figure 1 that, in the upper left (always taking the reader as a reference to indicate the upper, lower, left and right corners), the phrases “Gender” “Female” and “<yes> <no>” are displayed. These phrases indicate that the node of people who reported having a family income of up to two minimum wages was ruptures into two new nodes, one of female people and another of male people. Female people were allocated to a node more to the left, while male ones were allocated to a node more to the right. The rupturing of the nodes continues until only one number is reported in Figure 1, followed by a percentage. To give an example, well to the left, at the bottom of the figure, the number 442 and the percentage 17% are displayed. This number indicates a terminal node, in which there are no more ruptures. As mentioned earlier, the terminal nodes are called leaves. These numbers indicate that the people belonging to this leaf have an average score of 442 points in the mathematics domain of ENEM 2011 and that they represent 17% of the training sample.

In a tree chart, two types of information are very important. Along the tree, one can identify which predictive variables generated the different ruptures and the subsequent nodes. With this identification, one can verify the role of each of the predictive variables and their importance for the composition of the tree nodes. Nevertheless, the leaves contain the most important information. They permit identifying how the model predicts certain results in the mathematics domain of ENEM 2011, articulating this prediction with the predictive variables. The following is an example of how to read a leaf in Figure 1.

Well in the lower left corner is the leaf with the numbers 442 (17%). We have already indicated what these values mean. For the mere sake of emphasis, this group of ENEM 2011 participants has a mean score of 442 points in the mathematics domain. Among all leaves, this is the group with the worst performance in mathematics. To identify the profile of the participants in this leaf, Figure 1 should be read as follows. Start at the root node, at the top of the figure, and check what ruptures that subsequently produced the node of people with the average score of 442 points. Departing from the root node, it is observed that people who reported a family income of up to two minimum wages were allocated in a node on the left. These people were then separated again. Female people with a family income of up to two minimum wages were allocated more to the left, in a new node. We need to continue observing this sequence, as this will allow us to reach the target leaf. Then, this node of people with a family income of up to two minimum wages and female gender was divided into a pair of nodes using the family income variable. People with a family income of up to 1.5 minimum wage from this group were allocated in a new node, more to the left. This node was then ruptured based on the variable of the federation unit where they lived. Female people with a family income of up to 1.5 minimum wage and living in the Midwest, North and Northeast were allocated in a new node more to the left. Finally, this node was ruptured using the variable of the elementary school the participants had attended. The people from this node who had not taken most of their elementary school at a public school or at a private school, or who had not taken all of their elementary school at a private school, were allocated to a new node. This node was not ruptured again and it corresponds precisely to the leaf of people with an average of 442 points in the mathematics domain of ENEM 2011.

In summary, this leaf informs that women with a family income of up to 1.5 minimum wage, living in the North, Northeast and Midwest of Brazil, and who did not attend most of their elementary school at a public or private school, or who did not attend all of their elementary school at a private school, performed the worst in the mathematics domain of ENEM 2011. In contrast to this leaf, the leaf on the far right of Figure 1 presents the group with the highest average performance in the mathematics domain of ENEM 2011. This group has an average performance of 699 points and represents 3% of the training sample. It consists of people who reported a family income superior to two minimum wages, attended elementary school only at an indigenous or private school, or mostly attended a private school, are not female and did not have getting a scholarship as a strong motivation to take the ENEM (all items related to motivation to take the exam were scored on a scale from 0 to 5, with higher scores indicating greater motivation; these people did not mark scores 4 and 5 for the variable Q.26. Indicate what prompted you to participate in ENEM: get secondary education certification or speed up my studies).

Summarizing the fundamental results in the leaves of the tree in Figure 1, we can affirm that some variables are related to an increase or decrease in performance in the mathematics domain of ENEM 2011. Being female is related to a major decrease. For example, the people who have the same characteristics as the group with the highest performance, except for being female, have an average score of 654 points, instead of 699 points in the group with the highest performance, indicating a decline of 45 points, which corresponds to almost half a standard deviation on the scale. This finding supports the results of previous studies, such as Karakolidis et al. (2016Karakolidis, A., Pitsia, V., & Emvalotis, A. (2016). Examining Students’ Achievement in Mathematics: A Multilevel Analysis of the Programme for International Student Assessment (PISA) 2012 Data for Greece. International Journal of Educational Research, 79, 106-115. http://doi.org/10.1016/j.ijer.2016.05.013
http://doi.org/10.1016/j.ijer.2016.05.01...
) and Hampden-Thompson (2013Hampden-Thompson, G. (2013). Family Policy, Family Structure, and Children’s Educational Achievement. Social Science Research, 42, 804-817. http://doi.org/10.1037/0022-0663.95.1.124
http://doi.org/10.1037/0022-0663.95.1.12...
), in which female students obtained a lower mathematics score when compared to male students. Despite the advances and efforts regarding the achievement of gender equality, stereotypes, and cultural and social practices still persist in the Brazilian society that set limits as to what is allowed and even encouraged depending on the gender. According to Souza and Fonseca (2010Souza, M. C. R. F., & Fonseca, M. C. R. F. (2010). Relações de Gênero, Educação Matemática e Discurso: Enunciados sobre Mulheres, Homens e Matemática [Gender Relations, Mathematical Education and Discourse: Statements on Women, Men and Mathematics]. Autêntica.), both in the classroom and other contexts, intense mentions are made that reinforce the male superiority for mathematics, as well as characteristics that are socially and culturally attributed to women (such as docility, sensitivity and kindness) and men (such as rationality, courage and boldness). As a result, gender relations influence the numbering practices.

In addition to gender, reported family income is an important variable. People who reported a family income of up to two minimum wages perform worse, and the decrease in performance gains intensity if the reported family income is up to 1.5 minimum wage. In the case of students with higher incomes, no relevant impact was observed on the differentiation of students regarding mathematical performance, diverging from the argument by Thien and Ong (2015Thien, L. M., & Ong, M. Y. (2015). Malaysian and Singaporean Students’ Affective Characteristics and Mathematics Performance: Evidence from PISA 2012. Springer Plus, 4, 563. http://doi.org/10.1186/s40064-015-1358-z
http://doi.org/10.1186/s40064-015-1358-z...
) that high socioeconomic level would have a positive impact on academic performance. Nevertheless, caution is needed when comparing these two variables, as the socioeconomic level is not limited to family income. The socioeconomic variable comprises several other factors, such as educational level and profession of the parents, number of books at home, cultural resources. In that sense, comparing results related to socioeconomic level and family income is always a procedure of approximation, because the information generated by both may be similar, but it is not identical.

Not having taken most of the elementary school at a private or public school indicates an important decrease as well. For example, the worst performing group had an average of 442 points. People with the same characteristics at this group, but who took most of their elementary school at a private or public school, obtained an average performance of 487 points, corresponding to a 45-point increase. In turn, the participants who reported high motivation (scores 4 and 5 on a scale from 0 to 5 points) to take the ENEM to obtain secondary education certification or accelerate their studies performed worse. The leaves with an average performance of 487 points and 522 points indicate this. The people in these two leaves have the same characteristics, except for the group with the worst performance, which reported high motivation to obtain secondary education certification or accelerate their studies, with a 35-point decrease compared to the other group. The motivation focused on the achievement of external goals instead of learning itself can be a factor that interferes unfavorably in academic performance (Mello & Leme, 2016Mello, M. B. J. B., & Leme, M. I. S. (2016). Motivação de Alunos dos Cursos Superiores de Tecnologia. Psicologia Escolar e Educacional, 20, 581-590. http://doi.org/10.1590/2175-3539201502031053
http://doi.org/10.1590/2175-353920150203...
; Monteiro et al., 2012Monteiro, S. C., Almeida, L. S., & Vasconcelos, R. M. C. F. (2012). Abordagens à Aprendizagem, Autorregulação e Motivação: Convergência no Desempenho Acadêmico Excelente [Approaches to Learning, Self-Regulation and Motivation: Their Convergence on excellent Academic Performance]. Revista Brasileira de Orientação Profissional, 13, 153-162. ). It is interesting to note that, as from 2017, the MEC (2017Ministério da Educação (MEC) (2017). Portaria no468 de 3 de abril de 2017. http://download.inep.gov.br/educacao_basica/enem/legislacao/2017/Portaria_mec_gm_n468_de_03042017_dispoe_sobre_a_realizacao_do_enem.pdf
http://download.inep.gov.br/educacao_bas...
) took the decision to withdraw from ENEM the possibility of using the grade in the exam to obtain the secondary education certificate. This study presents evidence that could support that decision.

Living in the North, Northeast and Midwest of Brazil is also related to performing worse in the mathematics domain of ENEM 2011. The leaves with an average performance of 492 points and 528 points indicate this. They represent people with the same characteristics, except that the people who live in these regions perform worse, representing a 36-point decrease. This result may be associated with the socioeconomic development level, especially in the North and Northeast, which is considered inferior when compared to the level of the South and Southeast (Instituto Brasileiro de Geografia e Estatística [IBGE], 2019Instituto Brasileiro de Geografia e Estatística (IBGE) (2019). Síntese de Indicadores Sociais: Uma Análise das Condições de Vida da População Brasileira [Synthesis of Social Indicators: An Analysis of the Living Conditions of the Brazilian Population]. https://biblioteca.ibge.gov.br/visualizacao/livros/liv101678.pdf
https://biblioteca.ibge.gov.br/visualiza...
), possibly indicating fewer opportunities and conditions for their citizens. It is noteworthy, though, that the residents of the Federal District who reported a family income of up to two minimum wages showed a performance score in the mathematics domain similar to residents of the North and Northeast. People in the Federal District who reported a family income superior to two minimum wages showed a performance score in the mathematics domain similar to residents of the South and Southeast. This particularity of the Federal District deserves further investigation in new studies. Finally, not having taken the largest part of secondary education at an indigenous or private school or not having taken secondary education at a private school only is also related to a performance decrease. The leaves with an average performance of 522 points and 586 points represent people with the same characteristics, except that the group with 586 points attended most of their secondary education at these schools, indicating a 64-point increase. The latter result arouses reflections on the teaching practices adopted in these types of schools, demanding more in-depth studies, including research with qualitative designs.

Conclusion

ENEM was created with the main purpose of evaluating student performance at the end of basic education, specifically at the end of secondary education. It is a proposal for differentiated evaluation, distinguishing itself from most of the selection processes to gain access to higher education, especially the entry exams applied at Brazilian higher education institutions. Based on a strong political investment in the years 2008 to 2012, this evaluation process turned into an important Brazilian public policy. Besides inducing high-school education to curricular restructuring and the adoption of pedagogical proposals in line with the development of relevant skills for citizens, ENEM has become a form of selection for higher-education institutions, which use the test result as a criterion to be admitted to their undergraduate courses, either supplementing or replacing (fully or partially) the traditional entry exam. As developments, ENEM served as a unified selection tool in the selection processes of federal higher education institutions, democratizing the access opportunities and enabling academic student mobility.

Considering the context of the political and academic advances that the large-scale evaluation of secondary education has brought about in the country, enabled by ENEM, this study investigated a wide set of variables of the ENEM microdata 2011, seeking to verify whether this broad spectrum of information could provide relevant meaningful information about the performance of ENEM participants in the field of mathematics. In this sense, the use of analyses based on INEP microdata in the study presented here constitutes an effective contribution to quantitative methodological research in the area of education and psychology. INEP is one of the largest producers of microdata related to education: School Census, Higher Education Census, Prova Brasil, SAEB, ANA, ENEM, ENADE, among others. The microdata is the lowest available level of disaggregation of data collected by surveys, evaluations, and examinations; these data do not bring the information itself, they are not available in other statistical survey products and, therefore, need to be treated to extract the desired information and indicators related to the research objectives. The use of methods to deal with the ENEM microdata evidences the contributions of this article to disseminate ways of accessing and understanding the information in the data treatment process and in the statistical calculations.

The predictive analysis carried out based on the results of this research permitted the dissemination of a complex “map” of interactions between the variables proposed in this research, contributing to the advancement of theoretical knowledge about the performance in mathematics in secondary education and consequently to public policies related both to access to higher education and to the necessary reformulations in the secondary education curriculum. In terms of scientific contributions, the results presented corroborate findings from previous research in which variables such as gender, socioeconomic level of the student, region of residence, and type of school explain school performance (Akben-Selcuk, 2017Akben-Selcuk, E. (2017). Personality, Motivation, and Math Achievement among Turkish Students: Evidence from PISA Data. Perceptual and Motor Skills, 124, 514-530. http://doi.org/10.1177/0031512516686505
http://doi.org/10.1177/0031512516686505...
; Karakolidis et al., 2016Karakolidis, A., Pitsia, V., & Emvalotis, A. (2016). Examining Students’ Achievement in Mathematics: A Multilevel Analysis of the Programme for International Student Assessment (PISA) 2012 Data for Greece. International Journal of Educational Research, 79, 106-115. http://doi.org/10.1016/j.ijer.2016.05.013
http://doi.org/10.1016/j.ijer.2016.05.01...
; Laros et al., 2010Laros, J. A., Marciano, J. L. P., & Andrade, J. M. (2010). Fatores que Afetam o Desempenho na Prova de Matemática do SAEB: Um Estudo Multinível [Factors that Affect the Performance on the SAEB Mathematics Test: A Multilevel Study]. Avaliação Psicológica, 9, 173-186.; Organization for Economic Co-operation and Development -OECD, 2016Organization for Economic Co-operation and Development (OECD) (2016). Results from PISA 2015. http://download.inep.gov.br/acoes_internacionais/pisa/resultados/2015/pisa_2015_brazil.pdf
http://download.inep.gov.br/acoes_intern...
; Pangeni, 2014Pangeni, K. P. (2014). Factors Determining Educational Quality: Student Mathematics Achievement in Nepal. International Journal of Educations Research, 34(1), 30-41. http://doi.org/10.1016/j.ijedudev.2013.03.001
http://doi.org/10.1016/j.ijedudev.2013.0...
; Pinto et al., 2016Pinto, J., Carvalho e Silva, J., & Bixirão Neto, T. (2016). Fatores Influenciadores dos Resultados de Matemática de Estudantes Portugueses e Brasileiros no PISA: Revisão Integrativa [Influencing Factors on the Mathematics Results of Portuguese and Brazilian Students in PISA: Integrative Review]. Ciência e Educação, 22(4), 837-853. http://doi.org/10.1590/1516-731320160040002
http://doi.org/10.1590/1516-731320160040...
; Thien & Ong, 2015Thien, L. M., & Ong, M. Y. (2015). Malaysian and Singaporean Students’ Affective Characteristics and Mathematics Performance: Evidence from PISA 2012. Springer Plus, 4, 563. http://doi.org/10.1186/s40064-015-1358-z
http://doi.org/10.1186/s40064-015-1358-z...
). On the other hand, this study provides original and innovative information, such as the identification that income above two minimum family wages is a “protective” factor of performance in mathematics, as well as the low motivation to take the exam as a means to obtain certification, obtain scholarships, or accelerate studies. In addition, the study shows that, of the 53 predictive variables used, only seven had predictive importance, so that a wide set of microdata variables used in this study was irrelevant to understand the performance in the field of mathematics. The predictive model of this study was able to predict about 30% of the performance in mathematics, which is a reasonable result only. After all, the model did not explain about 70% of the performance. This prediction leaves a very high portion unexplained, which can be appointed as a possible methodological limitation of the study. One may assume that the 30% explanation lies in the fact that the selected final tree was created based on the parsimony criterion, decreasing the predictive power of the model. Nevertheless, even if we had used the cost of complexity criterion to generate the final tree, so as to produce the model with the best predictive power, but with an excessive number of leaves, this model would not explain more than about 40% of the performance in mathematics. These results suggest that the addition of new variables in the microdata of ENEM is relevant to improve the predicted performance in the field of mathematics. The inclusion of new educational variables, such as teaching methodology, teacher professional development and curricular organization, as well as the insertion of psychological variables in the ENEM microdata, such as creativity, self-concept and academic self-efficacy, may permit better predictions of performance in the field of mathematics. This study reveals the need to invest in the female potential for mathematics, encouraging students from an early age to engage in activities that require the use of mathematical thinking, as well as in the implementation of teaching practices that arouse the intrinsic interest in the area in students from different socio-economic backgrounds.

References

  • Akben-Selcuk, E. (2017). Personality, Motivation, and Math Achievement among Turkish Students: Evidence from PISA Data. Perceptual and Motor Skills, 124, 514-530. http://doi.org/10.1177/0031512516686505
    » http://doi.org/10.1177/0031512516686505
  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees Chapman & Hall/CRC.
  • European Commission. (2011). Mathematics Education in Europe: Common Challenges and National Policies Education, Audiovisual and Cultural Executive Agency, Eurydice.
  • Gomes, C. M. A., & Almeida, L. S. (2017). Advocating the Broad Use of the Decision Tree Method in Education. Practical Assessment, Research & Evaluation, 22(10), 1-10. https://doi.org/10.7275/2w3n-0f07
    » https://doi.org/10.7275/2w3n-0f07
  • Gomes, C. M. A., Golino, H. F., & Peres, A. J. S. (2016). Investigando a Validade Estrutural das Competências do ENEM: Quatro Domínios Correlacionados ou Um Modelo Bifatorial? [Investigating the Structural Validity of the Competencies of ENEM: Four Correlated Domains or A Bifactorial Model?]. Boletim na Medida, 5(10), 33-38.
  • Gomes, C. M. A., Golino, H. F., & Peres, A. J. S. (2018). Análise da fidedignidade composta dos escores do ENEM por meio da análise fatorial de itens [Analysis of the Composite Reliability of the Scores of ENEM via Factor Analysis of Items]. European Journal of Education Studies, 5, 331-344. http://doi.org/10.5281/zenodo.2527904
    » http://doi.org/10.5281/zenodo.2527904
  • Hampden-Thompson, G. (2013). Family Policy, Family Structure, and Children’s Educational Achievement. Social Science Research, 42, 804-817. http://doi.org/10.1037/0022-0663.95.1.124
    » http://doi.org/10.1037/0022-0663.95.1.124
  • Hanushek, A. E., & Woessmann, L. (2011). The Cost of Low Educational Achievement in the European Union (Report No. 7). European Expert Network on Economics of Education (EENEE). http://www.eenee.de/dms/EENEE/Policy_Briefs/PolicyBrief1-2011.pdf
    » http://www.eenee.de/dms/EENEE/Policy_Briefs/PolicyBrief1-2011.pdf
  • Instituto Brasileiro de Geografia e Estatística (IBGE) (2019). Síntese de Indicadores Sociais: Uma Análise das Condições de Vida da População Brasileira [Synthesis of Social Indicators: An Analysis of the Living Conditions of the Brazilian Population]. https://biblioteca.ibge.gov.br/visualizacao/livros/liv101678.pdf
    » https://biblioteca.ibge.gov.br/visualizacao/livros/liv101678.pdf
  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R Springer. https://doi.org/10.1007/978-1-4614-7138-7
    » https://doi.org/10.1007/978-1-4614-7138-7
  • Karakolidis, A., Pitsia, V., & Emvalotis, A. (2016). Examining Students’ Achievement in Mathematics: A Multilevel Analysis of the Programme for International Student Assessment (PISA) 2012 Data for Greece. International Journal of Educational Research, 79, 106-115. http://doi.org/10.1016/j.ijer.2016.05.013
    » http://doi.org/10.1016/j.ijer.2016.05.013
  • Kuhn, M. (2017). caret: Classification and Regression Training https://CRAN.Rproject.org/package=caret
    » https://CRAN.Rproject.org/package=caret
  • Lantz, B. (2015). Machine Learning with R Packt.
  • Laros, J. A., Marciano, J. L. P., & Andrade, J. M. (2010). Fatores que Afetam o Desempenho na Prova de Matemática do SAEB: Um Estudo Multinível [Factors that Affect the Performance on the SAEB Mathematics Test: A Multilevel Study]. Avaliação Psicológica, 9, 173-186.
  • Lee, J., & Stankov, L. (2013). Higher-Order Structure of Noncognitive Constructs and Prediction of PISA 2003 Mathematics Achievement. Learning and Individual Differences, 26, 119-130. http://doi.org/10.1016/j.lindif.2013.05.004
    » http://doi.org/10.1016/j.lindif.2013.05.004
  • Martin, A. J., & Lazendic, G. (2018). Achievement in Large Scale National Numeracy Assessment: An Ecological Study of Motivation and Student, Home, and School Predictors. Journal of Educational Psychology, 110, 565-482. http://doi.org/10.1037/edu0000231
    » http://doi.org/10.1037/edu0000231
  • Mello, M. B. J. B., & Leme, M. I. S. (2016). Motivação de Alunos dos Cursos Superiores de Tecnologia. Psicologia Escolar e Educacional, 20, 581-590. http://doi.org/10.1590/2175-3539201502031053
    » http://doi.org/10.1590/2175-3539201502031053
  • Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (1998). Portaria Normativa nº 438, de 28 de maio de1998 DOU de 01 jun. 1998, nº 102-E, seção 1, p. 5. http://www.normasbrasil.com.br/norma/?id=181137
    » http://www.normasbrasil.com.br/norma/?id=181137
  • Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2012). Microdados do ENEM - 2011, Exame Nacional do Ensino Médio: Manual do usuário MEC/INEP. http://portal.inep.gov.br/web/guest/microdados
    » http://portal.inep.gov.br/web/guest/microdados
  • Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2013). Exame Nacional do Ensino Médio (Enem): Relatório pedagógico 2009-2010 INEP/MEC. http://portal.inep.gov.br/documents/186968/484421/Relatório+Pedagógico+ENEM+2009-2010/70890e24-a78a-44f8-a909-b235f02948f2?version=1.1
    » http://portal.inep.gov.br/documents/186968/484421/Relatório+Pedagógico+ENEM+2009-2010/70890e24-a78a-44f8-a909-b235f02948f2?version=1.1
  • Ministério da Educação (MEC)/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2015). Relatório pedagógico: ENEM 2011-2012 INEP. http://www.publicacoes.inep.gov.br/portal/download/1401
    » http://www.publicacoes.inep.gov.br/portal/download/1401
  • Ministério da Educação (MEC) (2017). Portaria no468 de 3 de abril de 2017 http://download.inep.gov.br/educacao_basica/enem/legislacao/2017/Portaria_mec_gm_n468_de_03042017_dispoe_sobre_a_realizacao_do_enem.pdf
    » http://download.inep.gov.br/educacao_basica/enem/legislacao/2017/Portaria_mec_gm_n468_de_03042017_dispoe_sobre_a_realizacao_do_enem.pdf
  • Monteiro, S. C., Almeida, L. S., & Vasconcelos, R. M. C. F. (2012). Abordagens à Aprendizagem, Autorregulação e Motivação: Convergência no Desempenho Acadêmico Excelente [Approaches to Learning, Self-Regulation and Motivation: Their Convergence on excellent Academic Performance]. Revista Brasileira de Orientação Profissional, 13, 153-162.
  • Organization for Economic Co-operation and Development (OECD) (2016). Results from PISA 2015 http://download.inep.gov.br/acoes_internacionais/pisa/resultados/2015/pisa_2015_brazil.pdf
    » http://download.inep.gov.br/acoes_internacionais/pisa/resultados/2015/pisa_2015_brazil.pdf
  • Pangeni, K. P. (2014). Factors Determining Educational Quality: Student Mathematics Achievement in Nepal. International Journal of Educations Research, 34(1), 30-41. http://doi.org/10.1016/j.ijedudev.2013.03.001
    » http://doi.org/10.1016/j.ijedudev.2013.03.001
  • Pinto, J., Carvalho e Silva, J., & Bixirão Neto, T. (2016). Fatores Influenciadores dos Resultados de Matemática de Estudantes Portugueses e Brasileiros no PISA: Revisão Integrativa [Influencing Factors on the Mathematics Results of Portuguese and Brazilian Students in PISA: Integrative Review]. Ciência e Educação, 22(4), 837-853. http://doi.org/10.1590/1516-731320160040002
    » http://doi.org/10.1590/1516-731320160040002
  • Pipere, A., & Mierina, I. (2017). Exploring Non-Cognitive Predictors of Mathematics Achievement among 9th Grade Students. Learning and Individual Differences, 59, 65-77. http://doi.org/10.1016/j.lindif.2017.09.005
    » http://doi.org/10.1016/j.lindif.2017.09.005
  • R Core Team (2017). R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing. http://www.R-project.org
    » http://www.R-project.org
  • Rabelo, M. L. (2013). Avaliação Educacional: Fundamentos, Metodologia e Aplicações no Contexto Brasileiro [Educational Evaluation: Fundamentals, Methodology and Applications in the Brazilian Context]. Rio de Janeiro, RJ: SBM.
  • Rokach, L., & Maimon, O. (2015). Data Mining with Decision Trees: Theory and Applications World Scientific Publishing.
  • Schwartzman, S., Costin, C., & Coutinho, A. M. J. (2017). Sociologia e Economia da Educação [Sociology and Economy of Education]. Rede Ciência para Educação.
  • Souza, M. C. R. F., & Fonseca, M. C. R. F. (2010). Relações de Gênero, Educação Matemática e Discurso: Enunciados sobre Mulheres, Homens e Matemática [Gender Relations, Mathematical Education and Discourse: Statements on Women, Men and Mathematics]. Autêntica.
  • Therneau, T. M., & Atkinson, E. J. (2015). An Introduction to Recursive Partitioning Using the rpart Routines https://cran.rproject.org/web/packages/rpart/vignettes/longintro.pdf
    » https://cran.rproject.org/web/packages/rpart/vignettes/longintro.pdf
  • Thien, L. M., & Ong, M. Y. (2015). Malaysian and Singaporean Students’ Affective Characteristics and Mathematics Performance: Evidence from PISA 2012. Springer Plus, 4, 563. http://doi.org/10.1186/s40064-015-1358-z
    » http://doi.org/10.1186/s40064-015-1358-z
  • Zhang, H., & Singer, B. H. (2010). Recursive Partitioning and Applications Springer. https://doi.org/10.1007/978-1-4419-6824-1
    » https://doi.org/10.1007/978-1-4419-6824-1
  • *
    Apoio: Universidade Federal de Minas Gerais.

Publication Dates

  • Publication in this collection
    08 Jan 2021
  • Date of issue
    2020

History

  • Received
    08 Mar 2018
  • Reviewed
    17 Jan 2019
  • Accepted
    09 Apr 2020
Instituto de Psicologia, Universidade de Brasília Instituto de Psicologia, Universidade de Brasília, 70910-900 - Brasília - DF - Brazil, Tel./Fax: (061) 274-6455 - Brasília - DF - Brazil
E-mail: revistaptp@gmail.com