Introduction

Key social and demographic trends have transformed the basic structure of households in industrialized countries over the past 40 years (Morgan, 2003; Waite, 2009; Wilson, 2004). In addition to below-replacement fertility, a rising average rate of divorce, and increased female labor force participation (United Nations, 2017a), international migration and population aging have become key drivers of national population change (United Nations, 2017b). As a result, the traditional nuclear household structure of couples with childrenFootnote 1 has become less common, with the global average proportion of nuclear households decreasing from 52.3% in 1990 to 43.2% by 2013 (United Nations, 2018). Simultaneously, less traditional structures have become more prevalent, particularly single-parent households,Footnote 2 and new forms of household structures have emerged, namely, skipped generationFootnote 3 and immigrant households.Footnote 4

These changes away from traditional nuclear family structures may have major implications for children’s cognitive development. In nuclear family structures, parents typically have the main responsibility for their children’s developmental growth at home, and there is well-established evidence from development psychology and neuroscience that parents play a pivotal role in shaping the way their children perceive and process information and behave in response to external stimuli (Erikson, 1950; Knudsen et al., 2006). Evidence shows how effort and perseverance attitudes from parents are adopted by children and translated into long-term cognitive performance (Leonard et al., 2017); how adults influence children’s emotional reactions (Wu et al., 2017); how parents’ use of a wide and diverse vocabulary improves their children’s literacy and cognitive competencies (Morgan & Winship, 2015; Weisleder & Fernald, 2013); and how the frequency and use of numbers at home improve children’s mathematical performance (Blevins-Knabe & Musun-Miller, 1996)—all competencies which are interconnected and contribute to facilitating learning at later ages.

At the same time, previous research has shown that children raised in non-traditional family structures tend to experience social, cultural, and economic struggles. Single parents usually face financial stress, have limited resources, and tend to depend on external childcare (Brady & Burroway, 2012; Duncan & Magnuson, 2011; Gornick & Jantti, 2010). Grandparent carers tend to rely on a limited pension, are usually unaware of contemporary generational trends, and are often less educated than their own children (Baker & Mutchler, 2010; Kamnuansilpa & Wongthanavasu, 2005). In immigrant family settings, differences in culture and language may limit or delay children’s development in speech and language learning (Shields & Behrman, 2004). Existing evidence suggests that economic, social, and cultural constraints are likely to undermine the cognitive development of children from non-traditional family structures. The majority of studies tend to focus on the academic achievement comparisons between children from a single non-traditional family structure and children from traditional nuclear families (Barajas, 2011; Carlson & Corcoran, 2001; Ermisch & Francesconi, 2001). However, considerable debate remains about which of these family structures impacts the academic achievement of children the most. Examining the association between different forms of non-traditional family structures and children’s academic performance helps to understand the children who face the most academic disadvantages. Understanding this relationship is crucial to prioritizing suitable and effective policy interventions exclusively for them.

To address this knowledge gap, this paper aims to establish the relationship between various forms of non-traditional household structures and children’s educational outcomes, in which families are classified as two-parent, single-parent, skipped generation, and immigrant households (see Fig. 1).Footnote 5 Drawing on data from the Taiwan Assessment of Student Achievement (TASA), we evaluate the treatment effect of different family structures on children’s school overall grades using propensity score weighting (PSW). Generally, standard regression approaches are used to evaluate the relationship between particular pairs of family structures and children’s school performance. However, such approaches do not control for the influence of confounding factors on children’s educational performance and family structures. Experimental or quasi-experimental study designs are needed to adjust the bias induced by confounding factors (Varian, 2016). Several previous studies have used propensity score approaches to address the influence of confounding factors, such as robustness checks, but propensity scores are usually obtained by conventional logit regression methods, in which the usefulness of eliminating the confounding effect depends greatly on the correct specification of propensity score modeling. However, the propensity score we have used is constructed by the machine learning algorithm, which does not require an assumption of the parameter to ensure better performance in addressing the problem caused by confounding factors in estimating treatment effects. Our PSW approach incorporates the propensity score with weighted regression to estimate the treatment effect of family structures on children’s academic performance, as suggested by McCaffrey et al. (McCaffrey et al., 2004). Specifically, we used students from traditional two-parent families as the control group and non-traditional family structures (i.e., skipped generation households, single-parent households, and immigrant households) as “treatment” categories in our analysis.

Fig. 1
figure 1

Family structures

In Taiwan, the number of non-traditional household structures has risen over the past 20 years. From 2001 to 2018, the number of single-parent and skipped generation households increased from 520 and 75k to over 830k and 114k, respectively (Gender Equality Committee, Executive Yuan).Footnote 6 Reflecting an increased in the size of the foreign-born population, the percentage of children from immigrant households in junior school is estimated to have increased from 0.36% in 2003 to 11.04% by 2018 (Ministry of Education).Footnote 7 While this reveals the rising importance of non-traditional family arrangements in Taiwan’s demographic structure, the societal implications of this change remain poorly understood. Our work contributes to advancing our understanding of how these new family structures impact children’s educational performance.

The remainder of this study is organized as follows. The following section reviews the literature on family factors and highlights the discussion on the impacts of these factors on children’s academic performance. "Data and statistical method" section introduces the data set.  "Methodology" section introduces the estimation strategy, the description of the data and the empirical results are presented and discussed in "Results" section, and "Conclusion" section provides the conclusion and discusses the implications of this study.

Literature review

The type of family structure in which children are raised has a significant impact on children’s academic performance (Biblarz & Gottainer, 2000). Most previous studies have consistently found that growing up in non-traditional emerging family structures has negative consequences for children’s education (Sun & Li, 2011). Compared to children who grew up in two-parent families, children who grew up in single-parent families typically had lower academic achievement levels (e.g., Finn & Owings, 1994; Hofferth, 2006; Schiller et al., 2002). Studies on children’s educational outcomes in skipped generation households showed that children in these circumstances tend to have poorer academic performance and achieve lower educational qualifications (Edwards & Mumford, 2005; Sun, 2003). However, the results of studies focusing on the academic performance of children from immigrant households are inconsistent (Levels & Dronkers, 2008).

Besides, evidence indicates that parenting involvement and the level of socioeconomic resources are the two main determinants of children’s educational performance. Parenting styles influence children’s academic performance, and parental practices, which vary according to family type, can explain the differences in children’s academic performance deriving from different family structures (Astone & McLanahan, 1991). As for socioeconomic status, research shows that this attribute plays a key role not only in children’s academic performance but also in maintaining romantic relationships and family formation (Conger et al., 2010; Karney & Bradbury, 2005).

Parenting

Parental involvement is the main factor in explaining children’s educational performance. Increased parental attention, time, support, and dedication are key to promoting and facilitating learning (Gonzalez-Pienda et al., 2002). Active parental participation in children’s school activities encourages long-term educational and career aspirations in their children (Barnard, 2004). Kao and Tienda (Kao & Tienda, 1995) showed that frequent parental engagement with school activities positively affects children’s academic achievement through mechanisms such as modeling, reinforcement and instruction (Hoover-Dempsey & Sandler, 1995). However, the level of parental influence on children’s educational performance depends on key factors defining the parent–child relationship, particularly how close children feel to their parents (i.e., parent–child attachment); the level of direct supervision that children receive from parents (i.e., the average amount of parental attention); and the style of parenting (i.e., authoritative parenting style which means high responsiveness plus high demandingness has been linked with adaptive achievement strategies characterized by low levels of failure expectation, task-irrelevant behavior, and passivity, coupled with high levels of independent problem-solving and critical thinking, which could promote academic achievement) (Aunola et al., 2000; Gray & Steinberg, 1999; Grolnick & Ryan, 1989; Pong et al., 2005; Steinberg et al., 1992).

These parental engagement factors tend to vary in systematic ways across family structures, suggesting consistent differences in parental involvement and thus, in children’s educational performance. The average amount of parental attention and support tends to be greater in traditional two-parent households compared to one-parent households as single carers need to balance work and other domestic responsibilities (Amato, 2005). Additionally, reduced grandparent–children attachment is expected in a skipped generation household due to major generational gaps in the use of new technology, fashion, and trends (Chung et al., 2010). For immigrant households, the extent of parental involvement may be reduced. Long working hours reported by immigrants may not fit with school periods and result in extended times with an absence of dedicated caring (Elliott et al., 2017). Immigrants’ limited local language proficiency may also undermine their capacity to assist their children with school content (Kao & Tienda, 1995).

Additionally, the style of parenting has also been found to vary across family structures. Single mothers are often the confidantes of their children (Devall et al., 1986; Weiss, 1979) and tend to use a more flexible parenting style than mothers from two-parent household structures (Dornbusch et al., 1987). However, this style of parenting is gender biased as single mothers, on the one hand, tend to bring more permissive parenting to their sons than their daughters (Dornbusch et al., 1987). Single fathers, in contrast, are consistently more permissive with both sons and daughters, compared to fathers from two-parent households (Dornbusch et al., 1987). Similarly, parents from immigrant households tend to be equally permissive to children of both sexes (Dornbusch et al., 1987). In skipped generation households, parenting practices do not appear to have been investigated.

Socioeconomic resources

The impact of socioeconomic resources on educational outcomes is well established. The availability of material and financial resources determines the quality and access of children to school programs, technology, extracurricular activities, and new experiences which shape their academic success (Morris & Gennetian, 2003). High socioeconomic status (SES) households have these resources and they enable children to access high-quality education and promote high levels of academic performance (Davis-Kean, 2005). Parents from high-SES households also tend to have a degree qualification, have mastered a foreign language, and have a high level of cultural knowledge (Memon et al., 2010)—all of which translate into better educational performance among their children, compared to those from more disadvantaged households (Haveman & Wolfe, 1995).

Also, the positive relationship between high SES and academic performance at an early stage of life tends to persist from childhood through to adolescence and is consistent across ethnic groups (Aikens & Barbarin, 2008; Caro et al., 2009; Kieffer, 2012; Mpofu & Vijver, 2000; Ren & Xin, 2013; Woßmann, 2005). The persistence of this relationship reflects the recursive nature of learning as early learning lays the foundations for later attainment and triggers a self-reinforcing motivation for further learning (Heckman, 2006). Consistently, research has found that children from high-SES households develop academic skills faster than children from lower-SES families (Morgan et al., 2009). Furthermore, high SES leads to significantly higher overall performance scores in mathematics and English (Farooq et al., 2011).

Furthermore, SES is argued to be a key factor in maintaining relationships, which could lead to a change in family structure. There is abundant evidence that having higher SES reduces the likelihood of separation of couples and the risk of divorce (Conger et al., 2010; Karney & Bradbury, 2005). A higher income, more family resources, and greater economic gains are positively associated with greater marital stability (Duncan et al., 2007; Orbuch et al., 2002; Popenoe, 2007; S., 2001; Stanley et al., 2006), while insufficient income and financial problems are associated with a lower quality of marriage (Amato et al., 2007; Cutrona et al., 2003; Dakin & Wampler, 2008; Falke & Larson, 2007; Karney et al., 2005; Rauer et al., 2008; Stanley et al., 2006). Moreover, the lack of professional prestige is associated with the occurrence of marital problems (Amato et al., 2007). The lower SES of marriage partners is more likely to see a two-parent family become a single-parent family due to a divorce. With the change in family formation, grandparents are more likely to assume parental responsibility if parents are unable to raise the children (Smith et al., 1998).

In Taiwan, divorce caused by a family’s financial problems has been recognized as one of the main causes of the formation of skipped generation families (Hsu, 2016; Luo & Chiu, 2013). Furthermore, lower SES has been a common phenomenon in immigrant families in Taiwan for decades. Most immigrant families are made up of a native Taiwanese man and a foreign bride. Taiwanese men with lower SES or living in rural areas have difficulty finding Taiwanese women for marriage and turn to Southeast Asian and mainland Chinese brides. The motivation of these foreign brides to get married to Taiwanese men is the desire for a better life than they would have in their motherland. Marriage agencies match the demand for cross-border marriage, motivated by factors other than romance, from Taiwanese men and women from Southeast Asian and mainland China. These agencies have even built businesses on an industrial scale and have increasingly dominated Taiwan’s cross-border marriage market since the last decades of the twentieth century.

In summary, SES is not only crucial for the academic performance of children but also for the stability of the family. As a result, SES plays a confounding role in estimating the treatment effect of family structure on children’s academic performance. Therefore, this study not only includes SES measures as covariates in empirical analysis but also addresses the confounding effect of SES.

Data and statistical method

Data source

To evaluate the impact of various family structures on the academic performance of children in Taiwan, this study used a dataset from the TASA which is administered by the National Academy for Educational Research Preparatory Office. The TASA is nationally representative. A two-stage-stratified sampling design was implemented to select the schools and children. In the first stage, the sampling unit consisted of schools stratified by two variables, namely, district size and school size. Then, schools were sampled systematically with probabilities that were proportional to the measured size. In the second stage, the sampling units were classes within chosen schools based on a simple random sampling method (Yip, 2017). The assessment framework, performance standards, scoring rubrics, and sample items or tasks are available online (http://tasa.naer.edu.tw/).

The TASA assesses the extent to which 4th-, 6th-, 8th-, and 11th-grade children (see Fig. 2) have acquired the key knowledge and skills of the academic learning set out in the Taiwanese national curriculum. The assessment is implemented across primary and secondary schools in all 23 counties of Taiwan. In 4th, 6th, 8th, and 11th grades, the median age of children is 10, 12, 14, and 17 years, respectively, and the TASA focuses on the core school subjects of Chinese, English, mathematics, science and social science. Children in the 4th grade are not assigned to participate in the English and social science tests. Each child is randomly assigned an examination on two school subjects. The 10 possible combinations are as follows: Chinese and English, Chinese and math, Chinese and science, Chinese and social science, English and math, English and science, English and social science, math and science, math and social science, and science and social science. Children are assigned a questionnaire which collects their background information, including gender, number of siblings, family structure, family environment (level of socioeconomic resources, language use, guardians’ educational level), and aspects of children’s lives (attitudes toward learning, their habits and life in and out of school). The questionnaire, however, does not collect information on residential and school location and school and classroom attributes, so the influence of these factors cannot be assessed. Nonetheless, spatial socioeconomic inequalities across schools and neighborhoods in Taiwan are small (Sun & Tsai, 2007). The Program for International Student Assessment results indicate that socioeconomic inequalities between urban schools and non-urban schools in Taiwan are relatively small compared to other OECD Asian countries (OECD, 2013). Thus, the influence of residential and school factors on children’s economic performance is expected to be less pronounced than individual and household attributes. This is in line with evidence from countries with small interpersonal socioeconomic inequalities, like Australia (Rowe et al., 2017).

Fig. 2
figure 2

The structure of the Taiwan Assessment of Student Achievement

Table 1 shows the detailed schedule of the TASA assessment. For this study, we used data on 8th-grade children in 2007, 2010, and 2013. We used data on 8th-grade children for two key reasons. First, this grade is the closest available point of observation to the end of their compulsory education.Footnote 8 Second, in Taiwan, achievement in junior school is an important factor impacting children’s choice of high school or vocational school or not continuing to study (Li & Qiu, 2018).Footnote 9 Understanding children’s educational performance in 8th grade is, thus, important to guide policy formulation and planning for better educational outcomes later in life by reinforcing the educational offer in earlier years. We do not use 2006 TASA data because they do not provide the relevant details to define family structures required to be consistent with later years.

Table 1 Timetable for Taiwan assessment of student achievement

Measure of academic performance

To measure academic performance, we used students’ test scores of subject assessments from the TASA examination. Test scores are commonly used to assess children’s academic performance in the literature (Davison & Dustova, 2017). We constructed a test score variable by computing the ratio of the number of correctly answered questions to the total number of questions and then multiplied by 100. The score ranges from 0 to 100, with a score of 80 indicating 40 correctly answered questions out of 50 questions.

Methodology

We used a quasi-experimental design to estimate the causal effect [or average treatment effect (ATE)] of family structures on children’s academic performance (Lumley & Scott, 2017; McCaffrey et al., 2013). It involves a two-stage approach. First, a machine learning algorithm is applied to estimate the propensity score, which is a less restricted framework that frees us from assuming any specific form of relationship between the treatments and the covariates. Second, the estimated propensity scores are translated to weights by an inverse probability approach, which makes plausible the covariates’ balance among the different treatment groups, and a weighted regression approach is employed to estimate the causal treatment effect of interest.

Several previous studies have relied on the use of propensity scores to tackle the confounding effect when revealing the treatment effect of family structure on the academic performance of children. Most of them incorporate the propensity score obtained by assuming a specific model form (e.g., logit regression or probit regression) along with a matching approach (e.g., one-to-one matching, nearest neighbor matching, and kernel matching) to estimate the treatment effect (e.g., (Cid & Stokes, 2013; Cross, 2020; Lu et al., 2021; Santín & Sicilia, 2016)). When incorporating the propensity score obtained through logit or probit regression to address the problem of confounding factors in estimating treatment effects, the presence of model misspecification for the propensity score would lead to mediocre performance in eliminating confounding factors (Austin & Stuart, 2017). Moreover, compared to matching approaches, the weighting on propensity score works as a more flexible approach (allowing targeting specific groups as a reference, especially when multiple treatments are used in analysis), which preserves data integrity better, which can, in turn, increase the precision of estimating causal effects (Desai & Franklin, 2019). Indeed, there is also evidence that the weighting approach shows better performance in balancing covariates than the matching approach (Li et al., 2018).

Propensity score weighting

Intuitively, PSW approaches are typically used to estimate the causal effect of a treatment (e.g., a medical treatment, policy intervention, or training program). They enable addressing the challenge arising from the fact that individuals in the treatment and control group may not be a product of random assignment, and these groups may instead differ systemically. The PSW approaches enable the identification of comparable individuals in the treatment and control groups based on observed attributes.

We applied a novel approach to obtain the propensity score, known as Extreme Gradient Boosting (XGBoost), or generalized boosted models (GBM). Unlike traditional approaches (e.g., logit regression or probit regression), GBM incorporates a machine learning algorithm to construct the relationship between covariates and the multiple treatment propensity scores (McCaffrey et al., 2013). GBM fit the relationship using a regression tree approach, which is an automated and adaptive algorithm. This algorithm can be applied to study cases involving a large set of pre-treatment covariates to fit a nonlinear surface and predict treatment assignment (McCaffrey et al., 2004).

Misspecification of the propensity score model is often a key issue resulting in poor estimates of the propensity scores and biased estimates of the treatment effect (King & Nielsen, 2019). In particular, GBM helps address this issue by its flexibility. Traditionally, propensity scores are usually estimated with logistic regression which imposes parametric assumptions and, thus, restricts the functional form relating to the underlying distribution of the population. However, GBM approaches the estimation of the propensity scores as a prediction problem based on a non-parametric boosting machine learning algorithm. Boosting outperforms alternative methods, such as logistic model analysis in terms of prediction error (Friedman, 2001; Madigan & Ridgeway, 2004). There are many variants of boosting in the machine learning and statistics literature, including the AdaBoost algorithm (Freund & Schapire, 1997), GBM (Ridgeway, 1999), and the gradient boosting machine (Friedman, 2001). We used GBM because, compared to other implementations of boosting, this method is designed to produce models yielding well-calibrated propensity scores (or probability estimates) (McCaffrey et al., 2004).

Implementing generalized boosted models

To implement GBM, we considered each of our three non-traditional family structures as separate treatment groups (skipped generation households, single-parent households, and immigrant households) and traditional two-parent families as the control group. First, we estimated the propensity score to distinguish children with similar observable attributes from each non-traditional household from those from two-parent families. Formally the propensity score is defined as follows:

$$\mathrm{Pr}\left({K}_{i} = k \right|{x}_{i})\equiv {P}_{k}\left({x}_{i}\right) = \frac{{\mathrm{e}}^{{g}_{k}\left({x}_{i}\right)}}{1+{\sum }_{k = 1}^{3}{\mathrm{e}}^{{g}_{k}\left({x}_{i}\right)}}, k\in \left\{\mathrm{1,2},3\right\},$$
(1)

where xi is the covariates of a child i. Pk(xi) is the propensity score for a child i in a non-traditional family k; that is, 1 for a single-parent, 2 for a skipped generation, and 3 for an immigrant family structure. The propensity score for the control group (i.e., children from a two-parent family) was estimated as follows:

$${P}_{0}\left({x}_{i}\right) = 1-\sum_{k = 1}^{3}{P}_{k}\left({x}_{i}\right) = \frac{1}{1+{\sum }_{k = 1}^{3}{\mathrm{e}}^{{g}_{k}\left({x}_{i}\right)}}.$$
(2)

To estimate propensity scores for multiple treatments, a log-likelihood function was used based on a joint Bernoulli random trial [Eq. (3)]. A household structure indicator \({z}_{ki}\) was created to identify a child i from family structure k, where \({z}_{ki}\) equals 1 if a child belongs to a given non-traditional family structure (“treatment”) and 0 otherwise.

$$L\left(g\right) = \sum_{i = 1}^{N}\mathrm{ln}\left\{{{P}_{1}\left({x}_{i}\right)}^{{z}_{1i}}\times {{P}_{2}\left({x}_{i}\right)}^{{z}_{2i}}\times {{P}_{3}\left({x}_{i}\right)}^{{z}_{3i}}\times {\left(1-\sum_{k = 1}^{3}{P}_{k}\left({x}_{i}\right)\right)}^{\left(1-{\sum }_{k = 1}^{3}{z}_{ki}\right)}\right\} = \sum_{i = 1}^{N}\left\{\sum_{k = 1}^{3}{z}_{ki}\times {g}_{k}\left({x}_{i}\right)-\mathrm{ln}\left(1+\sum_{k = 1}^{3}{\mathrm{e}}^{{g}_{k}\left({x}_{i}\right)}\right)\right\}.$$
(3)

Based on Eq. (3), the GBM was used to find a set of \({g}_{k}\left({x}_{i}\right)\) that maximizes the likelihood function. The desired estimate of \({g}_{k}\left({x}_{i}\right)\) is given by the following argument,

$${g}^{*}\equiv \mathrm{arg}\underset{g}{\mathrm{max}}\left\{L\left(g\right)\right\}, \mathrm{where}\space \,\,g = \left\{{g}_{1}, \dots ,{g}_{k}\right\}.$$
(4)

The \({g}^{*}\) are used to compute the propensity scores defined in Eqs. (1) and (2).

Estimating the average treatment effect

To estimate the ATE of family structures on children’s academic performance, the resulting propensity scores were used as statistical weights, known as the inverse probability of treatment weighting (IPTW). In other words, the assigned weight of an observation that receives treatment k would be the reciprocal value of the estimated probability of receiving treatment k. A weight of 1/\({P}_{k}\left({x}_{i}\right)\) was applied for children from skipped generation, single-parent, and immigrant households, and a weight of 1/{1 − \({\sum }_{k = 1}^{3}{P}_{k}\left({x}_{i}\right)\)} was given to children from two-parent households, where \({P}_{k}\left({x}_{i}\right)\) is the estimated propensity score for child i receiving treatment k.Footnote 10 The use of PSW is similar to the use of sampling weights in survey data analysis to account for unequal probabilities of inclusion in a study example. In essence, treated and untreated groups are weighted to have a similar composition in terms of the observed attributes captured in the data (Lee et al., 2010). We applied the propensity score weights to produce a balanced dataset in the first stage to estimate the ATE of family structures on children’s educational performance in the second stage.

We estimated three different sets of regression models. The first set of models was used to estimate the overall ATE across our sample; the second was used to estimate discipline-specific ATEs and the third was used to estimate year-specific ATEs. For the first set of models, we used children’s averaged individual test scores to create an overall test score. Averages were computed using data for two subject disciplines as children are only required to complete tests in two different subjects. Therefore, we averaged, for example, the scores for math and social science or math and English. Formally, the model can be represented by

$$\bar{{Y }_{i}} = {\alpha }_{0}+{\beta }_{0}{\mathrm{FS}}_{i}+{\gamma }_{0}{C}_{i}+{\sigma }_{0}{S}_{i}+{\nu }_{i},$$
(5)

where i denotes children; \(\bar{{Y }_{i}}\) represents the overall average test score of an individual i across the two disciplines for which the examination was undertaken; FS represents each of our four family structures: two-parent, single-parent, skipped generation, and immigrant households and captures variations in academic performance due to differences in family structure; C is a vector of variables capturing children’s characteristics, including gender and number of siblings; S is a vector of variables capturing children’s family background, specifically their guardian’s educational level, the language used at home, and the availability of computers and books at home; and \({\nu }_{i}\) corresponds to the residual term.

In our analysis, we sought to identify differences in academic performance arising from differences in family socioeconomic resources and parenting styles. Thus, we included covariates to control for possession of at least a computer and the number of books at home to capture the influence of household socioeconomic resources and isolate the effect of parenting styles, which, we argue, is measured by differences in family structures.Footnote 11 Table 2 lists and describes the variables included in our regression models.

Table 2 Description of variables

The second set estimates discipline-specific ATEs of children’s family structures on their educational performance. We expect some systematic discipline-specific differences in children across different family backgrounds. For instance, we expect differences in student performance across disciplines as having an English-speaking immigrant background may be linked to high scores in English examinations but low scores in Chinese or social science tests. We estimated the following model:

$${Y}_{ij} = {\alpha }_{1}+{\beta }_{1}{\mathrm{FS}}_{ij}+{\gamma }_{1}{C}_{ij}+{\sigma }_{1}{S}_{ij}+{\pi }_{ij}.$$
(6)

Here, we note that test scores and covariates vary by children and discipline j (Chinese, English, math, science, and social science).

The third set of regression models estimates year-specific ATEs of children’s family structures on their educational performance to establish the extent of the persistence of the observed effect. We estimated the following model:

$$\bar{Y }_{it} = {\alpha }_{2}+{\beta }_{2}{\mathrm{FS}}_{it}+{\gamma }_{2}{C}_{it}+{\sigma }_{2}{S}_{it}+{\varepsilon }_{it},$$
(7)

where t denotes year (2007, 2010, and 2013).

Results

This section first presents some descriptive statistics on the composition of family structures before discussing the results derived from our regression analysis.

Descriptive analysis

Table 3 shows the distribution of children across our four family structures for our pooled sample, including data from 2007, 2010, and 2013. It shows that two-parent families account for the largest type of family structure. About three-quarters of children in the sample live in two-parent households. This type of family structure is, however, declining, and the proportion of children living in non-traditional family structures is on the rise. Combined, the percentage of children living in skipped-generation, single-parent, and immigrant households increased from 22.11% in 2007 to 34.86% by 2013. Children living in single-parent families represent the largest increase (by 8.20%) followed by children living in immigrant households rising by 3.68%. The percentage of children living in skipped generation families also seems to be growing but at a slower rate.

Table 3 Sample size of the whole sample and by family structures

Table 4 shows the distribution of children in our sample across individual and family characteristics. Overall, there is a balanced sex ratio, and children are drawn largely from Chinese-speaking households (85.70%). Most children have siblings (89.85%). The distribution of male guardians’ and female guardians’ education levels is quite similar. We also observe a significantly lower density in groups with an educational level of illiterate and elementary school and a higher density between groups with an educational level of junior school and graduate school or above. In terms of socioeconomic resources at home, most children have at least one computer (96.19%). Equal proportions of children report having 0–25, 26–100, and 100 + books. Chinese is the main language used at home.

Table 4 Summary statistics of variables by family structures

Differences are, however, observed across household structures. Notably, skipped generation households record the largest share of an only-child setting and use of a non-Chinese language at home and seem to be the most disadvantaged group in terms of computer and book resources. These percentages are also larger for single-parent and immigrant households than for two-parent families. Immigrant households also have the largest share of illiterate guardians, both female and male.

Table 5 shows the overall children’s average test scores by family structure revealing consistently higher scores for children from two-parent households. Children from skipped generation and single-parent families register the lowest scores, remarkably low for the former and displaying a sharp decline from 50.10 in 2010 to 46.48 in 2013. Even though a considerable percentage of children in immigrant families have parents with no education or only elementary education, their average scores are around 60 across all three years in analysis. However, average test scores for children from immigrant families have been lower than those for children from two-parent households and their difference has widened over time.

Table 5 Sample means of test scores by year and family structures

Regression results

Next, we analyzed whether family structures are associated with children’s educational performance. We were particularly interested in establishing a causal link; that is, whether children’s family structure has any impact on their educational performance. We hypothesized that non-traditional household structures may impact children’s educational advantage negatively due to differences in parenting styles, language, time, and financial resources. We conducted a regression analysis with the inverse probability treatment weighting of the propensity score to estimate the ATE of being associated with a non-traditional family on the overall test score of children, where the propensity score can help eliminate the imbalance of covariants between children from two-parent and non-traditional household structures. Note that all covariates included in each equation are taken into account in the GBM process when estimating the propensity score for each equation. Therefore, the GBM processes and estimated propensity scores vary according to the model. The improvement in the imbalance of covariates and the quality of propensity scores are ensured by examining the types of balance measures and the distribution of propensity scores.Footnote 12

Overall impact

Table 6 presents the regression results for the first set of models based on a pooled dataset of the three years under analysis. Dummy fixed effects are included for years but they are not displayed in Table 6. Two models are presented. The key difference between these models is that Model (1) excludes home computer and stored books variables which are included in Model (2) to account for differences in the socioeconomic resources of children. Focusing on family structures, our results reveal that children from two-parent families show the highest educational achievements (i.e., in terms of average overall test scores). Compared to children from two-parent households, those from skipped generation and single-parent households tended to perform much worse with average overall test scores up to 11.26 points lower.

Table 6 Regression coefficients of different family structures

Model (2), however, indicates that differences in academic performance due to variations in family structures are less pronounced after controlling for socioeconomic resources. The score gap between children from two-parent families and non-traditional family structures is reduced. The coefficients for skipped-generation households fall from − 11.26 points to − 8.71 points, from − 6.62 points to − 5.22 points for single-parent households and from − 1.37 points to − 0.98 points for immigrant households. This finding suggests that having an adequate provision of resources can help remedy issues of disadvantage affecting children’s educational performance within non-traditional households. At the same time, it also suggests that differences in socioeconomic resources only account for a small share of the difference in school grades among children.

The coefficients of variables related to children’s background and personal characteristics cannot be interpreted causally (Westreich & Greenland, 2013), yet some relationships between these variables and outcome variables are still revealed. Being male and having siblings are negatively associated with children’s school performance. Besides, guardians’ educational level positively relates to children’s educational performance. Relative to guardians with graduate school or higher qualifications (base category), children of parents with lower educational levels tend to evidence lower academic scores. Considering children’s background, speaking Chinese in households is positively associated with test scores. Owning at least one computer and a larger number of books is positively associated with test scores: this is consistent with existing empirical evidence indicating that high-SES households, with enough material and finance resources, enable children’s high academic performance (Davis-Kean, 2005).

Variation across disciplines

While, together, these results show that children from non-traditional households tend to be more academically disadvantaged, variations exist across disciplines. Table 7 reveals that children from skipped generation and single-parent households are consistently more disadvantaged across all the five disciplines in analysis, particularly in English and math. This may be because grandparents in the skipped generation households tend to be out of the workforce, less well educated, less proficient in the use of new technology, their knowledge of the national school curriculum is likely to be out of date, and they have a reduced ability to absorb new knowledge, coupled with the fact that they tend to lack financial resources to mitigate these deficiencies (Denton & Spencer, 2009). For children from single-parent households, poor academic performance may be explained by a likely reduced capacity of single guardians to balance work and childcare duties (Weiss, 1979). Children in single-parent households are likely to have higher time constraints compared with other family structures.

Table 7 Regression coefficients of different family structures by subject

Children from immigrant households achieve similar scores to those from two-parent families in math and science but they perform worse in English and science, achieving an average score of 2.17 and 1.39 points, respectively, which is lower than children from two-parent families. The poor performance of children from immigrant families seems to be because a large proportion of immigrant households in Taiwan involve mothers from non-English-speaking countries from South-East Asian countries. Therefore, English does represent a barrier for children from these families. However, children from immigrant families perform better in Chinese than two-parent children.

The association between the guardian’s education level and children’s academic performance is inconsistent across five subjects. However, the association of a home computer, the number of books owned, and speaking Chinese with children’s academic performance is consistently positive across five subjects. Males are negatively associated with performance of language subjects, while we found no evidence of an association between gender and performance in math, science, and social science.

Changes over time

Next, we analyzed the temporal persistence of the impact of family structures on children’s educational performance. Regression models were estimated separately for each year: 2007, 2010, and 2013. As the models reported in Table 7, we added control variables relating to personal and household characteristics, but we only reported the estimates for family structure types. Figure 3 reveals the persistent disadvantages for children from skipped generation households and single-parent households versus children from two-parent households. Children from skipped-generation households are systematically the most disadvantaged. In 2007, compared to two-parent households, children from skipped generation households registered a score 9.74 points lower. This gap equated to a 5-point difference between children from two-parent households and single-parent households.

Fig. 3
figure 3

Test score gaps between family structures and two-parent households by year

Figure 3 also reveals a shift in the educational performance gap between children from immigrant households and those from two-parent families. In 2007, there was a gap of 2.64 points between children from immigrant families performing worse than those from two-parent families. However, this situation appears to have improved since 2010, with the scores of children from immigrant families not significantly different from children from two-parent families. An increase in subsidy policies for immigrant families might explain this phenomenon. To promote the overall foreign spouse care counseling service and the training and development of related human resources to encourage and provide child care and multicultural promotion programs, the Foreign Spouse Care and Guidance Fund was established by the Ministry of Interior National Immigration Agency in 2005. The fund is increasing year by year, from 171 million Taiwan dollars in 2011 and 214 million Taiwan dollars in 2012 to 336 million Taiwan dollars in 2013 (calculated by the Ministry of Interior National Immigration Agency). In particular, one which is named the National New Immigrant Torch Project has been promoted jointly by the Ministry of the Interior and the Ministry of Education since 2012. This project aims to help immigrants and their children learn extensively.

Conclusion

Trends of increasing rates of marriage separation, population aging, and the growth of the foreign-born population have triggered a rise in the number of non-traditional households globally. The rapid proliferation of these family structures is eroding the prevalence of the traditional two-parent family structure with wide-ranging implications for children’s academic performance. However, little progress has been made in understanding the association between various forms of non-traditional family structures and children’s school performance. This study sought to assess empirically the influence of four household structures (two-parent households, skipped generation households, single-parent households, and immigrant households) on cognitive knowledge (measured by test scores), examining variations across disciplines and persistence over time. To this end, we used a quasi-experimental design. To reduce the influence of confounders, we used a machine learning-based PSW approach to compare children from non-traditional family structures with those from two-parent families based on observed characteristics.

First, our findings revealed that, overall, children from skipped generation households are consistently the most academically disadvantaged. Children from two-parent households registered the highest academic scores, while children from single-parent households tended to achieve scores of five points lower and children from immigrant households reported scores 0.9 points lower than children from two-parent households. These findings are consistent with previous studies indicating that children from non-traditional family structures tend to record lower school achievements than children from two-parent households (McLanahan & Sandefur, 2009). This outcome reflects financial disadvantage, lack of educational resources, and limited parental involvement relating to non-traditional family structures to offer appropriate educational support for their children (Baker & Mutchler, 2010; Gornick & Jantti, 2010; Shields & Behrman, 2004).

Second, children from skipped generation households and single-parent households perform worse than children from two-parent households across all five disciplines of cognitive knowledge in our analysis. By contrast, children from immigrant households tend to achieve similar scores to children from two-parent households, except in English language, Chinese, and science. They perform worse only in English and science while they perform even better in Chinese. These findings are inconsistent with existing research in industrialized Western societies which indicates that children from immigrants tend to perform worse in reading math and science (OECD, 2018). This contrasting finding could be traced to the fact that immigrants in Taiwan originate primarily from mainland China and hence, share the same mother tongue. Language barriers, which are commonly a key factor undermining the learning abilities of immigrant children, do not represent an impediment for most immigrants in Taiwan.

Third, the poor academic performance of children from skipped-generation households and single-parent households shows remarkable persistence over time. Children from skipped generation households and single-parent households consistently achieve scores up to nine points lower than children from two-parent households. In contrast, children from immigrant households displayed the same level of performance as those from two-parent households since 2010. This seems to reflect the increasingly selective group (e.g., well educated) of immigrant population in Taiwan. The percentage of immigrants who have at least completed university or above grew from 33.9% in 2015 to over 35.5% in 2018 (Ministry of the Interior).Footnote 13 Holding a higher education qualification is typically associated with higher socioeconomic status which is, in turn, positively linked to higher educational performance and achievement (Azhar et al., 2014).

Implications

Education is a key factor in long-term economic success, and the association between family disruption and lower educational attainment raises the question of whether the sharp increase in family instability over the past two decades will have lasting negative consequences on the educational attainment of the next generation (Astone & McLanahan, 1991). Our research results have significant implications. First, our results provide a blueprint for the allocation of government budgets and the prioritization of funding for non-traditional family structures in terms of policy. It is necessary to prioritize the provision of support for skipped generation and single-parent households. The lasting negative effects of family structure formation on educational attainment can be addressed through support. Moreover, the total budget allocated to the relevant subsidy packages should necessarily be increased due to the increasing number of skipped generation and single-parent households in Taiwan.

Second, our findings suggest that traditional two-parent families serve as the most appropriate family environment for children’s education. As the saying goes, “An ounce of prevention is worth a pound of cure.” Preventing two-parent families from becoming single parents or other types of households is, therefore, a problem that deserves concern. According to the Ministry of the Interior of Taiwan, the divorce rate in 2021 was 2.3%, higher than the average divorce rate elsewhere in the world (1.8%). Moreover, the number of couples in Taiwan who divorced within 5 years of marriage reached a 10-year high in 2021. The data showed that 34.75% of the total number of divorced couples ended their marriage after being married for less than 5 years. Preventing divorce is, therefore, a crucial issue for Taiwan authorities. Pre-marital education or marriage counseling, which are well established in other countries, are potential prescriptions that Taiwan authorities should consider implementing to avoid further divorce. In Singapore, for example, the Marriage Central program provides comprehensive consulting and support resources for life both before and after marriage, helping couples build sustainability in their relationship. In the USA, various programs are also designed to help parents. For example, Home Visitation Programs and Fatherhood Programs are being implemented in the USA to guide the role of parents and strengthen fathers’ awareness of their responsibility to the family.

Limitations and future research

The limitation of this research comes from the complexities of addressing the confounding effects of SES on the relationship between family structure and children’s academic performance. Socioeconomic status operates as a confounding variable due to its dual role in influencing both academic performance and changes in family structure. To mitigate this confounding, the propensity scores are applied in conjunction with an IPTW approach. This framework allows us to elucidate the impact of family structure on children’s academic performance while controlling for potential confounders.

However, a limitation arose during the calculation of the propensity scores. It would be ideal to consider the socioeconomic status before the change in family structure, but it is difficult to determine the socioeconomic status accurately at that specific time from the data used. Therefore, we assume that socioeconomic status remains consistent before and after the family structure change. This assumption justifies the use of SES metrics that are currently available, which act as proxies for the pre-family-change SES status.

Besides, our study is built on a cross-sectional analysis, which reveals the persistence of academic disadvantage among children from skipped generation and single-parent households and the improvement in the academic performance of children from immigrant backgrounds. However, this cross-sectional perspective cannot capture the full spectrum of long-term cognitive development and future career outcomes for individuals from distinct family structures. This underscores an underexplored area of research, which calls for more in-depth investigation. As societies increasingly witness the emergence of non-traditional household structures, comprehending the sustained trajectories of children from these families becomes more significant. The understanding of this is crucial for formulating effective policy interventions that foster inclusive and sustainable economic and social advancement, aligning with the imperative for inclusive progress as highlighted by Li and Qiu (Li & Qiu, 2018).

To enhance the depth and robustness of future research, the construction of longitudinal datasets is needed. These datasets should include comprehensive information on SES dynamics, educational trajectories, and labor market performance. It would provide a more holistic exploration of individuals’ development from childhood to adulthood.