Generalised linear model trees with global additive effects

Seibold, Heidi; Hothorn, Torsten; Zeileis, Achim

doi:10.1007/s11634-018-0342-1

Generalised linear model trees with global additive effects

Regular Article
Published: 05 October 2018

Volume 13, pages 703–725, (2019)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

856 Accesses
20 Altmetric
1 Mention
Explore all metrics

Abstract

Model-based trees are used to find subgroups in data which differ with respect to model parameters. In some applications it is natural to keep some parameters fixed globally for all observations while asking if and how other parameters vary across subgroups. Existing implementations of model-based trees can only deal with the scenario where all parameters depend on the subgroups. We propose partially additive linear model trees (PALM trees) as an extension of (generalised) linear model trees (LM and GLM trees, respectively), in which the model parameters are specified a priori to be estimated either globally from all observations or locally from the observations within the subgroups determined by the tree. Simulations show that the method has high power for detecting subgroups in the presence of global effects and reliably recovers the true parameters. Furthermore, treatment–subgroup differences are detected in an empirical application of the method to data from a mathematics exam: the PALM tree is able to detect a small subgroup of students that had a disadvantage in an exam with two versions while adjusting for overall ability effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Better Model, Worse Predictions: The Dangers in Student Model Comparisons

Testing heterogeneity in quantile regression: a multigroup approach

Article Open access 10 June 2023

High-Stakes Testing Case Study: A Latent Variable Approach for Assessing Measurement and Prediction Invariance

Article 22 January 2019

References

Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Wadsworth, Pacific Grove
MATH Google Scholar
Chen J, Yu K, Hsing A, Therneau TM (2007) A partially linear tree-based regression model for assessing complex joint gene–gene and gene–environment effects. Genet Epidemiol 31(3):238–251. https://doi.org/10.1002/gepi.20205
Article Google Scholar
Dusseldorp E, Conversano C (2018) Stima: Simultaneous Threshold Interaction Modeling Algorithm. R package version 1.2. https://CRAN.R-project.org/package=stima
Doove LL, Dusseldorp E, Van Deun K, Van Mechelen I (2014) A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions. Adv Data Anal Classif 8(4):403–425. https://doi.org/10.1007/s11634-013-0159-x
Article MathSciNet MATH Google Scholar
Dusseldorp E, Conversano C, Van Os BJ (2010) Combining an additive and tree-based regression model simultaneously: STIMA. J Comput Graph Stat 19(3):514–530. https://doi.org/10.1198/jcgs.2010.06089
Article MathSciNet Google Scholar
Fokkema M, Smits N, Zeileis A, Hothorn T, Kelderman H (2018) Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees. Behav Res Methods 50(5):2016–2034. https://doi.org/10.3758/s13428-017-0971-x
Article Google Scholar
Hajjem A, Bellavance F, Larocque D (2011) Mixed effects regression trees for clustered data. Stat Probab Lett 81(4):451–459. https://doi.org/10.1016/j.spl.2010.12.003
Article MathSciNet MATH Google Scholar
Holloway ST, Laber EB, Linn KA, Zhang B, Davidian M, Tsiatis AA (2015) DynTxRegime: methods for estimating dynamic treatment regimes. https://CRAN.R-project.org/package=DynTxRegime, R package version 2.1
Hothorn T, Zeileis A (2015) partykit: a modular toolkit for recursive partytioning in R. J Mach Learn Res 16:3905–3909
MathSciNet MATH Google Scholar
Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat 15(3):651–674. https://doi.org/10.1198/106186006X133933
Article MathSciNet Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218. https://doi.org/10.1007/BF01908075
Article MATH Google Scholar
Italiano A (2011) Prognostic or predictive? It’s time to get back to definitions!. J Clin Oncol 29(35):4718–4718. https://doi.org/10.1200/JCO.2011.38.3729
Article Google Scholar
Lang M, Bischl B, Surmann D (2017) batchtools: tools for R to work on batch systems. J Open Source Softw. https://doi.org/10.21105/joss.00135
Article Google Scholar
Lipkovich I, Dmitrienko A, D’Agostino RB (2016) Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials. Stat Med. https://doi.org/10.1002/sim.7064
Article Google Scholar
Loh WY (2002) Regression trees with unbiased variable selection and interaction detection. Stat Sin 12(2):361–386
MathSciNet MATH Google Scholar
Mbogning C, Toussile W (2015) GPLTR: generalized partially linear tree-based regression model. https://CRAN.R-project.org/package=GPLTR, R package version 1.2
Milligan GW, Cooper MC (1986) A study of the comparability of external criteria for hierarchical cluster analysis. Multivar Behav Res 21(4):441–458. https://doi.org/10.1207/s15327906mbr2104_5
Article Google Scholar
Seibold H, Zeileis A, Hothorn T (2016) Model-based recursive partitioning for subgroup analyses. Int J Biostat 12(1):45–63. https://doi.org/10.1515/ijb-2015-0032
Article MathSciNet Google Scholar
Seibold H, Hothorn T, Zeileis A (2017) palmtree: partially additive (generalized) linear model trees. https://CRAN.R-project.org/package=palmtree, R package version 0.9-0
Sela RJ, Simonoff JS (2012) RE-EM trees: a data mining approach for longitudinal and clustered data. Mach Learn 86(2):169–207. https://doi.org/10.1007/s10994-011-5258-3
Article MathSciNet MATH Google Scholar
Sies A, Van Mechelen I (2017) Comparing four methods for estimating tree-based treatment regimes. Int J Biostat Online First. https://doi.org/10.1515/ijb-2016-0068
Article Google Scholar
Zeileis A, Hornik K (2007) Generalized M-fluctuation tests for parameter instability. Stat Neerl 61(4):488–508. https://doi.org/10.1111/j.1467-9574.2007.00371.x
Article MathSciNet MATH Google Scholar
Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17(2):492–514. https://doi.org/10.1198/106186008X319331
Article MathSciNet Google Scholar
Zhang B, Tsiatis AA, Davidian M, Zhang M, Laber E (2012) Estimating optimal treatment regimes from a classification perspective. Stat 1(1):103–114. https://doi.org/10.1002/sta.411
Article MATH Google Scholar

Download references

Acknowledgements

We thank Andrea Farnham for improving the language. We are thankful to the Swiss National Fund for funding this Project with Grants 205321_163456 and IZSEZ0_177091 and mobility Grant 205321_163456/2.

Author information

Authors and Affiliations

Department of Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Hirschengraben 84, 8001, Zurich, Switzerland
Heidi Seibold & Torsten Hothorn
Department of Statistics, Faculty of Economics and Statistics, Universität Innsbruck, Universitätsstr. 15, 6020, Innsbruck, Austria
Achim Zeileis
Institute for Medical Information Processing, Biometry, and Epidemiology Ludwig-Maximilans-Universität München, Marchioninistr. 15, 81377, Munich, Germany
Heidi Seibold

Authors

Heidi Seibold
View author publications
You can also search for this author inPubMed Google Scholar
Torsten Hothorn
View author publications
You can also search for this author inPubMed Google Scholar
Achim Zeileis
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Heidi Seibold.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (R 3 KB)

Supplementary material 2 (md 1 KB)

Supplementary material 3 (R 6 KB)

Supplementary material 4 (R 7 KB)

Supplementary material 5 (R 4 KB)

Supplementary material 6 (R 0 KB)

Supplementary material 7 (R 2 KB)

Supplementary material 8 (R 4 KB)

Supplementary material 9 (R 14 KB)

Supplementary material 10 (R 8 KB)

Supplementary material 11 (R 9 KB)

Appendices

Full factorial simulation

The simulation study described in Sect. 3 takes a ceteris paribus approach and varies one simulation variable at a time while keeping the others at a standard value. We did an additional simulation study where we vary all variables, which leads to $8 \cdot 5 \cdot 2 \cdot 4 \cdot 4 \cdot 4 = 5120$ (see Table 1) different scenarios. For each scenario we simulated two data sets and ran all algorithms on each. In the following we show a small selection of interesting graphics based on the simulations. For the full results of the simulation studies we refer to the online material.

Figure 8 shows the marginal results of the ARI for ${\varDelta }_\beta $, the number of predictive factors, the number of observations and quantitative versus qualitative interactions. We average over the other simulation variables and the two repetitions. For sake of easy visualisation, we restrict the plotted variable to few levels. Similarly Figs. 9 and 10 show the marginal results of the proportion of correct treatment assignment and mean absolute error in estimated treatment effect for the number of predictive factors, ${\varDelta }_\beta $, the number of observations and quantitative versus qualitative interactions. Figure 11 shows the results for the MAE for $n = 900$ and one prognostic factor to show when LM tree 1 starts to improve (see Sect. 3.4).

Figure 8 shows that PALM tree can handle simple subgroups with one predictive factor even when the number of observations is low, but the difference in treatment effects must be reasonably high. All other algorithms perform worse, with LM tree 2 and STIMA being the strongest competitors in the low-n-scenarios. OTR performs reasonably well if qualitative subgroups are present. For $n = 500$ the performance of PALM tree rises already at lower levels of ${\varDelta }_\beta $. The performance of PALM tree and LM tree 2 is very similar and STIMA also performs well. By design OTR ignores any non-qualitative subgroups.

When quantitative treatment subgroups exist, all methods are good at deciding the correct treatment regime (see Fig. 9), especially when the number of observations is reasonably high (300). With $n = 100$ PALM tree, LM tree 2, STIMA and even LM tree 1 still perform very well. OTR is the weakest competitor here. With low numbers of observations ($n = 100$), low treatment effect differences (${\varDelta }_\beta = 0.5$) and qualitative differences, the performance of all algorithms is close to random guessing (0.5), irrespective of the number of predictive factors. With higher ${\varDelta }_\beta $ PALM tree performs reasonably well, followed by LM tree 2, STIMA and OTR (order depending on the number of predictive factors). For $n = 300$ and ${\varDelta }_\beta = 0.5$ STIMA and LM tree 1 perform worst, but STIMA catches up with the other algorithms when ${\varDelta }_\beta = 1.5$, whereas LM tree 1 stays at the bottom. Section 3.3 discusses these results in the context of the results in the star-like simulation study.

Section 3.4 already partly discussed Figs. 10 and 11. Figure 10 shows that across different scenarios the MAE increases with increasing number of predictive factors. PALM tree is among the best performers everywhere. In comparison to the other algorithms it performs particularly well in low-n-qualitative scenarios with ${\varDelta }_\beta = 1.5$.

Computation times

The computation times for all methods except STIMA are very reasonable in these applications. For a summary of computation times in the full factorial desing see Table 3. STIMA reached a maximum of 17.4 h and almost half the models took half an hour or longer.

Table 3 Quantiles of computation times per algorithm in seconds

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Seibold, H., Hothorn, T. & Zeileis, A. Generalised linear model trees with global additive effects. Adv Data Anal Classif 13, 703–725 (2019). https://doi.org/10.1007/s11634-018-0342-1

Download citation

Received: 17 March 2017
Revised: 03 September 2018
Accepted: 04 September 2018
Published: 05 October 2018
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s11634-018-0342-1

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalised linear model trees with global additive effects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Appendices

Full factorial simulation

Computation times

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now