Skip to main content
Log in

Wavelet-based gradient boosting

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A new data science tool named wavelet-based gradient boosting is proposed and tested. The approach is special case of componentwise linear least squares gradient boosting, and involves wavelet functions of the original predictors. Wavelet-based gradient boosting takes advantages of the approximate \(\ell _1\) penalization induced by gradient boosting to give appropriate penalized additive fits. The method is readily implemented in R and produces parsimonious and interpretable regression fits and classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

Download references

Acknowledgments

We are grateful to Andrew Chernih for his provision of the Sydney residential property price data and to Peter Green for his comments on aspects of this research. Partial support was provided by Australian Research Council Discovery Project DP0877055. Assistance from the University of Technology, Sydney’s Distinguished Visitor programme is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. P. Wand.

Appendix: Highest-density region grids

Appendix: Highest-density region grids

We now provide details of the highest density region (HDR) grids used in Figures 3 and 5.

Let \(\varvec{x}=(x_1,\ldots ,x_n)\) be a generic univariate sample and \({\widehat{p}}\) be a probability density estimate based on \(\varvec{x}\). Then a \(100(1-\tau )\%\) highest-density region estimate is

$$\begin{aligned} {\widehat{R}}_{\tau }=\{x\in {\mathbb R}:{\widehat{p}}(x)\ge {\widehat{p}}_{\tau }\} \end{aligned}$$

where \({\widehat{p}}_{\tau }\) is chosen so that the probability mass of \({\widehat{p}}\) over the set \({\widehat{R}}_{\tau }\) does not exceed \(1-\tau \). See, for example, Samworth and Wand (2010) for a precise mathematical definition of \({\widehat{p}}_{\tau }\).

The most commonly used estimator \({\widehat{p}}\) for HDR estimation is the kernel density estimator

$$\begin{aligned} {\widehat{p}}(x)=\frac{1}{nh}\sum _{i=1}^n K\left( \frac{x-x_i}{h}\right) \end{aligned}$$

where \(K\) is a kernel function and \(h>0\) is a bandwidth (see e.g. Wand and Jones 1995). Recently, Samworth and Wand (2010) devised an automatic rule for selection of \(h\) in the HDR estimation context. The R package hdrcde (Hyndman 2010) implements both HDR estimation and the Samworth-Wand bandwidth selector. Figure 8 shows 80% HDR estimate for the variable distance to coastline variable in the Sydney residential property prices data. The corresponding HDR grid of size 50 is shown at the base of the plot.

Fig. 8
figure 8

A HDR grid for the predictor variable distance to coastline in the Sydney real estate data example. The data are shown as red dots with vertical jittering to enhance visualization. The green curve is a kernel density estimate and the blue bars are the estimated 80% highest density region. The rug at the base of the plot is the corresponding HDR grid of size 50

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dubossarsky, E., Friedman, J.H., Ormerod, J.T. et al. Wavelet-based gradient boosting. Stat Comput 26, 93–105 (2016). https://doi.org/10.1007/s11222-014-9474-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-014-9474-0

Keywords

Navigation