doi:10.1016/S0167-9473(03)00042-2
Copyright © 2003 Elsevier B.V. All rights reserved.
Implementing the Bianco and Yohai estimator for logistic regression
a Department of Applied Economics, Katholieke Universiteit Leuven, Naamsestraat 69, B-3000, Leuven, Belgium
b Department of Mathematics, University of Liège (B37), Grande Traverse 12, B-4000, Liège, Belgium
Received 31 July 2002;
revised 19 February 2003.
Available online 20 March 2003.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
A fast and stable algorithm to compute a highly robust estimator for the logistic regression model is proposed. A criterium for the existence of this estimator at finite samples is derived and the problem of the selection of an appropriate loss function is discussed. It is shown that the loss function can be chosen such that the robust estimator exists if and only if the maximum likelihood estimator exists. The advantages of using a weighted version of this estimator are also considered. Simulations and an example give further support for the good performance of the implemented estimators.
Author Keywords: Robust estimation; Influence function; Logistic regression; Maximum likelihood
Fig. 1. φ functions for the ML (left) and BY (right) estimators.
Fig. 2. ψ functions for the ML and BY estimators.
Fig. 3. φ and ψ functions for the BY estimator with the newly proposed ρ function (
2.3).
Fig. 4. Influence function for (a) the ML estimator and (b) the BY estimator of the first slope parameter at a logistic regression model with regressors from a standard bivariate normal distribution.
Fig. 5. Simulated sample of size
n=100, where
p=2, and with six extreme bad leverage points added. The explicative variables (
xi1,
xi2) are indicated by the corresponding value of
yi, and the solid line gives the discriminating hyperplane.
Table 1. Biases and mean-squared errors of the estimators ML, WML, BY, WBY, CUBIF and MALLOWS over 1000 simulations, in an uncontaminated situation (I), with 5% of intermediate contamination (II) and with 5% of extreme contamination (III)

Table 2. Estimated parameters, standard errors (SE), and goodness-of-fit measures for the skin data set, for the ML, WML, BY, WBY, CUBIF and MALLOWS estimator

Table 3. Estimated parameters for the skin data set based on the BY estimator for different values of the tuning constant d
