Investigating the distribution of the value of travel time savings

https://doi.org/10.1016/j.trb.2005.09.007Get rights and content

Abstract

The distribution of the value of travel time savings (VTTS) is investigated employing various nonparametric techniques to a large dataset originating from a stated choice experiment. The data contain choices between a fast and more expensive alternative and a slow and less expensive alternative. Increasing the implicit price of time leads to an increased share of respondents who decline to pay to save time. But a significant proportion of respondents, 13%, remain willing to pay to save time at the highest price of time in the design. This means that the right tail of the VTTS distribution is not observed and hence the mean VTTS cannot be evaluated without additional assumptions. When socio-economic and situational variables are introduced into a semiparametric model it becomes possible to accept that the whole VTTS distribution is observed.

Sixteen candidates for parametric VTTS distributions are investigated. Some distributions are simply not able to fit to the empirical distribution while others have very long tails. The mean VTTS is shown to be extremely dependent on the choice of parametric distribution. Requiring that the parametric distribution should be accepted against the nonparametric alternative narrows the field down to five candidates. One of the distributions tested here has also support within the observed range such that the estimated VTTS is bounded by the data.

Introduction

The value of travel time savings (VTTS) is arguably the single most important number in transport economics. Travel time savings are usually a dominant user benefit and hence usually constitute a very large share of total benefits in cost benefit analyses of infrastructure projects (Hensher, 2001a, Mackie et al., 2001) and cost benefit analyses are in turn a main part of the information provided to decision makers on new projects. It is not only the average VTTS that is important but also its distribution, e.g., when forecasting market share for a tolled road (Hensher and Goodwin, 2004).

The VTTS is usually inferred from experimental data using discrete choice models (Gunn, 2000). Recently the mixed logit has become the model of choice, since it allows for considerable improvements over the logit model in both realism and ability to describe the data (Train, 2003). The mixed logit model works by allowing certain parameters in the logit model to vary randomly in the population according to some parametric distribution. The (hyper)parameters for this mixing distribution can then be estimated.

There remains, however, the problem of deciding which mixing distribution to specify; some common choices are normal, lognormal, beta, Johnson’s SB or triangular (Hess et al., 2005). The choice of mixing distribution can have considerable impact on results (Hensher, 2001b, also Heckman and Singer, 1984), but little evidence exists to guide this choice.

This paper avoids strong prior distributional assumptions by employing nonparametric and semiparametric techniques to estimate the mean VTTS and study the distribution of the VTTS in a dataset originating from a stated choice experiment. The analysis is facilitated by a particularly simple experimental design where respondents in effect state whether their VTTS is higher or lower than a bid value: the trade-off between time and money set by the design. From this information it is possible to estimate nonparametrically the cumulative distribution of the VTTS in the sample with convincing results.

It is, however, only possible to estimate the cumulative VTTS distribution function up to the maximum bid. With the current data it is found that the maximum bid corresponds to about the 87% quantile of the cumulative VTTS distribution, the right tail is not observed. But it is necessary to know the whole distribution in order to estimate the mean VTTS. Varying assumptions about the unobserved tail can lead to arbitrarily high estimates of the mean VTTS. The paper shows this by fitting a number of parametric distributions to the data. Some of these distributions are accepted by a specification test against the nonparametric alternative, meaning that these would be suitable for predicting choices over the range of bids.

The paper goes on to specify a semiparametric model describing individual VTTS as a random component and a systematic component depending on socio-economic and situational variables. With this model it becomes possible to accept that the whole distribution of the random VTTS component is observed. Again, a range of parametric distributions are tested against the nonparametric alternative.

There has been little application of nonparametric and semiparametric estimators in the transport literature. Hensher and Greene (2003) stress the importance of the issue of selecting parameter distributions in mixed logit modeling and suggest using a kernel density estimator to parameter estimates after applying a jackknife procedure to a multinomial logit model. This method allows one to visually inspect the distribution of parameters, however, without confidence bands on the estimated densities.1 Their findings further suggest that a wide range for the variables in a stated choice design is preferable, something which the results here also indicate.

Introductions to nonparametric and semiparametric econometrics are given, e.g., in Yatchew, 2003, Pagan and Ullah, 1999, Härdle, 1990. Consider the regression model y = f(x) + ε, where the point of interest is to determine the function f. Classical OLS regression assumes that the function f is linear in parameters and estimates these. Nonparametric kernel smoothers avoid such parametric assumptions by instead averaging the y’s in a neighborhood of each x. The average of the y’s then converges to f(x) under weak assumptions. This is a data hungry procedure, especially as the number of dimensions in x grows. Therefore semiparametric methods have been developed as a hybrid between parametric and nonparametric regression where just some of the relationship is modeled nonparametrically.

The results presented here rely on a two-step estimation procedure suggested by Lewbel et al. (2002).2 In the first step the Klein and Spady (1993) estimator is used to estimate parameters in a linear index binary choice model without assuming a distribution for the error terms. In the second step the distribution of the error term is estimated. Details are given below in the section on semiparametric methodology.

The paper is organized as follows. Section 2 sets out the methodology. Section 3 presents a large dataset collected in a recent Danish value of travel time study, which is used in Section 4 to investigate the stochastic distribution of the value of time without using covariates. Section 5 introduces covariates to explain the distribution and estimate the mean VTTS and Section 6 concludes.

Section snippets

Transformation of the data to contingent valuation format

Our data come from a stated choice exercise where respondents are presented with binary choice situations (Burge et al., 2004). The data are transformed into a format similar to contingent valuation data, where we observe the choice y = 1 if a random latent w is smaller than a bid or boundary VTTS labeled v, which is set by experimental design. Alternatives 1 and 2 differ by travel time ti and cost ci only; they are otherwise the same. The conventional model for such data is the binary (mixed)

Data

The data are extracted from a recent Danish value of time study undertaken for the Danish Ministry of Transport by the Danish consultancy TetraPlan in joint venture with Rand Europe and Gallup. Stated preference interviews have been conducted both via the Internet and computer aided personal interviews. Business travel is excluded. The stated preference design is discussed in Burge et al. (2004). Here only data from one experiment are used. From the data, a sample of 2197 interviews is selected

Nonparametric regression

First consider the nonparametric regression of y on v shown in Fig. 4.7 Before the regression, v is transformed to logs; this affects the regression through the bandwidth such that in effect the bandwidth is larger for large v where data are sparser.8 In the estimation, data have been rescaled to lie within the unit interval and a

A semiparametric model

We have seen that the data do not allow identification of the whole distribution of w since the range of bids v does not extend over the range of VTTS in w. In this section the model is expanded by the inclusion of various covariates in a semiparametric model combining some parameterization with an additive nonparametric error. As indicated in Section 2, this is achieved by specifying a model where log(w) is split into a linear index plus an independent error, i.e. log(w) = βx + u, where u is

Conclusions

This paper has demonstrated the application of various nonparametric and semiparametric methods to the estimation of the distribution of the value of travel time savings and further to the estimation of the mean of the distribution. It is shown possible to estimate the VTTS distribution quite precisely with narrow confidence bands. A number of parametric distributions are accepted for the current data against the nonparametric alternative. However, when using no covariates except for the

References (26)

  • Fosgerau, M., 2005. Unit income elasticity of the value of travel time savings. In: Proceedings of the European...
  • H.F. Gunn

    An introduction to the valuation of travel-time savings and losses

  • J. Heckman et al.

    A method for minimizing the impact of distributional assumptions in econometric models for duration data

    Econometrica

    (1984)
  • Cited by (137)

    View all citing articles on Scopus
    View full text