skip to main content
10.1145/1348599.1348601acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

More bang for their bucks: assessing new features for online advertisers

Published:12 August 2007Publication History

ABSTRACT

Online search systems that display ads continually offer new features that advertisers can use to fine-tune and enhance their ad campaigns. An important question is whether a new feature actually helps advertisers. In an ideal world for statisticians, we would answer this question by running a statistically designed experiment. But that would require randomly choosing a set of advertisers and forcing them to use the feature, which is not realistic. Accordingly, in the real world, new features for advertisers are seldom evaluated with a traditional experimental protocol. Instead, customer service representatives select advertisers who are invited to be among the first to test a new feature (i.e., white-listed), and then each white-listed advertiser chooses whether or not to use the new feature. Neither the customer service representative nor the advertiser chooses at random.

This paper addresses the problem of drawing valid inferences from whitelist trials about the effects of new features on advertiser happiness. We are guided by three principles. First, statistical procedures for whitelist trials are likely to be applied in an automated way, so they should be robust to violations of modeling assumptions. Second, standard analysis tools should be preferred over custom-built ones, both for clarity and for robustness. Standard tools have withstood the test of time and have been thoroughly debugged. Finally, it should be easy to compute reliable confidence intervals for the estimator. We review an estimator that has all these attributes, allowing us to make valid inferences about the effects of a new feature on advertiser happiness. In the example in this paper, the new feature was introduced during the holiday shopping season, thereby further complicating the analysis.

References

  1. O. Ashenfelter. Using the longitudinal structure of earnings to estimate the effect of training programs. Review of Economics and Statistics, 67:648--660, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  2. O. Ashenfelter and D. Card. Estimating the effect of training programs on earnings. Review of Economics and Statistics, 60(1):47--57, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  3. R. Fisher. The Design of Experiments. Hafner Publishing Company, 1935.Google ScholarGoogle Scholar
  4. P. Holland. Statistics and causal inference (with discussion). Journal of the American Statistical Association, 81:945--970, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  5. D. Horvitz and D. Thompson. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47:663--685, 1952.Google ScholarGoogle ScholarCross RefCross Ref
  6. G. Imbens. Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Economics and Statistics, 86(1):4--29, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. K. Lunceford and M. Davidian. Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study. Statistics in Medicine, 23:2937--2960, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  8. D. McCaffrey, G. Ridgeway, and A. Morral. Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9(4):403--425, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Robins and A. Rotnitzky. Semiparametric efficiency in multivariate regression models with missing data. Journal of the American Statistical Association, 90:122--129, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  11. J. Robins, A. Rotnitzky, and L. Zhao. Analysis of semiparametric regression models with missing data. Journal of the American Statistical Association, 90:106--121, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  12. P. Rosenbaum and D. Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika, 70:41--55, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  13. D. Rubin. Estimating causal effects of treatments in randomized and non-randomized studies. Journal of Educational Psychology, 66:688--701, 1974.Google ScholarGoogle ScholarCross RefCross Ref
  14. A. Smith and C. Elkan. A bayesian network framework for reject inference. In Proceedings ACM SIGKDD, pages 286--295, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. More bang for their bucks: assessing new features for online advertisers

      Recommendations

      Reviews

      James Speybroeck

      This elegantly written paper follows the protocol of all worthwhile research. After a very thorough introduction, Lambert and Pregibon discuss randomized experiments in online search engines, as well as advantages and disadvantages of using "white-listed" advertisers, which are selected subsets of advertisers invited to be among the first to test a new feature. The authors address advertiser happiness in white-list studies through metrics, such as retention, and comparisons of pre-feature and post-feature spending behavior. First, the format of white-list trials is presented, followed by the method of selecting first users. The general framework for the study is then discussed. Lambert and Pregibon explain their rationale for propensity score matching. They discuss the usual methods of making causal inferences from observational data, methods involving direct outcome models and inverse propensity weighting; then, an alternative method is presented. They choose this new method, referred to as a "doubly robust" estimator. The developed module is made available to 600 advertisers over a period of 11 weeks. The estimator is then applied to the white-list trials. Outcomes considered are retention and logSpendRatio. The propensity score model and the direct outcome model are explained, followed by a discussion of doubly robust application estimates. Lambert and Pregibon conclude that the methods proposed can be applied, once pre-experiment characteristics of users are extracted from logs. The paper ends with a brief summary of other contributions and contributors to the field, and a brief bibliography. There is no doubt that, given the sophisticated background in mathematics and statistics, the interested reader will significantly benefit from this paper. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ADKDD '07: Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising
        August 2007
        75 pages
        ISBN:9781595938336
        DOI:10.1145/1348599
        • General Chair:
        • Ying Li

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 August 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate12of21submissions,57%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader