skip to main content
10.1145/2623330.2623676acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Class-distribution regularized consensus maximization for alleviating overfitting in model combination

Authors Info & Claims
Published:24 August 2014Publication History

ABSTRACT

In data mining applications such as crowdsourcing and privacy-preserving data mining, one may wish to obtain consolidated predictions out of multiple models without access to features of the data. Besides, multiple models usually carry complementary predictive information, model combination can potentially provide more robust and accurate predictions by correcting independent errors from individual models. Various methods have been proposed to combine predictions such that the final predictions are maximally agreed upon by multiple base models. Though this maximum consensus principle has been shown to be successful, simply maximizing consensus can lead to less discriminative predictions and overfit the inevitable noise due to imperfect base models. We argue that proper regularization for model combination approaches is needed to alleviate such overfitting effect. Specifically, we analyze the hypothesis spaces of several model combination methods and identify the trade-off between model consensus and generalization ability. We propose a novel model called Regularized Consensus Maximization (RCM), which is formulated as an optimization problem to combine the maximum consensus and large margin principles. We theoretically show that RCM has a smaller upper bound on generalization error compared to the version without regularization. Experiments show that the proposed algorithm outperforms a wide spectrum of state-of-the-art model combination methods on 11 tasks.

Skip Supplemental Material Section

Supplemental Material

p303-sidebyside.mp4

mp4

338.6 MB

References

  1. Acharya Ayan, Hruschka Eduardo, R., Ghosh Joydeep, Sarwar Badrul, and Ruvini Jean-David. Probabilistic combination of classifier and cluster ensembles for non-transductive learning. In SDM, 2013.Google ScholarGoogle Scholar
  2. Acharya Ayan, R. Hruschka Eduardo, Ghosh Joydeep, and Acharyya Sreangsu. An optimization framework for semi-supervised and transfer learning using multiple classifiers and clusterers. In CoRR, 2012.Google ScholarGoogle Scholar
  3. P. L. Bartlett. The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network. IEEE Trans. Inf. Theor., 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Asa Ben-Hur, David Horn, Hava T. Siegelmann, and Vladimir Vapnik. Support vector clustering. Journal of Machine Learning Research, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ralph Allan Bradley. Rank analysis of incomplete block designs: Ii. additional tables for the method of paired comparisons. Biometrika, 41(3/4):pp. 502--537, 1954.Google ScholarGoogle ScholarCross RefCross Ref
  7. Ralph Allan Bradley and Milton E. Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):pp. 324--345, 1952.Google ScholarGoogle ScholarCross RefCross Ref
  8. Sébastien Bubeck and Ulrike von Luxburg. Nearest neighbor clustering: A baseline method for consistent clustering with arbitrary objective functions. Journal of Machine Learning Research, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Xi Chen, Paul N. Bennett, Kevyn Collins-Thompson, and Eric Horvitz. Pairwise ranking aggregation in a crowdsourced setting. WSDM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Janez Dem\vsar. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. Kernel k-means: spectral clustering and normalized cuts. In SIGKDD, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra. Efficient projections onto the l1-ball for learning in high dimensions. ICML, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Wang Fei, Wang Xin, and Li Tao. Generalized cluster aggregation. In IJCAI, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xiaoli Zhang Fern and Carla E. Brodley. Solving cluster ensemble problems by bipartite graph partitioning. ICML, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jing Gao, Wei Fan, Deepak Turaga, Olivier Verscheure, Xiaoqiao Meng, Lu Su, and Jiawei Han. Consensus extraction from heterogeneous detectors to improve performance over network traffic anomaly detection. In INFOCOM, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  16. Jing Gao, Feng Liang, Wei Fan, Yizhou Sun, and Jiawei Han. Graph-based consensus maximization among multiple supervised and unsupervised models. In NIPS, 2009.Google ScholarGoogle Scholar
  17. Ralf Herbrich, Tom Minka, and Thore Graepel. Trueskill(tm): A bayesian skill rating system. NIPS.Google ScholarGoogle Scholar
  18. Tao Li and Chris Ding. Weighted Consensus Clustering. SDM, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  19. Tao Li, Chris Ding, and Michael I. Jordan. Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. ICDM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Xudong Ma, Ping Luo, Fuzhen Zhuang, Qing He, Zhongzhi Shi, and Zhiyong Shen. Combining supervised and unsupervised models via unconstrained probabilistic embedding. IJCAI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Meila Marina and Shortreed Susan. Regularized spectral learning. Journal of Machine Learning Research, 2006.Google ScholarGoogle Scholar
  22. Kaixiang Mo, Erheng Zhong, and Qiang Yang. Cross-task crowdsourcing. KDD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. Paugam-Moisy, A. Elisseeff, and Y. Guermeur. Generalization performance of multiclass discriminant models. In Neural Networks, 2000. IJCNN 2000, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Arun Rajkumar and Shivani Agarwal. A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In ICML, 2014.Google ScholarGoogle Scholar
  26. Rion Snow, Brendan O'Connor, Daniel Jurafsky, and Andrew Y. Ng. Cheap and fast--but is it good?: Evaluating non-expert annotations for natural language tasks. EMNLP, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Alexander Strehl and Joydeep Ghosh. Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ben Taskar, Carlos Guestrin, and Daphne Koller. Max-margin markov networks. In NIPS. 2004.Google ScholarGoogle Scholar
  29. Vladimir Vapnik. Statistical learning theory. Wiley, 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Hongjun Wang, Hanhuai Shan, and Arindam Banerjee. Bayesian cluster ensembles. In SDM, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  31. Naiyan Wang and Dit-Yan Yeung. Ensemble-based tracking: Aggregating crowdsourced structured time series data. In ICML, 2014.Google ScholarGoogle Scholar
  32. Pu Wang, Carlotta Domeniconi, and Kathryn Blackmond Laskey. Nonparametric bayesian clustering ensembles. In ECML PKDD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ruby C. Weng and Chih-Jen Lin. A bayesian approximation method for online ranking. J. Mach. Learn. Res., 12, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sihong Xie, Wei Fan, and Philip S. Yu. An iterative and re-weighting framework for rejection and uncertainty resolution in crowdsourcing. In SDM, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  35. Sihong Xie, Xiangnan Kong, Jing Gao, Wei Fan, and Philip S. Yu. Multilabel consensus classification. In ICDM, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  36. Yan Yan, Romer Rosales, Glenn Fung, and Jennifer Dy. Active learning from crowds. ICML, 2011.Google ScholarGoogle Scholar
  37. Jinfeng Yi, Tianbao Yang, Rong Jin, A.K. Jain, and M. Mahdavi. Robust ensemble clustering by matrix completion. ICDM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yi Zhang and Jeff Schneider. Maximum margin output coding. ICML, 2012.Google ScholarGoogle Scholar
  39. Jun Zhu, Amr Ahmed, and Eric P. Xing. Medlda: maximum margin supervised topic models for regression and classification. ICML, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Class-distribution regularized consensus maximization for alleviating overfitting in model combination

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2014
      2028 pages
      ISBN:9781450329569
      DOI:10.1145/2623330

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 August 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '14 Paper Acceptance Rate151of1,036submissions,15%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader