Skip to main content
Log in

PSVM: a preference-enhanced SVM model using preference data for classification

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Classification is an essential task in data mining, machine learning and pattern recognition areas. Conventional classification models focus on distinctive samples from different categories. There are fine-grained differences between data instances within a particular category. These differences form the preference information that is essential for human learning, and, in our view, could also be helpful for classification models. In this paper, we propose a preference-enhanced support vector machine (PSVM), that incorporates preference-pair data as a specific type of supplementary information into SVM. Additionally, we propose a two-layer heuristic sampling method to obtain effective preference-pairs, and an extended sequential minimal optimization (SMO) algorithm to fit PSVM. To evaluate our model, we use the task of knowledge base acceleration-cumulative citation recommendation (KBA-CCR) on the TREC-KBA-2012 dataset and seven other datasets from UCI, StatLib and mldata.org. The experimental results show that our proposed PSVM exhibits high performance with official evaluation metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Vapnik V, Vashist A, Pavlovitch N. Learning using hidden information (learning with teacher). In: Proceedings of International Joint Conference on Neural Networks, Atlanta, 2009. 3188–3195

    Google Scholar 

  2. Sharmanska V, Quadrianto N, Lampert C H. Learning to rank using privileged information. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), Sydney, 2013. 825–832

    Google Scholar 

  3. Wang Z, Gao T, Ji Q. Learning with hidden information using a max-margin latent variable model. In: Proceedings of the 22nd International Conference on Pattern Recognition (ICPR), Stockholm, 2014. 1389–1394

    Google Scholar 

  4. Feyereisl J, Kwak S, Son J, et al. Object localization based on structural SVM using privileged information. Adv Neural Inf Process Syst, 2014, 1: 208–216

    Google Scholar 

  5. Vladimir V N, Vapnik V. The nature of statistical learning theory. IEEE Trans Neural Netw, 1995, 8: 1564–1564

    MathSciNet  MATH  Google Scholar 

  6. Tsang I W, Kwok J T, Cheung P M. Core vector machines: fast SVM training on very large data sets. J Mach Learn Res, 2005, 6: 363–392

    MathSciNet  MATH  Google Scholar 

  7. Sun C Y, Mu C X, Li X M. A weighted LS-SVM approach for the identification of a class of nonlinear inverse systems. Sci China Ser F-Inf Sci, 2009, 52: 770–779

    Article  MathSciNet  MATH  Google Scholar 

  8. Yang T, Li Y F, Mahdavi M, et al. Nyström method vs random fourier features: a theoretical and empirical comparison. Adv Neural Inf Process Syst, 2012, 1: 476–484

    Google Scholar 

  9. Qu A P, Chen J M, Wang L W, et al. Segmentation of hematoxylin-eosin stained breast cancer histopathological images based on pixel-wise SVM classifier. Sci China Inf Sci, 2015, 58: 092105

    Article  Google Scholar 

  10. Pechyony D, Izmailov R, Vashist A, et al. SMO-style algorithms for learning using privileged information. In: Proceedings of the 2010 International Conference on Data Mining, Las Vegas, 2010. 235–241

    Google Scholar 

  11. Kuo T M, Lee C P, Lin C J. Large-scale kernel rankSVM. In: Proceedings of the SIAM International Conference on Data Mining. New York: ACM, 2014. 812–820

    Google Scholar 

  12. Herbrich R, Graepel T, Obermayer K. Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classfiers. Cambridge: MIT Press, 2000. 115–132

    Google Scholar 

  13. Cao Y, Xu J, Liu T Y, et al. Adapting RankingSVM to document retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2006. 186–193

    Google Scholar 

  14. Yu H, Kim Y, Hwang S. RV-SVM: an efficient method for learning ranking SVM. In: Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. BerlIn: Springer, 2009. 426–438

    Chapter  Google Scholar 

  15. Schohn G, Cohn D. Less is more: active learning with support vector machines. In: Proceedings of the 17th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2000. 839–846

    Google Scholar 

  16. Tong S, Koller D. Support vector machine active learning with applications to text classification. J Mach Learn Res, 2001, 2: 45–66

    MATH  Google Scholar 

  17. Brinker K. Incorporating diversity in active learning with support vector machines. In: Proceedings of the 20th International Conference on Machine Learning, Washington, 2003. 59–66

    Google Scholar 

  18. Brinker K. Active learning of label ranking functions. In: Proceedings of the 21st International Conference on Machine Learning. New York: ACM, 2004. 17

  19. Fürnkranz J, Hüllermeier E. Pairwise preference learning and ranking. In: Proceedings of the 14th European Conference on Machine Learning. Berlin: Springer, 2003. 145–156

    Google Scholar 

  20. Yu H. SVM selective sampling for ranking with application to data retrieval. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. New York: ACM, 2005. 354–363

    Google Scholar 

  21. Yu H. Selective sampling techniques for feedback-based data retrieval. Data Min Knowl Disc, 2011, 22: 1–30

    Article  MathSciNet  MATH  Google Scholar 

  22. Lin K Y, Jan T K, Lin H T. Data selection techniques for large-scale rank SVM. In: Proceedings of International Conference on Technologies and Applications of Artificial Intelligence (TAAI), Taipei, 2013. 25–30

    Google Scholar 

  23. Schölkopf B, Herbrich R, Smola A J. A generalized representer theorem. In: Proceedings of the 14th Annual Conference on Computational Learning Theory. London: Springer, 2001. 416–426

    Chapter  Google Scholar 

  24. John P. Fast training of support vector machines using sequential minimal optimization. Cambridge: MIT Press, 1999. 185–208

    Google Scholar 

  25. Fan R E, Chen P H, Lin C J. Working set selection using second order information for training support vector machines. J Mach Learn Res, 2005, 6: 1889–1918

    MathSciNet  MATH  Google Scholar 

  26. Robertson S E, Soboroff I. The Trec 2002 Filtering Track Report. Technical Report, In TREC’02. 2003

    Google Scholar 

  27. Wang J G, Song D D, Lin C Y, et al. Bit and Msra at Trec Kba Ccr Track 2013. Technical Report, DTIC Document. 2013

    Google Scholar 

  28. Wang J G, Liao L J, Song D D, et al. Resorting relevance evidences to cumulative citation recommendation for knowledge base acceleration. In: Proceedings of International Conference on Web-Age Information Management. BerlIn: Springer, 2015. 169–180

    Chapter  Google Scholar 

  29. Kjersten B, McNamee P. The Hltcoe Approach to the Trec 2012 Kba Track. Technical Report, In TREC’12. 2013

    Google Scholar 

  30. Liu X, Fang H. Entity Profile Based Approach in Automatic Knowledge Finding. Technical Report, In TREC’12. 2013

    Google Scholar 

  31. Balog K, Ramampiaro H, Takhirov N, et al. Multi-step classification approaches to cumulative citation recommendation. In: Proceedings of the 10th Conference on Open Research Areas in Information Retrieval. New York: ACM, 2013. 121–128

    Google Scholar 

Download references

Acknowledgments

This work was supported by National Key Research and Development Program of China (Grant No. 2016YFB1000902), National Basic Research Program of China (973 Program) (Grant No. 2013CB329 600), National Natural Science Foundation of China (Grant No. 61472040), and National Natural Science Basic Research Plan in Shaanxi Province of China (Grant No. 2016JM6082).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Dandan Song or Lejian Liao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, L., Song, D., Liao, L. et al. PSVM: a preference-enhanced SVM model using preference data for classification. Sci. China Inf. Sci. 60, 122103 (2017). https://doi.org/10.1007/s11432-016-9020-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-016-9020-4

Keywords

Navigation