skip to main content
10.1145/2063576.2063855acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Item categorization in the e-commerce domain

Published:24 October 2011Publication History

ABSTRACT

Hierarchical classification is a challenging problem yet bears a broad application in real-world tasks. Item categorization in the ecommerce domain is such an example. In a large-scale industrial setting such as eBay, a vast amount of items need to be categorized into a large number of leaf categories, on top of which a complex topic hierarchy is defined. Other than the scale challenges, item data is extremely sparse and skewed distributed over categories, and exhibits heterogeneous characteristics across categories. A common strategy for hierarchical classification is the "gates-and-experts" methods, where a high-level classification is made first (the gates), followed by a low-level distinction (the experts). In this paper, we propose to leverage domain-specific feature generation and modeling techniques to greatly enhance the classification accuracy of the experts. In particular, we innovatively derive features to encode various rich domain knowledge and linguistic hints, and then adapt a SVM-based model to distinguish several very confusing category groups appeared as the performance bottleneck of a currently deployed live system at eBay. We use illustrative examples and empirical results to demonstrate the effectiveness of our approach, particularly the merit of smartly designed domain-specific features.

References

  1. L. Cai and T. Hofmann. Hierarchical document categorization with support vector machines. In Proc. of the 13th ACM International Conference on Information and Knowledge Management(CIKM), pages 78--87, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. O. Dekel, J. Keshet, and Y. Singer. Large margin hierarchical classification. In Proc. of the 21st International Conference on Machine Learning(ICML), pages 27--34, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. T. Dumais and H. Chen. Hierarchical classification of web content. In Proc. of the ACM SIGIR Conference on Research and Development in Information Retrieval, pages 256--263, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. E. Fan, K. W. Chang, C. J. Hsieh, X. R. Wang, and C. J. Lin. Liblinear: A library for large linear classification. Journal of Machine Learning Research, pages 1871--1874, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Koller and M. Sahami. Hierarchically classifying docuemnts using very few words. In Proc. of the 14th International Conference on Machine Learning(ICML), pages 171--178, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys (CSUR), 34:1--47, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Shen, M. Somaiya, J. D. Ruvini, and N. Sundaresan. Large-scale hierarchical item categorization for e-commerce. eBay Research Labs Technical Report, 2011.Google ScholarGoogle Scholar
  8. A. S.Weigend, E. D.Wiener, and J. O. Pedersen. Exploiting hierarchy in text categorization. Information Retrieval, pages 193--216, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. V. N. Vapnik. Statistical learning theory, 1998.Google ScholarGoogle Scholar
  10. K. Weinberger and O. Chapelle. Large margin taxonomy embedding with an application to document categorization. Advances in Neural Information Processing Systems, pages 1737--1744, 2008.Google ScholarGoogle Scholar
  11. Y. Yang and X. Liu. A re-examination of text categorization methods. In Proc. of the ACM SIGIR Conference on Research and Development in Information Retrieval, pages 42--49, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Item categorization in the e-commerce domain

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
      October 2011
      2712 pages
      ISBN:9781450307178
      DOI:10.1145/2063576

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 October 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader