skip to main content
10.1145/2783258.2783322acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Fast and Robust Parallel SGD Matrix Factorization

Published:10 August 2015Publication History

ABSTRACT

Matrix factorization is one of the fundamental techniques for analyzing latent relationship between two entities. Especially, it is used for recommendation for its high accuracy. Efficient parallel SGD matrix factorization algorithms have been developed for large matrices to speed up the convergence of factorization. However, most of them are designed for a shared-memory environment thus fail to factorize a large matrix that is too big to fit in memory, and their performances are also unreliable when the matrix is skewed.

This paper proposes a fast and robust parallel SGD matrix factorization algorithm, called MLGF-MF, which is robust to skewed matrices and runs efficiently on block-storage devices (e.g., SSD disks) as well as shared-memory. MLGF-MF uses Multi-Level Grid File (MLGF) for partitioning the matrix and minimizes the cost for scheduling parallel SGD updates on the partitioned regions by exploiting partial match queries processing}. Thereby, MLGF-MF produces reliable results efficiently even on skewed matrices. MLGF-MF is designed with asynchronous I/O permeated in the algorithm such that CPU keeps executing without waiting for I/O to complete. Thereby, MLGF-MF overlaps the CPU and I/O processing, which eventually offsets the I/O cost and maximizes the CPU utility. Recent flash SSD disks support high performance parallel I/O, thus are appropriate for executing the asynchronous I/O.

From our extensive evaluations, MLGF-MF significantly outperforms (or converges faster than) the state-of-the-art algorithms in both shared-memory and block-storage environments. In addition, the outputs of MLGF-MF is significantly more robust to skewed matrices. Our implementation of MLGF-MF is available at http://dm.postech.ac.kr/MLGF-MF as executable files.

References

  1. R. M. Bell and Y. Koren. Lessons from the netflix prize challenge. SIGKDD Explor. Newsl., 9(2):75--79, Dec. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509--517, Sept. 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Dror, N. Koenigstein, Y. Koren, and M. Weimer. The Yahoo! Music Dataset and KDD-Cup'11. JMLR Workshop and Conference Proceedings, 18:3--18, 2012.Google ScholarGoogle Scholar
  4. V. Gaede and O. Günther. Multidimensional access methods. ACM Comput. Surv., 30(2):170--231, June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '11, pages 69--77. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Guttman. R-trees: A dynamic index structure for spatial searching. SIGMOD Rec., 14(2):47--57, June 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W.-S. Han, S. Lee, K. Park, J.-H. Lee, M.-S. Kim, J. Kim, and H. Yu. Turbograph: A fast parallel graph engine handling billion-scale graphs in a single pc. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '13, pages 77--85. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C.-J. Hsieh and I. S. Dhillon. Fast coordinate descent methods with variable selection for non-negative matrix factorization. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '11, pages 1064--1072. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, ICDM '08, pages 263--272. IEEE Computer Society, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Kiefer and J. Wolfowitz. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23:462--466, 1952.Google ScholarGoogle ScholarCross RefCross Ref
  11. Y. Koren. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '08, pages 426--434. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Kyrola, G. Blelloch, and C. Guestrin. Graphchi: Large-scale graph computation on just a pc. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), pages 31--46. USENIX, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J.-H. Lee, Y.-K. Lee, K.-Y. Whang, and I.-Y. Song. A region splitting strategy for physical database design of multidimensional file organizations. In Proceedings of the 23rd International Conference on Very Large Data Bases, VLDB '97, pages 416--425. Morgan Kaufmann Publishers Inc., 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York, 2nd edition, 2006.Google ScholarGoogle Scholar
  15. A. Papadopoulos, Y. Manolopoulos, Y. Theodoridis, and V. Tsotras. Grid file (and family). In Encyclopedia of Database Systems, pages 1279--1282. Springer US, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  16. B. Recht, C. Re, S. Wright, and F. Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems 24, pages 693--701. Curran Associates, Inc., 2011.Google ScholarGoogle Scholar
  17. R. L. Rivest. Partial-match retrieval algorithms. SIAM Journal on Computing, 5(1):19--50, 1976.Google ScholarGoogle ScholarCross RefCross Ref
  18. H. Robbins and S. Monro. A Stochastic Approximation Method. The Annals of Mathematical Statistics, 22(3):400--407, 1951.Google ScholarGoogle ScholarCross RefCross Ref
  19. J. T. Robinson. The k-d-b-tree: A search structure for large multidimensional dynamic indexes. In Proceedings of the 1981 ACM SIGMOD International Conference on Management of Data, SIGMOD '81, pages 10--18. ACM, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Y. Whang, S. W. Kim, and G. Wiederhold. Dynamic maintenance of data distribution for selectivity estimation. The VLDB Journal, 3(1):29--51, Jan. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. W. Xu, X. Liu, and Y. Gong. Document clustering based on non-negative matrix factorization. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR '03, pages 267--273. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H.-F. Yu, C.-J. Hsieh, S. Si, and I. Dhillon. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining, ICDM '12, pages 765--774. IEEE Computer Society, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H. Yun, H.-F. Yu, C.-J. Hsieh, S. Vishwanathan, and I. S. Dhillon. Nomad: Non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. In International Conference on Very Large Data Bases (VLDB), sep 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the netflix prize. In Proceedings of the 4th International Conference on Algorithmic Aspects in Information and Management, AAIM '08, pages 337--348. Springer-Verlag, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Zhuang, W.-S. Chin, Y.-C. Juan, and C.-J. Lin. A fast parallel sgd for matrix factorization in shared memory systems. In Proceedings of the 7th ACM Conference on Recommender Systems, RecSys '13, pages 249--256. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Zinkevich, M. Weimer, L. Li, and A. J. Smola. Parallelized stochastic gradient descent. In J. Lafferty, C. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2595--2603. Curran Associates, Inc., 2010.Google ScholarGoogle Scholar

Index Terms

  1. Fast and Robust Parallel SGD Matrix Factorization

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
      August 2015
      2378 pages
      ISBN:9781450336642
      DOI:10.1145/2783258

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 August 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader