skip to main content
research-article

Robust Decentralized Low-Rank Matrix Decomposition

Authors Info & Claims
Published:02 May 2016Publication History
Skip Abstract Section

Abstract

Low-rank matrix approximation is an important tool in data mining with a wide range of applications, including recommender systems, clustering, and identifying topics in documents. When the matrix to be approximated originates from a large distributed system, such as a network of mobile phones or smart meters, a challenging problem arises due to the strongly conflicting yet essential requirements of efficiency, robustness, and privacy preservation. We argue that although collecting sensitive data in a centralized fashion may be efficient, it is not an option when considering privacy and efficiency at the same time. Thus, we do not allow any sensitive data to leave the nodes of the network. The local information at each node (personal attributes, documents, media ratings, etc.) defines one row in the matrix. This means that all computations have to be performed at the edge of the network. Known parallel methods that respect the locality constraint, such as synchronized parallel gradient search or distributed iterative methods, require synchronized rounds or have inherent issues with load balancing, and thus they are not robust to failure. Our distributed stochastic gradient descent algorithm overcomes these limitations. During the execution, any sensitive information remains local, whereas the global features (e.g., the factor model of movies) converge to the correct value at all nodes. We present a theoretical derivation and a thorough experimental evaluation of our algorithm. We demonstrate that the convergence speed of our method is competitive while not relying on synchronization and being robust to extreme and realistic failure scenarios. To demonstrate the feasibility of our approach, we present trace-based simulations, real smartphone user behavior analysis, and tests over real movie recommender system data.

References

  1. Dimitris Achlioptas and Frank McSherry. 2005. On spectral learning of mixtures of distributions. In Proceedings of the 18th Annual Conference on Learning Theory (COLT’05). 458--469. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Waseem Ahmad and Ashfaq Khokhar. 2006. Secure aggregation in large scale overlay networks. In Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM’06). DOI:http://dx.doi.org/10.1109/GLOCOM.2006.315Google ScholarGoogle Scholar
  3. Ethem Alpaydin. 2010. Introduction to Machine Learning (2nd ed.). MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Yossi Azar, Amos Fiat, Anna R. Karlin, Frank McSherry, and Jared Saia. 2001. Spectral analysis of data. In Proceedings of the 33rd Symposium on Theory of Computing (STOC’01). 619--626. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Bache and M. Lichman. 2013. UCI Machine Learning Repository. Retrieved March 13, 2016, from http://archive.ics.uci.edu/ml.Google ScholarGoogle Scholar
  6. Austin R. Benson, David F. Gleich, and James Demmel. 2013. Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. arXiv:1301.1071 {cs.DC}.Google ScholarGoogle Scholar
  7. Arnaud Berlioz, Arik Friedman, Mohamed Ali Kaafar, Roksana Boreli, and Shlomo Berkovsky. 2015. Applying differential privacy to matrix factorization. In Proceedings of the 9th ACM Conference on Recommender Systems. ACM, New York, NY, 107--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michael W. Berry, Susan T. Dumais, and Gavin W. O’Brien. 1995. Using linear algebra for intelligent information retrieval. SIAM Review 37, 4, 573--595. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Árpád Berta, Vilmos Bilicki, and Márk Jelasity. 2014. Defining and understanding smartphone churn over the Internet: A measurement study. In Proceedings of the 14th IEEE International Conference on Peer-to-Peer Computing (P2P’14). IEEE, Los Alamitos, CA. DOI:http://dx.doi.org/10.1109/P2P.2014.6934317Google ScholarGoogle ScholarCross RefCross Ref
  10. Ken Birman, Márk Jelasity, Robert Kleinberg, and Edward Tremel. 2015. Building a secure and privacy-preserving smart grid. ACM SIGOPS Operating Systems Review 49, 1, 131--136. DOI:http://dx.doi.org/10.1145/2723872.2723891 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Y. Ng, and Kunle Olukotun. 2007. Map-reduce for machine learning on multicore. In Advances in Neural Information Processing Systems 19 (NIPS 2006). 281--288.Google ScholarGoogle Scholar
  12. Fan Chung, Linyuan Lu, and Van Vu. 2003. Eigenvalues of random power law graphs. Annals of Combinatorics 7, 1, 21--33.Google ScholarGoogle ScholarCross RefCross Ref
  13. Gábor Danner and Márk Jelasity. 2015. Fully distributed privacy preserving mini-batch gradient descent learning. In Distributed Applications and Interoperable Systems. Lecture Notes in Computer Science, Vol. 9038. Springer, 30--44. DOI:http://dx.doi.org/10.1007/978-3-319-19129-4_3Google ScholarGoogle Scholar
  14. P. Drineas, A. Frieze, R. Kannan, S. Vempala, and V. Vinay. 2004. Clustering large graphs via the singular value decomposition. Machine Learning 56, 1--3, 9--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Petros Drineas, Ravi Kannan, and Michael W. Mahoney. 2006. Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing 36, 1, 158--183. DOI:http://dx.doi.org/10.1137/S0097539704442696 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Petros Drineas, Iordanis Kerenidis, and Prabhakar Raghavan. 2002. Competitive recommendation systems. In Proceedings of the 34th Symposium on Theory of Computing (STOC’02). 82--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Cynthia Dwork. 2011. A firm foundation for private data analysis. Communications of the ACM 54, 1, 86--95. DOI:http://dx.doi.org/10.1145/1866739.1866758 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Rainer Gemulla, Erik Nijkamp, Peter J. Haas, and Yannis Sismanis. 2011. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, New York, NY, 69--77. DOI:http://dx.doi.org/10.1145/2020408.2020426 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Alan Genz. 1999. Methods for generating random orthogonal matrices. In Monte Carlo and Quasi-Monte Carlo Methods, H. Niederreiter and J. Spanier (Eds.). Springer, 199--213.Google ScholarGoogle Scholar
  20. Genevieve Gorrell. 2006. Generalized Hebbian algorithm for incremental singular value decomposition in natural language processing. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL’06).Google ScholarGoogle Scholar
  21. Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012a. NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Transactions on Signal Processing 60, 6, 2882--2898. DOI:http://dx.doi.org/10.1109/TSP.2012.2190406 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012b. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Transactions on Neural Networks and Learning Systems 23, 7, 1087--1099. DOI:http://dx.doi.org/10.1109/TNNLS.2012.2197827Google ScholarGoogle ScholarCross RefCross Ref
  23. Zhenqi Huang, Sayan Mitra, and Nitin Vaidya. 2015. Differentially private distributed optimization. In Proceedings of the 2015 International Conference on Distributed Computing and Networking. ACM, New York, NY, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sibren Isaacman, Stratis Ioannidis, Augustin Chaintreau, and Margaret Martonosi. 2011. Distributed rating prediction in user generated content streams. In Proceedings of the 5th ACM Conference on Recommended Systems (RecSys’11). ACM, New York, NY, 69--76. DOI:http://dx.doi.org/10.1145/2043932.2043948 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ravindran Kannan, Hadi Salmasian, and Santosh Vempala. 2005. The spectral method for general mixture models. In Proceedings of the 18th Annual Conference on Learning Theory (COLT’05). 444--457. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. David Kempe and Frank McSherry. 2004. A decentralized algorithm for spectral analysis. In Proceedings of the 36th Symposium on Theory of Computing (STOC’04). ACM, New York, NY, 561--568. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jon Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 5, 604--632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Satish Babu Korada, Andrea Montanari, and Sewoong Oh. 2011. Gossip PCA. In Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’11) ACM, New York, NY, 209--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8, 30--37. DOI:http://dx.doi.org/10.1109/MC.2009.263 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Quoc Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg Corrado, Jeff Dean, and Andrew Ng. 2012. Building high-level features using large scale unsupervised learning. In Proceedings of the 29th International Conference on Machine Learning (ICML’12). 81--88.Google ScholarGoogle Scholar
  31. Yongjun Liao, Pierre Geurts, and Guy Leduc. 2010. Network distance prediction based on decentralized matrix factorization. In NETWORKING 2010. Lecture Notes in Computer Science, Vol. 6091. Springer, 15--26. DOI:http://dx.doi.org/10.1007/978-3-642-12963-6_2 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Qing Ling, Yangyang Xu, Wotao Yin, and Zaiwen Wen. 2012. Decentralized low-rank matrix completion. In Proceedings of the 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’12). 2925--2928. DOI:http://dx.doi.org/10.1109/ICASSP.2012.6288528Google ScholarGoogle ScholarCross RefCross Ref
  33. Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2010. GraphLab: A new parallel framework for machine learning. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.Google ScholarGoogle Scholar
  34. Frank McSherry. 2001. Spectral partitioning of random graphs. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS’01). 529--537. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Milena Mihail and Christos Papadimitriou. 2002. On the eigenvalue power law. In Randomization and Approximation Techniques in Computer Science. Lecture Notes in Computer Science, Vol. 2483. Springer, 254--262. DOI:http://dx.doi.org/10.1007/3-540-45726-7_20 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Alberto Montresor and Márk Jelasity. 2009. PeerSim: A scalable P2P simulator. In Proceedings of the 9th IEEE International Conference on Peer-to-Peer Computing. IEEE, Los Alamitos, CA, 99--100. DOI:http://dx.doi.org/10.1109/P2P.2009.5284506 extended abstract.Google ScholarGoogle ScholarCross RefCross Ref
  37. Valeria Nikolaenko, Stratis Ioannidis, Udi Weinsberg, Marc Joye, Nina Taft, and Dan Boneh. 2013. Privacy-preserving matrix factorization. In Proceedings of the 20th ACM Conference on Computer and Communications Security (CCS’13). ACM, New York, NY, 801--812. DOI:http://dx.doi.org/10.1145/2508859.2516751 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. T. Nis. 1999. JAMA: A Java Matrix Package. Retrieved March 13, 2016, from http://math.nist.gov/javanumerics/jama.Google ScholarGoogle Scholar
  39. Róbert Ormándi, István Hegedűs, and Márk Jelasity. 2013. Gossip learning with linear models on fully distributed data. Concurrency and Computation: Practice and Experience 25, 4, 556--571. DOI:http://dx.doi.org/10.1002/cpe.2858Google ScholarGoogle ScholarCross RefCross Ref
  40. Christos H. Papadimitriou, Hisao Tamaki, Prabhakar Raghavan, and Santosh Vempala. 2000. Latent semantic indexing: A probabilistic analysis. Journal of Computer and System Sciences 61, 2, 217--235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Fabio Petroni and Leonardo Querzoni. 2014. GASGD: Stochastic gradient descent for distributed asynchronous matrix completion via graph partitioning. In Proceedings of the 8th ACM Conference on Recommender Systems (RecSys’14). ACM, New York, NY, 241--248. DOI:http://dx.doi.org/10.1145/2645710.2645725 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl. 1994. GroupLens: An open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW’94). ACM, New York, NY, 175--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Jelle Roozenburg. 2006. Secure Decentralized Swarm Discovery in Tribler. Master’s Thesis. Parallel and Distributed Systems Group, Delft University of Technology. http://www.pds.ewi.tudelft.nl/∼epema/MSc-theses/MSc-thesis-Roozenburg.pdf.Google ScholarGoogle Scholar
  44. Roberto Roverso, Jim Dowling, and Márk Jelasity. 2013. Through the wormhole: Low cost, fresh peer sampling for the Internet. In Proceedings of the 13th IEEE International Conference on Peer-to-Peer Computing (P2P’13). IEEE, Los Alamitos, CA. DOI:http://dx.doi.org/10.1109/P2P.2013.6688707Google ScholarGoogle ScholarCross RefCross Ref
  45. Nathan Srebro and Tommi Jaakkola. 2003. Weighted low-rank approximations. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 720--727.Google ScholarGoogle Scholar
  46. Daniel Stutzbach and Reza Rejaie. 2006. Understanding churn in peer-to-peer networks. In Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement (IMC’06). ACM, New York, NY, 189--202. DOI:http://dx.doi.org/10.1145/1177080.1177105 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Norbert Tölgyesi and Márk Jelasity. 2009. Adaptive peer sampling with Newscast. In Euro-Par 2009 Parallel Processing. Lecture Notes in Computer Science, Vol. 5704. Springer, 523--534. DOI:http://dx.doi.org/10.1007/978-3-642-03869-3_50 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Yu-Xiang Wang, Stephen Fienberg, and Alex Smola. 2015. Privacy for free: Posterior sampling and stochastic gradient Monte Carlo. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 2493--2502.Google ScholarGoogle Scholar
  49. F. Yan, S. Sundaram, S. V. N. Vishwanathan, and Y. Qi. 2013. Distributed autonomous online learning: Regrets and intrinsic privacy-preserving properties. IEEE Transactions on Knowledge and Data Engineering 25, 11, 2483--2493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Martin A. Zinkevich, Alex Smola, Markus Weimer, and Lihong Li. 2010. Parallelized stochastic gradient descent. In Advances in Neural Information Processing Systems 23 (NIPS’10). 2595--2603.Google ScholarGoogle Scholar

Index Terms

  1. Robust Decentralized Low-Rank Matrix Decomposition

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Intelligent Systems and Technology
              ACM Transactions on Intelligent Systems and Technology  Volume 7, Issue 4
              Special Issue on Crowd in Intelligent Systems, Research Note/Short Paper and Regular Papers
              July 2016
              498 pages
              ISSN:2157-6904
              EISSN:2157-6912
              DOI:10.1145/2906145
              • Editor:
              • Yu Zheng
              Issue’s Table of Contents

              Copyright © 2016 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 2 May 2016
              • Accepted: 1 December 2015
              • Revised: 1 November 2015
              • Received: 1 March 2015
              Published in tist Volume 7, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader