research-article

Robust Decentralized Low-Rank Matrix Decomposition

Authors:
István Hegedűs

University of Szeged, Hungary

University of Szeged, Hungary
View Profile

,
Árpád Berta

University of Szeged, Hungary

University of Szeged, Hungary
View Profile

,
Levente Kocsis

Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI)

Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI)
View Profile

,
András A. Benczúr

Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI)

Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI)
View Profile

,
Márk Jelasity

University of Szeged, and MTA-SZTE Research Group on AI, Hungary

University of Szeged, and MTA-SZTE Research Group on AI, Hungary
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 7 Issue 4Article No.: 62pp 1–24https://doi.org/10.1145/2854157

Published:02 May 2016Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Low-rank matrix approximation is an important tool in data mining with a wide range of applications, including recommender systems, clustering, and identifying topics in documents. When the matrix to be approximated originates from a large distributed system, such as a network of mobile phones or smart meters, a challenging problem arises due to the strongly conflicting yet essential requirements of efficiency, robustness, and privacy preservation. We argue that although collecting sensitive data in a centralized fashion may be efficient, it is not an option when considering privacy and efficiency at the same time. Thus, we do not allow any sensitive data to leave the nodes of the network. The local information at each node (personal attributes, documents, media ratings, etc.) defines one row in the matrix. This means that all computations have to be performed at the edge of the network. Known parallel methods that respect the locality constraint, such as synchronized parallel gradient search or distributed iterative methods, require synchronized rounds or have inherent issues with load balancing, and thus they are not robust to failure. Our distributed stochastic gradient descent algorithm overcomes these limitations. During the execution, any sensitive information remains local, whereas the global features (e.g., the factor model of movies) converge to the correct value at all nodes. We present a theoretical derivation and a thorough experimental evaluation of our algorithm. We demonstrate that the convergence speed of our method is competitive while not relying on synchronization and being robust to extreme and realistic failure scenarios. To demonstrate the feasibility of our approach, we present trace-based simulations, real smartphone user behavior analysis, and tests over real movie recommender system data.

References

Dimitris Achlioptas and Frank McSherry. 2005. On spectral learning of mixtures of distributions. In Proceedings of the 18th Annual Conference on Learning Theory (COLT’05). 458--469. Google ScholarDigital Library
Waseem Ahmad and Ashfaq Khokhar. 2006. Secure aggregation in large scale overlay networks. In Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM’06). DOI:http://dx.doi.org/10.1109/GLOCOM.2006.315Google Scholar
Ethem Alpaydin. 2010. Introduction to Machine Learning (2nd ed.). MIT Press, Cambridge, MA. Google ScholarDigital Library
Yossi Azar, Amos Fiat, Anna R. Karlin, Frank McSherry, and Jared Saia. 2001. Spectral analysis of data. In Proceedings of the 33rd Symposium on Theory of Computing (STOC’01). 619--626. Google ScholarDigital Library
K. Bache and M. Lichman. 2013. UCI Machine Learning Repository. Retrieved March 13, 2016, from http://archive.ics.uci.edu/ml.Google Scholar
Austin R. Benson, David F. Gleich, and James Demmel. 2013. Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. arXiv:1301.1071 {cs.DC}.Google Scholar
Arnaud Berlioz, Arik Friedman, Mohamed Ali Kaafar, Roksana Boreli, and Shlomo Berkovsky. 2015. Applying differential privacy to matrix factorization. In Proceedings of the 9th ACM Conference on Recommender Systems. ACM, New York, NY, 107--114. Google ScholarDigital Library
Michael W. Berry, Susan T. Dumais, and Gavin W. O’Brien. 1995. Using linear algebra for intelligent information retrieval. SIAM Review 37, 4, 573--595. Google ScholarDigital Library
Árpád Berta, Vilmos Bilicki, and Márk Jelasity. 2014. Defining and understanding smartphone churn over the Internet: A measurement study. In Proceedings of the 14th IEEE International Conference on Peer-to-Peer Computing (P2P’14). IEEE, Los Alamitos, CA. DOI:http://dx.doi.org/10.1109/P2P.2014.6934317Google ScholarCross Ref
Ken Birman, Márk Jelasity, Robert Kleinberg, and Edward Tremel. 2015. Building a secure and privacy-preserving smart grid. ACM SIGOPS Operating Systems Review 49, 1, 131--136. DOI:http://dx.doi.org/10.1145/2723872.2723891 Google ScholarDigital Library
Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Y. Ng, and Kunle Olukotun. 2007. Map-reduce for machine learning on multicore. In Advances in Neural Information Processing Systems 19 (NIPS 2006). 281--288.Google Scholar
Fan Chung, Linyuan Lu, and Van Vu. 2003. Eigenvalues of random power law graphs. Annals of Combinatorics 7, 1, 21--33.Google ScholarCross Ref
Gábor Danner and Márk Jelasity. 2015. Fully distributed privacy preserving mini-batch gradient descent learning. In Distributed Applications and Interoperable Systems. Lecture Notes in Computer Science, Vol. 9038. Springer, 30--44. DOI:http://dx.doi.org/10.1007/978-3-319-19129-4_3Google Scholar
P. Drineas, A. Frieze, R. Kannan, S. Vempala, and V. Vinay. 2004. Clustering large graphs via the singular value decomposition. Machine Learning 56, 1--3, 9--33. Google ScholarDigital Library
Petros Drineas, Ravi Kannan, and Michael W. Mahoney. 2006. Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing 36, 1, 158--183. DOI:http://dx.doi.org/10.1137/S0097539704442696 Google ScholarDigital Library
Petros Drineas, Iordanis Kerenidis, and Prabhakar Raghavan. 2002. Competitive recommendation systems. In Proceedings of the 34th Symposium on Theory of Computing (STOC’02). 82--90. Google ScholarDigital Library
Cynthia Dwork. 2011. A firm foundation for private data analysis. Communications of the ACM 54, 1, 86--95. DOI:http://dx.doi.org/10.1145/1866739.1866758 Google ScholarDigital Library
Rainer Gemulla, Erik Nijkamp, Peter J. Haas, and Yannis Sismanis. 2011. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, New York, NY, 69--77. DOI:http://dx.doi.org/10.1145/2020408.2020426 Google ScholarDigital Library
Alan Genz. 1999. Methods for generating random orthogonal matrices. In Monte Carlo and Quasi-Monte Carlo Methods, H. Niederreiter and J. Spanier (Eds.). Springer, 199--213.Google Scholar
Genevieve Gorrell. 2006. Generalized Hebbian algorithm for incremental singular value decomposition in natural language processing. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL’06).Google Scholar
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012a. NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Transactions on Signal Processing 60, 6, 2882--2898. DOI:http://dx.doi.org/10.1109/TSP.2012.2190406 Google ScholarDigital Library
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012b. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Transactions on Neural Networks and Learning Systems 23, 7, 1087--1099. DOI:http://dx.doi.org/10.1109/TNNLS.2012.2197827Google ScholarCross Ref
Zhenqi Huang, Sayan Mitra, and Nitin Vaidya. 2015. Differentially private distributed optimization. In Proceedings of the 2015 International Conference on Distributed Computing and Networking. ACM, New York, NY, 4. Google ScholarDigital Library
Sibren Isaacman, Stratis Ioannidis, Augustin Chaintreau, and Margaret Martonosi. 2011. Distributed rating prediction in user generated content streams. In Proceedings of the 5th ACM Conference on Recommended Systems (RecSys’11). ACM, New York, NY, 69--76. DOI:http://dx.doi.org/10.1145/2043932.2043948 Google ScholarDigital Library
Ravindran Kannan, Hadi Salmasian, and Santosh Vempala. 2005. The spectral method for general mixture models. In Proceedings of the 18th Annual Conference on Learning Theory (COLT’05). 444--457. Google ScholarDigital Library
David Kempe and Frank McSherry. 2004. A decentralized algorithm for spectral analysis. In Proceedings of the 36th Symposium on Theory of Computing (STOC’04). ACM, New York, NY, 561--568. Google ScholarDigital Library
Jon Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 5, 604--632. Google ScholarDigital Library
Satish Babu Korada, Andrea Montanari, and Sewoong Oh. 2011. Gossip PCA. In Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’11) ACM, New York, NY, 209--220. Google ScholarDigital Library
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8, 30--37. DOI:http://dx.doi.org/10.1109/MC.2009.263 Google ScholarDigital Library
Quoc Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg Corrado, Jeff Dean, and Andrew Ng. 2012. Building high-level features using large scale unsupervised learning. In Proceedings of the 29th International Conference on Machine Learning (ICML’12). 81--88.Google Scholar
Yongjun Liao, Pierre Geurts, and Guy Leduc. 2010. Network distance prediction based on decentralized matrix factorization. In NETWORKING 2010. Lecture Notes in Computer Science, Vol. 6091. Springer, 15--26. DOI:http://dx.doi.org/10.1007/978-3-642-12963-6_2 Google ScholarDigital Library
Qing Ling, Yangyang Xu, Wotao Yin, and Zaiwen Wen. 2012. Decentralized low-rank matrix completion. In Proceedings of the 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’12). 2925--2928. DOI:http://dx.doi.org/10.1109/ICASSP.2012.6288528Google ScholarCross Ref
Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2010. GraphLab: A new parallel framework for machine learning. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.Google Scholar
Frank McSherry. 2001. Spectral partitioning of random graphs. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS’01). 529--537. Google ScholarDigital Library
Milena Mihail and Christos Papadimitriou. 2002. On the eigenvalue power law. In Randomization and Approximation Techniques in Computer Science. Lecture Notes in Computer Science, Vol. 2483. Springer, 254--262. DOI:http://dx.doi.org/10.1007/3-540-45726-7_20 Google ScholarDigital Library
Alberto Montresor and Márk Jelasity. 2009. PeerSim: A scalable P2P simulator. In Proceedings of the 9th IEEE International Conference on Peer-to-Peer Computing. IEEE, Los Alamitos, CA, 99--100. DOI:http://dx.doi.org/10.1109/P2P.2009.5284506 extended abstract.Google ScholarCross Ref
Valeria Nikolaenko, Stratis Ioannidis, Udi Weinsberg, Marc Joye, Nina Taft, and Dan Boneh. 2013. Privacy-preserving matrix factorization. In Proceedings of the 20th ACM Conference on Computer and Communications Security (CCS’13). ACM, New York, NY, 801--812. DOI:http://dx.doi.org/10.1145/2508859.2516751 Google ScholarDigital Library
T. Nis. 1999. JAMA: A Java Matrix Package. Retrieved March 13, 2016, from http://math.nist.gov/javanumerics/jama.Google Scholar
Róbert Ormándi, István Hegedűs, and Márk Jelasity. 2013. Gossip learning with linear models on fully distributed data. Concurrency and Computation: Practice and Experience 25, 4, 556--571. DOI:http://dx.doi.org/10.1002/cpe.2858Google ScholarCross Ref
Christos H. Papadimitriou, Hisao Tamaki, Prabhakar Raghavan, and Santosh Vempala. 2000. Latent semantic indexing: A probabilistic analysis. Journal of Computer and System Sciences 61, 2, 217--235. Google ScholarDigital Library
Fabio Petroni and Leonardo Querzoni. 2014. GASGD: Stochastic gradient descent for distributed asynchronous matrix completion via graph partitioning. In Proceedings of the 8th ACM Conference on Recommender Systems (RecSys’14). ACM, New York, NY, 241--248. DOI:http://dx.doi.org/10.1145/2645710.2645725 Google ScholarDigital Library
Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl. 1994. GroupLens: An open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW’94). ACM, New York, NY, 175--186. Google ScholarDigital Library
Jelle Roozenburg. 2006. Secure Decentralized Swarm Discovery in Tribler. Master’s Thesis. Parallel and Distributed Systems Group, Delft University of Technology. http://www.pds.ewi.tudelft.nl/&sim;epema/MSc-theses/MSc-thesis-Roozenburg.pdf.Google Scholar
Roberto Roverso, Jim Dowling, and Márk Jelasity. 2013. Through the wormhole: Low cost, fresh peer sampling for the Internet. In Proceedings of the 13th IEEE International Conference on Peer-to-Peer Computing (P2P’13). IEEE, Los Alamitos, CA. DOI:http://dx.doi.org/10.1109/P2P.2013.6688707Google ScholarCross Ref
Nathan Srebro and Tommi Jaakkola. 2003. Weighted low-rank approximations. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 720--727.Google Scholar
Daniel Stutzbach and Reza Rejaie. 2006. Understanding churn in peer-to-peer networks. In Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement (IMC’06). ACM, New York, NY, 189--202. DOI:http://dx.doi.org/10.1145/1177080.1177105 Google ScholarDigital Library
Norbert Tölgyesi and Márk Jelasity. 2009. Adaptive peer sampling with Newscast. In Euro-Par 2009 Parallel Processing. Lecture Notes in Computer Science, Vol. 5704. Springer, 523--534. DOI:http://dx.doi.org/10.1007/978-3-642-03869-3_50 Google ScholarDigital Library
Yu-Xiang Wang, Stephen Fienberg, and Alex Smola. 2015. Privacy for free: Posterior sampling and stochastic gradient Monte Carlo. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 2493--2502.Google Scholar
F. Yan, S. Sundaram, S. V. N. Vishwanathan, and Y. Qi. 2013. Distributed autonomous online learning: Regrets and intrinsic privacy-preserving properties. IEEE Transactions on Knowledge and Data Engineering 25, 11, 2483--2493. Google ScholarDigital Library
Martin A. Zinkevich, Alex Smola, Markus Weimer, and Lihong Li. 2010. Parallelized stochastic gradient descent. In Advances in Neural Information Processing Systems 23 (NIPS’10). 2595--2603.Google Scholar

Index Terms

Recommendations

Low-Rank Matrix Approximation Using the Lanczos Bidiagonalization Process with Applications

Low-rank approximation of large and/or sparse matrices is important in many applications, and the singular value decomposition (SVD) gives the best low-rank approximations with respect to unitarily-invariant norms. In this paper we show that good low-rank ...
Read More
A New Privacy-Preserving Data Mining Method Using Non-negative Matrix Factorization and Singular Value Decomposition

The data analysis and mining is more and more powerful with the rapid growing data size. And publishing data for researchers is becoming more valuable. This process has an important problem: privacy protection. In recent decades, many methods for ...
Read More
A structured rank-revealing method for Sylvester matrix

We propose a fast algorithm for computing the numeric ranks of Sylvester matrices. Let S denote the Sylvester matrix and H denote the Hankel-like-Sylvester matrix. The algorithm is based on a fast Cholesky factorization of S^TS or H^TH and relies on a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 7, Issue 4
Special Issue on Crowd in Intelligent Systems, Research Note/Short Paper and Regular Papers
July 2016
498 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2906145
Editor:
Yu Zheng
Microsoft Research, China
Issue’s Table of Contents
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 May 2016
- Accepted: 1 December 2015
- Revised: 1 November 2015
- Received: 1 March 2015
Published in tist Volume 7, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Data mining
decentralized matrix factorization
decentralized recommender systems
online learning
privacy
singular value decomposition
stochastic gradient descent
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 433
  Total Downloads
- Downloads (Last 12 months)58
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Robust Decentralized Low-Rank Matrix Decomposition

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Low-Rank Matrix Approximation Using the Lanczos Bidiagonalization Process with Applications

A New Privacy-Preserving Data Mining Method Using Non-negative Matrix Factorization and Singular Value Decomposition

A structured rank-revealing method for Sylvester matrix