Skip to main content
Log in

Tuning the Learning Rate for Stochastic Variational Inference

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Stochastic variational inference (SVI) can learn topic models with very big corpora. It optimizes the variational objective by using the stochastic natural gradient algorithm with a decreasing learning rate. This rate is crucial for SVI; however, it is often tuned by hand in real applications. To address this, we develop a novel algorithm, which tunes the learning rate of each iteration adaptively. The proposed algorithm uses the Kullback–Leibler (KL) divergence to measure the similarity between the variational distribution with noisy update and that with batch update, and then optimizes the learning rates by minimizing the KL divergence. We apply our algorithm to two representative topic models: latent Dirichlet allocation and hierarchical Dirichlet process. Experimental results indicate that our algorithm performs better and converges faster than commonly used learning rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Jin F, Huang M L, Zhu X Y. Guided structure-aware review summarization. Journal of Computer Science and Technology, 2011, 26(4): 676–684.

    Article  Google Scholar 

  2. Li P, Wang B, Jin W. Improving Web document clustering through employing user-related tag expansion techniques. Journal of Computer Science and Technology, 2012, 27(3): 554–566.

    Article  Google Scholar 

  3. Newman D, Asuncion A, Smyth P, Welling M. Distributed algorithms for topic models. Journal of Machine Learning Research, 2009, 10: 1801–1828.

    MathSciNet  MATH  Google Scholar 

  4. Yan F, Xu N, Qi Y. Parallel inference for latent Dirichlet allocation on graphics processing units. In Proc. the 23rd NIPS, Dec. 2009, pp.2134-2142.

  5. Liu Z, Zhang Y, Chang E, Sun M. PLDA+: Parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): Article No. 26.

  6. AlSumait L, Barbara D, Domeniconi C. On-line LDA: Adaptive topic models for mining text streams with applications to topic detection and tracking. In Proc. the 8th ICDM, Dec. 2008, pp.3-12.

  7. Yao L, Mimno D, McCallum A. Efficient methods for topic model inference on streaming document collections. In Proc. the 15th SIGKDD, June 28-July 1, 2009, pp.937-945.

  8. Hoffman M D, Blei D M. Online learning for latent Dirichlet allocation. In Proc. the 24th NIPS, Dec. 2010.

  9. Mimno D, Hoffman M D, Blei D M. Sparse stochastic inference for latent Dirichlet allocation. In Proc. the 29th ICML, June 27-July 3, 2012, pp.1599-1606.

  10. Wang C, Chen X, Smola A, Xing E P. Variance reduction for stochastic gradient optimization. In Proc. the 27th NIPS, Dec. 2013.

  11. Patterson S, Teh Y W. Stochastic gradient riemannian langevin dynamics on probability simplex. In Proc. the 27th NIPS, Dec. 2013.

  12. Zeng J, Liu Z Q, Cao X Q. Online belief propagation for topic modeling. arXiv.1210.2179, June 2013. http://arxiv.org/pdf/1210.2179.pdf, July 2015.

  13. Hoffman M D, Blei D M,Wang C, Paisley J. Stochastic variational inference. Journal of Machine Learning Research, 2013, 14(1): 1303–1347.

    MathSciNet  MATH  Google Scholar 

  14. Ouyang J, Lu Y, Li X. Momentum online LDA for largescale datasets. In Proc. the 21st ECAI, August 2014, pp.1075-1076.

  15. Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2013, 3: 993–1022.

    MATH  Google Scholar 

  16. Teh Y W, Jordan M I, Beal M J, Blei D M. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 2006, 101(476): 1566–1581.

    Article  MathSciNet  MATH  Google Scholar 

  17. Wang C, Paisley J, Blei D M. Online variational inference for the hierarchical Dirichlet process. In Proc. the 14th AISTATS, April 2011, pp.752-760.

  18. Johnson M J, Willsky A S. Stochastic variational inference for Bayesian time series models. In Proc. the 31st ICML, June 2014, pp.3872-3880.

  19. Hernandez-Lobato J M, Houlsby N, Ghahramani Z. Stochastic inference for scalable probabilistic modeling of binary matrices. In Proc. the 31st ICML, June 2014, pp.1693-1710.

  20. Robbins H, Monro S. A stochastic approximation method. The Annals of Mathematical Statistics, 1951, 22(3): 400–407.

    Article  MathSciNet  MATH  Google Scholar 

  21. Ranganath R, Wang C, Blei D M, Xing E P. An adaptive learning rate for stochastic variational inference. In Proc. the 30th ICML, June 2013, pp.298-306.

  22. Amari S. Natural gradient words efficiently in learning. Neural Computation, 1998, 10(2): 251–276.

    Article  MathSciNet  Google Scholar 

  23. Schaul T, Zhang S, LeCun Y. No more pesky learning rates. In Proc. the 30th ICML, June 2013, pp.343-351.

  24. Nemirovski A, Juditsky A, Lan G, Shapiro A. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 2009, 19(4): 1574–1609.

    Article  MathSciNet  MATH  Google Scholar 

  25. Collober R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 2011, 12: 2493–2537.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ji-Hong Ouyang.

Additional information

This work was supported by the National Natural Science Foundation of China under Grant Nos. 61170092, 61133011 and 61103091.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, XM., Ouyang, JH. Tuning the Learning Rate for Stochastic Variational Inference. J. Comput. Sci. Technol. 31, 428–436 (2016). https://doi.org/10.1007/s11390-016-1636-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-016-1636-4

Keywords

Navigation