skip to main content

Paradoxes of probabilistic programming: and how to condition on events of measure zero with infinitesimal probabilities

Authors Info & Claims
Published:04 January 2021Publication History
Skip Abstract Section

Abstract

Abstract Probabilistic programming languages allow programmers to write down conditional probability distributions that represent statistical and machine learning models as programs that use observe statements. These programs are run by accumulating likelihood at each observe statement, and using the likelihood to steer random choices and weigh results with inference algorithms such as importance sampling or MCMC. We argue that naive likelihood accumulation does not give desirable semantics and leads to paradoxes when an observe statement is used to condition on a measure-zero event, particularly when the observe statement is executed conditionally on random data. We show that the paradoxes disappear if we explicitly model measure-zero events as a limit of positive measure events, and that we can execute these type of probabilistic programs by accumulating infinitesimal probabilities rather than probability densities. Our extension improves probabilistic programming languages as an executable notation for probability distributions by making it more well-behaved and more expressive, by allowing the programmer to be explicit about which limit is intended when conditioning on an event of measure zero.

References

  1. Nathanael L. Ackermann, Cameron E. Freer, and Daniel M. Roy. 2017. On computability and disintegration. Mathematical Structures in Computer Science 27, 8 ( 2017 ), 1287-1314. https://doi.org/10.1017/S0960129516000098 Google ScholarGoogle ScholarCross RefCross Ref
  2. Bob Carpenter, Andrew Gelman, Matthew Hofman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A Probabilistic Programming Language. Journal of Statistical Software, Articles 76, 1 ( 2017 ), 1-32. https://doi.org/10.18637/jss.v076.i01 Google ScholarGoogle ScholarCross RefCross Ref
  3. Joseph Chang and David Pollard. 1997. Conditioning as disintegration. Statistica Neerlandica 51, 3 ( 1997 ), 287-317. https://doi.org/10.1111/ 1467-9574. 00056 Google ScholarGoogle ScholarCross RefCross Ref
  4. Fredrik Dahlqvist and Dexter Kozen. 2020. Semantics of higher-order probabilistic programs with conditioning, In POPL. PACMPL. https://doi.org/10.1145/3371125 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Noah Goodman, Vikash K. Mansinghka, Daniel M. Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: a language for generative models. In UAI. 220-229. https://doi.org/10.5555/2969033.2969207 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In LICS. 1-12. https://doi.org/10.1109/LICS. 2017.8005137 Google ScholarGoogle ScholarCross RefCross Ref
  7. Jules Jacobs. 2020. Paradoxes of Probabilistic Programming: Artifact. https://doi.org/10.5281/zenodo.4075076 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Edwin Thompson Jaynes. 2003. Probability theory: The logic of science. Cambridge University Press, Cambridge.Google ScholarGoogle Scholar
  9. Brooks Paige, Frank Wood, Arnaud Doucet, and Yee Whye Teh. 2014. Asynchronous Anytime Sequential Monte Carlo. In Advances in Neural Information Processing Systems 27. Curran Associates, Inc., 3410-3418.Google ScholarGoogle Scholar
  10. Chung-Chieh Shan and Norman Ramsey. 2017. Exact Bayesian inference by symbolic disintegration, In POPL. PACMPL, 130-144. https://doi.org/10.1145/3009837.3009852 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sam Staton. 2017. Commutative Semantics for Probabilistic Programming. In Proceedings of the 26th European Symposium on Programming Languages and Systems-Volume 10201. Springer-Verlag, Berlin, Heidelberg, 855-879. https://doi.org/10. 1007/978-3-662-54434-1_32 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. David Tolpin, Jan-Willem van de Meent, Brooks Paige, and Frank Wood. 2015. Output-Sensitive Adaptive MetropolisHastings for Probabilistic Programs. In Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, Vol. 9285. 311-326. https://doi.org/10.1007/978-3-319-23525-7_19 Google ScholarGoogle ScholarCross RefCross Ref
  13. Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An Introduction to Probabilistic Programming. arXiv:arXiv: 1809.10756Google ScholarGoogle Scholar
  14. John von Neumann. 1951. Various Techniques Used in Connection with Random Digits. In Monte Carlo Method. National Bureau of Standards Applied Mathematics Series, Vol. 12. US Government Printing Ofice, Washington, DC, Chapter 13, 36-38.Google ScholarGoogle Scholar
  15. Frank Wood, Jan-Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In AISTATS 2014 ( JMLR Proceedings). JMLR.org, 1024-1032. http://jmlr.org/proceedings/papers/v33/wood14. htmlGoogle ScholarGoogle Scholar
  16. Yi Wu, Siddharth Srivastava, Nicholas Hay, Simon Du, and Stuart Russell. 2018. Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 80 ), Jennifer Dy and Andreas Krause (Eds.). PMLR, Stockholmsmässan, Stockholm Sweden, 5343-5352. http://proceedings.mlr.press/v80/wu18f.htmlGoogle ScholarGoogle Scholar

Index Terms

  1. Paradoxes of probabilistic programming: and how to condition on events of measure zero with infinitesimal probabilities

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader