Solomonoff Prediction and Occam’s Razor

Tom F. Sterkenburg

doi:10.1086/687257

Solomonoff Prediction and Occam’s Razor

Published online by Cambridge University Press: 01 January 2022

Tom F. Sterkenburg

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Algorithmic information theory gives an idealized notion of compressibility that is often presented as an objective measure of simplicity. It is suggested at times that Solomonoff prediction, or algorithmic information theory in a predictive setting, can deliver an argument to justify Occam’s razor. This article explicates the relevant argument and, by converting it into a Bayesian framework, reveals why it has no such justificatory force. The supposed simplicity concept is better perceived as a specific inductive assumption, the assumption of effectiveness. It is this assumption that is the characterizing element of Solomonoff prediction and wherein its philosophical interest lies.

Type: Research Article
Information: Philosophy of Science , Volume 83 , Issue 4 , October 2016 , pp. 459 - 479

DOI: https://doi.org/10.1086/687257 [Opens in a new window]
Copyright: Copyright © The Philosophy of Science Association

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

†

For valuable feedback on several versions and presentations of this article, I am indebted to Peter Grünwald, Jan-Willem Romeijn, the members of the Groningen PCCP seminar, Simon Huttegger, Hannes Leitgeb, Samuel Fletcher, Filippo Massari, Teddy Seidenfeld, and an anonymous referee. This research was supported by NWO Vici project 639.073.904.

References

Barron, Andrew R. 1998. “Information-Theoretic Characterization of Bayes Performance and the Choice of Priors in Parametric and Nonparametric Problems.” In Proceedings of the Sixth Valencia International Meeting, ed. Bernardo, José M., Berger, James O., Dawid, A. Philip, and Smith, Adrian F.M., 27–52. Oxford: Oxford University Press.Google Scholar

Bernardo, José M., and Smith, Adrian F. M.. 1994. Bayesian Theory. Chichester: Wiley.CrossRef Google Scholar

Blackwell, David, and Dubins, Lester. 1962. “Merging of Opinion with Increasing Information.” Annals of Mathematical Statistics 33:882–86.CrossRef Google Scholar

Braithwaite, Richard B. 1957. “On Unknown Probabilities.” In Observation and Interpretation: Proceedings of the Ninth Symposium of the Colston Research Society, ed. Körner, S., 3–11. London: Butterworths.Google Scholar

Carnap, Rudolf. 1945. “On Inductive Logic.” Philosophy of Science 12:72–97.CrossRef Google Scholar

Carnap, Rudolf 1950. Logical Foundations of Probability. Chicago: University of Chicago Press.Google Scholar

Carnap, Rudolf 1952. The Continuum of Inductive Methods. Chicago: University of Chicago Press.Google Scholar

Cesa-Bianchi, Nicolò, and Lugosi, Gabor. 2006. Prediction, Learning and Games. Cambridge: Cambridge University Press.CrossRef Google Scholar

Chaitin, Gregory J. 1969. “On the Length of Programs for Computing Finite Binary Sequences: Statistical Considerations.” Journal of the Association for Computing Machinery 16:145–59.CrossRef Google Scholar

Dawid, A. Philip. 1984. “Present Position and Potential Developments: Some Personal Views.” Journal of the Royal Statistical Society A 147:278–92.Google Scholar

de Finetti, Bruno. 1937/1937. “La prévision: Ses lois logiques, ses sources subjectives.” Annales de l’Institut Henri Poincaré 7:1–68. Trans. Henry E. Kyburg Jr. in Studies in Subjective Probability, ed. Henry E. Kyburg Jr. and Howard E. Smokler, 93–158. New York: Wiley.Google Scholar

Downey, Rodney G., and Hirschfeldt, Denis R.. 2010. Algorithmic Randomness and Complexity. New York: Springer.CrossRef Google Scholar

Gaifman, Haim, and Snir, Marc. 1982. “Probabilities over Rich Languages, Testing and Randomness.” Journal of Symbolic Logic 47 (3): 495–548.CrossRef Google Scholar

Goodman, Nelson. 1955. Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press.Google Scholar

Grünwald, Peter D. 2007. The Minimum Description Length Principle. Cambridge, MA: MIT Press.CrossRef Google Scholar

Hintikka, Jaakko. 1971. “Unknown Probabilities, Bayesianism, and de Finetti’s Representation Theorem.” In Proceedings of the 1970 Biennial Meeting of the Philosophy of Science Association, ed. Buck, Roger C. and Cohen, Robert S., 325–41. Dordrecht: Reidel.Google Scholar

Howson, Colin. 2000. Hume’s Problem: Induction and the Justification of Belief. New York: Oxford University Press.CrossRef Google Scholar

Hutter, Marcus. 2003. “Convergence and Loss Bounds for Bayesian Sequence Prediction.” IEEE Transactions on Information Theory 49 (8): 2061–66.CrossRef Google Scholar

Hutter, Marcus 2007. “On Universal Prediction and Bayesian Confirmation.” Theoretical Computer Science 384 (1): 33–48.CrossRef Google Scholar

Jeffrey, Richard C. 1973. “Carnap’s Inductive Logic.” Synthese 25:299–306.CrossRef Google Scholar

Kass, Robert E., and Raftery, Adrian E.. 1995. “Bayes Factors.” Journal of the American Statistical Association 90 (420): 773–95.CrossRef Google Scholar

Kelly, Kevin T. 1996. The Logic of Reliable Inquiry. New York: Oxford University Press.Google Scholar

Kelly, Kevin T. 2008. “Ockham’s Razor, Truth, and Information.” In Handbook of the Philosophy of Information, ed. van Benthem, Johan F. A. K. and Adriaans, Pieter, 321–60. Dordrecht: Elsevier.Google Scholar

Kolmogorov, Andrey N. 1965. “Three Approaches to the Quantitative Definition of Information.” Problems of Information Transmission 1 (1): 1–7.Google Scholar

Li, Ming, and Vitányi, Paul M. B.. 2008. An Introduction to Kolmogorov Complexity and Its Applications. 3rd ed. New York: Springer.CrossRef Google Scholar

Merhav, Neri, and Feder, Meir. 1998. “Universal Prediction.” IEEE Transactions on Information Theory 44 (8): 2124–47.CrossRef Google Scholar

Müller, Markus. 2010. “Stationary Algorithmic Probability.” Theoretical Computer Science 411 (1): 113–30.CrossRef Google Scholar

Nies, André. 2009. Computability and Randomness. Oxford: Oxford University Press.CrossRef Google Scholar

Ortner, Ronald, and Leitgeb, Hannes. 2011. “Mechanizing Induction.” In Inductive Logic, vol. 10 of Handbook of the History of Logic, ed. Gabbay, Dov M., Hartmann, Stephan, and Woods, John, 719–72. North-Holland: Elsevier.Google Scholar

Piccinini, Gualtiero. 2011. “The Physical Church-Turing Thesis: Modest or Bold?” British Journal for the Philosophy of Science 62:733–69.CrossRef Google Scholar

Poland, Jan, and Hutter, Marcus. 2005. “Asymptotics of Discrete MDL for Online Prediction.” IEEE Transactions on Information Theory 51 (11): 3780–95.CrossRef Google Scholar

Reichenbach, Hans. 1935. Wahrscheinlichkeitslehre. Leiden: Sijthoff.Google Scholar

Rissanen, Jorma J. 1989. Stochastic Complexity in Statistical Inquiry. Singapore: World Scientific.Google Scholar

Romeijn, Jan-Willem. 2004. “Hypotheses and Inductive Predictions.” Synthese 141 (3): 333–64.Google Scholar

Schurz, Gerhard. 2008. “The Meta-inductivist’s Winning Strategy in the Prediction Game: A New Approach to Hume’s Problem.” Philosophy of Science 75:278–305.CrossRef Google Scholar

Shiryaev, Albert N. 1989. “Kolmogorov: Life and Creative Activities.” Annals of Probability 17 (3): 866–944.CrossRef Google Scholar

Solomonoff, Raymond J. 1960. “A Preliminary Report on a General Theory of Inductive Inference.” Technical report, Zator, Cambridge, MA.Google Scholar

Solomonoff, Raymond J. 1964. “A Formal Theory of Inductive Inference.” Pts. 1 and 2. Information and Control 7:1–22, 224–54.CrossRef Google Scholar

Solomonoff, Raymond J. 1978. “Complexity-Based Induction Systems: Comparisons and Convergence Theorems.” IEEE Transactions on Information Theory 24 (4): 422–32.CrossRef Google Scholar

Solomonoff, Raymond J. 1986. “The Application of Algorithmic Probability to Problems in Artificial Intelligence.” In Uncertainty in Artificial Intelligence, ed. Kanal, Laveen N. and Lemmer, John F., 473–91. Dordrecht: Elsevier.Google Scholar

Lemmer, John F. 1997. “The Discovery of Algorithmic Probability.” Journal of Computer and System Sciences 55 (1): 73–88.Google Scholar

Lemmer, John F. 2009. “Algorithmic Probability: Theory and Applications.” In Information Theory and Statistical Learning, ed. Emmert-Streib, Frank and Dehmer, Matthias, 1–23. New York: Springer.Google Scholar

Stalker, Douglas, ed. 1994. Grue! The New Riddle of Induction. Chicago: Open Court.Google Scholar

Suppes, Patrick. 2002. Representation and Invariance of Scientific Structures. Stanford, CA: CSLI.Google Scholar

Vitányi, Paul M. B. 2005. “Algorithmic Statistics and Kolmogorov’s Structure Functions.” In Advances in Minimum Description Length, ed. Grünwald, Peter D., Myung, In Jae, and Pitt, Mark A., 151–74. Cambridge, MA: MIT Press.Google Scholar

Wallace, Christopher S. 2005. Statistical and Inductive Inference by Minimum Message Length. New York: Springer.Google Scholar

Wood, Ian, Sunehag, Peter, and Hutter, Marcus. 2013. “(Non-)equivalence of Universal Priors.” In Papers from the Ray Solomonoff 85th Memorial Conference, ed. Dowe, David L., 417–25. New York: Springer.Google Scholar

Zabell, Sandy L. 2011. “Carnap and the Logic of Inductive Inference.” In Inductive Logic, vol. 10 of Handbook of the History of Logic, ed. Gabbay, Dov M., Hartmann, Stephan, and Woods, John, 265–309. North-Holland: Elsevier.Google Scholar

Zvonkin, Alexander K., and Levin, Leonid A.. 1970. “The Complexity of Finite Objects and the Development of the Concepts of Information and Randomness by Means of the Theory of Algorithms.” Russian Mathematical Surveys 26 (6): 83–124.CrossRef Google Scholar

Article contents

Solomonoff Prediction and Occam’s Razor

Abstract

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests