skip to main content
10.1145/3583133.3596327acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Priors for symbolic regression

Published:24 July 2023Publication History

ABSTRACT

When choosing between competing symbolic models for a data set, a human will naturally prefer the "simpler" expression or the one which more closely resembles equations previously seen in a similar context. This suggests a non-uniform prior on functions, which is, however, rarely considered within a symbolic regression (SR) framework. In this paper we develop methods to incorporate detailed prior information on both functions and their parameters into SR. Our prior on the structure of a function is based on a n-gram language model, which is sensitive to the arrangement of operators relative to one another in addition to the frequency of occurrence of each operator. We also develop a formalism based on the Fractional Bayes Factor to treat numerical parameter priors in such a way that models may be fairly compared though the Bayesian evidence, and explicitly compare Bayesian, Minimum Description Length and heuristic methods for model selection. We demonstrate the performance of our priors relative to literature standards on benchmarks and a real-world dataset from the field of cosmology.

Skip Supplemental Material Section

Supplemental Material

References

  1. Deaglan J. Bartlett, Harry Desmond, and Pedro G. Ferreira. 2022. Exhaustive Symbolic Regression. arXiv e-prints, Article arXiv:2211.11461 (Nov. 2022), arXiv:2211.11461 pages. arXiv:2211.11461 [astro-ph.CO] Google ScholarGoogle ScholarCross RefCross Ref
  2. Sören Becker, Michal Klein, Alexander Neitz, Giambattista Parascandolo, and Niki Kilbertus. 2022. Discovering ordinary differential equations that govern time-series. arXiv e-prints, Article arXiv:2211.02830 (Nov. 2022), arXiv:2211.02830 pages. arXiv:2211.02830 [cs.LG] Google ScholarGoogle ScholarCross RefCross Ref
  3. James O. Berger and Luis R. Pericchi. 1996. The Intrinsic Bayes Factor for Model Selection and Prediction. J. Amer. Statist. Assoc. 91, 433 (1996), 109--122. arXiv:https://www.tandfonline.com/doi/pdf/10.1080/01621459.1996.10476668 Google ScholarGoogle ScholarCross RefCross Ref
  4. Jose M. Bernardo. 1979. Reference Posterior Distributions for Bayesian Inference. Journal of the Royal Statistical Society: Series B (Methodological) 41, 2 (1979), 113--128. arXiv:https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1979.tb01066.x Google ScholarGoogle ScholarCross RefCross Ref
  5. Luca Biggio, Tommaso Bendinelli, Alexander Neitz, Aurelien Lucchi, and Giambattista Parascandolo. 2021. Neural Symbolic Regression that scales. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 936--945. https://proceedings.mlr.press/v139/biggio21a.htmlGoogle ScholarGoogle Scholar
  6. G. F. Bomarito, P. E. Leser, N. C. M. Strauss, K. M. Garbrecht, and J. D. Hochhalter. 2022. Bayesian Model Selection for Reducing Bloat and Overfitting in Genetic Programming for Symbolic Regression. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (Boston, Massachusetts) (GECCO '22). Association for Computing Machinery, New York, NY, USA, 526--529. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. F. Bomarito, P. E. Leser, N. C. M. Strauss, K. M. Garbrecht, and J. D. Hochhalter. 2023. Automated learning of interpretable models with quantified uncertainty. Computer Methods in Applied Mechanics and Engineering 403 (Jan. 2023), 115732. arXiv:2205.01626 [cs.NE] Google ScholarGoogle ScholarCross RefCross Ref
  8. Jure Brence, Ljupčo Todorovski, and Sašo Džeroski. 2021. Probabilistic grammars for equation discovery. Knowledge-Based Systems 224 (2021), 107077. Google ScholarGoogle ScholarCross RefCross Ref
  9. Per A Brodtkorb and John D'Errico. 2015. numdifftools 0.9.11. https://github.com/pbrod/numdifftools.Google ScholarGoogle Scholar
  10. Dillon Brout et al. 2022. The Pantheon+ Analysis: Cosmological Constraints. arXiv e-prints, Article arXiv:2202.04077 (Feb. 2022), arXiv:2202.04077 pages. arXiv:2202.04077 [astro-ph.CO]Google ScholarGoogle Scholar
  11. Qi Chen, Bing Xue, and Mengjie Zhang. 2022. Rademacher Complexity for Enhancing the Generalization of Genetic Programming for Symbolic Regression. IEEE Transactions on Cybernetics 52, 4 (2022), 2382--2395. Google ScholarGoogle Scholar
  12. J.H. Conway, N.J.A. Sloane, E. Bannai, R.E. Borcherds, J. Leech, S.P. Norton, A.M. Odlyzko, R.A. Parker, L. Queen, and B.B. Venkov. 2013. Sphere Packings, Lattices and Groups. Springer New York. https://books.google.fr/books?id=hoTjBwAAQBAJGoogle ScholarGoogle Scholar
  13. T. M. Cover and J. A. Thomas. 1991. Elements of Information Theory (2nd ed.). Wiley.Google ScholarGoogle Scholar
  14. Miles Cranmer. 2020. PySR: Fast & Parallelized Symbolic Regression in Python/Julia. Google ScholarGoogle ScholarCross RefCross Ref
  15. Miles Cranmer, Alvaro Sanchez-Gonzalez, Peter Battaglia, Rui Xu, Kyle Cranmer, David Spergel, and Shirley Ho. 2020. Discovering Symbolic Models from Deep Learning with Inductive Biases. NeurIPS 2020 (2020). arXiv:2006.11287 [cs.LG]Google ScholarGoogle Scholar
  16. Stéphane d'Ascoli, Pierre-Alexandre Kamienny, Guillaume Lample, and François Charton. 2022. Deep Symbolic Regression for Recurrent Sequences. arXiv e-prints, Article arXiv:2201.04600 (Jan. 2022), arXiv:2201.04600 pages. arXiv:2201.04600 [cs.LG] Google ScholarGoogle ScholarCross RefCross Ref
  17. E. David. 1989. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley.Google ScholarGoogle Scholar
  18. R.P. Feynman, R.B. Leighton, and M. Sands. 2015. The Feynman Lectures on Physics, Vol. I: The New Millennium Edition: Mainly Mechanics, Radiation, and Heat. Number v. 1. Basic Books. https://books.google.fr/books?id=d76DBQAAQBAJGoogle ScholarGoogle Scholar
  19. R.P. Feynman, R.B. Leighton, M.L. Sands, and M.A. Gottlieb. 2006. The Feynman Lectures on Physics. Number v. 2 in The Feynman Lectures on Physics. Pearson/Addison-Wesley. https://books.google.fr/books?id=AbruAAAAMAAJGoogle ScholarGoogle Scholar
  20. R.P. Feynman, R.B. Leighton, M.L. Sands, and M.A. Gottlieb. 2006. The Feynman Lectures on Physics: Quantum mechanics. Pearson/Addison-Wesley. https://books.google.fr/books?id=_6XvAAAAMAAJGoogle ScholarGoogle Scholar
  21. William A. Gale and Geoffrey Sampson. 1995. Good-turing frequency estimation without tears. Journal of Quantitative Linguistics 2, 3 (1995), 217--237. arXiv:https://doi.org/10.1080/09296179508590051 Google ScholarGoogle ScholarCross RefCross Ref
  22. H. Goldstein, C.P. Poole, and J.L. Safko. 2002. Classical Mechanics. Addison Wesley. https://books.google.fr/books?id=tJCuQgAACAAJGoogle ScholarGoogle Scholar
  23. I. J. GOOD. 1953. THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS. Biometrika 40, 3--4 (12 1953), 237--264. arXiv:https://academic.oup.com/biomet/article-pdf/40/3-4/237/492571/40-3-4-237.pdf Google ScholarGoogle ScholarCross RefCross Ref
  24. P. Grunwald. 2007. The Minimum Description Length Principle. MIT Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Peter Grünwald and Teemu Roos. 2019. Minimum Description Length Revisited. arXiv e-prints, Article arXiv:1908.08484 (Aug. 2019), arXiv:1908.08484 pages. arXiv:1908.08484 [stat.ME]Google ScholarGoogle Scholar
  26. Roger Guimerà, Ignasi Reichardt, Antoni Aguilar-Mogas, Francesco A. Massucci, Manuel Miranda, Jordi Pallarès, and Marta Sales-Pardo. 2020. A Bayesian machine scientist to aid in the solution of challenging scientific problems. Science Advances 6, 5 (2020), eaav6971. arXiv:https://www.science.org/doi/pdf/10.1126/sciadv.aav6971 Google ScholarGoogle ScholarCross RefCross Ref
  27. R. Haupt and S. Haupt. 2004. Practical genetic algorithms (2nd ed.). Wyley.Google ScholarGoogle Scholar
  28. Hitoshi Iba, Hugo De Garis, and Taisuke Sato. 1994. Genetic programming using a minimum description length principle. Advances in genetic programming 1 (1994), 265--284.Google ScholarGoogle Scholar
  29. John David Jackson. 1999. Classical electrodynamics (3rd ed. ed.). Wiley, New York, NY. http://cdsweb.cern.ch/record/490457Google ScholarGoogle Scholar
  30. Harold Jeffreys. 1946. An Invariant Form for the Prior Probability in Estimation Problems. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences 186, 1007 (1946), 453--461. http://www.jstor.org/stable/97883Google ScholarGoogle ScholarCross RefCross Ref
  31. Pierre-Alexandre Kamienny, Stéphane d'Ascoli, Guillaume Lample, and François Charton. 2022. End-to-end symbolic regression with transformers. arXiv e-prints, Article arXiv:2204.10532 (April 2022), arXiv:2204.10532 pages. arXiv:2204.10532 [cs.LG] Google ScholarGoogle ScholarCross RefCross Ref
  32. Lukas Kammerer, Gabriel Kronberger, Bogdan Burlacu, Stephan M. Winkler, Michael Kommenda, and Michael Affenzeller. 2021. Symbolic Regression by Exhaustive Search: Reducing the Search Space Using Syntactical Constraints and Efficient Semantic Structure Deduplication. arXiv e-prints, Article arXiv:2109.13895 (Sept. 2021), arXiv:2109.13895 pages. arXiv:2109.13895 [cs.LG]Google ScholarGoogle Scholar
  33. Slava M. Katz. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Signal Process. 35 (1987), 400--401.Google ScholarGoogle ScholarCross RefCross Ref
  34. Michael Kommenda, Andreas Beham, Michael Affenzeller, and Gabriel Kronberger. 2021. Complexity Measures for Multi-objective Symbolic Regression. arXiv e-prints, Article arXiv:2109.00238 (Sept. 2021), arXiv:2109.00238 pages. arXiv:2109.00238 [cs.LG]Google ScholarGoogle Scholar
  35. Michael Korns. 2011. Accuracy in Symbolic Regression. 129--151. Google ScholarGoogle ScholarCross RefCross Ref
  36. Jiří Kubalík, Erik Derner, and Robert Babuška. 2023. Neural Networks for Symbolic Regression. arXiv e-prints, Article arXiv:2302.00773 (Feb. 2023), arXiv:2302.00773 pages. arXiv:2302.00773 [cs.NE] Google ScholarGoogle ScholarCross RefCross Ref
  37. Guillaume Lample and François Charton. 2019. Deep Learning for Symbolic Mathematics. arXiv e-prints, Article arXiv:1912.01412 (Dec. 2019), arXiv:1912.01412 pages. arXiv:1912.01412 [cs.SC] Google ScholarGoogle ScholarCross RefCross Ref
  38. Aaron D. Lanterman. 2001. Schwarz, Wallace, and Rissanen: Intertwining Themes in Theories of Model Selection. International Statistical Review 69, 2 (2001), 185--212. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1751-5823.2001.tb00456.x Google ScholarGoogle ScholarCross RefCross Ref
  39. Pablo Lemos, Niall Jeffrey, Miles Cranmer, Shirley Ho, and Peter Battaglia. 2022. Rediscovering orbital mechanics with machine learning. arXiv e-prints, Article arXiv:2202.02306 (Feb. 2022), arXiv:2202.02306 pages. arXiv:2202.02306 [astroph.EP]Google ScholarGoogle Scholar
  40. F. B. Lempers. 1971. Posterior probabilities of alternative linear models. Some theoretical considerations and empirical experiments. [By] F. B. Lempers. Rotterdam University Press [Rotterdam]. 118 p. pages.Google ScholarGoogle Scholar
  41. Christopher D. Manning and Hinrich Schutze. 1999. Foundations of statistical natural language processing. MIT Press, Cambridge, Mass.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Trent McConaghy. 2011. FFX: Fast, Scalable, Deterministic Symbolic Regression Technology. Springer New York, New York, NY, 235--260. Google ScholarGoogle ScholarCross RefCross Ref
  43. James McDermott, David R. White, Sean Luke, Luca Manzoni, Mauro Castelli, Leonardo Vanneschi, Wojciech Jaśkowski, Krzysztof Krawiec, Robin Harper, Kenneth De Jong, and Una May O'Reilly. 2012. Genetic programming needs better benchmarks. In GECCO'12 - Proceedings of the 14th International Conference on Genetic and Evolutionary Computation (GECCO'12 - Proceedings of the 14th International Conference on Genetic and Evolutionary Computation). 791--798. 14th International Conference on Genetic and Evolutionary Computation, GECCO'12 ; Conference date: 07-07-2012 Through 11-07-2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Aaron Meurer et al. 2017. SymPy: symbolic computing in Python. PeerJ Computer Science 3 (Jan. 2017), e103. Google ScholarGoogle ScholarCross RefCross Ref
  45. Anthony O'Hagan. 1995. Fractional Bayes Factors for Model Comparison. Journal of the Royal Statistical Society. Series B (Methodological) 57, 1 (1995), 99--138. http://www.jstor.org/stable/2346088Google ScholarGoogle ScholarCross RefCross Ref
  46. Adam G. Riess, Wenlong Yuan, Lucas M. Macri, Dan Scolnic, Dillon Brout, Stefano Casertano, David O. Jones, Yukei Murakami, Gagandeep S. Anand, Louise Breuval, Thomas G. Brink, Alexei V. Filippenko, Samantha Hoffmann, Saurabh W. Jha, W. D'arcy Kenworthy, John Mackenty, Benjamin E. Stahl, and WeiKang Zheng. 2022. A Comprehensive Measurement of the Local Value of the Hubble Constant with 1 km s−1 Mpc−1 Uncertainty from the Hubble Space Telescope and the SH0ES Team. ApJ 934, 1, Article L7 (July 2022), L7 pages. arXiv:2112.04510 [astro-ph.CO] Google ScholarGoogle ScholarCross RefCross Ref
  47. J. Rissanen. 1978. Modeling by shortest data description. Automatica 14, 5 (1978), 465--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Jorma Rissanen. 1983. A Universal Prior for Integers and Estimation by Minimum Description Length. The Annals of Statistics 11, 2 (1983), 416 -- 431. Google ScholarGoogle ScholarCross RefCross Ref
  49. Daniel Rivero and Enrique Fernandez-Blanco. 2019. A New Deterministic Technique for Symbolic Regression. arXiv e-prints, Article arXiv:1908.06754 (Aug. 2019), arXiv:1908.06754 pages. arXiv:1908.06754 [cs.LG]Google ScholarGoogle Scholar
  50. M.D. Schwartz. 2014. Quantum Field Theory and the Standard Model. Cambridge University Press. https://books.google.fr/books?id=HbdEAgAAQBAJGoogle ScholarGoogle Scholar
  51. Gideon Schwarz. 1978. Estimating the Dimension of a Model. Annals of Statistics 6, 2 (July 1978), 461--464.Google ScholarGoogle ScholarCross RefCross Ref
  52. Dan Scolnic, Dillon Brout, Anthony Carr, Adam G. Riess, Tamara M. Davis, Arianna Dwomoh, David O. Jones, Noor Ali, Pranav Charvu, Rebecca Chen, Erik R. Peterson, Brodie Popovic, Benjamin M. Rose, Charlotte Wood, Peter J. Brown, Ken Chambers, David A. Coulter, Kyle G. Dettman, Georgios Dimitriadis, Alexei V. Filippenko, Ryan J. Foley, Saurabh W. Jha, Charles D. Kilpatrick, Robert P. Kirshner, Yen-Chen Pan, Armin Rest, Cesar Rojas-Bravo, Matthew R. Siebert, Benjamin E. Stahl, and WeiKang Zheng. 2021. The Pantheon+ Analysis: The Full Dataset and Light-Curve Release. arXiv e-prints, Article arXiv:2112.03863 (Dec. 2021), arXiv:2112.03863 pages. arXiv:2112.03863 [astro-ph.CO]Google ScholarGoogle Scholar
  53. Parshin Shojaee, Kazem Meidani, Amir Barati Farimani, and Chandan K. Reddy. 2023. Transformer-based Planning for Symbolic Regression. arXiv e-prints, Article arXiv:2303.06833 (March 2023), arXiv:2303.06833 pages. arXiv:2303.06833 [cs.LG] Google ScholarGoogle ScholarCross RefCross Ref
  54. Guido F. Smits and Mark Kotanchek. 2005. Pareto-Front Exploitation in Symbolic Regression. Springer US, Boston, MA, 283--299. Google ScholarGoogle ScholarCross RefCross Ref
  55. R.S. Sutton and A.G. Barto. 2018. Reinforcement Learning, second edition: An Introduction. MIT Press. https://books.google.co.uk/books?id=sWV0DwAAQBAJGoogle ScholarGoogle Scholar
  56. J. Takeuchi. 1997. Characterization of the Bayes estimator and the MDL estimator for exponential families. IEEE Transactions on Information Theory 43, 4 (1997), 1165--1174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. A. M. Turing. 1950. I.---COMPUTING MACHINERY AND INTELLIGENCE. Mind LIX, 236 (10 1950), 433--460. Google ScholarGoogle ScholarCross RefCross Ref
  58. Silviu-Marian Udrescu, Andrew Tan, Jiahai Feng, Orisvaldo Neto, Tailin Wu, and Max Tegmark. 2020. AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity. arXiv e-prints, Article arXiv:2006.10782 (June 2020), arXiv:2006.10782 pages. arXiv:2006.10782 [cs.LG]Google ScholarGoogle Scholar
  59. Silviu-Marian Udrescu and Max Tegmark. 2020. AI Feynman: A physics-inspired method for symbolic regression. Science Advances 6, 16 (April 2020), eaay2631. arXiv:1905.11481 [physics.comp-ph] Google ScholarGoogle ScholarCross RefCross Ref
  60. Nguyen Quang Uy, Nguyen Xuan Hoai, Michael O'Neill, Robert I. McKay, and Edgar Galván López. 2011. Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genet. Program. Evolvable Mach. 12, 2 (2011), 91--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Mojtaba Valipour, Bowen You, Maysum Panju, and Ali Ghodsi. 2021. SymbolicGPT: A Generative Transformer Model for Symbolic Regression. arXiv e-prints, Article arXiv:2106.14131 (June 2021), arXiv:2106.14131 pages. arXiv:2106.14131 [cs.LG] Google ScholarGoogle ScholarCross RefCross Ref
  62. Martin Vastl, Jonáš Kulhánek, Jiří Kubalík, Erik Derner, and Robert Babuška. 2022. SymFormer: End-to-end symbolic regression using transformer-based architecture. arXiv e-prints, Article arXiv:2205.15764 (May 2022), arXiv:2205.15764 pages. arXiv:2205.15764 [cs.LG] Google ScholarGoogle ScholarCross RefCross Ref
  63. Ekaterina J. Vladislavleva, Guido F. Smits, and Dick den Hertog. 2009. Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming. IEEE Transactions on Evolutionary Computation 13, 2 (2009), 333--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. C. S. Wallace and P. R. Freeman. 1987. Estimation and Inference by Compact Coding. Journal of the Royal Statistical Society. Series B (Methodological) 49, 3 (1987), 240--265. http://www.jstor.org/stable/2985992Google ScholarGoogle ScholarCross RefCross Ref
  65. C. S. Wallace and P. R. Freeman. 1992. Single-Factor Analysis by Minimum Message Length Estimation. Journal of the Royal Statistical Society: Series B (Methodological) 54, 1 (1992), 195--209. arXiv:https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1992.tb01874.x Google ScholarGoogle ScholarCross RefCross Ref
  66. Steven Weinberg. 1972. Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity. John Wiley and Sons, New York.Google ScholarGoogle Scholar
  67. Matthias Werner, Andrej Junginger, Philipp Hennig, and Georg Martius. 2022. Uncertainty in equation learning. In GECCO '22: Genetic and Evolutionary Computation Conference, Companion Volume, Boston, Massachusetts, USA, July 9 -- 13, 2022, Jonathan E. Fieldsend and Markus Wagner (Eds.). ACM, 2298--2305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Tony Worm and Kenneth Chiu. 2013. Prioritized Grammar Enumeration: Symbolic Regression by Dynamic Programming. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation (Amsterdam, The Netherlands) (GECCO '13). Association for Computing Machinery, New York, NY, USA, 1021--1028. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Priors for symbolic regression

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary Computation
        July 2023
        2519 pages
        ISBN:9798400701207
        DOI:10.1145/3583133

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 July 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,669of4,410submissions,38%

        Upcoming Conference

        GECCO '24
        Genetic and Evolutionary Computation Conference
        July 14 - 18, 2024
        Melbourne , VIC , Australia
      • Article Metrics

        • Downloads (Last 12 months)81
        • Downloads (Last 6 weeks)9

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader