research-article

Free Access

Just Accepted

PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments

Authors:
Abhijit Chowdhary

Mathematics Department, North Carolina State University, USA

Mathematics Department, North Carolina State University, USA

0000-0002-5092-3503
View Profile

,
Shady E. Ahmed

School of Mechanical and Aerospace Engineering, Oklahoma State University, USA

School of Mechanical and Aerospace Engineering, Oklahoma State University, USA

0000-0001-5548-0265
View Profile

,
Ahmed Attia

Mathematics and Computer Science Division, Argonne National Laboratory, USA

Mathematics and Computer Science Division, Argonne National Laboratory, USA

0000-0001-5940-9247
View Profile

Authors Info & Claims

ACM Transactions on Mathematical SoftwareAccepted on March 2024https://doi.org/10.1145/3653071

Online AM:20 March 2024Publication History

ACM Transactions on Mathematical Software

Abstract

This paper describes PyOED, a highly extensible scientific package that enables developing and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also meant to enable researchers to experiment with standard and innovative OED technologies with a wide range of test problems (e.g., simulation models). OED, inverse problems (e.g., Bayesian inversion), and data assimilation (DA) are closely related research fields, and their formulations overlap significantly. Thus, PyOED is continuously being expanded with a plethora of Bayesian inversion, DA, and OED methods as well as new scientific simulation models, observation error models, and observation operators. These pieces are added such that they can be permuted to enable testing OED methods in various settings of varying complexities. The PyOED core is completely written in Python and utilizes the inherent object-oriented capabilities; however, the current version of PyOED is meant to be extensible rather than scalable. Specifically, PyOED is developed to “enable rapid development and benchmarking of OED methods with minimal coding effort and to maximize code reutilization.” This paper provides a brief description of the PyOED layout and philosophy and provides a set of exemplary test cases and tutorials to demonstrate the potential of the package.

References

Alen Alexanderian. 2021. Optimal experimental design for infinite-dimensional Bayesian inverse problems governed by PDEs: a review. Inverse Problems 37, 4 (2021), 043001.Google ScholarCross Ref
Alen Alexanderian, Noemi Petra, Georg Stadler, and Omar Ghattas. 2014. A-optimal design of experiments for infinite-dimensional Bayesian linear inverse problems with regularized ℓ₀-sparsification. SIAM Journal on Scientific Computing 36, 5 (2014), A2122–A2148.Google ScholarDigital Library
Alen Alexanderian, Noemi Petra, Georg Stadler, and Omar Ghattas. 2016. A fast and scalable method for A-optimal design of experiments for infinite-dimensional Bayesian nonlinear inverse problems. SIAM Journal on Scientific Computing 38, 1 (2016), A243–A272.Google ScholarDigital Library
Alen Alexanderian and Arvind K Saibaba. 2018. Efficient D-optimal design of experiments for infinite-dimensional Bayesian linear inverse problems. SIAM Journal on Scientific Computing 40, 5 (2018), A2956–A2985.Google ScholarDigital Library
Mark Asch, Marc Bocquet, and Maëlle Nodet. 2016. Data assimilation: methods, algorithms, and applications. SIAM.Google Scholar
Richard C Aster, Brian Borchers, and Clifford H Thurber. 2018. Parameter estimation and inverse problems. Elsevier.Google Scholar
Ahmed Attia. 2023. PyOED Documentation. Retrieved December 6, 2023 from https://web.cels.anl.gov/~aattia/pyoed/index.htmlGoogle Scholar
Ahmed Attia. 2023. PyOED GitLab Repository. Retrieved December 6, 2023 from https://gitlab.com/ahmedattia/pyoedGoogle Scholar
Ahmed Attia, Alen Alexanderian, and Arvind K Saibaba. 2018. Goal-oriented optimal design of experiments for large-scale Bayesian linear inverse problems. Inverse Problems 34, 9 (2018), 095009.Google ScholarCross Ref
Ahmed Attia and Emil Constantinescu. 2022. Optimal Experimental Design for Inverse Problems in the Presence of Observation Correlations. SIAM Journal on Scientific Computing 44, 4 (2022), A2808–A2842.Google ScholarDigital Library
Ahmed Attia, Sven Leyffer, and Todd Munson. 2022. Stochastic Learning Approach for Binary Optimization: Application to Bayesian Optimal Design of Experiments. SIAM Journal on Scientific Computing 44, 2 (2022), B395–B427.Google ScholarDigital Library
Ahmed Attia, Sven Leyffer, and Todd Munson. 2023. Robust A-optimal experimental design for Bayesian inverse problems. In preparation (2023).Google Scholar
Ahmed Attia, Vishwas Rao, and Adrian Sandu. 2015. A sampling approach for four dimensional data assimilation. In Dynamic Data-Driven Environmental Systems Science. Springer, 215–226.Google Scholar
Ahmed Attia, Vishwas Rao, and Adrian Sandu. 2016. A hybrid Monte Carlo sampling smoother for four dimensional data assimilation. International Journal for Numerical Methods in Fluids (2016). https://doi.org/10.1002/fld.4259fld.4259.Google ScholarCross Ref
Ahmed Attia and Adrian Sandu. 2015. A hybrid Monte Carlo sampling filter for non-Gaussian data assimilation. AIMS Geosciences 1, geosci-01-00041 (2015), 4–1–78. https://doi.org/10.3934/geosci.2015.1.41Google ScholarCross Ref
Ahmed Attia and Adrian Sandu. 2019. DATeS: a highly extensible data assimilation testing suite v1. 0. Geoscientific Model Development 12, 2 (2019), 629–649.Google ScholarCross Ref
Ahmed Attia, Răzvan Ştefănescu, and Adrian Sandu. 2017. The reduced-order hybrid Monte Carlo sampling smoother. International Journal for Numerical Methods in Fluids 83, 1 (2017), 28–51.Google ScholarCross Ref
Satish Balay, Shrirang Abhyankar, Steven Benson, Jed Brown, Peter R Brune, Kristopher R Buschelman, Emil Constantinescu, Alp Dener, Jacob Faibussowitsch, William D Gropp, et al. 2022. PETSc/TAO users manual. Technical Report. Argonne National Laboratory (ANL), Argonne, IL (United States).Google Scholar
RN Bannister. 2017. A review of operational methods of variational and ensemble-variational data assimilation. Quarterly Journal of the Royal Meteorological Society 143, 703 (2017), 607–633.Google ScholarCross Ref
Johnathan M. Bardsley, Tiangang Cui, Youssef M. Marzouk, and Zheng Wang. 2020. Scalable Optimization-Based Sampling on Function Space. SIAM Journal on Scientific Computing 42, 2 (2020), A1317–A1347. https://doi.org/10.1137/19M1245220Google ScholarDigital Library
Joakim Beck, Ben Mansour Dia, Luis FR Espath, Quan Long, and Raul Tempone. 2018. Fast Bayesian experimental design: Laplace-based importance sampling for the expected information gain. Computer Methods in Applied Mechanics and Engineering 334 (June 2018), 523–553.Google Scholar
Alexandros Beskos, Mark Girolami, Shiwei Lan, Patrick E. Farrell, and Andrew M. Stuart. 2017. Geometric MCMC for infinite-dimensional inverse problems. J. Comput. Phys. 335 (2017), 327–351.Google ScholarCross Ref
James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jaxGoogle Scholar
Tan Bui-Thanh, Omar Ghattas, James Martin, and Georg Stadler. 2013. A computational framework for infinite-dimensional Bayesian inverse problems Part I: The linearized case, with application to global seismic inversion. SIAM Journal on Scientific Computing 35, 6 (2013), A2494–A2523.Google ScholarDigital Library
S. L. Cotter, G. O. Roberts, A. M. Stuart, and D. White. 2013. MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster. Statist. Sci. 28, 3 (2013), 424–446.Google ScholarCross Ref
Roger Daley. 1991. Atmospheric data analysis. Cambridge University Press. 457 pages.Google Scholar
Geir Evensen. 2009. Data assimilation: the ensemble Kalman filter. Vol. 2. Springer.Google Scholar
Valerii Fedorov and Jon Lee. 2000. Design of experiments in statistics. In Handbook of semidefinite programming. Springer, Boston, 511–532.Google ScholarCross Ref
Valerii Vadimovich Fedorov. 2013. Theory of optimal experiments. Elsevier.Google Scholar
Robert J Flassig and René Schenkendorf. 2018. Model-based design of experiments: where to go. In Ninth Vienna Internatioal Conference on Mathematical Modelling. 875–876.Google ScholarCross Ref
H Pearl Flath, Lucas C Wilcox, Volkan Akçelik, Judith Hill, Bart van Bloemen Waanders, and Omar Ghattas. 2011. Fast algorithms for Bayesian uncertainty quantification in large-scale linear inverse problems based on low-rank partial Hessian approximations. SIAM Journal on Scientific Computing 33, 1 (2011), 407–432.Google ScholarDigital Library
Marco Foracchia, Andrew Hooker, Paolo Vicini, and Alfredo Ruggeri. 2004. POPED, a software for optimal experiment design in population kinetics. Computer Methods and Programs in Biomedicine 74, 1 (2004), 29–46.Google ScholarCross Ref
M. Gerdts. 2005. Solving mixed-integer optimal control problems by branch&bound: a case study from automobile test-driving with gear shift. Optimal Control Applications & Methods 26, 1 (2005), 1–18.Google ScholarCross Ref
Michael Ghil and Paola Malanotte-Rizzoli. 1991. Data assimilation in meteorology and oceanography. Advances in Geophysics 33 (1991), 141–266.Google ScholarCross Ref
Eldad Haber, Lior Horesh, and Luis Tenorio. 2008. Numerical methods for experimental design of large-scale linear ill-posed inverse problems. Inverse Problems 24, 5 (2008), 055012.Google ScholarCross Ref
Eldad Haber, Lior Horesh, and Luis Tenorio. 2009. Numerical methods for the design of large-scale nonlinear discrete ill-posed inverse problems. Inverse Problems 26, 2 (2009), 025002.Google ScholarCross Ref
Martin Hairer, Andrew M. Stuart, and Sebastian J. Vollmer. 2014. Specral gaps for a Metropolis–Hastings algorithm in infinite dimensions. The Annals of Applied Probability 24, 6 (2014), 2455–2490.Google ScholarCross Ref
Insu Han, Dmitry Malioutov, and Jinwoo Shin. 2015. Large-scale log-determinant computation through stochastic Chebyshev expansions. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 37), Francis Bach and David Blei (Eds.). PMLR, Lille, France, 908–917.Google Scholar
Radoslav Harman and Lenka Filová. 2019. A brief introduction to the R library OptimalDesign. (2019).Google Scholar
Xun Huan and Youssef Marzouk. 2014. Gradient-based stochastic optimization methods in Bayesian experimental design. International Journal for Uncertainty Quantification 4, 6 (2014).Google ScholarCross Ref
Xun Huan and Youssef M Marzouk. 2013. Simulation-based optimal Bayesian experimental design for nonlinear systems. J. Comput. Phys. 232, 1 (2013), 288–317.Google ScholarDigital Library
John Jakeman. 2022. PyApprox: Enabling efficient model analysis. Technical Report. Sandia National Lab.(SNL-NM), Albuquerque, NM (United States).Google Scholar
Kennedy Putra Kusumo, Kamal Kuriyan, Shankarraman Vaidyaraman, Salvador García-Muñoz, Nilay Shah, and Benoît Chachuat. 2022. Risk mitigation in model-based experiment design: a continuous-effort approach to optimal campaigns. Computers & Chemical Engineering 159 (2022), 107680.Google ScholarCross Ref
Sven Leyffer. 2001. Integrating SQP and branch-and-bound for mixed integer nonlinear programming. Computational Optimization and Applications 18, 3 (2001), 295–309.Google ScholarDigital Library
C. Lieberman and K. Willcox. 2013. Goal-Oriented Inference: Approach, Linear Theory, and Application to Advection Diffusion. SIAM Rev. 55, 3 (2013), 493–519. https://doi.org/10.1137/130913110Google ScholarDigital Library
Chad Lieberman and Karen Willcox. 2014. Nonlinear goal-oriented Bayesian inference: application to carbon capture and storage. SIAM Journal on Scientific Computing 36, 3 (2014), B427–B449.Google ScholarDigital Library
Quan Long, Marco Scavino, Raúl Tempone, and Suojin Wang. 2013. Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations. Computer Methods in Applied Mechanics and Engineering 259 (2013), 24–39.Google ScholarCross Ref
Edward N Lorenz. 1996. Predictability: A problem partly solved. In Proc. Seminar on Predictability, Vol. 1.Google Scholar
Ionel M Navon. 2009. Data assimilation for numerical weather prediction: a review. In Data assimilation for atmospheric, oceanic and hydrologic applications. Springer, 21–65.Google Scholar
Simon Olofsson, Lukas Hebing, Sebastian Niedenführ, Marc Peter Deisenroth, and Ruth Misener. 2019. GPdoemd: A Python package for design of experiments for model discrimination. Computers & Chemical Engineering 125 (2019), 54–70.Google ScholarCross Ref
Thomas O’Leary-Roseberry, Xiaosong Du, Anirban Chaudhuri, Joaquim RRA Martins, Karen Willcox, and Omar Ghattas. 2022. Learning high-dimensional parametric maps via reduced basis adaptive residual networks. Computer Methods in Applied Mechanics and Engineering 402 (2022), 115730.Google ScholarCross Ref
Thomas O’Leary-Roseberry, Umberto Villa, Peng Chen, and Omar Ghattas. 2022. Derivative-informed projected neural networks for high-dimensional parametric maps governed by PDEs. Computer Methods in Applied Mechanics and Engineering 388 (2022), 114199.Google ScholarCross Ref
Noemi Petra and Georg Stadler. 2011. Model variational inverse problems governed by partial differential equations. Technical Report 11-05. The Institute for Computational Engineering and Sciences, The University of Texas at Austin.Google Scholar
Luc Pronzato and Andrej Pázman. 2013. Design of experiments in nonlinear models. Lecture Notes in Statistics 212 (2013), 1.Google ScholarCross Ref
Friedrich Pukelsheim. 2006. Optimal design of experiments. SIAM, Philadelphia.Google Scholar
Arno Rasch, H Martin Bücker, and André Bardow. 2009. Software supporting optimal experimental design: A case study of binary diffusion using EFCOSS. Computers & Chemical Engineering 33, 4 (2009), 838–849.Google ScholarCross Ref
Dieter Rasch, Jurgen Pilz, Leon R Verdooren, and Albrecht Gebhardt. 2011. Optimal experimental design with R. CRC Press.Google Scholar
Arvind K Saibaba, Alen Alexanderian, and Ilse CF Ipsen. 2017. Randomized matrix-free trace and log-determinant estimators. Numer. Math. 137, 2 (2017), 353–395.Google ScholarDigital Library
Oliver Sailer. 2005. crossdes: A package for design and randomization in crossover studies. Rnews 5, 2 (2005), 24–27.Google Scholar
Bonnie Sibbald and Chris Roberts. 1998. Understanding controlled trials crossover trials. Bmj 316, 7146 (1998), 1719–1720.Google Scholar
Ralph C Smith. 2013. Uncertainty quantification: theory, implementation, and applications. Vol. 12. SIAM.Google Scholar
Andrew M Stuart. 2010. Inverse problems: a Bayesian perspective. Acta Numerica 19 (2010), 451–559.Google ScholarCross Ref
Yunsheng Tian, Mina Konakovic Lukovic, Michael Foshey, Timothy Erps, Beichen Li, and Wojciech Matusik. 2021. AutoOED: Automated Optimal Experimental Design Platform with Data-and Time-Efficient Multi-Objective Optimization. (2021).Google Scholar
Luke Tierney and Joseph B Kadane. 1986. Accurate approximations for posterior moments and marginal densities. J. Amer. Statist. Assoc. 81, 393 (1986), 82–86.Google ScholarCross Ref
Dariusz Ucinski. 2000. Optimal sensor location for parameter estimation of distributed processes. International Journal of Control 73, 13 (2000), 1235–1248.Google ScholarCross Ref
Sanita Vetra-Carvalho, Peter Jan Van Leeuwen, Lars Nerger, Alexander Barth, M Umer Altaf, Pierre Brasseur, Paul Kirchgessner, and Jean-Marie Beckers. 2018. State-of-the-art stochastic data assimilation methods for high-dimensional non-Gaussian problems. Tellus A: Dynamic Meteorology and Oceanography 70, 1 (2018), 1–43.Google ScholarCross Ref
U. Villa, N. Petra, and O. Ghattas. 2018. hIPPYlib: An Extensible Software Framework for Large-scale Deterministic and Bayesian Inverse Problems. Journal of Open Source Software 3, 30 (2018). https://doi.org/10.21105/joss.00940Google ScholarCross Ref
Curtis R Vogel. 2002. Computational methods for inverse problems. SIAM.Google Scholar
Bob Wheeler and Maintainer Jerome Braun. 2019. Package ‘AlgDesign’. R Proj. Stat. Comput 1, 0 (2019), 1–25.Google Scholar
Keyi Wu, Thomas O’Leary-Roseberry, Peng Chen, and Omar Ghattas. 2023. Large-scale Bayesian optimal experimental design with derivative-informed projected neural network. Journal of Scientific Computing 95, 1 (2023), 30.Google ScholarDigital Library

Index Terms

PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments
1. Mathematics of computing
  1. Mathematical software

Recommendations

Optimal Experimental Design for Inverse Problems in the Presence of Observation Correlations

Optimal experimental design (OED) is the general formalism of sensor placement and decisions about the data collection strategy for engineered or natural experiments. This approach is prevalent in many critical fields such as battery design, numerical ...
Read More
On gauss-verifiability of optimal solutions in variational data assimilation problems with nonlinear dynamics

The problem of variational data assimilation for a nonlinear evolution model is formulated as an optimal control problem to find the initial condition. The optimal solution (analysis) error arises due to the errors in the input data (background and ...
Read More
A Fast and Scalable Method for A-Optimal Design of Experiments for Infinite-dimensional Bayesian Nonlinear Inverse Problems

We address the problem of optimal experimental design (OED) for Bayesian nonlinear inverse problems governed by partial differential equations (PDEs). The inverse problem seeks to infer an infinite-dimensional parameter from experimental data observed at a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Mathematical Software Just Accepted
ISSN:0098-3500
EISSN:1557-7295
Table of Contents

Copyright © 2024 Copyright held by the owner/author(s).
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Online AM: 20 March 2024
- Accepted: 4 March 2024
- Revised: 19 December 2023
- Received: 6 March 2023
Published in toms Just Accepted

Check for updates
Author Tags
Optimal Experimental Design
OED
Inverse Problems
Data Assimilation
Mathematical Software
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 43
  Total Downloads
- Downloads (Last 12 months)43
- Downloads (Last 6 weeks)28
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments

ACM Transactions on Mathematical Software

Abstract

References

Cited By

Index Terms

Recommendations

Optimal Experimental Design for Inverse Problems in the Presence of Observation Correlations

On gauss-verifiability of optimal solutions in variational data assimilation problems with nonlinear dynamics

A Fast and Scalable Method for A-Optimal Design of Experiments for Infinite-dimensional Bayesian Nonlinear Inverse Problems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments

ACM Transactions on Mathematical Software

Abstract

References

Cited By

Index Terms

Recommendations

Optimal Experimental Design for Inverse Problems in the Presence of Observation Correlations

On gauss-verifiability of optimal solutions in variational data assimilation problems with nonlinear dynamics

A Fast and Scalable Method for A-Optimal Design of Experiments for Infinite-dimensional Bayesian Nonlinear Inverse Problems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media