Abstract
It is generally challenging to quantify the fidelity of surrogate models without additional system evaluations. Standard error measures, such as the mean squared error and cross-validation error, often do not adequately capture the fidelity of the model trained using all available sample points. This paper introduces a new model-independent approach to quantify surrogate model fidelity, called Predictive Estimation of Model Fidelity (PEMF). In PEMF, intermediate surrogates are iteratively constructed over heuristic subsets of sample points. The median and the maximum errors estimated over the remaining points are used to determine the respective error distributions at each iteration. The estimated modes of the error distributions are represented as functions of the density of intermediate training points through nonlinear regression, assuming a smooth decreasing trend of errors with increasing sample density. These regression functions are then used to predict the expected median and maximum errors in the final surrogate model (trained using all available sample points). A Monotonic Trend criterion is defined to statistically test if the regression function is reasonably reliable in predicting the model fidelity, failing which a stable implementation of k-fold cross-validation (based on modal error) is used to predict the final surrogate error. To compare the accuracy and robustness of PEMF with that of the popular leave-one-out cross-validation, numerical experiments are performed using Kriging, RBF, and E-RBF models. It is observed that the model fidelities estimated by PEMF is up to two orders of magnitude more accurate and statistically more stable compared to those based on cross-validation.
Similar content being viewed by others
References
Allaire D, He Q, Deyst J, Willcox K (2012) An information-theoretic metric of system complexity with application to engineering system design. J Mech Des 134(10):100,906
Atamturktur S, Hemez F, Williams B, Tome C, Unal C (2011) A forecasting metric for predictive modeling. Comput Struct 89(23):2377–2387
Atamturktur S, Williams B, Egeberg M, Unal C (2013) Batch sequential design of optimal experiments for improved predictive maturity in physics-based modeling. Struct Multidiscip Optim 48(3):549–569
Audet C, Dennis JE, Moore DW, Booker A, Frank PD (2000) A surrogate-model-based method for constrained optimization. In: 8th symposium on multidisciplinary analysis and optimization. Long Beach
Booker AJ, Dennis JE, Frank P, Serafini DB, Torczon V, Trosset MW (1999) A rigorous framework for optimization of expensive functions by surrogates. Struct Optim 17(1):1–13
Bozdogan H (2000) Akaike’s information criterion and recent developments in information complexity. J Math Psychol 44:62–91
Chowdhury S, Mehmani A, Messac A (2014a) Concurrent surrogate model selection (cosmos) based on predictive estimation of model fidelity. In: ASME 2014 international design engineering technical conferences (IDETC). Buffalo
Chowdhury S, Mehmani A, Tong W, Messac A (2014b) A visually-informed decision-making platform for model-based design of wind farms. In: 15th AIAA/ISSMO multidisciplinary analysis and optimization conference. Atlanta
Efron B, Tibshirani R (1993) An introduction to the bootstrap, vol 57. CRC press
Forrester A, Keane A (2009) Recent advances in surrogate-based optimization. Progress Aerospace Sci 45(1-3):50–79
Goel T, Stander N (2009) Comparing three error criteria for selecting radial basis function network topology. Comput Methods in Appl Mech Eng 198(27):2137–2150
Goel T, Haftka RT, Shyy W, Queipo NV (2007) Ensemble of surrogates. Struct Multidiscip Optim 33(3):199–216
Goel T, Haftka RT, Shyy W (2009) Comparing error estimation measures for polynomial and kriging approximation of noise-free functions. Struct Multidiscip Optim 38(5):429–442
Gorissen D, Dhaene T, Turck FD (2009) Evolutionary model type selection for global surrogate modeling. J Mach Learn Res 10:2039–2078
Gunn SR (1998) Support vector machines for classification and regression. Tech. rep., ISIS - 14. NASA Langley Research Center, Hampton, VA
Haldar A, Mahadevan S (2000) Probability, reliability, and statistical methods in engineering design. Wiley
Hardy RL (1971) Multiquadric equations of topography and other irregular surfaces. J Geophys Res 76:1905–1915
Haukoos JS, Lewis RJ (2005) Advanced statistics: bootstrapping confidence intervals for statistics with ŞdifficultŤ distributions. Acad Emerg Med 12(4):360–365
Hemez F, Atamturktur S, Unal C (2010) Defining predictive maturity for validated numerical simulations. Comput Struct 88(7):497–505
Jin R, Chen W, Simpson TW (2000) Comparative studies of metamodeling techniques under multiple modeling criteria. AIAA 1(4801)
Jin R, Chen W, Sudjianto A (2002) On sequential sampling for global metamodeling in engineering design. In: ASME 2002-design engineering technical conferences and computers and information in engineering conference. Montreal
Jones D, Schonlau M, Welch W (1998) Efficient global optimization of expensive black-box functions. J Global Optim 13(4):455–492
Keane AJ (2006) Statistical improvement criteria for use in multiobjective design optimization. AIAA Journal 44(4):879–891
Kennedy MC, O’Hagan A (2000) Predicting the output from a complex computer code when fast approximations are available. Biometrika 87(1):1–13
Kleijnen J (1975) Statistical techniques in simulation. Publishing House Statistics, New York
Kleijnen J, Beers WV (2004) Application-driven sequential designs for simulation experiments: Kriging metamodelling. J Oper Res Soc 55:876–883
Lawrence I, Lin K (1998) A concordance correlation coefficient to evaluate reproducibility. Biometrics, pp 255–268
Lehmensiek R, Meyer P, Muller M (2002) Adaptive sampling applied to multivariate, multiple output rational interpolation models with application to microwave circuits. Int J RF and Microwave Comput Aided Eng 12(4):332–340
Loeppky JL, Moore LM, Williams B (2010) Batch sequential designs for computer experiments. J Stat Plan Infer 140(6):1452–1464
Lophaven SN, Nielsen HB, Sondergaard J (2002) Dace - a matlab kriging toolbox, version 2.0. Tech. Rep, IMM-REP-2002-12. Informatics and mathematical modelling report, Technical University of Denmark
Martin JD, Simpson TW (2005) Use of kriging models to approximate deterministic computer models. AIAA J 43(4):853–863
McKay M, Conover W, Beckman R (1979) A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2): 239–245
Meckesheimer M, Booker AJ, Barton RR, Simpson TW (2002) Computationally inexpensive metamodel assessment strategies. AIAA J 40(10):2053–2060
Mehmani A, Zhang J, Chowdhury S, Messac A (2012) Surrogate-based design optimization with adaptive sequential sampling. In: 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics And Materials Conference. Hawaii
Montgomery DC, Runger GC (2010) Applied statistics and probability for engineers. Wiley, Hoboken
Mullur A, Messac A (2005) Extended radial basis functions: more flexible and effective metamodeling. AIAA J 43(6):1306–1315
Nguyen HM, Couckuyt I, Knockaert L, Dhaene T, Gorissen D, Saeys Y (2011) An alternative approach to avoid overfitting for surrogate models. In: Proceedings of the 2011 winter simulation conference, pp 2765–2776
Queipo N, Haftka RT, Shyy W, Goel T, Vaidyanathan R, Tucker P (2005) Surrogate-based analysis and optimization. Progress Aerospace Sci 41(1):1–28
Sacks J, Welch W, Mitchell T, Wynn H (1989) Design and analysis of computer experiments. Stat Sci 4(4)
Simpson T, Korte J, Mauery T, Mistree F (2001) Kriging models for global approximation in simulation-based multidisciplinary design optimization. AIAA J 39(12):2233–2241
Sugiyama M (2006) Active learning in approximately linear regression based on conditional expectation of generalization error. J Mach Learn Res 7:141–166
Viana FAC, Haftka RT, Steffen V (2009) Multiple surrogates: How cross-validation errors can help us to obtain the best predictor. Struct Multidiscip Optim 39(4):439–457
Viana FAC, Pecheny V, Haftka RT (2010) Using cross validation to design conservative surrogates. AIAA J 48(10):2286–2298
Williams B, Loeppky JL, Moore LM, Macklem MS (2011) Batch sequential design to achieve predictive maturity with calibrated computer models. Reliab Eng Syst Saf 96(9):1208–1219
Yegnanarayana B (2004) Artificial neural networks. PHI Learning Pvt. Ltd
Zhang J, Chowdhury S, Messac A (2012) An adaptive hybrid surrogate model. Struct Multidiscip Optim 46(2):223–238
Zhang J, Chowdhury S, Mehmani A, Messac A (2014) Characterizing uncertainty attributable to surrogate models. J Mech Des 3(136):031,004
Acknowledgments
Support from the National Science Foundation Awards CMMI-1100948 and CMMI-1437746 is gratefully acknowledged. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors, and do not necessarily reflect the views of the NSF.
The surrogate model codes provided by Dr. Jie Zhang are gratefully acknowledged.
Author information
Authors and Affiliations
Additional information
Parts of this manuscript have been presented at the 54th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference, in April, 2013, at Boston, Massachusetts - Paper Number: AIAA 2013-1751.
Appendix A
Appendix A
1.1 A.1 Standard k-fold cross-validation
1.2 A.2 Quantifying the mode of median and maximum errors, estimated on additional test points (for performance testing of PEMF)
1.3 A.3 Quantifying the mean and maximum error estimated on additional test points (for performance validation of cross-validation)
Rights and permissions
About this article
Cite this article
Mehmani, A., Chowdhury, S. & Messac, A. Predictive quantification of surrogate model fidelity based on modal variations with sample density. Struct Multidisc Optim 52, 353–373 (2015). https://doi.org/10.1007/s00158-015-1234-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00158-015-1234-z