Interpretation of Black-Box Predictive Models

Cherkassky, Vladimir; Dhar, Sauptik

doi:10.1007/978-3-319-21852-6_19

Vladimir Cherkassky⁴ &
Sauptik Dhar⁵

2993 Accesses
10 Citations

Abstract

Many machine learning applications involve predictive data-analytic modeling using black-box techniques. A common problem in such studies is understanding/interpretation of estimated nonlinear high-dimensional models. Whereas human users naturally favor simple interpretable models, such models may not be practically feasible with modern adaptive methods such as Support Vector Machines (SVMs) , Multilayer Perceptron Networks (MLPs), AdaBoost , etc. This chapter provides a brief survey of the current techniques for visualization and interpretation of SVM-based classification models, and then highlights potential problems with such methods. We argue that, under the VC-theoretical framework, model interpretation cannot be achieved via technical analysis of predictive data-analytic models. That is, any meaningful interpretation should incorporate application domain knowledge outside data analysis. We also describe a simple graphical technique for visualization of SVM classification models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adams, N.M., Hand, D.J.: Improving the practice of classifier performance assessment. Neural Comput. 12(2), 305–311 (2000)
Article Google Scholar
Ahn, J., Marron, J.: The maximal data piling direction for discrimination. Biometrika 97(1), 254–259 (2010)
Article MATH MathSciNet Google Scholar
Barakat, N., Bradley, A.: Rule-extraction from support vector machines: a review. Neurocomputing 74(1–3), 178–190 (2010)
Article Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, New York (2004)
Book MATH Google Scholar
Bradley, A.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)
Article Google Scholar
Breiman, L.: Statistical modeling: the two cultures. Stat. Sci. 16(3), 199–231 (2001)
Article MATH MathSciNet Google Scholar
Caragea, D., Cook, D., Honavar, V.G.: Gaining insights into support vector machine pattern classifiers using projection-based tour methods. In: Proceedings of KDD, pp. 251–256 (2001)
Google Scholar
Cherkassky, V.: Predictive learning, knowledge discovery and philosophy of science (invited lecture). In: Lin, J., et al. (eds.) Advances in Computational Intelligence, vol. 7311, pp. 209–233. Springer, Berlin (2012)
Chapter Google Scholar
Cherkassky, V.: Predictive Learning. http://VCtextbook.com (2013)
Cherkassky, V., Dhar, S.: Simple method for interpretation of high-dimensional nonlinear SVM classification models. In: Proceedings of the 2010 International Conference on Data Mining (DMIN 2010), pp. 267–272 (2010)
Google Scholar
Cherkassky, V., Dhar, S.: Market timing of international mutual funds: a decade after the scandal. In: Proceedings of Computational Intelligence for Financial Engineering and Economics, pp. 1–8 (2012)
Google Scholar
Cherkassky, V., Dhar, S., Dai, W.: Practical conditions for effectiveness of the universum learning. IEEE Trans. Neural Netw. 22(8), 1241–1255 (2011)
Article Google Scholar
Cherkassky, V., Mulier, F.: Learning from Data: Concepts, Theory, and Methods. Wiley, New York (1998)
MATH Google Scholar
Cherkassky, V., Mulier, F.: Learning from Data: Concepts, Theory, and Methods, 2nd edn. Wiley, New York (2007)
Book Google Scholar
Cook, D., Swayne, D.F.: Interactive and Dynamic Graphics for Data Analysis: With Examples Using R and GGobi. Springer, New York (2007)
Book Google Scholar
Diederich, J.: Rule Extraction from Support Vector Machines. Springer, Berlin (2008)
Book MATH Google Scholar
Fisher, R.: The logic of inductive inference. J. R. Stat. Soc. 98(1), 39–82 (1935)
Article Google Scholar
Fisher, R.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1935)
Article Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)
Article MATH Google Scholar
Hand, D.J.: Classifier technology and the illusion of progress. Stat. Sci. 21(1), 1–14 (2006)
Article MATH MathSciNet Google Scholar
Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, New York (2001)
Book Google Scholar
Martens, D., Provost, F.: Explaining documents’ classifications. http://pages.stern.nyu.edu/~fprovost/Papers/martens-CeDER-11-01.pdf (2011)
Martens, D., Baesens, B., Gestel, T.: Decompositional rule extraction from support vector machines by active learning. IEEE Trans. Knowl. Data Eng. 21(2), 178–191 (2009)
Article Google Scholar
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola, A., et al. (eds.) Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (2000)
Google Scholar
Poulet, F.: SVM and graphical algorithms: a cooperative approach. In: Proceedings of the Fourth IEEE International Conference on Data Mining, pp. 499–502 (2004)
Google Scholar
Roweis, S.: Data for MATLAB hackers. http://www.cs.nyu.edu/~roweis/data.html
Suykens, J.A.K., Van Gestel, T., de Brabanter, J., de Moor, B., Vandewalle, J.: Least Squares Support Vector Machines. World Scientific, Singapore (2002)
Book MATH Google Scholar
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Boston (2006)
Google Scholar
United States Congress: Mutual funds: who’s looking out for investors? http://www.access.gpo.gov/congress/house/pdf/108hrg/92982.pdf. Accessed on 10 Mar 2014
Vapnik, V.: The Nature of Statistical Learning theory. Springer, New York (1995)
Book MATH Google Scholar
Vapnik, V.N.: : Estimation of Dependences Based on Empirical Data. Empirical Inference Science: Afterword of 2006. Springer, New York (2006)
Google Scholar
Wang, X., Wu, S., Li, Q.: SVMV—a novel algorithm for the visualization of SVM classification results. In: Wang, J., et al. (eds.) Advances in Neural Networks. Lecture Notes in Computer Science, vol. 3971, pp. 968–973. Springer, Berlin (2006)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier, Amsterdam (2005)
Google Scholar
Zitzewitz, E.: Who cares about shareholders? Arbitrage proofing mutual funds. J. Law Econ. Organ. 19(2), 245–280 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN, 55455, USA
Vladimir Cherkassky
Research and Technology Center, Robert Bosch LLC, Palo Alto, CA, 94304, USA
Sauptik Dhar

Authors

Vladimir Cherkassky
View author publications
You can also search for this author in PubMed Google Scholar
Sauptik Dhar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladimir Cherkassky .

Editor information

Editors and Affiliations

Dept. of Computer Science, Royal Holloway, Univ of London, Egham, Surrey, United Kingdom
Vladimir Vovk
Frederick University, Nicosia, Cyprus
Harris Papadopoulos
Dept. of Computer Science, University of London, Egham, Surrey, United Kingdom
Alexander Gammerman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cherkassky, V., Dhar, S. (2015). Interpretation of Black-Box Predictive Models. In: Vovk, V., Papadopoulos, H., Gammerman, A. (eds) Measures of Complexity. Springer, Cham. https://doi.org/10.1007/978-3-319-21852-6_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-21852-6_19
Published: 04 September 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21851-9
Online ISBN: 978-3-319-21852-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics