Abstract
OBJECTIVE: To illustrate the use of multivariable optimal discriminant analysis (MultiODA).
DESIGN: Data from four previously published studies were reanalyzed using MultiODA. The original analysis was Fisher’s linear discriminant analysis (FLDA) for two studies and logistic regression analysis (LRA) for two studies.
MEASUREMENTS AND MAIN RESULTS: In Study 1, FLDA achieved an overall percentage accuracy in classification (PAC) for the training sample of 69.9%, compared with 73.5% for MultiODA. In Study 2, the LRA model required three attributes to achieve a 76.1% overall PAC for the training sample and a 79.4% overall PAC for the hold-out sample. Using only two attributes, the MultiODA model achieved similar values. In Study 3, the FLDA model achieved an overall PAC of 82.5%, compared with 87.5% for the MultiODA model. In Study 4, MultiODA identified a two-attribute model that achieved a 93.3% overall training PAC, when an LRA model could not be developed.
CONCLUSIONS: MultiODA identified: a superior training model (Study 1); a more parsimonious model that achieved superior overall training and identical hold-out PAC (Study 2); a model that achieved a higher hold-out PAC (Study 3); and a two-attribute model that achieved a relatively high PAC when a multivariable LRA model could not be obtained (Study 4). These findings suggest that MultiODA has the potential to improve the accuracy of predictions made in general internal medicine research.
Similar content being viewed by others
References
Silva APD, Stam A. Discriminant analysis. In: Grimm LG, Yarnold PR (eds). Reading and Understanding Multivariate Statistics. Washington, DC: APA Books, 1995.
Wright RE. Logistic regression. In: Grimm LG, Yarnold PR (eds). Reading and Understanding Multivariate Statistics. Washington, DC: APA Books, 1995.
Yarnold PR, Soltysik RC, Martin GJ. Heart rate variability and susceptibility for sudden cardiac death: an example of multivariable optimal discriminant analysis. Stat Med. 1994;13:1015–21.
Soltysik RC, Yarnold PR. The Warmack-Gonzalez algorithm for multivariable optimal discriminant analysis. Comput Operat Res. 1994;21:735–45.
Hillier FS. Lieberman GJ. Operations Research (2nd ed). San Francisco: Holden-Day, 1974.
Land AH, Doig AG. An automated method of solving discrete programming problems. Econometrica. 1960;28:497–520.
Bajgier SM, Hill AV. An experimental comparison of statistical and linear programming approaches to the discriminant problem. Decis Sci. 1982;13:604–18.
Joachimsthaler EA, Stam A. Mathematical programming approaches for the classification problem in two-group discriminant analysis. Multivariate Behav Res. 1990;25:427–54.
Stam A. Joachimsthaler EA. A comparison of a robust mixed-integer approach to existing methods for establishing classification rules for the discriminant problem. Eur J Operat Res. 1990;46:113–22.
SAS/OR Users Guide, Version 6. Cary, NC: SAS Institute, 1989.
Soltysik RC, Yarnold PR. MAKE45: a user interface for multivariable optimal discriminant analysis via SAS/OR. Appl Psychol Meas. 1991;15:170.
Soltysik RC. Yarnold PR. ODA 1.0: Optimal Data Analysis for DOS. Chicago: Optimal Data Analysis, 1993.
Lin EHB, Katon W, Von Korff M, et al. Frustrating patients: physician and patient perspectives among distressed high users of medical services. J Gen Intern Med. 1991;6:241–6.
Burns R, Nichols LO. Factors predicting readmission of older general medicine patients. J Gen Intern Med. 1991;6:389–93.
McCormick WC. Inui TS, Deyo RA, Wood RW. Long-term care needs of hospitalized persons with AIDS: a prospective cohort study. J Gen Intern Med. 1991;6:27–34.
McCormick WC, Inui TS, Deyo RA. Wood RW. Long-term care preferences of hospitalized persons with AIDS. J Gen Intern Med. 1991;6:524–8.
Loucopoulos C, Pavor R. An improved MIP formulation for the three-group classification problem. Invited address. The Institute of Management Science and the Operational Research Society of America (joint national meetings). Chicago. May 1993.
Soltysik RC, Yarnold PR. Fast solutions to optimal discriminant analysis problems. Invited address. The Institute of Management Science and the Operational Research Society of America (joint national meetings), Orlando, May 1992.
Markowski CA. On the balancing of error rates for LP discriminant methods. Manager Decis Econ. 1990;11:235–41.
Erenguc SS, Koehler GJ. Survey of mathematical programming models and experimental results for linear discriminant analysis. Manager Decis Econ. 1990;11:215–25.
Stam A, Jones DG. Classification performance of mathematical programming techniques in discriminant analysis: results for small and medium sample sizes. Manager Decis Econ. 1990;11:243–53.
Yarnold PR, Hart LA, Soltysik RC. Optimizing the classification performance of logistic regression and Fisher’s discriminant analysis. Educ Psychol Measure. 1994;54:73–85.
Koehler GJ, Erenguc SS. Minimizing misclassifications in linear discriminant analysis. Decis Sci. 1990;21:63–85.
Yarnold PR. Soltysik RC. Refining two-group multivariable classification models using univariate optimal discriminant analysis. Decis Sci. 1991;22:1158–64.
Author information
Authors and Affiliations
Additional information
Supported in part by grants from the National Science Foundation (SES-8822337); the Northwestern Memorial Foundation (Dixon Award); the US Public Health Service (1D28PE-15275-01); the Northwest Health Services Research and Development Field Program (Seattle VA Medical Center); the Seattle/King County Department of Public Health, AIDS Prevention Project; the Health Services Research and Development Field Program (Little Rock); and the National Institute of Mental Health (MH41739-02).
Appreciation is extended to the computer center of the University of Illinois at Chicago, for computer resources used in this research.
Rights and permissions
About this article
Cite this article
Yarnold, P.R., Soltysik, R.C., McCormick, W.C. et al. Application of multivariable optimal discriminant analysis in general internal medicine. J Gen Intern Med 10, 601–606 (1995). https://doi.org/10.1007/BF02602743
Issue Date:
DOI: https://doi.org/10.1007/BF02602743
Key words
- acceptability of health care
- AIDS
- chronic disease
- classification
- discharge planning
- emergent readmission
- frustrating patients
- geriatrics
- logistic regression
- long-term care
- medical services utilization
- multivariable optimal discriminant analysis
- patient preferences
- severity of illness
- somatization
- statistical prediction model
- validity analysis