The Effect of Prior Probability on Skill in Two-Group Discriminant Analysis

Paik, Haejung

doi:10.1023/A:1004359127048

The Effect of Prior Probability on Skill in Two-Group Discriminant Analysis

Published: May 1998

Volume 32, pages 201–211, (1998)
Cite this article

Quality and Quantity Aims and scope Submit manuscript

Haejung Paik¹

47 Accesses
2 Citations
Explore all metrics

Abstract

Although the weights in a discriminant function (both linear and quadratic) are independent of group prior probabilities, the performance of the classifier (on the training and validation data) is sensitively dependent on these often unknown probabilities. After reviewing some defects of a popular measure of performance in the situation where the group sizes are naturally disproportionate, three alternative measures of performance (or association) are considered and it is shown that the behavior of the measures as a function of group prior probability is different between measures. Consequently, the optimum choice of the group prior probability depends on the specific measure of performance. Among the measures considered, only two measures - the index of mean square contingency and the Heidke Skill Statistic - are found to be well defined in the disparate-group size situation, and are, therefore, recommended. An empirical data set, dealing with delinquency among high school students is employed to illustrate all of the findings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Anderer, P., Saletu, B., Klöppel, B., Semlitsch, H. V. & Werner, H. (1994). Discrimination between demented patients and normals based on topographic EEG slow wave activity: comparison between z statistics, discriminant analysis and artificial neural network classifiers, Electroencephalography and Clinical Neurophysiology 91: 108–117.
Google Scholar
Anson, O. & Sagy, S. (1995). Marital violence: Comparing women in violent and nonviolent unions, Human Relations 48: 285–305.
Google Scholar
Azari, N. P., Pietrini, P., Horwitz, B. & Pettigrew, K. D. (1993). Individual differences in cerebral metabolic patterns during pharmacotherapy in obsessive-compulsive disorder: A multiple regression/discriminant analysis of positron emission tomographic data, Biological Psychiatry 34: 798–809.
Google Scholar
Bellantoni, L., Conway, J. S., Jacobsen, J. E., Pan, Y. B. & Wu, S. L. (1991). Using neural networks with jet shapes to identify b jets in e ⁺ e ⁻ interactions, Nuclear Instruments and Methods A-310: 610–615.
Google Scholar
Bernard, L. C., McGrath, M. J. & Houston, W. (1993). Discriminating between simulated malingering and closed head injury on the Wechsler Memory Scale — revised, Archives of Clinical Neuropsychology 8: 539–551.
Google Scholar
Boone, S. L. (1991). Aggression in African-American boys: A discriminant analysis. Genetic, Social, and General Psychology Monographs, 117: 203–228.
Google Scholar
Bowser-Chao, D. & Dzialo, D. L. (1993). Comparison of the use of binary decision trees and neural networks in top-quark detection, Physical Review D 47: 1900–1905.
Google Scholar
Camilli, G. (1990). The test of homogeneity for 2 × 2 contingency tables: A review of and some personal opinions on the controversy, Psychological Bulletin 108: 135–145.
Google Scholar
Cherry, A. (1993). Combining cluster and discriminant analysis to develop a social bond typology of runaway youth, Research on Social Work Practice 3: 175–190.
Google Scholar
Christensen, L. & Duncan, K. (1995). Distinguishing depressed from nondepressed individuals using energy and psychosocial variables, Journal of Consulting and Clinical Psychology 63: 495–498.
Google Scholar
Dammers, E. (1993). Measurement in the ex post evaluation of forecasts, Quality and Quantity 27: 31–45.
Google Scholar
Dannehl, C. R. & Groth, A. J. (1992). Communist and non-communist Europe: Functional differentiation, 1970–1985, Social Indicators Research 27: 59–87.
Google Scholar
Doolittle, M. H. (1888). Association ratios, Bulletin of the Philosophical Society of Washington 10: 83–87, 94–96.
Google Scholar
Famularo, R., Fenton, T., Kinscherff, R., Barnum, R., Bolduc, S. & Bunschaft, D. (1992). Differences in neuropsychological and academic achievement between adolescent delinquents and status offenders, American Journal of Psychiatry 149: 1252–1257.
Google Scholar
Fienberg, S. E. (1980). The Analysis of Cross-Classified Categorical Data (2nd edn). Cambridge, MA: MIT Press.
Google Scholar
Glick, N. (1978). Additive estimators for probabilities of correct classification, Pattern Recognition 10: 211–222.
Google Scholar
Goodman, L. A. & Kruskal, W. H. (1954). Measures of association for cross classifications, American Statistical Association Journal 49: 723–764.
Google Scholar
Goodman, L. A. & Kruskal, W. H. (1959). Measures of association for cross classifications. II: Further discussion and references, American Statistical Association Journal 54: 123–163.
Google Scholar
Haber, M. (1990). Comments on “the test of homogeneity for 2 × 2 contingency tables: A review of and some personal opinions on the controversy” by G. Camilli, Psychological Bulletin 108: 146–149.
Google Scholar
Hammond, S. M. & Lienert, G. A. (1995). Modified Phi correlation coefficients for the multivariate analysis of ordinally scaled variables, Educational and Psychological Measurement 55: 225–236. Hays, W. L. (1973). Statistics for the Social Sciences (2nd edn). New York, NY: Holt, Rinehart and Winston.
Google Scholar
Huberty, C. A. (1994). Applied Discriminant Analysis. New York, NY: John Wiley & Sons, Inc.
Google Scholar
Klecka, W. R. (1980). Discriminant Analysis. Newbury Park, CA: Sage Publications.
Google Scholar
Lachenbruch, P. A. (1975). Discriminant Analysis. New York, NY: Hafner Press.
Google Scholar
Marzban, C. (1997). Scalar Measures of Performance in Rare-event Situations, To appear in Weather and Forecasting.
McLachlan, G. J. (1992). Discriminant Analysis and Statistical Pattern Recognition. New York, NY: A Wiley-Interscience Publication.
Google Scholar
Morwitz, V. G. & Schmittlein, D. (1992). Using segmentation to improve sales forecasts based on purchase intent: Which “intenders” actually buy?, Journal of Marketing Research 29: 391–405.
Google Scholar
Murphy, A. H. & Daan, H. (1985). Forecast Evaluation, in A. H. Murphy & R. W. Katz (eds), Probability, Statistics and Decision Making in the Atmospheric Sciences, pp. 379–437. Boulder, CO: Westview Press.
Google Scholar
National Opinion Research Center. (1980). High School and Beyond Information for Users: Base Year (1980) Data. Chicago: Author.
Google Scholar
Ott, L., Larson, R. F. & Mendenhall, W. (1983). Statistics: A Tool for the Social Sciences (3rd edn). Boston, MA: Duxbury Press.
Google Scholar
Paik, H. & Comstock, G. (1994). The effects of television violence on antisocial behavior: A meta-analysis, Communication Research 21: 516–546.
Google Scholar
Paik, H. & Marzban, C. (1995). Predicting television extreme viewers and nonviewers: A neural network analysis, Human Communication Research 22: 284–306.
Google Scholar
Parshall, C. G. & Kromrey, J. D. (1996). Tests of independence in contingency tables with small samples: A comparison of statistical power, Educational and Psychological Measurement 56: 26–44.
Google Scholar
Peirce, C. S. (1884). The numerical measure of the success of predictions [Letter to the editor], Science 4: 453–454.
Google Scholar
Reiss, A. J. & Roth, J. A. (eds). (1994). Understanding and Preventing Violence, Vol. 3: Social Influences. Washington, DC: National Academy Press.
Google Scholar
SAS Institute Inc. (1989). SAS/STAT® User's Guide, Version 6, Fourth Edition, 1, Cary, NC: SAS Institute Inc.
Google Scholar
Stimpfl-Abele, G. (1991). Recognition of decays of charged tracks with neural network techniques, Computer Physics Communications 67: 183–192.
Google Scholar
Wilson, R. L. & Hardgrave, B. C. (1995). Predicting graduate student success in an MBA program regression versus classification, Educ. Psych. Measurement 55: 186–195.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Communication, University of Oklahoma, Norman, OK, 73019, U.S.A.
Haejung Paik

Authors

Haejung Paik
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Paik, H. The Effect of Prior Probability on Skill in Two-Group Discriminant Analysis. Quality & Quantity 32, 201–211 (1998). https://doi.org/10.1023/A:1004359127048

Download citation

Issue Date: May 1998
DOI: https://doi.org/10.1023/A:1004359127048

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Effect of Prior Probability on Skill in Two-Group Discriminant Analysis

Abstract

Access this article

Similar content being viewed by others

A Comparison of the Different Measures of Error Rates in Discriminant Analysis using Small Samples

Unequal Priors in Linear Discriminant Analysis

Variable selection in discriminant analysis for mixed continuous-binary variables and several groups

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

The Effect of Prior Probability on Skill in Two-Group Discriminant Analysis

Abstract

Access this article

Similar content being viewed by others

A Comparison of the Different Measures of Error Rates in Discriminant Analysis using Small Samples

Unequal Priors in Linear Discriminant Analysis

Variable selection in discriminant analysis for mixed continuous-binary variables and several groups

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation