Abstract
Coefficients of association have been widely employed in cluster analysis. However, their use has been, for the most part, restricted to binary data. This limitation can be overcome by redefining positive and negative matches and mismatches in terms of minimum and maximum values of paired elements of parallel vector arrays. Rewriting the algorithms of coefficients of association with these new components gives the new “quantified” coefficients general utility for binary, ordered multistate, and quantitative data, while retaining their original analytic properties. Quantified coefficients of association avoid several problems of shape and size that are associated with correlation coefficients and measures of Euclidean distance. However, when measuring similarity, quantified coefficients weight each attribute of an object by that attribute's magnitude. A related set of similarity indices termed “mean ratios” is introduced; these indices give each attribute equal weight in all situations. Both quantified coefficients of association and mean ratios are related to a number of measures of similarity introduced to various fields of scientific research during the past 50 years. A review of this literature is included in an attempt to consolidate methodology and simplify nomenclature.
Similar content being viewed by others
References
Anderson, A. J. B., 1971, Similarity measure for mixed attribute types: Nature, v. 232, no. 5310, p. 416–417.
Bonham-Carter, G. F., 1965, A numerical method of classification using qualitative and semi-quantitative data as applied to the facies analysis of limestones: Can. Petrol. Geol. Bull., v. 13, no. 4, p. 482–502.
Boudouresque, C.- F., and Lück, H. B. 1972, Recherches de bionomie structurale au niveau d'un peuplement benthique sciaphile: Jour. Exp. Mar. Biol. Ecology, v. 8, no. 2, p. 133–144.
Boyce, A. J., 1964, The value of some methods of numerical taxonomy with reference to hominoid classification,in Phenetic and phylogenetic classification: Systematics Assoc., Publ. 6, p. 47–65.
Bray, J. R., and Curtis, J. T., 1957, An ordination of the Upland Forest communities of southern Wisconsin: Ecol. Monogr., v. 27, no. 4, p. 325–349.
Burnaby, T. P., 1970, On a method for character-weighting a similarity coefficient, employing the concept of information: Jour. Math. Geology, v. 2, no. 1, p. 25–38.
Cain, A. J., and Harrison, G. A., 1958, An analysis of the taxonomist's judgement of affinity: Proc. Zool. Soc. London, v. 131, pt. 1, p. 85–98.
Cheetham, A. H., and Hazel, J. E., 1969, Binary (presence-absence) similarity coefficients: Jour. Paleontology, v. 43, no. 5, p. 1130–1136.
Cole, L. C., 1949, The measurement of interspecific association: Ecology, v. 30, no. 4, p. 411–424.
Colless, D. H., 1967, An examination of certain concepts in phenetic taxonomy: Systematic Zoology, v. 16, no. 1, p. 6–27.
Czekanowski, J., 1913, Zarys Metod Statystycznych w Zastosowaniu do Antropologji: Prace Towarzystwa Naukowego Warszawskiego, no. 5, Varsovie, 228 p.
Dice, L. R., 1945, Measures of the amount of ecologic association between species: Ecology, v. 26, no. 3, p. 297–302.
Eades, D. C., 1965, The inappropriateness of the correlation coefficient as a measure of taxonomic resemblance: Systematic Zoology, v. 14, no. 2, p. 98–100.
Edynak, G. J., 1974, Estimating lifestyles from human skeletal material; A Medieval Yugoslav example,in The measures of man: Shenkman Publ. Co., Cambridge, Massachusetts, in press.
Fager, E. W., and McGowan, J. A., 1963, Zooplankton species groups in the North Pacific: Science, v. 140, no. 3566, p. 453–460.
Forbes, S. A., 1907, On the local distribution of certain Illinois fishes; An essay in statistical ecology: Bull. Illinois State Lab. Nat. Hist., v. 7, art. 8, p. 273–303.
Gleason, H. A., 1920, Some applications of the quadrat method. Bull. Torrey Bot. Club, v. 47, no. 1, p. 21–33.
Goodall, D. W., 1964, A probablistic similarity index: Nature, v. 203, no. 4949, p. 1098.
Goodall, D. W., 1966, A new similarity index based on probability: Biometrics, v. 22, pt. 4, p. 882–907.
Gower, J. C., 1971, A general coefficient of similarity and some of its properties: Biometrics, v. 27, pt. 4, p. 857–871.
Hall, A. V., 1969, Avoiding informational distortion in automatic grouping programs: Systematic Zoology, v. 18, no. 3, p. 318–329.
Hazel, J. E., 1970, Binary coefficients and clustering in stratigraphy: Geol. Soc. America Bull., v. 81, no. 11, p. 3237–3252.
Imbrie, J., and Purdy, E. G., 1962, Classification of modern Bahamian carbonate sediments,in Classification of carbonate rocks: Am. Assoc. Petroleum Geologists Mem. 1, p. 253–272.
Jaccard, P., 1901, Distribution de la Flore Alpine dans le Bassin des Dranses et dans quelques régions voisines: Bull. Soc. Vaud. Sci. Nat., v. 37, no. 140, p. 241–272.
Kendrick, W. B., 1964, Quantitative characters in computer taxonomy,in Phenetic and phylogenetic classification: Systematics Assoc., Publ. 6, p. 105–114.
Kulczyński, S., 1927, Die Pflanzenassoziationen des Pieninen: Bull. Int. Acad. Pol. Sci. Lett., Classe Sci. Math. Nat., Sér B., Sci. Math., Suppl. 3, p. 57–203.
Long, C. A., 1963, Mathematical formulas expressing faunal resemblance: Trans. Kansas Acad. Sci., v. 66, no. 1, p. 138–140.
Mello, J. F., and Buzas, M. A., 1968, An application of cluster analysis as a method of determining biofacies: Jour. Paleontology, v. 42, no. 3, p. 747–758.
Minkoff, E. C., 1965, The effect on classification of slight alterations in numerical technique: Systematic Zoology, v. 14, no. 3, p. 196–213.
Odum, E. P., 1950, Bird populations of the Highlands (North Carolina) Plateau in relation to plant succession and avian invasion: Ecology, v. 31, no. 4, p. 587–605.
Parks, J. M., 1966, Cluster analysis applied to multivariate geologic problems: Jour. Geology, v. 74, no. 5, pt. 2, p. 703–715.
Parks, J. M., 1969, Multivariate facies maps,in Symposium on computer applications in petroleum exploration: Kansas Geol. Survey Computer Contr. 40, p. 6–11.
Parks, J. M., 1970, FORTRAN IV program for Q-mode cluster analysis on distance function with printed dendrogram: Kansas Geol. Survey Computer Contr. 46, 32 p.
Penrose, L. S., 1954, Distance, shape, and size: Ann. Eugenics, v. 18, pt. 4, p. 337–343.
Rex, M. A., 1972, Species diversity and character variation in some western North Atlantic deep sea gastropods: unpubl. doctoral dissertation, Harvard Univ, 178 p.
Rohlf, F. H., and Sokal, R. R., 1965, Coefficients of correlation and distance in numerical taxonomy: Univ. Kansas Sci. Bull., v. 45, no. 1, p. 3–27.
Rubin, J., 1966, An approach to organizing data into homogeneous groups: Systematic Zoology, v. 15, no. 3, p. 169–182.
Sheals, D. G., 1965, The application of computer techniques to Acarine taxonomy: A preliminary examination with species of the Hypoaspis-Androlaelaps complex (Acarina): Proc. Linn. Soc. London, v. 176, pt. 1, p. 11–21.
Simpson, G. G., 1960, Notes on the measurement of faunal resemblance: Am. Jour. Sci., v. 258a, p. 300–311.
Sneath, P. H. A., 1962, The construction of taxonomic groups,in Microbial classification: 12th Sym. Soc. Gen. Microbiol., p. 289–332.
Sokal, R. R., and Sneath, P. H. A., 1963, Principles of numerical taxonomy: W. H. Freeman and Company, San Francisco, 359 p.
SØrensen, T., 1948, A method of stabilizing groups of equivalent amplitude in plant sociology based on the similarity of species content and its application to analyses of the vegetation on Danish commons: Biol. Srk., v. 5, no. 4, p. 1–34.
Stephenson, W., Williams, W. T., and Cook, S. D., 1972, Computer analyses of Petersen's original data on bottom communities: Ecol. Monogr., v. 42, no. 4, p. 387–415.
Williams, W. T., and Dale, M. B., 1965, Fundamental problems in numerical taxonomy: Adv. Bot. Res., v. 2, p. 35–68.
Williams, W. T., Lambert, J. M., and Lance, G. N., 1966, Multivariate methods in plant ecology. V. Similarity analyses and information-analysis: Jour. Ecology, v. 54, no. 2, p. 427–445.
Williams, W. T., and Lance, G. N., 1965, Logic of computer-based intrinsic classifications: Nature, v. 207, no. 4993, p. 159–161.
Wishart, D., 1969, FORTRAN II programs for 8 methods of cluster analysis (CLUSTAN I): Kansas Geol. Survey Computer Contr. 38, 112 p.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Sepkoski, J.J. Quantified coefficients of association and measurement of similarity. Mathematical Geology 6, 135–152 (1974). https://doi.org/10.1007/BF02080152
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF02080152