Elsevier

Ecological Complexity

Volume 31, September 2017, Pages 201-205
Ecological Complexity

Short Note
On some properties of the Bray-Curtis dissimilarity and their ecological meaning

https://doi.org/10.1016/j.ecocom.2017.07.003Get rights and content

Highlights

  • We examine some basic properties of the Bray-Curtis (BC) dissimilarity.

  • We first suggest an additive decomposition formula for the BC coefficient.

  • Next we derive a general formula of dissimilarity which includes the BC dissimilarity as special case.

  • Finally we show that the BC coefficient exhibits a linear response to the transfer of species abundances from an abundant plot to a less abundant plot.

Abstract

In this paper, we examine some basic properties of the Bray-Curtis dissimilarity as compared with other distance and dissimilarity functions applied to ecological abundance data. We argue that the ability of every coefficient to measure species-level contributions is a fundamental requirement. By suggesting an additive decomposition formula for the Bray-Curtis coefficient we derive a general formula of dissimilarity, which includes the Canberra distance and the Bray-Curtis dissimilarity as special cases. A similar general formula is also proposed for the Marczewski-Steinhaus coefficient. Finally, using a modified version of Dalton’s principle of transfers, we show that the Bray-Curtis coefficient and the city-block distance exhibit a linear response to the transfer of species abundances from an abundant plot to a less abundant plot. At the other extreme, the chord and the Hellinger distances show an irregular and non-monotonic behavior.

Introduction

Ecologists routinely use dissimilarity measures between pairs of plots (or species assemblages, communities, sites, quadrats, etc.) to explore community assembly processes. Given two plots U and V, a logical starting point for evaluating any other dissimilarity measure is the Euclidean distance because it corresponds to our everyday feeling about interpoint distances in the visible and easily measurable 3D (physical) world:EDUV=j=1S(xUjxVj)2where xUj and xVj are the abundance values of species j in plots U and V, respectively, S is the total number of species recorded in these two plots: S=|SUSV|, and SU is the set of species in plot U.

However, community ecologists have repeatedly argued that this coefficient may provide misleading results for species abundance data containing zeros (e.g. Orlóci, 1972, Orlóci, 1978, Legendre and Gallagher, 2001). As an example, let us consider an artificial community composition matrix composed of four species (S1–S4) in three plots (U–W):

If we use Euclidean distance to measure dissimilarity, we find that the distance between plots U and W, which share species S1 and S2, is larger than that between plots U and V, which have no species in common: EDUV = 3.162; EDUW = 4.472; EDVW = 7.071. This is counter-intuitive ecologically, because the plots U and W contain the same species while plot V hosts a unique set of species. That is, abundance differences completely override a more fundamental issue: agreement in presence of species. This effect may be more substantial for large data matrices in which many species may easily have just a few records leading to sparse matrices predominantly filled up with zeros (Legendre and Gallagher, 2001).

In order to eliminate the problems inherent to the Euclidean distance, ecologists have developed a rich arsenal of alternative coefficients (see Legendre and Legendre, 2012, for a review). These indices incorporate some operation involving data standardization, i.e. modification of data such that each new score depends on other values in the matrix. If the plot vectors (columns in the example above) are first standardized to unit length by dividing each value with the length of the vector according to xUj=xUj/j=1SxUj2, and then the Euclidean distance is calculated from the normalized quantities xUj, we get the chord distance (Orlóci, 1967) given by the formula:CHUV=j=1S(xUjj=1SxUj2xVjj=1SxVj2)2=j=1S(xUjxVj)2CH is equivalent to the (Euclidean) length of the chord between two objects (plots) projected onto the surface of a hypersphere of unit radius (Orlóci, 1978, Legendre and Gallagher, 2001). Therefore, in the above example, while the Euclidean distance between U and W is 4.472, their chord distance is zero (CHUW = 0), because these plots contain the same species in the same proportions. Since plot V has no species in common with plots U and W, we get the maximum distance between them (CHUV = CHVW = 2  1.414). Therefore, for an ecologist this index captures information on community composition in a much more meaningful way than ED.

Another measure of multivariate plot-to-plot dissimilarity that can be calculated by first transforming the plot vectors in an appropriate way and then taking the Euclidean distance of the transformed vectors is the Hellinger distance (Legendre and Gallagher, 2001). In this case, the raw values xUj are first transformed by dividing each value by the plot sum and then taking the square root of the resulting values such that xUj=xUj/j=1SxUj. Then, the Euclidean distance is calculated from the transformed quantities xUj as:HDUV=j=1S(xUjj=1SxUjxVjj=1SxVj)2=j=1S(xUjxVj)2

Raw data may be transformed by many other ways, however. The formula suggested by Bray and Curtis (1957) implies relativization of species-wise differences by the total abundance of species in the two plots:BCUV=j=1S|xUjxVj|j=1S(xUj+xVj)

This index reflects the proportion of the total species abundances in which the two plots differ. For the above example, BC also outperforms ED because the maximum distance is obtained when the plots being compared have no species in common (BCUV = 1 and BCVW = 1), whereas BCUW = 0.5. This latter example suggests that, unlike CD, BC takes the value zero only if the two plots being compared are identical.

These three coefficients illustrate pretty well that, although dissimilarity may appear an intuitively simple concept, there is no single, unequivocal way for its measurement. The literature of numerical ecology treats many more, even hundreds of dissimilarity functions (see e.g., Orlóci, 1978, Podani, 2000, Legendre and Legendre, 2012) and selection among them is often arbitrary, dictated by fashion, availability in commercial software or personal preference. The choice of a dissimilarity index best suited for a specific ecological problem is a complex question which does not have clear and unambiguous answer. However, while these references provide some information for ecologists to facilitate decision, the properties of even the best known indices are not fully understood.

The aim of this paper is thus to review some of the properties of the Bray-Curtis dissimilarity relevant for ecologists. The paper is organized as follows: first, we discuss the relationships of the Bray-Curtis dissimilarity with the Canberra dissimilarity family (sensu Podani, 2000). Next, we show the ability of the Bray-Curtis dissimilarity to conform to a generalization of Dalton’s (1920) principle of transfers to a pair of plots.

Section snippets

An unconventional genealogy of the Bray-Curtis dissimilarity

The Euclidean distance is a special case of a more general parametric family of dissimilarity functions called Minkowski distance:MNKUV=j=1S|xUjxVj|ααwhere α  1. For α = 2, we have the Euclidean distance. For α = 1, we obtain the so-called city-block (or Manhattan) distance, which is the sum of absolute differences in species abundances:CBUV=j=1S|xUjxVj|

An advantage of this formula over EU is that species-wise differences are not exaggerated by squaring (Orlóci, 1972). Division by the number of

A modified principle of transfers for a pair of plots

In the previous section we showed that the Bray-Curtis dissimilarity is sensitive to differences in abundance between species, and that abundant species are weighted more than rare species. The aim of this section is now to analyze how BC is influenced by differences in species abundances between plots.

The question whether a given index is a suitable measure of dissimilarity is usually answered axiomatically by assessing whether the index meets some properties that are intuitively considered to

Discussion

Ecologists have proposed an extensive arsenal of coefficients for summarizing different aspects of plot-to-plot dissimilarity. In this view, the behavior of such measures must be understood to assess whether these measures allow useful biological distinctions between a pair of plots. In this paper we thus reviewed some of the properties of the Bray-Curtis dissimilarity that may be relevant in the context of ecology.

We started from the suggestion that the BC index is additively decomposable into

Acknowledgments

We wish to thank Paulo Inácio Prado, Dave Roberts and one anonymous reviewer for their very constructive comments on a previous version of our paper.

References (24)

  • J.R. Bray et al.

    An ordination of the upland forest communities of southern Wisconsin

    Ecol. Monogr.

    (1957)
  • A.J. Cain et al.

    An analysis of the taxonomist’s judgement of affinity

    Proc. Zoolog. Soc. Lond.

    (1958)
  • K.R. Clarke

    Non-parametric multivariate analyses of changes in community structure

    Aust. J. Ecol.

    (1993)
  • H. Dalton

    Measurement of the inequality of incomes

    Econ. J.

    (1920)
  • D.P. Faith et al.

    Compositional dissimilarity as a robust measure of ecological distance

    Vegetatio

    (1987)
  • J.F. Grassle et al.

    A similarity measure sensitive to the contribution of rare species and its use in investigation of variation in marine benthic communities

    Oecologia

    (1976)
  • P. Holgate

    Notes on the Marczewski-Steinhaus coefficient of similarity

  • G.N. Lance et al.

    Mixed data classificatory programs. I. Agglomerative systems

    Aust. Comput. J.

    (1967)
  • P. Legendre et al.

    Ecologically meaningful transformations for ordination of species data

    Oecologia

    (2001)
  • P. Legendre et al.

    Numerical Ecology

    (2012)
  • P. Legendre

    Interpreting the replacement and richness difference components of beta diversity

    Global Ecol. Biogeogr.

    (2014)
  • M. Levandowsky

    An ordination of phytoplankton populations in ponds of varying similarity and temperature

    Ecology

    (1972)
  • Cited by (126)

    View all citing articles on Scopus
    View full text