The Correlational Agreement Coefficient CA(≤,D)—a mathematical analysis of a descriptive goodness-of-fit measure

doi:10.1016/j.mathsocsci.2004.03.003

Mathematical Social Sciences

Volume 48, Issue 3, November 2004, Pages 281-314

https://doi.org/10.1016/j.mathsocsci.2004.03.003 Get rights and content

Abstract

The Correlational Agreement Coefficient, CA(≤,D), was introduced by J.F.J. van Leeuwe in 1974 within Item Tree Analysis (ITA), a data-analytic method to derive quasi orders (surmise relations) on sets of bi-valued test items. Recently, it has become of interest in connection with Knowledge Space Theory (KST). The coefficient CA(≤,D) is used as a descriptive goodness-of-fit measure to select out of competing surmise relations one with maximal CA(≤,D) value. Formal aspects like boundedness, decomposition, and the interplay between consistency of a surmise relation (with a binary data matrix) and the attainment of the maximum value of CA(≤,D) are investigated. Dependence of CA(≤,D) on trivial response patterns is quantified by a functional relationship that allows one to bunch the impact of trivial response patterns in a single “bias term”. These considerations should warn against inconsiderate use of the coefficient. Mathematical reasons for failed, however, heuristically plausible, properties are presented.

Introduction

In the field of knowledge assessment and acquisition based on prerequisite relationships, a central problem is to derive reflexive, transitive binary relations on sets of bi-valued test items. This is done for modeling hierarchies between items based on solvability dependencies of the type: “Given a positive response to an item J (e.g., J solved), it can be surmised that another item I will also be responded to positively (e.g., I solved)”. Such binary relations (quasi orders) are central within Knowledge Space Theory (KST) introduced by Doignon and Falmagne, 1985, Doignon and Falmagne, 1999. In KST, they are called surmise relations. However, given a field of knowledge and a set of bi-valued test items appropriate enough to allow for fine-grained and representative coverage of the field, the problem is how to establish a reasonable surmise relation on the item set. Item Tree Analysis (ITA) is a data-analytic method for the derivation of surmise relations on sets of bi-valued test items. ITA was introduced by Airasian, Bart, and Krus in 1973 Airasian and Bart, 1973, Bart and Krus, 1973 and was developed into the present form by Leeuwe (1974). In particular, Leeuwe (1974) introduced the Correlational Agreement Coefficient, CA(≤,D), as part of ITA.¹ In ITA, CA(≤,D) is used as a descriptive goodness-of-fit measure to select out of competing surmise relations one with maximal CA(≤,D) value.

Recently, ITA, and in particular, CA(≤,D), has become of interest in connection with KST; see Held et al. (1995), Held and Korossy (1998), Schrepp, 1999, Schrepp, 2003, and Schrepp et al. (1999). For the application of efficient adaptive computer-based knowledge assessment procedures, one requires surmise relations of “a trade-off type”. On the one hand, it should reflect the data as well as possible (descriptive adequacy), and on the other hand, it should be of as large as possible cardinality as a set. The authors tried to achieve this by applying ITA and the coefficient CA(≤,D) (cp. Section 12).

Leeuwe (1974) reports:² “…This coefficient [partial order reproducibility coefficient]³ cannot serve therefore [stationarity in tolerance level L=0] as a criterion for choosing the best solution…This procedure [CA(≤,D)] has the advantage that it gives a lower value not only in the case that too many relations are constructed [larger tolerance levels], but also in the case that the number of relations is very low [smaller tolerance levels]”.

ITA's renaissance in connection with KST has led to criticisms of CA(≤,D). Held and Korossy (1998) stress the “ad hoc” (descriptive) nature of CA(≤,D): “…we will apply two ad hoc criteria [one, the CA(≤,D)]”. Schrepp (1999) illustrates that CA(≤,D) can be reduced by non-comparable item pairs: “…for relations which contain many non-connected item pairs it seems possible that the correct relation ≤_L will not have the best CA(≤_L) value”. Another criticism of ITA and CA(≤,D) is voiced by Wesiak et al. (2004). They observe that trivial response patterns (i.e., all or none of the items answered positively), though empirically irrelevant with respect to solvability dependencies between items, do drastically manipulate ITA solutions. This is due to CA(≤,D)'s dependence on such patterns (cp. Section 11).

In the light of these observations, a comprehensive mathematical analysis of CA(≤,D) is missing. Rather, the elaborations so far are heuristic, based on experimentation with certain data sets. Other deeper properties of CA(≤,D) are actually not known so far. Thus, this work represents a coherent and extensive mathematical treatise on CA(≤,D). In particular, it warns against inconsiderate use of the coefficient, and if used, it tells to what one needs to pay attention. Perhaps, this work may also be viewed as a general guide to carry out a first mathematical analysis of ad hoc formulated coefficients. Additionally, Section 12 contains valuable methodological issues in regard to goodness-of-fit measures in general. Beside criteria proposed by Goodman and Kruskal, 1954, Goodman and Kruskal, 1959, Goodman and Kruskal, 1963, Goodman and Kruskal, 1972 (reviewed by Bishop et al., 1975, Liebetrau, 1983), Section 12 mentions the importance of purpose-specific goodness-of-fit measures and the problem of trade-off between different fit criteria.

This section reviews Leeuwe's (1974) Item Tree Analysis.

We use the following notation (m,n∈ $N$ )⁴:

Q≔{I_l: 1≤l≤m} set of dichotomous items,
P≔{P_k: 1≤k≤n} sample of subjects,
D≔(d_kl′) corresponding binary (=0/1) n×m data matrix,

and for every (I_i,I_j)∈Q×Q (1≤i, j≤m), the 2×2 table notation

I_{i} ⧹ I_{j} 101 a_{ij} b_{ij} 0 c_{ij} d_{ij}

with a_ij,b_ij,c_ij,d_ij∈

N

⋃{0}; in respective order, the absolute frequencies of subjects solving items I_i and I_j [a_ij], solving I_i, not I_j [b_ij], solving I_j, not I_i [c_ij], and solving neither I_i nor I_j [d_ij]. Then, the ITA rule for generating binary relations ≤_L (0≤L≤n) is given by

I_{i} ≤_{L} I_{j} :⇔c_{ij} ≤L.

This L (0≤L≤n) is called tolerance level. The ITA rule represents STEP1 of ITA. The latter consists of five steps, STEP1–STEP5:

STEP1.
Determine the binary relations ≤_L for L=0, 1,…, n.
STEP2.
From the ≤_L (0≤L≤n), remove those that are not transitive.
STEP3.
Set a critical value 0<c≤1 for the proportions, p_L, of subjects not contradicting the respective surmise relations ≤_L in STEP2.
STEP4.
From the surmise relations in STEP2, remove those with p_L<c.
STEP5.
From the remaining surmise relations (after STEP4)—≤₀ is always contained—select one with maximal CA(≤,D) value.

The Correlational Agreement Coefficient is used as a goodness-of-fit measure to handle the selection problem in STEP5. From the remaining surmise relations, select an “optimal” one, i.e., one with maximal CA(≤,D) value.

Basic concepts and the definition of empirical Pearson correlation are reviewed (Section 2). The definition of theoretical correlation is presented (Section 3). Empirical and theoretical correlation are compared in regard to coincidence (Section 4) and boundedness (Section 5). Based on this, CA(≤,D) is defined coherently (Section 6). A natural decomposition of the coefficient CA(≤,D) into four partial functions is given (Section 7). It is analyzed in regard to boundedness (Section 8). An analysis of the interplay between the consistency of a surmise relation ≤ with a data matrix D and the attainment of the maximum value of CA(≤,D) is presented 9 Consistency–maximum problem, 10 Maximum–consistency problem. We conclude with the analysis of the dependence of CA(≤,D) on trivial response patterns (Section 11). The work ends with a discussion (Section 12).

Note that all proofs are deferred to an appendix, section-wise (Appendix A).

Section snippets

Basic concepts

We review basic conventions regarding terminology and notation.

Let Q, P, and D be defined as in Section 1.2. The row z_k (1≤k≤n) of D encodes the responses of subject P_k to all items in Q, whereas column s_l (1≤l≤m) of D encodes the responses of all subjects in P to item I_l.

Definition 1

Let Q={I_l: 1≤l≤m} (m∈ $N$ ). We define:⁵ $S ≔{≤⊆Q×Q:≤ quasi order on Q},$ $D ≔ ∪ n∈$

Theoretical correlation derived through idealization

Section 4 gives motivation for the form and name of theoretical correlation.

Definition 8

Let (I_i,I_j)∈[Q×Q]_A and ≤∈ $S$ . Theoretical correlation, r_ij*, between I_i and I_j, derived through idealization, is defined as $r_{ij} *≔ 1 :(I_{i},I_{j})∈≤∧(I_{j},I_{i})∈≤ (1−p_{I_{i}})·p_{I_{j}} (1−p_{I_{j}})·p_{I_{i}} :(I_{i},I_{j})∈≤∧(I_{j},I_{i})∉≤ (1−p_{I_{j}})·p_{I_{i}} (1−p_{I_{i}})·p_{I_{j}} :(I_{i},I_{j})∉≤∧(I_{j},I_{i})∈≤ 0 :(I_{i},I_{j})∉≤∧(I_{j},I_{i})∉≤$

Theoretical correlation r_ij* is well-defined for every (I_i,I_j)∈[Q×Q]_A. It is the case that s_i,s_j≠ $0$ _n, $1$ _n, i.e., p_{I_i}, p_{I_j}≠0, 1.

Comparing empirical and theoretical correlation: coincidence

Lemma 9

Let ≤∈ $S$ , which is consistent with binary response data D. Then, for all (I_i,I_j)∈≤∩ [Q×Q]_A, $r_{ij} = 1 :(I_{j},I_{i})∈≤ (1−p_{I_{i}})p_{I_{j}} (1−p_{I_{j}})p_{I_{i}} :(I_{j},I_{i})∉≤$

Proof

See Appendix A.1.□

The next corollary gives a first answer to the question of coincidence.

Corollary 10

Let ≤∈ $S$ , consistent with D. Let (I_i,I_j)∈≤ with A. Then, theoretical correlation r_ij* equals empirical correlation r_ij (i.e., r_ij*=r_ij).

What can be said about coincidence in case of not-≤-comparable item pairs?⁸

Comparing empirical and theoretical correlation: boundedness

Empirical correlation uniformly lies in the interval [−1,1] (Lemma 6). What about theoretical correlation?

Proposition 13

Let Q={I_l: 1≤l≤m} (m∈ $N$ ). It holds:

(Relative Interval Nesting). Let D∈ $M$ (n×m; {0,1}), ≤∈ $S$ , and let (I_i,I_j)∈[Q×Q]_A. Then (since (I_i,I_j)∈[Q×Q]_A, n≥2), $0≤r_{ij} *≤n−1.$
(Proper Divergence to +∞). For m≥2, there exists an ≤_∗∈ $S$ and a pair (I_i,I_j)∈Q×Q with i<j and [(I_i,I_j)∈≤_∗∧(I_j,I_i)∉≤_∗], such that $∀n≥2∃D_{n−1} ∈ M (n×m;{0,1}):[[(r_{ij})_{n−1} =−1]∧[(r_{ij} *)_{n−1} =n−1]].$ ¹¹

Defining the coefficient CA(≤,D)

Definition 15

Let Q≔{I_l: 1≤l≤m} (m∈ $N$ , m≥2), ≤ be a surmise relation on Q, and D=(d_kl′)∈ $M$ (n×m; {0,1}). Further, let $<_{Q} ′≔{(I_{i},I_{j})∈Q×Q:i<j and (I_{i},I_{j}) fulfills A}.$

The Correlational Agreement Coefficient, CA(≤,D), is defined as $CA (≤,D)≔1− 2 m(m−1) ∑ (I_{i},I_{j})∈<_{Q} ′ (r_{ij} −r_{ij} *)^{2} .$

We close this section with two (actually obvious) remarks.

Decomposing the coefficient CA(≤,D)

We begin with some notation.

Definition 16

Let Q≔{I_l: 1≤l≤m} (m≥2), D∈ $M$ (n×m; {0,1}), and ≤∈ $S$ . We define: $<_{Q^{∣_{≅}}} ′=<_{Q} ′∩{(I_{i},I_{j})∈Q×Q:(I_{i},I_{j})∈≤∧(I_{j},I_{i})∈≤},$ $<_{Q^{∣_{≪}}} ′=<_{Q} ′∩{(I_{i},I_{j})∈Q×Q:(I_{i},I_{j})∈≤∧(I_{j},I_{i})∉≤},$ $<_{Q^{∣_{≫}}} ′=<_{Q} ′∩{(I_{i},I_{j})∈Q×Q:(I_{i},I_{j})∉≤∧(I_{j},I_{i})∈≤},$ $<_{Q^{∣_{≭}}} ′=<_{Q} ′∩{(I_{i},I_{j})∈Q×Q:(I_{i},I_{j})∉≤∧(I_{j},I_{i})∉≤}.$

The family $F$ ≔(<_Q^∣^≅′,<_Q^∣^≪′,<_Q^∣^≫′,<_Q^∣^≭′) of subsets of <_Q′ fulfills $<_{Q} ′= ∪ k∈{≅,≪≫,≭} <_{Q^{∣_{k}}} ′ (Coverging property),$ $<_{Q^{∣_{k}}} ′∩<_{Q^{∣_{l}}} ′=∅ for k,l∈{≅,≪≫,≭},k≠l (Pairwise disjoint).$

In general, $F$ may not be a partition of <_Q′, since one of the members <_Q^∣ⁱ′

Boundedness of CA(≤,D)

Proposition 18

Let Q≔{I_l: 1≤l≤m} (m≥2). It holds:

(Relative Interval Nesting). If D∈ $M$ (n×m; {0,1}) for n∈ $N$ fixed, then, for all ≤∈ $S$ , $1−n^{2} ≤CA(≤,D)≤1.$ That is, partial function CA(.,D): $S$ → $R$ , ≤↦CA(.,D)(≤)≔CA(≤,D) has a bounded range CA(.,D)( $S$ )⊂[1−n²,1].
(Proper Divergence to −∞). There exists an ≤_∗∈ $S$ and (D_n)_{n∈ $N$} in $D$ : $lim n→∞ (CA(≤_{∗},D_{n}))_{n∈N} =−∞,$ in the sense of diverging properly to −∞.

Proof

See Appendix A.4.□

Consistency–maximum problem

Reconsider the example in Lemma 11:

Lemma 20 Counterexample

Let Q≔{I_l: 1≤l≤m} (m≥2) and ≤∈ $S$ be a total fit to D. Then, it is not necessarily the case that CA(≤,D)=1. In other words, consistency does not imply maximum in general.

Proof

See Appendix A.5.□

Remark

If we presuppose consistency, and that CA(≤,D) depends on (r_ij−r_ij*)²>0 for a not-≤-comparable item pair (I_i,I_j)∈<_Q′, then we have: $CA (≤,D)≔1− 2 m(m−1) ∑ (I_{i},I_{j})∈<_{Q} ′ (r_{ij} −r_{ij} *)^{2}$ $= 1− 2 m(m−1) ∑ (I_{i},I_{j})∈<_{Q^{∣_{≭}}} ′,δ_{ij} >0 (r_{ij} −r_{ij} *)^{2} >0 . >0 <1$

Equivalence between consistency and maximum is not a

Maximum–consistency problem

The converse implication is also not true in general.

Lemma 22 Counterexample

Let Q≔{I_l: 1≤l≤m} (m≥2), D∈ $D$ , and ≤∈ $S$ with CA(≤,D)=1. Then, it is not necessarily the case that ≤ is consistent with D. In other words, maximum does not imply consistency in general.

Proof

See Appendix A.6.□

Proposition 23 states that maximum CA(≤,D)=1 implies consistency, provided no subject contradicts any of the non-reflexive²⁰ pairs I_i≤I_j with non-existent empirical correlation r_ij.

Proposition 23

Functional relationship for equivalent data matrices

Wesiak et al. (2004) observe a “data-related” problem arising when trivial response patterns are included/excluded in/from the input data matrix for ITA. Such response patterns, though empirically irrelevant with respect to solvability dependencies between items, do drastically manipulate ITA solutions. Larger/smaller optimal L_opt (stronger/weaker structures ≤_opt) are obtained by adding/removing trivial patterns to/from the input data.

Lemma 24 bunches the impact of such patterns in a single

Major misconceptions in CA(≤,D) publications

Two major misconceptions are present in some of the CA(≤,D) publications mentioned in Section 1.1:

(A)
The coefficient CA(≤,D) does not measure goodness-of-fit with respect to the fit criterion “number of response patterns in D matching all pairs in ≤”. In the terminology of knowledge spaces (Doignon and Falmagne, 1999), this is refered to as “number of response patterns in D matching one of the knowledge states in the quasi ordinal knowledge space $K$ _≤, corresponding to ≤”.²³

Acknowledgements

This research was supported by grants from the University of Graz to Ali Ünlü.

References (27)

J.-P. Doignon et al.
Spaces for the assessment of knowledge
International Journal of Man–Machine Studies
(1985)
M. Schrepp
On the empirical construction of implications between bi-valued test items
Mathematical Social Sciences
(1999)
P.W. Airasian et al.
Ordering theory: a new and useful measurement model
Educational Technology
(1973)
W.M. Bart et al.
An ordering-theoretic method to determine hierarchies among items
Educational and Psychological Measurement
(1973)
G. Birkhoff
Rings of sets
Duke Mathematical Journal
(1937)
Y.M.M. Bishop et al.
Discrete Multivariate Analysis: Theory and Practice
(1975)
J. Bortz
Statistik
(1989)
H. Cramér
Mathematical Methods of Statistics
(1946)
C.M Dayton et al.
A probabilistic model for validation of behavioral hierarchies
Psychometrika
(1976)
J.-P. Doignon et al.
Knowledge Spaces
(1999)

G. Fischer

Lineare Algebra

(1995)

L.A. Goodman et al.

Measures of association for cross classifications

Journal of the American Statistical Association

(1954)

L.A. Goodman et al.

Measures of association for cross classifications: II. Further discussion and references

Journal of the American Statistical Association

(1959)

Cited by (6)

On the evaluation of fit measures for quasi-orders
2007, Mathematical social sciences
There are several measures available, which calculate the fit between a quasi-order on a set of dichotomous items and a set of observed response patterns. We investigate how we can evaluate such measures of fit for quasi-orders concerning their adequacy. The most prominent use cases for such fit measures are explorative data analysis and the empirical comparison of existing quasi-orders, for example obtained by querying experts. We formulate several requirements, which a fit measure for quasi-orders must fulfil to be useful in these cases. Two fit measures used in explorative data analysis are the correlational agreement coefficient (van Leeuwe, 1974) and the diff-coefficient (Schrepp, 1999). We investigate in several simulation studies if these fit measures fulfil the described requirements.
Properties of the correlational agreement coefficient: A comment to Ünlü and Albert (2004)
2006, Mathematical social sciences
The correlational agreement coefficient CA(≤, D) [van Leeuwe, J.F.J., 1974. Item tree analysis. Nederlands Tijdschrift voor de Psychologie 29, 475–484.] is a descriptive measure for the fit of a quasi-order ≤ on an item set to a binary data set D. The coefficient is based on the comparison between the empirical correlations of the items to their assumed theoretical correlations. These theoretical correlations are derived from the assumption that the quasi-order is a correct representation of the data. In a recent paper Ünlü and Albert [Ünlü, A., Albert, D., 2004. The correlational agreement coefficient CA(≤, D) — A mathematical analysis of a descriptive goodness-of-fit measure. Mathematical Social Sciences 48, 281–314.] presented a detailed mathematical investigation of CA(≤, D). They describe a number of problems of this coefficient which show in their opinion that its use to compare quasi-orders is questionable. We do not agree with some of the statements in Ünlü and Albert [Ünlü, A., Albert, D., 2004. The correlational agreement coefficient CA(≤, D) — A mathematical analysis of a descriptive goodness-of-fit measure. Mathematical Social Sciences 48, 281–314.]. Especially we try to show that some of the problems of CA(≤, D) mentioned in Ünlü and Albert [Ünlü, A., Albert, D., 2004. The correlational agreement coefficient CA(≤, D) — A mathematical analysis of a descriptive goodness-of-fit measure. Mathematical Social Sciences 48, 281–314.] are in fact properties which a good measure of fit for a quasi-order should have.
A Neuroevolutionary Method for Knowledge Space Construction
2022, Computer Science and Information Systems
A class of k-modes algorithms for extracting knowledge structures from data
2017, Behavior Research Methods
An iterative procedure for extracting skill maps from data
2016, Behavior Research Methods
ITA 2.0: A program for classical and inductive item tree analysis
2006, Journal of Statistical Software

View full text

The Correlational Agreement Coefficient CA(≤,D)—a mathematical analysis of a descriptive goodness-of-fit measure

Abstract

Introduction

Section snippets

Basic concepts

Theoretical correlation derived through idealization

Comparing empirical and theoretical correlation: coincidence

Comparing empirical and theoretical correlation: boundedness

Defining the coefficient CA(≤,D)

Decomposing the coefficient CA(≤,D)

Boundedness of CA(≤,D)

Consistency–maximum problem

Maximum–consistency problem

Functional relationship for equivalent data matrices

Major misconceptions in CA(≤,D) publications

Acknowledgements

International Journal of Man–Machine Studies

Mathematical Social Sciences

Ordering theory: a new and useful measurement model

Educational Technology

An ordering-theoretic method to determine hierarchies among items

Educational and Psychological Measurement

Rings of sets

Duke Mathematical Journal

Discrete Multivariate Analysis: Theory and Practice

Statistik

Mathematical Methods of Statistics

A probabilistic model for validation of behavioral hierarchies

Psychometrika

Knowledge Spaces

Lineare Algebra

Measures of association for cross classifications

Journal of the American Statistical Association

Measures of association for cross classifications: II. Further discussion and references

Journal of the American Statistical Association