Models of Statistical Relationship*

Herbert F. Weisberg

doi:10.2307/1959947

Models of Statistical Relationship*

Published online by Cambridge University Press: 02 September 2013

Herbert F. Weisberg

Show author details

Herbert F. Weisberg: Affiliation:
Ohio State University

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The choice among measures of relationship has increasingly become a matter of the interpretation of their intermediate values. Interpretations are important, but a prior question is the statistic's implicit model of a relationship—what it considers a perfect relationship, and what it considers a null relationship. A family of models based on combinations of certain maximum- and null-value conditions is analyzed in this paper. The distinction between the models can be used to shed light on the stakes involved in the choice among dichotomous variable measures as well as that among familiar ordinal statistics.

The models are ordered in terms of their leniency, and the coefficients based on each model are specified. An empirical analysis shows that the different measures are positively correlated, but those measures based on different models can differ sharply from one another. Statistics based on the same model covary regardless of differences in their interpretations. Since different models are intended to measure different concepts, multiple coefficients can allow investigators to examine their data in greater detail. Several political examples of the use of multiple models are provided.

Type: Research Article
Information: American Political Science Review , Volume 68 , Issue 4 , December 1974 , pp. 1638 - 1655

DOI: https://doi.org/10.2307/1959947 [Opens in a new window]
Copyright: Copyright © American Political Science Association 1974

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

¹ Goodman, Leo A. and Kruskal, William H., “Measures of Association for Cross Classifications,” Journal of the American Statistical Association, 49 (December, 1954), 732–764Google Scholar.

² Kruskal, William H., “Ordinal Measures of Association,” Journal of the American Statistical Association, 53 (December, 1958), 814–861CrossRef Google Scholar.

³ Somers, Robert H., “A New Asymmetric Measure of Association for Ordinal Variables,” American Sociological Review, 27 (December, 1962), 799–811CrossRef Google Scholar.

⁴ Costner, Herbert L., “Criteria for Measures of Association,” American Sociological Review, 30 (June, 1965), 341–353CrossRef Google Scholar.

⁵ See, for example, Somers's presentation of a proportional-reduction-in-error interpretation of his d_xy in Somers, Robert H., “On the Measurement of Association,” American Sociological Review, 33 (April, 1968), 291–292Google Scholar, and also Wilson, Thomas P., “A Proportional-Reduction-in-Error Interpretation for Kendall's Tau-b,” Social Forces, 47 (March, 1969), 340–342CrossRef Google Scholar.

⁶ There would be little need to resort to a summary statistic were we considering only the relation between a single vote and party, but the interest usually extends to the importance of party across a wide range of votes. The fundamental concern would be in comparing the strength of party effects to other effects (such as regional divisions and urban-rural differences) or in comparing the role of partisan effects across issue areas, across time, between legislative chambers, and/or between different legislatures. Being able to summarize the importance of party (and other) effects on each vote permits comparing the distributions of the summary measures over the votes.

⁷ Grumm has employed the phi-statistic in this regard, Turner used the related chi-square statistic, and MacRae worked with Yule's Q (without reporting specific results). Anderson's handbook on roll-call analysis mentions several standard statistical coefficients, but does not consider the stakes involved in the choice between them. MacRae's roll-call analysis text includes an excellent discussion of measuring party differences, including comments on the choice between the regression slope and Yule's Q. See Grumm, John G., “The Means of Measuring Conflict and Cohesion in the Legislature,” Southwestern Social Science Quarterly, 44 (March, 1964), 377–388Google Scholar; Turner, Julius, Party and Constituency (Baltimore: The Johns Hopkins Press, 1951)Google Scholar; MacRae, Duncan Jr.,, Issues and Parties in Legislative Voting (New York: Harper and Row, 1970), pp. 175–200Google Scholar; Anderson, Lee F., Watts, Meredith W. Jr.,, and Wilcox, Allen R., Legislative Roll-Call Analysis (Evanston, Ill.: Northwestern University Press, 1966), pp. 43–54Google Scholar.

⁸ If the marginals are accepted as given, the number of Democrats voting no in Chart 2.3 could range between 0 and 10. If there were 1 Democrat voting no (and 9 Republicans voting no), the relationship between party and vote would not be as strong as in Chart 2.3 and similarly if more than 1 Democrat voted no. If there were 10 Democrats but no Republicans casting negative votes, the party responsibility would again be maximal given the vote marginals. Thus the strongest relationships between party and vote with the given marginals occur when there is one zero cell in the table—no Democrats or no Republicans voting in the negative direction. Any other case represents a weaker relationship.

⁹ The literature which argues that democratic stability depends on individuals having numerous cross-cutting group memberships is reviewed in Rae, Douglas W. and Taylor, Michael, The Analysis of Political Cleavages (New Haven: Yale University Press, 1970), chapter 4Google Scholar.

¹⁰ McGinnis, Robert, “Logical Status of the Concept of Association,” The Midwest Sociologist, 20 (May, 1958), 72–77Google Scholar.

¹¹ Models 3a and 3b would regard Chart 2.3 as both a perfect and a null case. Model 2a would reat unanimous and near-unanimous votes as both perfect and null relationships. Balance can occur on unanimous (and near-unanimous) votes, which would make model 2b indeterminate. Maximum departure rom cleavage does exist on unanimous votes, but such a relationship would not fulfill the conditions for strong monotonicity; consequently model 1d cannot exist. As a result of these problems, measures cannot be based on models 1d, 2a, 2b, 3a, and 3b.

Additionally, models 2c and 3c would regard all unanimous cases as both a perfect and a null relationship. Measures can be based on them only if they are undefined for strictly unanimous items.

¹² As noted in the chart, the computational formulas shown linearly transform the coefficient values to range from zero for a null relationship to one for a perfect relationship. The transformations are not always appropriate for nondichotomous variables, or for comparing relationships involving dichotomous variables with ones involving nondichotomous variables. Particularly model 1b and 2d measures might validly reflect the numbers of categories for the variables. Absolute values have been employed for measures of departure from statistical independence since signs are of limited meaning when dealing with dichotomous categoric variables.

Several measures are undefined for strictly unanimous items. A measure which does not approach zero as a variable becomes unanimous is regarded as satisfying the moderate (or, if appropriate, weak) monotonicity rule.

The opposition index is suggested by MacRae, , Issues and Parties in Legislative Voting, p. 199Google Scholar. Koppa is a linear function of the usual simple agreement score. The cross-cutting measure is based on the work of Rae and Taylor, The Analysis of Political Cleavages. An unconventional formula is shown for Goodman and Kruskal's tau, based on an algebraic simplification of its formula to phi-squared for dichotomous variables. Theta is described in Messenger, Robert C. and Mandell, Lewis M., “A Modal Search Technique for Predictive Nominal Scale Multivariate Analysis,” Journal of the American Statistical Association, 67 (December, 1972), 768–772Google Scholar. The phi/phi-max formula differs from the usual version of that statistic in that it assumes the value of unity whenever there is a zero cell in the cross-tabulation. The cosine-pi approximation for tetrachoric r is employed. The backward regression coefficient, b_xy, is included in Chart 5 since its maximum and null value conditions are among those treated in the text. Other measures are standard statistical coefficients, simple transformations of such coefficients, or totally ad hoc measures.

¹³ An a priori ordering of the severity of null conditions b and c is not evident, but the ordering in the text has been chosen because condition b behaves like condition a in this instance (and in the cases mentioned in footnote 11) while condition c behaves like condition d.

¹⁴ The use of Pearson's r here is debatable, but it is consistent with the usual practice of treating party voting measures at the interval level (as by taking their mean over a series of votes). The Pearson's r values should be considered conservative since they measure only linear rather than monotonic relationships; for example, the Yule's Q and Yule's Y values have a correlation of “only” .94 with one another while in fact their relationship is perfectly monotonic.

The correlational approach employed in this section would not be possible in legislatures where the relationship between party and vote is identical for all votes, in which case there would be no variance in the relationship statistics. Fortunately for the analysis, the relationship between party and vote varies considerably among different votes in the American Congress. The standard deviations of the measures could range from .00 to .50 (except for the opposition index with a maximum of 1.00 since it has been allowed to take on negative values); the actual values range from .19 to .30 (with the exception of the opposition index with a standard deviation of .38).

¹⁵ Variation on the first component is slight (all values around .65); its effect can be visualized in Figure 1 by curling the page so the bottom three points and the top point are slightly below the others.

¹⁶ Sharp differences among the models can also emerge with respect to unanimous and near-unanimous variables. If the dependent variable is strictly unanimous when indepedent variable categories are of equal sizes, null conditions a-c would be satisfied, while models 2d and 3d would be maximal. If there is a single dissenting vote with equal-sized independent variable categories, null condition a would be satisfied, models 1b and 1c would be near null, model 2c would be moderate, model 2d would be near maximal, and weak monotonicity (maximum condition 3) would be achieved. Thus models 3d, 3c, 2d, and to some extent 2c are more distinct from the other models in the presence of unanimous and near-unanimous variables.

The analysis in the text was replicated with unanimous and near-unanimous roll calls retained. The correlations between the different measures (upper right of Table 1) remain positive, except for some involving the cleavage model (2d and 3d) statistics, but the correlations involving model 3c and even moreso models 2d and 3d fall. The first principal component is strong, but the model 3d and especially 2d coefficients are virtually independent of it. The model 3c measures and, to a less extent, b_xy (a model 2c statistic) are more distinct from the remaining measures in the component solution.

¹⁷ Measures based on different models should behave differently, at least to the extent to which their underlying models differ. This maxim specifically contradicts the view that coefficients should be stable when the variable categories are collapsed, since collapsing categories may validly alter the degree of fit with some models. Furthermore statistics should not be judged according to whether they are sensitive to the Pearson's r correlation “underlying” the cross-tabulation because they may be purposefully based on relationship models different from that of Pearson's r. An interesting empirical analysis of the behavior of several measures is given by Rutherford, Brent M., “The Accuracy, Robustness and Relationships Among Correlational Models for Social Analysis: A Monte Carlo Simulation,” (paper delivered at the Annual Meeting of the American Political Science Association, Washington, D.C., September 5–9, 1972)Google Scholar. A useful discussion of intuitive notions of a relationship and of the factors underlying differences among measures is provided by Hunter, A. A., “On the Validity of Measures of Association,” American Journal of Sociology, 79 (July, 1973), 99–109CrossRef Google Scholar. However, neither author's interpretation of the results acknowledges sufficiently the meaningfulness of differences in the behavior of measures.

¹⁸ MacRae, Issues and Parties in Legislative Voting, chapters 3 and 5.

¹⁹ Sokal, Robert R. and Sneath, Peter H. A., Principles of Numerical Taxonomy (San Francisco: Freeman, 1963)Google Scholar.

²⁰ Maximum value condition 3 was described as providing an adjustment for unequal marginals. That involves a contrast of independent variables and dependent variable marginals, whereas the sensitivity described here is a further sensitivity to changes in the independent variable marginals.

²¹ The sole exception involves the opposition index when low variance votes are retained in the analysis. This measure differs from the others in the employing negative values to assess varying degrees of party accord (its null value condition). Low-variance votes would be marked by strong party accord, so this index differs from other model la measures for such votes, which results in somewhat lower correlations.

Note that raising the measures (except the opposition index) to a positive power yields another statistic having the same maximum and null value conditions and therefore the same relationship model. Raising to a power provides a statistic whose intermediate values “look” different from those of the original measure, but they would covary perfectly according to Spearman's rho correlation of rank orders. The measures would differ meaningfully only in their magnitudes.

²² These results seem most liable to being dependent on the empirical data being analyzed. Greater variation in the independent variable marginals may have an effect, but the effects are unlikely to be pronounced unless the marginal split betwen parties is 80–20 or 90–10.

²³ The relationships of Chart 8 were originally detected by inspection of the party voting measures for 1953–54 and 1965. Proofs have subsequently been obtained for all of the inequalities shown, while the lack of a consistent ordering with the roll-call data is sufficient to demonstrate the absence of an order relationship.

²⁴ Considering only the moderate-valued measures, the restrictive model la statistics have the lowest means, the model 1b, 1c, and 2c coefficients have lower means than the model 3c and 2d measures, and the lenient model 3d statistic has the highest mean—an ordering of means consistent with both Charts 4 and 8. The differences in mean values in Table 2 are fairly sharp; the value of a coefficient is related to its underlying model. The means for most co-efficients drop when unanimous and near-unanimous votes are included, except that the model 2d and 3d measures are higher in this instance (and the model 2c and 3c statistics fall by less than .03 with these data), since such votes are extremely predictable and cohesion is necessarily high.

²⁵ Separate analysis of the two Congresses would yield similar results. The differences in measure means between the two are minimal. The largest difference is .09 for the symmetric lambda; others are .04 or less. There is no trend for higher values in one Congress than the other, though the measures' variances are consistently higher in the 1953–54 Congress in which party sizes were virtually equal.

²⁶ In particular, the tau-b used to measure “status polarization” in Campbell, Angus, Converse, Philip E., Miller, Warren E., and Stokes, Donald E., The American Voter (New York: Wiley, 1960)Google Scholar is identical to phi since the variable are dichotomous. Rae and Taylor, The Analysis of Political Cleavages, also suggest the applicability of their cross-cutting measure for studying group voting. See also Korpi, Walter, “Some Problems in the Measurement of Class Voting,” American Journal of Sociology, 78 (November, 1972), 627–642CrossRef Google Scholar.

²⁷ Converse, Philip E. and DuPeux, Georges, “Politicization of the Electorate in France and the United States,” Public Opinion Quarterly, 26 (Spring, 1962), 1–23CrossRef Google Scholar.

²⁸ Blalock, Hubert M. Jr.,, “Causal Inferences, Closed Populations, and Measures of Association,” American Political Science Review, 61 (March, 1967), 130–136CrossRef Google Scholar.

²⁹ See also Hernes's interpretation of several measures of relationship in the context of James Coleman's effect parameters for time-continuous Markov processes in Hernes, Gudmund, “A Markovian Approach to Measures of Association,” American Journal of Sociology, 75 (May, 1970), 992–1011CrossRef Google Scholar.

³⁰ Actually tau-c attains its maximum value only under moderate monotonicity with a uniform distribution of cases on the variable with fewer categories.

Article contents

Models of Statistical Relationship*

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests