Abstract
Whenever two or more survey statistics are compared, the question arises whether this comparison is warranted. Warranted usually means that there is no methodological artifact that could possibly explain any differences: I term this the “strong” interpretation of comparability. The “weak” interpretation of comparability is then that artifacts might exist, but evidence shows that they are not strong enough to explain away a particular substantive finding. In this chapter I discuss some methods to prevent, detect, and correct for incomparability. Translation issues and coding of design characteristics of questions in different countries are particularly relevant to cross-cultural studies. Strong and weak comparability, and the methods associated with them, are discussed for different aspects of total survey error (TSE). On the “measurement side” of TSE, invariance testing, differential item functioning, and anchoring vignettes are well-known techniques. On the “representation side,” I discuss the use of the R-indicator to provide evidence that the comparison of survey statistics is warranted.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
For a complete description of the study design and the original questionnaires, please see http://ess.nsd.uib.no/.
- 3.
There might of course still be difficulty in translating into other languages. Source questions formulated in the English language, which is often claimed to have more words than any other natural language, would appear to be particularly prone to this type of problem.
- 4.
A test of whether the factor model holds in each country is not possible in this case because with only three indicators the model without equality constraints has zero degrees of freedom.
- 5.
A two-parameter normal ogive model was estimated. Expected values were calculated by multiplying country-specific item characteristic curves with the scores 0–10 and summing over categories.
- 6.
The analysis and correction using probit models were done using Mplus 5.2.
- 7.
The principle of anchoring vignettes is identical to that of response function analysis in classical psychophysics.
References
Alwin, D. F. (2007). Margins of error: A study of reliability in survey measurement (Vol. 547). New York: Wiley.
Baumgartner, H., & Steenkamp, J. B. E. M. (2001). Response styles in marketing research: A cross-national investigation. Journal of Marketing Research, 38(1), 143–156.
Bethlehem, J. (1988). Reduction of nonresponse bias through regression estimation. Journal of Official Statistics, 4(3), 251–260.
Billiet, J. B., & McClendon, M. K. J. (2000). Modeling acquiescence in measurement models for two balanced sets of items. Structural Equation Modeling, 7(4), 608–628.
Boateng, S. K. (2009). Significant country differences in adult learning. Population and social conditions, Eurostat statistics in focus, http://epp.eurostat.ec.europa.eu/cache/ITY_OFFPUB/KS-SF-09-044/EN/KS-SF-09-044-EN.PDF .
Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105(3), 456.
Christian, L. M., & Dillman, D. A. (2004). The influence of graphical and symbolic language manipulations on responses to self-administered questions. Public Opinion Quarterly, 68(1), 57.
Cohany, S. R., Polivka, A. E., & Rothgeb, J. M. (1994). Revisions in the current population survey effective January 1994. Employment and Earnings, 41, 13.
Davidov, E., Schmidt, P., & Billiet, J. (2010). Cross-cultural analysis: Methods and applications. New York: Taylor and Francis.
Dillman, D. A., Smyth, J. D., & Christian, L. M. (2008). Internet, mail, and mixed-mode surveys: The tailored design method (3rd ed.). New York: Wiley.
Donsbach, W., & Traugott, M. W. (2008). The SAGE handbook of public opinion research. London: Sage Publications Ltd.
Fuller, W. A. (1987). Measurement error models. New York: Wiley.
Ganzeboom, H. B. G., & Schröder, H. (2009). Measuring level of education in the European social survey. Keynot speech. Presented at the European survey research association (ESRA), Warsaw.
Goudy, W. J. (1976). Nonresponse effects on relationships between variables. Public Opinion Quarterly, 40(3), 360.
Groves, R. M. (2004). Survey errors and survey costs (Vol. 536). New York: Wiley.
Groves, R. M. (2006). Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly, 70(5), 646.
Groves, R. M., & Couper, M. P. (1998). Nonresponse in household interview surveys. New York: Wiley.
Groves, R. M., & Peytcheva, E. (2008). The impact of nonresponse rates on nonresponse bias. Public Opinion Quarterly, 72(2), 167.
Harkness, J. A. (2003). Questionnaire translation. Cross-Cultural Survey Methods, 325, 35.
Harkness, J. A., Braun, M., Edwards, B., Johnson, T. P., Lyberg, L. E., Mohler, P. P., et al. (2010). Survey Methods in Multicultural, Multinational, and Multiregional Contexts (Vol. 552). New York: Wiley.
Harkness, J. A., Vijver, F. J. R., & Johnson, T. P. (2003). Questionnaire design in comparative research. Cross-Cultural Survey Methods, 325, 19–34.
Harzing, A. W. (2006). Response styles in cross-national survey research. International Journal of Cross Cultural Management, 6(2), 243.
Hoffmeyer-Zlotnik, J. H. P., & Harkness, J. A. (2005). Methodological aspects in cross-national research. Mannheim: Zentrum für Umfragen, Methoden und Analysen (ZUMA).
Holland, P. W. (1982). Test equating. New York: Academic Press.
Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum Associates.
Hui, C. H., & Triandis, H. C. (1985). Measurement in cross-cultural psychology. Journal of Cross-Cultural Psychology, 16(2), 131–152.
Jowell, R. (2007). Measuring attitudes cross-nationally: Lessons from the European Social Survey. London: Sage Publications Ltd.
Kankaraš, M., Moors, G., & Vermunt, J. K. (2010). Testing for measurement invariance with latent class analysis. In E. Davidov, P. Schmidt, & J. Billiet (Eds.), Cross-cultural analysis: methods and applications. New York: Taylor and Francis.
King, G., Murray, C. J. L., Salomon, J. A., & Tandon, A. (2004). Enhancing the validity and cross-cultural comparability of measurement in survey research. American Political Science Review, 98(1), 191–207.
Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied cognitive psychology, 5(3), 213–236.
Lessler, J. T., & Forsyth, B. H. (1996). A coding system for appraising questionnaires. In N. Schwarz & S. Sudman (Eds.), Answering questions (pp. 259–292). San Francisco: Jossey-Bass.
Malhotra, N., & Krosnick, J. A. (2007). The effect of survey mode and sampling on inferences about political attitudes and behavior: Comparing the 2000 and 2004 ANES to Internet surveys with nonprobability samples. Political Analysis, 15(3), 286–323.
Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13(2), 127–143.
Mellenbergh, G. J. (1994). Generalized linear item response theory. Psychological Bulletin, 115(2), 300.
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543.
Millsap, R. E., Meredith, W., Cudeck, R., & MacCallum, R. (2007). Factorial invariance: Historical perspectives and new problems. In R. Cudeck & R. C. MacCallum (Eds.), Factor analysis at 100: historical developments and future directions (p. 131). Mahwah, NJ: Lawrence Erlbaum Associates.
Millsap, R. E., & Yun-Tein, J. (2004). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39(3), 479–515.
Oberski, D. (2011). Measurement error in comparative surveys. Tilburg: Tilburg University.
Oberski, D., Gruner, T., & Saris, W. E. (2011). SQP 2. Retrieved from http://www.sqp.nl/.
Oberski, D., Saris, W. E., & Hagenaars, J. (2007). Why are there differences in measurement quality across countries. In: G. Loosveldt, D. Swyngedouw & B. Cambré (Eds.), Measuring meaningful data in social research. Leuven: Acco.
Oberski D., Saris W.E. & Hagenaars J. (2008). Categorization errors and differences in the quality of questions across countries. In: T. D. Johnson & M. Braun (Eds.), Survey Methods in Multinational, Multiregional, and Multicultural Contexts (3MC). New York: Wiley and Sons, Ltd.
Petty, R. E., & Krosnick, J. A. (1995). Attitude strength: Antecedents and consequences. Mahwah: Lawrence Erlbaum Associates, Inc.
Reeskens, T., & Hooghe, M. (2008). Cross-cultural measurement equivalence of generalized trust. evidence from the european social survey (2002 and 2004). Social Indicators Research, 85(3), 515–532.
Saris, W. E. (1988). Variation in response functions: A source of measurement error in attitude research (Vol. 3). Amsterdam: Sociometric Research Foundation.
Saris, W. E., & Andrews, F. M. (2011). Evaluation of Measurement Instruments Using a Structural Modeling Approach. In: P. P. Biemer, R. M. Groves, L. E. Lyberg, N. A. Thiowetz & S. Sudman (Eds.), Measurement Errors in Surveys (pp. 575–597). New York: John Wiley & Sons.
Saris, W. E., & Gallhofer, I. N. (2007). Design, evaluation, and analysis of questionnaires for survey research (Vol. 548). New York: Wiley.
Schouten, B., Cobben, F., & Bethlehem, J. (2009). Indicators for the representativeness of survey response. Survey Methodology, 35(1), 101–113.
Shlomo, N., Skinner, C., Schouten, B., Bethlehem, J., & Zhang, L. (2008). Statistical Properties of R-indicators. RISQ Work Package, 3. Retrieved from http://www.risq-project.eu/papers/RISQ-Deliverable-2-1-V2.pdf.
Steenkamp, J. B. E. M., & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of consumer research, 25, 78–90.
Stevens, S. S. (1975). Psychophysics: Introduction to its perceptual, neural, and social prospects. Piscataway: Transaction Publishers.
Stoop, I., Billiet, J., & Koch, A. (2010). Improving survey response: Lessons learned from the European Social Survey. New York: Wiley.
Tourangeau, R., Rips, L. J., & Rasinski, K. A. (2000). The psychology of survey response. Cambridge: Cambridge University Press.
Van de Vijver, F. J. R., & Leung, K. (1997). Methods and data analysis for cross-cultural research (Vol. 1). London: Sage Publications, Inc.
von dem Knesebeck, O., Verde, P. E., & Dragano, N. (2006). Education and health in 22 European countries. Social Science and Medicine, 63(5), 1344–1351. doi:10.1016/j.socscimed.2006.03.043.
Voogt, R. J. J. (2004). I’m not interested: nonresponse bias, response bias and stimulus effects in election research. Amsterdam: University of Amsterdam.
Wand, J., King, G., & Lau, O. (2007). Anchors: Software for anchoring vignette data. Journal of Statistical Software, 42, 1–25.
Willis, G. B. (2005). Cognitive interviewing: A tool for improving questionnaire design. :London: Sage Publications, Inc.
Zavala, D. (2011, February 7). Deviations found through SQP coding in the ESS Round 5 questionnaires. Report given to the European Social Survey’s national coordinator’s meeting.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media New York
About this chapter
Cite this chapter
Oberski, D.L. (2012). Comparability of Survey Measurements. In: Gideon, L. (eds) Handbook of Survey Methodology for the Social Sciences. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3876-2_27
Download citation
DOI: https://doi.org/10.1007/978-1-4614-3876-2_27
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-3875-5
Online ISBN: 978-1-4614-3876-2
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)