Using Amazon Mechanical Turk for linguistic research

Schnoebelen,  Tyler; Kuperman,  Victor

National library of Serbia

About the journal

Editorial policy

Instructions for authors

Cobiss

All issues

Psihologija 2010 Volume 43, Issue 4, Pages: 441-464
https://doi.org/10.2298/PSI1004441S
Full text ( 144 KB)
Cited by

Using Amazon Mechanical Turk for linguistic research

Schnoebelen Tyler (Stanford University, USA)
Kuperman Victor (McMaster University, Canada)

Amazon’s Mechanical Turk service makes linguistic experimentation quick, easy, and inexpensive. However, researchers have not been certain about its reliability. In a series of experiments, this paper compares data collected via Mechanical Turk to those obtained using more traditional methods One set of experiments measured the predictability of words in sentences using the Cloze sentence completion task (Taylor, 1953). The correlation between traditional and Turk Cloze scores is high (rho=0.823) and both data sets perform similarly against alternative measures of contextual predictability. Five other experiments on the semantic relatedness of verbs and phrasal verbs (how much is “lift” part of “lift up”) manipulate the presence of the sentence context and the composition of the experimental list. The results indicate that Turk data correlate well between experiments and with data from traditional methods (rho up to 0.9), and they show high inter-rater consistency and agreement. We conclude that Mechanical Turk is a reliable source of data for complex linguistic tasks in heavy use by psycholinguists. The paper provides suggestions for best practices in data collection and scrubbing.

Keywords: crowdsourcing, Amazon Mechanical Turk, web experiments, predictability, semantic similarity

More data about this article available through SCIndeks

doiSerbia