Psihologija 2010 Volume 43, Issue 4, Pages: 441-464
https://doi.org/10.2298/PSI1004441S
Full text ( 144 KB)
Cited by
Using Amazon Mechanical Turk for linguistic research
Schnoebelen Tyler (Stanford University, USA)
Kuperman Victor (McMaster University, Canada)
Amazon’s Mechanical Turk service makes linguistic experimentation quick,
easy, and inexpensive. However, researchers have not been certain about its
reliability. In a series of experiments, this paper compares data collected
via Mechanical Turk to those obtained using more traditional methods One set
of experiments measured the predictability of words in sentences using the
Cloze sentence completion task (Taylor, 1953). The correlation between
traditional and Turk Cloze scores is high (rho=0.823) and both data sets
perform similarly against alternative measures of contextual predictability.
Five other experiments on the semantic relatedness of verbs and phrasal
verbs (how much is “lift” part of “lift up”) manipulate the presence of the
sentence context and the composition of the experimental list. The results
indicate that Turk data correlate well between experiments and with data
from traditional methods (rho up to 0.9), and they show high inter-rater
consistency and agreement. We conclude that Mechanical Turk is a reliable
source of data for complex linguistic tasks in heavy use by psycholinguists.
The paper provides suggestions for best practices in data collection and
scrubbing.
Keywords: crowdsourcing, Amazon Mechanical Turk, web experiments, predictability, semantic similarity
More data about this article available through SCIndeks