Abstract
Crowdsourcing allows human intelligence tasks to be outsourced to a large number of unspecified people at low costs. However, because of the uneven ability and diligence of crowd workers, the quality of their submitted work is also uneven and sometimes quite low. Therefore, quality control is one of the central issues in crowdsourcing research. In this paper, we consider a quality control problem of POI (points of interest) collection tasks, in which workers are asked to enumerate location information of POIs. Since workers neither necessarily provide correct answers nor provide exactly the same answers even if the answers indicate the same place, we propose a two-stage quality control method consisting of an answer clustering stage and a reliability estimation stage. Implemented with a new constrained exemplar clustering and a modified HITS algorithm, the effectiveness of our method is demonstrated as compared to baseline methods on several real crowdsourcing datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baba, Y., Kashima, H.: Statistical quality estimation for general crowdsourcing tasks. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)
Bachrach, Y., Graepel, T., Minka, T., Guiver, J.: How to grade a test without knowing the answers–a Bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on Machine Learning (2012)
Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining (2013)
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society 28(1), 20–28 (1979). Series C (Applied Statics)
Elhamifar, E., Sapiro, G., Vidal, R.: Finding exemplars from pairwise dissimilarities via simultaneous sparse recovery. In: Advances in Neural Information Processing Systems 25 (2012)
Heer, J., Bostock, M.: Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In: Proceedings of the 28th SIGCHI Conference on Human Factors in Computing Systems (2010)
Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms (1998)
Lease, M.: On quality control and machine learning in crowdsourcing. In: Proceedings of the 3rd Human Computation Workshop (2011)
Lin, C., Mausam, M., Weld, D.: Crowdsourcing control: moving beyond multiple choice. In: Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (2012)
Mathur, S., Jin, T., Kasturirangan, N., Chandrasekaran, J., Xue, W., Gruteser, M., Trappe, W.: Parknet: Drive-by sensing of road-side parking statistics. In: Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services (2010)
Matsui, T., Baba, Y., Kamishima, T., Kashima, H.: Crowdordering. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014, Part II. LNCS, vol. 8444, pp. 336–347. Springer, Heidelberg (2014)
Raykar, V.C., Yu, S.: Ranking annotators for crowdsourced labeling tasks. In: Advances in Neural Information Processing 24 (2011)
Tian, Y., Zhu, J.: Learning from crowds in the presence of schools of thought. In: Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2012)
Trushkowsky, B., Kraska, T., Franklin, M.J., Sarkar, P.: Crowdsourced enumeration queries. In: Proceedings of the 52th IEEE International Conference on Data Engineering (2013)
Venanzi, M., Rogers, A., Jennings, N.R.: Crowdsourcing spatial phenomena using trust-based heteroskedastic gaussian processes. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing (2013)
Wagstaff, K., Rogers, S., Schroedl, S.: Constrained \(k\)-means clustering with background knowledge. In: Proceedings of the 8th International Conference on Machine Learning (2001)
Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems 23 (2010)
Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Advances in Neural Information Processing Systems 22 (2009)
Wu, X., Fan, W., Yu, Y.: Sembler: ensembling crowd sequential labeling for improved quality. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (2012)
Yi, J., Jin, R., Jain, S., Jain, A.: Inferring users’ preferences from crowdsourced pairwise comparisons: A matrix completion approach. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kajimura, S., Baba, Y., Kajino, H., Kashima, H. (2015). Quality Control for Crowdsourced POI Collection. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-18032-8_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18031-1
Online ISBN: 978-3-319-18032-8
eBook Packages: Computer ScienceComputer Science (R0)