Quality Control for Crowdsourced POI Collection

Kajimura, Shunsuke; Baba, Yukino; Kajino, Hiroshi; Kashima, Hisashi

doi:10.1007/978-3-319-18032-8_20

Shunsuke Kajimura¹⁰,
Yukino Baba^11,12,
Hiroshi Kajino¹⁰ &
…
Hisashi Kashima¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

4202 Accesses
3 Citations
3 Altmetric

Abstract

Crowdsourcing allows human intelligence tasks to be outsourced to a large number of unspecified people at low costs. However, because of the uneven ability and diligence of crowd workers, the quality of their submitted work is also uneven and sometimes quite low. Therefore, quality control is one of the central issues in crowdsourcing research. In this paper, we consider a quality control problem of POI (points of interest) collection tasks, in which workers are asked to enumerate location information of POIs. Since workers neither necessarily provide correct answers nor provide exactly the same answers even if the answers indicate the same place, we propose a two-stage quality control method consisting of an answer clustering stage and a reliability estimation stage. Implemented with a new constrained exemplar clustering and a modified HITS algorithm, the effectiveness of our method is demonstrated as compared to baseline methods on several real crowdsourcing datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baba, Y., Kashima, H.: Statistical quality estimation for general crowdsourcing tasks. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)
Google Scholar
Bachrach, Y., Graepel, T., Minka, T., Guiver, J.: How to grade a test without knowing the answers–a Bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on Machine Learning (2012)
Google Scholar
Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining (2013)
Google Scholar
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society 28(1), 20–28 (1979). Series C (Applied Statics)
Google Scholar
Elhamifar, E., Sapiro, G., Vidal, R.: Finding exemplars from pairwise dissimilarities via simultaneous sparse recovery. In: Advances in Neural Information Processing Systems 25 (2012)
Google Scholar
Heer, J., Bostock, M.: Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In: Proceedings of the 28th SIGCHI Conference on Human Factors in Computing Systems (2010)
Google Scholar
Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms (1998)
Google Scholar
Lease, M.: On quality control and machine learning in crowdsourcing. In: Proceedings of the 3rd Human Computation Workshop (2011)
Google Scholar
Lin, C., Mausam, M., Weld, D.: Crowdsourcing control: moving beyond multiple choice. In: Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (2012)
Google Scholar
Mathur, S., Jin, T., Kasturirangan, N., Chandrasekaran, J., Xue, W., Gruteser, M., Trappe, W.: Parknet: Drive-by sensing of road-side parking statistics. In: Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services (2010)
Google Scholar
Matsui, T., Baba, Y., Kamishima, T., Kashima, H.: Crowdordering. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014, Part II. LNCS, vol. 8444, pp. 336–347. Springer, Heidelberg (2014)
Chapter Google Scholar
Raykar, V.C., Yu, S.: Ranking annotators for crowdsourced labeling tasks. In: Advances in Neural Information Processing 24 (2011)
Google Scholar
Tian, Y., Zhu, J.: Learning from crowds in the presence of schools of thought. In: Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2012)
Google Scholar
Trushkowsky, B., Kraska, T., Franklin, M.J., Sarkar, P.: Crowdsourced enumeration queries. In: Proceedings of the 52th IEEE International Conference on Data Engineering (2013)
Google Scholar
Venanzi, M., Rogers, A., Jennings, N.R.: Crowdsourcing spatial phenomena using trust-based heteroskedastic gaussian processes. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing (2013)
Google Scholar
Wagstaff, K., Rogers, S., Schroedl, S.: Constrained \(k\)-means clustering with background knowledge. In: Proceedings of the 8th International Conference on Machine Learning (2001)
Google Scholar
Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems 23 (2010)
Google Scholar
Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Advances in Neural Information Processing Systems 22 (2009)
Google Scholar
Wu, X., Fan, W., Yu, Y.: Sembler: ensembling crowd sequential labeling for improved quality. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (2012)
Google Scholar
Yi, J., Jin, R., Jain, S., Jain, A.: Inferring users’ preferences from crowdsourced pairwise comparisons: A matrix completion approach. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Tokyo, Tokyo, Japan
Shunsuke Kajimura & Hiroshi Kajino
National Institute of Informatics, Tokyo, Japan
Yukino Baba
JST, ERATO, Kawarabayashi Large Graph Project, Tokyo, Japan
Yukino Baba
Kyoto University, Kyoto, Japan
Hisashi Kashima

Authors

Shunsuke Kajimura
View author publications
You can also search for this author in PubMed Google Scholar
Yukino Baba
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Kajino
View author publications
You can also search for this author in PubMed Google Scholar
Hisashi Kashima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shunsuke Kajimura .

Editor information

Editors and Affiliations

Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam
Tru Cao
Singapore Management University, Singapore, Singapore
Ee-Peng Lim
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Japan Advanced Institute of Science and Technology, Nomi City, Japan
Tu-Bao Ho
The University of Hong Kong, Hong Kong, Hong Kong SAR
David Cheung
Osaka University, Osaka, Japan
Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kajimura, S., Baba, Y., Kajino, H., Kashima, H. (2015). Quality Control for Crowdsourced POI Collection. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-18032-8_20
Published: 09 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18031-1
Online ISBN: 978-3-319-18032-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics