Abstract
Crowds of people can solve some problems faster than individuals or small groups. A crowd can also rapidly generate data about circumstances affecting the crowd itself. This crowdsourced data can be leveraged to benefit the crowd by providing information or solutions faster than traditional means. However, the crowdsourced data can hardly be used directly to yield usable information. Intelligently analyzing and processing crowdsourced information can help prepare data to maximize the usable information, thus returning the benefit to the crowd. This article highlights challenges and investigates opportunities associated with mining crowdsourced data to yield useful information, as well as details how crowdsource information and technologies can be used for response-coordination when needed, and finally suggests related areas for future research.
Similar content being viewed by others
Notes
http://twitter.pbworks.com/Hashtags—a label for Tweets prefixed with the # character.
Results from http://translate.google.com using “Detect Language” feature.
Results from http://cogcomp.cs.illinois.edu/demo/ner/results.php online named entity recognition demonstration.
References
Agarwal N, Kumar S, Liu H, Woodward M (2009) Blogtrackers: a tool for sociologists to track and analyze blogosphere. In: Proceedings of the 3rd international AAAI conference on weblogs and social media (ICWSM). http://www.public.asu.edu/~huanliu/papers/icwsm09k.pdf
Agarwal N, Liu H (2009) Modeling and data mining in blogosphere. In: Synthesis lectures on data mining and knowledge discovery, vol 1. Morgan/Claypool, San Mateo. http://www.morganclaypool.com/doi/abs/10.2200/S00213ED1V01Y200907DMK001
Agarwal N, Liu H, Tang L, Yu P (2008) Identifying the influential bloggers in a community. In: Proceedings of the international conference on web search and Web data mining. ACM, New York, pp 207–218
Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval, vol 463. ACM, New York
Banerjee S, Pedersen T (2002) An adapted lesk algorithm for word sense disambiguation using wordnet. In: Computational linguistics and intelligent text processing, pp 117–171
Bishop C (2006) Pattern recognition and machine learning, vol 4. Springer, New York
Blodget H (2009) Who the hell writes wikipedia, anyway? http://www.businessinsider.com/2009/1/who-the-hell-writes-wikipedia-anyway
Brabham D, Sanchez T, Bartholomew K (2009) Crowdsourcing public participation in transit planning: preliminary results from the next stop design case. Transportation Research Board
Budde A, Michahelles F (2010) Towards an open product repository using playful crowdsourcing. Digitale Soziale Netze@ Jahrestagung GI, Leipzig
Campbell M, Innovation (2009) The sinister powers of crowdsourcing. http://www.newscientist.com/article/dn18315-innovation-the-sinister-powers-of-crowdsourcing.html
Cristian F (1996) Synchronous and asynchronous. Commun ACM 39(4):88–97
DARPA (2012) Darpa network challenge
Gao H, Barbier G, Goolsby R (2011a) Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intell Syst 26(3):10–14. doi:10.1109/MIS.2011.52
Gao H, Wang X, Barbier G, Liu H (2011b) Promoting coordination for disaster relief: from crowdsourcing to coordination. In: Social computing, behavioral modeling, and prediction (SBP). Springer, Berlin
Goolsby R (2010) Social media as crisis platform: the future of community maps/crisis maps. ACM Trans Intell Syst Technol 1(1):1–11. doi:10.1145/1858948.1858955
Grosseck G, Holotescu C (2008) Can we use Twitter for educational activities. In: 4th international scientific conference, eLearning and software for education, Bucharest, Romania
Grossman L (2009) Iran protests: Twitter, the medium of the movement. Time (June 17, 2009). http://www.time.com/time/world/article/0,8599,1905125,00.html
Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, San Mateo
Harkin J (2009) The trouble with twitter. http://www.guardian.co.uk/commentisfree/2009/dec/29/trouble-twitter-social-networking-banality
Honey C, Herring S (2009) Beyond microblogging: conversation and collaboration via Twitter. In: 42nd Hawaii international conference on system sciences, 2009. HICSS’09. IEEE, New York, pp 1–10
Howe J (2006) The rise of crowdsourcing. Wired 14.06, Retrieved 2010–10–04
Hughes A, Palen L (2009) Twitter adoption and use in mass convergence and emergency events. Int J Emerg Manag 6(3):248–260
Jansen B, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci Technol 60(11):2169–2188
Jusang A, Ismail R, Boyd C (2007) A survey of trust and reputation systems for online service provision. Decis Support Syst 43(2):618–644
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: International conference on machine learning, Citeseer, pp 282–289
Li C (2007) Forrester’s new social technographics report. http://forrester.typepad.com/groundswell/2007/04/forresters_new_.html
Libert B, Spector J (2007) We are smarter than me: how to unleash the power of crowds in your business. Wharton School Publishing
Liu B (2007) Web data mining. Springer, Berlin
Liu H, Motoda H (eds) (2008) Computational methods of feature selection, Chapman & Hall/CRC, Boca Raton
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(3):1–12
Liu S, Palen L (2009) Spatiotemporal mashups: a survey of current tools to inform next generation crisis support. In: Proceedings of the 6th international ISCRAM conference, Gothenburg, Sweden
Pelleg D, Moore A (2008) X-means: extending K-means with efficient estimation of the number of clusters. In: Proceedings of the seventeenth international conference on machine learning, pp 727–734
Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Sorokin A, Forsyth D (2008) Utility data annotation with amazon mechanical turk. In: IEEE computer society conference on computer vision and pattern recognition workshops, 2008. CVPRW’08. IEEE, New York, pp 1–8
Spellman J (2010) Heading off disaster, one tweet at a time. http://www.cnn.com/2010/TECH/social.media/09/22/natural.disasters.social.media/index.html?hpt=Sbin. Turner broadcasting system, Inc
Surowiecki J (2009) G20 summit: how the bandwagon wrecked the wisdom of market crowds. http://www.guardian.co.uk/commentisfree/2009/mar/31/james-surowiecki-comment-global-economy
Terranova T (2004) Network culture: politics for the information age. Pluto Press, London
Turney P (2001) Mining the web for synonyms: Pmi-ir versus lsa on toefl. In: Proceedings of the twelfth European conference on machine learning (ecml-2001), pp 491–502
UNOCHA (2006) United Nations disaster assessment and coordination (UNDAC) handbook. Electronic. http://ochaonline.un.org/OCHAHome/AboutUs/Coordination/UNDACSystem/UNDACHandbook/tabid/6012/language/en-US/Default.aspx
Von Ahn L (2007) Human computation. In: Proceedings of the 4th international conference on knowledge capture. ACM, New York, pp 5–6
Welinder P, Perona P (2010) Online crowdsourcing: rating annotators and obtaining cost-effective labels, pp 25–32. doi:10.1109/CVPRW.2010.5543189
Woods D (2008) The commercial bear hug of open source. http://www.forbes.com/2008/08/17/cio-open-source-tech-cio-cx_dw_0818open.html
Zafarani R, Cole WD, Liu H (2010) Sentiment propagation in social networks: a case study in live journal. In: Chai SK, Salerno JJ, Mabry PL (eds) Advances in social computing. Lecture Notes in Computer Science. Springer, Berlin, pp 413–420
Acknowledgements
The authors wish to acknowledge the members of the Arizona State University, Data Mining and Machine Learning laboratory for their motivating influence and thought-inspiring comments and questions with reference to this topic. This work, in particular, the content of Sect. 5, was inspired by and based on an ongoing project “ASU Coordination Tracker (ACT) for Disaster Relief”. This work was funded, in part, by the Office of Naval Research (ONR), the Air Force Office of Scientific Research (AFOSR), and the OSD-T&E (Office of Secretary Defense-Test and Evaluation), Defense-Wide/PE0601120D8Z National Defense Education Program (NDEP)/BA-1, Basic Research; SMART Program Office, www.asee.org/fellowships/smart, Grant Number N00244-09-1-0081. This work is approved for public release, case number 88ABW-2012-1644.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Barbier, G., Zafarani, R., Gao, H. et al. Maximizing benefits from crowdsourced data. Comput Math Organ Theory 18, 257–279 (2012). https://doi.org/10.1007/s10588-012-9121-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10588-012-9121-2