ABSTRACT
Here we leverage the power of the crowd: online users who are willing to answer questions about dish availability at restaurants visited. While motivated users are happy to contribute knowledge, they are much less likely to respond to "silly'' or embarrassing questions (e.g., "DoesPizza Hut serve pizza?'' or "DoesMike's Vegan Restaurant serve steak?'')
In this paper, we study the problem of Vexation-Aware Active Learning (VAAL), where judiciously selected questions are targeted towards improving restaurant-dish model prediction, subject to a limit on the percentage of "unsure'' answers or "dismissals'' (e.g., swiping the app closed) measuring user vexation. We formalize the selection problem as an integer program and solve it efficiently using a distributed solution that scales linearly with the number of candidate questions. Since our algorithm relies on an accurate estimation of the unsure-dismiss rate (UDR), we present a regression model that provides high-quality results compared to baselines including collaborative filtering. Finally, we demonstrate in a live system that our proposed VAAL strategy performs competitively against classical (margin-based) active learning approaches while reducing the UDR for the questions being asked.
- Omar Alonso, Catherine C. Marshall, and Marc Najork. 2013. A Human-Centered Framework for Ensuring Reliability on Crowdsourced Labeling Tasks. In Human Computation and Crowdsourcing: Works in Progress and Demonstration Abstracts, An Adjunct to the Proceedings of the First AAAI Conference on Human Computation and Crowdsourcing, November 7--9, 2013, Palm Springs, CA, USA (AAAI Workshops), Vol. WS-13--18. AAAI . http://www.aaai.org/ocs/index.php/HCOMP/HCOMP13/paper/view/7487Google ScholarCross Ref
- David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O'Donoghue, and Warren Schudy. 2022. Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient. arxiv: math.OC/2106.04756Google Scholar
- Kalesha Bullard, Yannick Schroecker, and Sonia Chernova. 2019. Active Learning within Constrained Environments through Imitation of an Expert Questioner. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10--16, 2019, Sarit Kraus (Ed.). ijcai.org, 2045--2052. https://doi.org/10.24963/ijcai.2019/283Google ScholarCross Ref
- Antonin Chambolle and Thomas Pock. 2011. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. Journal of Mathematical Imaging and Vision , Vol. 40, 1 (2011), 120--145. http://dblp.uni-trier.de/db/journals/jmiv/jmiv40.html#ChambolleP11Google ScholarDigital Library
- Wei Chu, Martin Zinkevich, Lihong Li, Achint Thomas, and Belle L. Tseng. 2011. Unbiased online active learning in data streams. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 21--24, 2011, Chid Apté, Joydeep Ghosh, and Padhraic Smyth (Eds.). ACM, 195--203. https://doi.org/10.1145/2020408.2020444Google ScholarDigital Library
- Gui Citovsky, Giulia DeSalvo, Claudio Gentile, Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, and Sanjiv Kumar. 2021. Batch Active Learning at Scale. Advances in Neural Information Processing Systems , Vol. 34 (2021).Google Scholar
- Peng Dai, Jeffrey M. Rzeszotarski, Praveen Paritosh, and Ed H. Chi. 2015. And Now for Something Completely Different: Improving Crowdsourcing Workflows with Micro-Diversions. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, CSCW 2015, Vancouver, BC, Canada, March 14 - 18, 2015 , , Dan Cosley, Andrea Forte, Luigina Ciolfi, and David McDonald (Eds.). ACM, 628--638. https://doi.org/10.1145/2675133.2675260Google ScholarDigital Library
- Pinar Donmez, Jaime G. Carbonell, and Paul N. Bennett. 2007. Dual Strategy Active Learning. In Machine Learning: ECML 2007, 18th European Conference on Machine Learning, Warsaw, Poland, September 17--21, 2007, Proceedings (Lecture Notes in Computer Science), , Joost N. Kok, Jacek Koronacki, Ramó n Ló pez de Má ntaras, Stan Matwin, Dunja Mladenic, and Andrzej Skowron (Eds.), Vol. 4701. Springer, 116--127. https://doi.org/10.1007/978--3--540--74958--5_14Google Scholar
- Pinar Donmez, Jaime G Carbonell, and Jeff Schneider. 2009. Efficiently learning the accuracy of labeling sources for selective sampling. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. 259--268.Google ScholarDigital Library
- Sheng-Jun Huang, Rong Jin, and Zhi-Hua Zhou. 2010. Active Learning by Querying Informative and Representative Examples. In Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6--9 December 2010, Vancouver, British Columbia, Canada , , John D. Lafferty, Christopher K. I. Williams, John Shawe-Taylor, Richard S. Zemel, and Aron Culotta (Eds.). Curran Associates, Inc., 892--900. https://proceedings.neurips.cc/paper/2010/hash/5487315b1286f907165907aa8fc96619-Abstract.htmlGoogle Scholar
- Sheng-Jun Huang, Jia-Lve Chen, Xin Mu, and Zhi-Hua Zhou. 2017. Cost-Effective Active Learning from Diverse Labelers.. In IJCAI . 1879--1885.Google Scholar
- Panagiotis G. Ipeirotis and Evgeniy Gabrilovich. 2014. Quizz: targeted crowdsourcing with a billion (potential) users. In 23rd International World Wide Web Conference, WWW '14, Seoul, Republic of Korea, April 7--11, 2014 , , Chin-Wan Chung, Andrei Z. Broder, Kyuseok Shim, and Torsten Suel (Eds.). ACM , 143--154. https://doi.org/10.1145/2566486.2567988Google ScholarDigital Library
- Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer , Vol. 42, 8 (Aug. 2009), 30--37.Google ScholarDigital Library
- Evgeny Krivosheev, Siarhei Bykau, Fabio Casati, and Sunil Prabhakar. 2020. Detecting and Preventing Confused Labels in Crowdsourced Data. Proc. VLDB Endow. , Vol. 13, 11 (2020), 2522--2535. http://www.vldb.org/pvldb/vol13/p2522-krivosheev.pdfGoogle ScholarDigital Library
- Nikolaos Lagos, Salah Ait-Mokhtar, and Ioan Calapodescu. 2020. Point-Of-Interest Semantic Tag Completion in a Global Crowdsourced Search-and-Discovery Database. In ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020 - Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020) (Frontiers in Artificial Intelligence and Applications), Giuseppe De Giacomo, Alejandro Catalá, Bistra Dilkina, Michela Milano, Sené n Barro, Alberto Bugar'i n, and Jé rô me Lang (Eds.), Vol. 325. IOS Press, 2993--3000. https://doi.org/10.3233/FAIA200474Google Scholar
- Steffen Rendle, Walid Krichene, Li Zhang, and John R. Anderson. 2020. Neural Collaborative Filtering vs. Matrix Factorization Revisited. In RecSys 2020: Fourteenth ACM Conference on Recommender Systems, Virtual Event, Brazil, September 22--26, 2020, Rodrygo L. T. Santos, Leandro Balby Marinho, Elizabeth M. Daly, Li Chen, Kim Falk, Noam Koenigstein, and Edleno Silva de Moura (Eds.). ACM , 240--248. https://doi.org/10.1145/3383313.3412488Google ScholarDigital Library
- Burr Settles. 2009. Active Learning Literature Survey . Computer Sciences Technical Report 1648. University of Wisconsin--Madison. http://axon.cs.byu.edu/ martinez/classes/778/Papers/settles.activelearning.pdfGoogle Scholar
- Dominic Seyler, Mohamed Yahya, Klaus Berberich, and Omar Alonso. 2016. Automated question generation for quality control in human computation tasks. In Proceedings of the 8th ACM Conference on Web Science, WebSci 2016, Hannover, Germany, May 22--25, 2016, Wolfgang Nejdl, Wendy Hall, Paolo Parigi, and Steffen Staab (Eds.). ACM , 360--362. https://doi.org/10.1145/2908131.2908210Google ScholarDigital Library
- Victor S Sheng, Foster Provost, and Panagiotis G Ipeirotis. 2008. Get another label? improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining . 614--622.Google ScholarDigital Library
- Luis von Ahn and Laura Dabbish. 2008. Designing games with a purpose. Commun. ACM , Vol. 51, 8 (2008), 58--67. https://doi.org/10.1145/1378704.1378719Google ScholarDigital Library
- Chris Welty, Lora Aroyo, Flip Korn, Sara McCarthy, and Shubin Zhao. 2021. Rapid Instance-Level Knowledge Acquisition for Google Maps from Class-Level Common Sense. In Proceedings of HCOMP-2021 . AAAI.Google ScholarCross Ref
- Chris Welty, Lora Aroyo, Flip Korn, Sara M. McCarthy, and Shubin Zhao. 2022. Addressing Label Sparsity with Class-Level Common Sense for Google Maps. Frontiers Artif. Intell. , Vol. 5 (2022).Google ScholarCross Ref
Index Terms
- Vexation-Aware Active Learning for On-Menu Restaurant Dish Availability
Recommendations
Active lmitation learning: formal and practical reductions to I.I.D. learning
In standard passive imitation learning, the goal is to learn a policy that performs as well as a target policy by passively observing full execution trajectories of it. Unfortunately, generating such trajectories can require substantial expert effort and ...
A review and experimental analysis of active learning over crowdsourced data
AbstractTraining data creation is increasingly a key bottleneck for developing machine learning, especially for deep learning systems. Active learning provides a cost-effective means for creating training data by selecting the most informative instances ...
Multiple-view multiple-learner active learning
Generally, collecting a large quantity of unlabeled examples is feasible, but labeling them all is not. Active learning can reduce the number of labeled examples needed to train a good classifier. Existing active learning algorithms can be roughly ...
Comments