ABSTRACT
The constant growth of machine-generated mail, which today consists of more than 90% of non-spam mail traffic, is a major contributor toinformation overload in email, where users become overwhelmed with a flood of messages from commercial entities. A large part of this traffic is often junk mail that the user would prefer not to receive. Surprisingly, nearly 95% of this traffic is in fact solicited by the users themselves in the form of subscriptions to mailing services. These subscriptions are many times unintentional. Although unsubscription option from such services is enforced by commercial laws, it is hardly actually used by users. We perform a large scale study ofunsubscribable traffic, namely, messages that provide unsubscription option to users. We consider users behavior over such traffic in Yahoo Web mail service, and demonstrate a significant gap between users low interest in this traffic, and their lack of active behavior in decreasing its load. We conjecture that the cause of this gap is the lack of an efficient and easily accessible mechanism that would help users to unsubscribe. We validate our conjecture with an online large scale experiment, where we provide users with a novel mail feature for managing unsubscribable traffic, based on personalized recommendations. The experiment demonstrates the imminent need that exists for such a mechanism.
- D. Aberdeen, O. Pacovsky, and A. Slater. The learning behind gmail priority inbox. In NIPS 2010 Workshop on Learning on Cores, Clusters and Clouds. Elsevier, 2010.Google Scholar
- C.-S. Act. Federal trade commission, 2003. https://www.ftc.gov/tips-advice/business-center/guidance/can-spam-act-compliance-guide-business.Google Scholar
- Q. Ai, S. Dumais, N. Craswell, and D. Liebling. Characterizing email search using large-scale behavioral logs and surveys. In Proceedings of the 26th International Conference on World Wide Web, pages 1511--1520. ACM, 2017. Google ScholarDigital Library
- N. Ailon, Z. S. Karnin, E. Liberty, and Y. Maarek. Threading machine generated email. In Proceedings of the sixth ACM international conference on Web search and data mining, WSDM '13, pages 405--414. ACM, 2013. Google ScholarDigital Library
- I. Alberts and D. Forest. Email pragmatics and automatic classification: A study in the organizational context. In Journal of the Association for Information Science and Technology, volume 63, pages 904--922. John Wiley & Sons, Inc., 2012. Google ScholarDigital Library
- N. Avigdor-Elgrabli, M. Cwalinski, D. D. Castro, I. Gamzu, I. Grabovitch-Zuyev, L. Lewin-Eytan, and Y. Maarek. Structural clustering of machine-generated mail. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management, pages 217--226, 2016. Google ScholarDigital Library
- V. Bellotti, N. Ducheneaut, M. Howard, and I. Smith. Taking email to task: The design and evaluation of a task management centered email tool. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '03, pages 345--352. ACM, 2003. Google ScholarDigital Library
- M. Bendersky, X. Wang, D. Metzler, and M. Najork. Learning from user interactions in personal search via attribute parameterization. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM '17, pages 791--799. ACM, 2017. Google ScholarDigital Library
- D. Carmel, G. Halawi, L. Lewin-Eytan, Y. Maarek, and A. Raviv. Rank by time or by relevance?: Revisiting email search. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM '15, pages 283--292. ACM, 2015. Google ScholarDigital Library
- D. Carmel, L. Lewin-Eytan, A. Libov, Y. Maarek, and A. Raviv. The demographics of mail search and their application to query suggestion. In Proceedings of the 26th International Conference on World Wide Web, WWW '17. ACM, 2017. Google ScholarDigital Library
- D. Carmel, L. Lewin-Eytan, A. Libov, Y. Maarek, and A. Raviv. Promoting relevant results in time-ranked mail search. In Proceedings of the 26th International Conference on World Wide Web, WWW '17. ACM, 2017. Google ScholarDigital Library
- M. E. Cecchinato, A. Sellen, M. Shokouhi, and G. Smyth. Finding email in a multi-account, multi-device world. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI '16, pages 1200--1210, 2016. Google ScholarDigital Library
- L. A. Dabbish and R. E. Kraut. Email overload at work: An analysis of factors associated with email strain. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work, CSCW '06, pages 431--440. ACM, 2016. Google ScholarDigital Library
- L. A. Dabbish, R. E. Kraut, S. Fussell, and S. Kiesler. Understanding email use: Predicting action on a message. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '05, pages 691--700. ACM, 2005. Google ScholarDigital Library
- D. Di Castro, Z. Karnin, L. Lewin-Eytan, and Y. Maarek. You've got mail, and here is what you could do with it!: Analyzing and predicting actions on email messages. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, WSDM '16, pages 307--316. ACM, 2016. Google ScholarDigital Library
- D. Di Castro, L. Lewin-Eytan, Y. Maarek, R. Wolff, and E. Zohar. Enforcing k-anonymity in web mail auditing. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, WSDM '16, pages 327--336. ACM, 2016. Google ScholarDigital Library
- M. Dredze, T. Brooks, J. Carroll, J. Magarick, J. Blitzer, and F. Pereira. Intelligent email: Reply and attachment prediction. In Proceedings of the 13th International Conference on Intelligent User Interfaces, IUI '08, pages 321--324, Canary Islands, Spain, 2008. Google ScholarDigital Library
- M. Grbovic, G. Halawi, Z. Karnin, and Y. Maarek. How many folders do you really need? classifying email into a handful of categories. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM '14, pages 869--878. ACM, 2014. Google ScholarDigital Library
- C. Grevet, D. Choi, D. Kumar, and E. Gilbert. Overload is overloaded: Email in the age of gmail. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '14, pages 793--802. ACM, 2016. Google ScholarDigital Library
- J. Gwizdka. Email task management styles: The cleaners and the keepers. In CHI '04 Extended Abstracts on Human Factors in Computing Systems, CHI EA '04, pages 1235--1238. ACM, 2004. Google ScholarDigital Library
- R. Internet Engineering Task Force. The use of urls as meta-syntax for core mail list commands and their transport through message header fields, July 1998. https://tools.ietf.org/html/rfc2369.Google Scholar
- R. Internet Engineering Task Force. List-id: A structured field and namespace for the identification of mailing lists, March 2001. https://tools.ietf.org/html/rfc2919.Google Scholar
- R. Internet Engineering Task Force. Signaling one-click functionality for list email headers, January 2017. https://tools.ietf.org/html/rfc8058.Google Scholar
- S. Kiritchenko and S. Matwin. Email classification with co-training. In Center for Advanced Studies on Collaborative Research, CASCON '11, pages 301--312, 2011. Google ScholarDigital Library
- B. Klimt and Y. Yang. The enron corpus: A new dataset for email classification research. In Machine Learning: ECML 2004: 15th European Conference on Machine Learning, Proceedings, pages 217--226, 2004. Google ScholarDigital Library
- F. Kooti, L. M. Aiello, M. Grbovic, K. Lerman, and A. Mantrach. Evolution of conversations in the age of email overload. In Proceedings of the 24th International Conference on World Wide Web, WWW '15, pages 603--613, 2015. Google ScholarDigital Library
- G. Lab. Why do people subscribe to email newsletters?, Mars 2015. https://lab.getapp.com/new-research-getdata-why-do-people-subscribe-to-email-newsletters.Google Scholar
- J. Langford, L. Li, and A. Strehl. Vowpal wabbit online learning project, 2007. Technical report, http://hunch.net.Google Scholar
- J. Langford, L. Li, and T. Zhang. Sparse online learning via truncated gradient. Journal of Machine Learning Research, 10:777--801, 2009. Google ScholarDigital Library
- K. Narang, S. T. Dumais, N. Craswell, D. Liebling, and Q. Ai. Large-scale analysis of email search and organizational strategies. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval, CHIIR '17, pages 215--223. ACM, 2017. Google ScholarDigital Library
- B. On, E. Lim, J. Jiang, A. Purandare, and L. Teow. Mining interaction behaviors for email reply order prediction. In Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, ASONAM '10, pages 306--310. IEEE, 2010. Google ScholarDigital Library
- A. Qadir, M. Gamon, P. Pantel, and A. H. Awadallah. Activity modeling in email. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT '16, pages 1452--1462, 2016.Google ScholarCross Ref
- S. Radicati and J. Levenstein. Email market, 2015--2019, Nov 2015. http://www.radicati.com/wp/wp-content/uploads/2015/07/Email-Market-2015--2019-Executive-Summary.pdf.Google Scholar
- S. Radicati and J. Levenstein. Email market, 2017--2021, Nov 2017. http://www.radicati.com/wp/wp-content/uploads/2017/01/Email-Statistics-Report-2017--2021-Executive-Summary.pdf.Google Scholar
- M. Sappelli, S. Verberne, and W. Kraaij. Combining textual and non-textual features for e-mail importance estimation. In Proceedings of the 25th Benelux Conference on Articial Intelligence, pages 168--174. IBM Press, 2013.Google Scholar
- A. Taiwo, Z. Shikun, and K. Rinat. Applying machine learning techniques for e-mail management: solution with intelligent e-mail reply prediction. Journal of Engineering and Technology Research, 1)7):143--151, 10 2009.Google Scholar
- B. Wang, M. Ester, J. Bu, Y. Zhu, Z. Guan, and D. Cai. Which to view: Personalized prioritization for broadcast emails. In Proceedings of the 25th International Conference on World Wide Web, WWW '16, pages 1181--1190, 2016. Google ScholarDigital Library
- B. Wang, M. Ester, Y. Liao, J. Bu, Y. Zhu, Z. Guan, and D. Cai. The million domain challenge: Broadcast email prioritization by cross-domain recommendation. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, pages 1895--1904. ACM, 2016. Google ScholarDigital Library
- J. B. Wendt, M. Bendersky, L. Garcia-Pueyo, V. Josifovski, B. Miklos, and I. Krka. Hierarchical label propagation and discovery for machine generated email. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, WSDM '16, 2016. Google ScholarDigital Library
- S. Whittaker and C. L. Sidner. Email overload: Exploring personal information management of email. In Conference on Human Factors in Computing Systems: Common Ground, CHI, pages 276--283, 1996. Google ScholarDigital Library
- L. Yang, S. T. Dumais, P. N. Bennett, and A. H. Awadallah. Characterizing and predicting enterprise email reply behavior. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '17, pages 235--244. ACM, 2017. Google ScholarDigital Library
Index Terms
- Unsubscription: A Simple Way to Ease Overload in Email
Recommendations
Overload is overloaded: email in the age of Gmail
CHI '14: Proceedings of the SIGCHI Conference on Human Factors in Computing SystemsThe term email overload has two definitions: receiving a large volume of incoming email, and having emails of different status types (to do, to read, etc). Whittaker and Sidner proposed the latter definition in 1996, noticing that email inboxes were far ...
Structural Clustering of Machine-Generated Mail
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementSeveral recent studies have presented different approaches for clustering and classifying machine-generated mail based on email headers. We propose to expand these approaches by considering email message bodies. We argue that our approach can help ...
A More Private & Secure E-Mail System using Image Steganography (EPS) and Data Mining
AICTC '16: Proceedings of the International Conference on Advances in Information Communication Technology & ComputingData mining is a practice of automatically exploring and analysis of large quantities of data in order to discover valid, potentially useful and understandable patterns in data [1]. The data provided may contain private and user sensitive data leads to ...
Comments