skip to main content
10.1145/3308558.3313498acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Listening between the Lines: Learning Personal Attributes from Conversations

Published:13 May 2019Publication History

ABSTRACT

Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation. In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. This problem is more challenging than the established task of information extraction from scientific publications or Wikipedia articles, because dialogues often give merely implicit cues about the speaker. We propose methods for inferring personal attributes, such as profession, age or family status, from conversations using deep learning. Specifically, we propose several Hidden Attribute Models, which are neural networks leveraging attention mechanisms and embeddings. Our methods are trained on a per-predicate basis to output rankings of object values for a given subject-predicate combination (e.g., ranking the doctor and nurse professions high when speakers talk about patients, emergency rooms, etc). Experiments with various conversational texts including Reddit discussions, movie scripts and a collection of crowdsourced personal dialogues demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.

References

  1. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR'15.Google ScholarGoogle Scholar
  2. Angelo Basile, Gareth Dwyer, Maria Medvedeva, Josine Rawee, Hessel Haagsma, and Malvina Nissim. 2017. N-GrAM: New Groningen Author-profiling Model-Notebook for PAN at CLEF 2017. In Working Notes Papers of the CLEF 2017 Evaluation Labs.Google ScholarGoogle Scholar
  3. Roy Khristopher Bayot and Teresa Gonçalves. 2018. Age and Gender Classification of Tweets Using Convolutional Neural Networks. In Machine Learning, Optimization, and Big Data. Springer International Publishing, Cham.Google ScholarGoogle Scholar
  4. John D. Burger, John Henderson, George Kim, and Guido Zarrella. 2011. Discriminating Gender on Twitter. In Proceedings of EMNLP'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling. Technical Report. Google.Google ScholarGoogle Scholar
  6. Yun-Nung Chen, Dilek Hakkani-Tür, Gokhan Tur, Jianfeng Gao, and Li Deng. 2016. End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding. In Proceedings of Interspeech'16.Google ScholarGoogle ScholarCross RefCross Ref
  7. Cristian Danescu-Niculescu-Mizil and Lillian Lee. 2011. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs.. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, ACL 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Benjamin Fabian, Annika Baumann, and Marian Keil. 2015. Privacy on Reddit? Towards Large-scale User Classification. In Proceedings of ECIS'15.Google ScholarGoogle Scholar
  9. Lucie Flekova, Jordan Carpenter, Salvatore Giorgi, Lyle Ungar, and Daniel Preotiuc-Pietro. 2016. Analyzing Biases in Human Perception of User Age and Gender from Text. In Proceedings of ACL'16 (Volume 1: Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  10. Lucie Flekova, Daniel Preotiuc-Pietro, and Lyle Ungar. 2016. Exploring Stylistic Variation with Age and Income on Twitter. In Proceedings of ACL'16 (Volume 2: Short Papers).Google ScholarGoogle ScholarCross RefCross Ref
  11. Francisco Manuel, Rangel Pardo, Paolo Rosso, Martin Potthast, and Benno Stein. 2017. Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter. In Working Notes Papers of the CLEF 2017 Evaluation Labs.Google ScholarGoogle Scholar
  12. Nikesh Garera and David Yarowsky. 2009. Modeling Latent Biographic Attributes in Conversational Genres. In Proceedings of ACL/IJCNLP'09. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen tau Yih, and Michel Galley. 2018. A Knowledge-Grounded Neural Conversation Model. In Proceedings of AAAI'18.Google ScholarGoogle ScholarCross RefCross Ref
  14. Matej Gjurkovic and Jan Šnajder. 2018. Reddit: A Gold Mine for Personality Prediction. In Proceedings of the Second Workshop on Computational Modeling of People's Opinions, Personality, and Emotions in Social Media, NAACL-HLT'18.Google ScholarGoogle ScholarCross RefCross Ref
  15. Hongyan Jing, Nanda Kambhatla, and Salim Roukos. 2007. Extracting Social Networks and Biographical Facts From Conversational Speech Transcripts. In Proceedings of ACL'07.Google ScholarGoogle Scholar
  16. Chaitanya K. Joshi, Fei Mi, and Boi Faltings. 2017. Personalization in Goal-Oriented Dialog. In Proceedings of Conversational AI Workshop, NIPS'17.Google ScholarGoogle Scholar
  17. Sunghwan Mac Kim, Qiongkai Xu, Lizhen Qu, Stephen Wan, and Cecile Paris. 2017. Demographic Inference on Twitter using Recursive Neural Networks. In Proceedings of ACL'17 (Volume 2: Short Papers).Google ScholarGoogle ScholarCross RefCross Ref
  18. Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of EMNLP'14.Google ScholarGoogle ScholarCross RefCross Ref
  19. Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of ICLR'15.Google ScholarGoogle Scholar
  20. Jiwei Li, Michel Galley, Chris Brockett, Georgios Spithourakis, Jianfeng Gao, and Bill Dolan. 2016. A Persona-Based Neural Conversation Model. In Proceedings of ACL'16 (Volume 1: Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  21. Xiang Li, Gökhan Tür, Dilek Z. Hakkani-Tür, and Qi Li. 2014. Personal knowledge graph population from user utterances in conversational understanding. In Proceedings of IEEE Spoken Language Technology Workshop (SLT).Google ScholarGoogle ScholarCross RefCross Ref
  22. Grace I. Lin and Marilyn A. Walker. 2011. All the World's a Stage: Learning Character Models from Film. In Proceedings of AIIDE'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Andrea Madotto, Chien-Sheng Wu, and Pascale Fung. 2018. Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems. In Proceedings of ACL'18 (Volume 1: Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  24. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Proceedings of NIPS'13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kaixiang Mo, Yu Zhang, Shuangyin Li, Jiajun Li, and Qiang Yang. 2018. Personalizing a Dialogue System With Transfer Reinforcement Learning. In Proceedings of AAAI'18.Google ScholarGoogle ScholarCross RefCross Ref
  26. Marco Pennacchiotti and Ana-Maria Popescu. 2011. A Machine Learning Approach to Twitter User Classification. In Proceedings of ICWSM'11.Google ScholarGoogle Scholar
  27. James W Pennebaker, Martha E Francis, and Roger J Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates 71 (2001).Google ScholarGoogle Scholar
  28. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of EMNLP'14.Google ScholarGoogle Scholar
  29. Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of NAACL'18, Volume 1 (Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  30. Martin Potthast, Francisco Rangel, Michael Tschuggnall, Efstathios Stamatatos, Paolo Rosso, and Benno Stein. 2017. Overview of PAN'17: Author Identification, Author Profiling, and Author Obfuscation. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. 7th International Conference of the CLEF Initiative (CLEF 17). Berlin Heidelberg New York.Google ScholarGoogle Scholar
  31. Daniel Preotiuc-Pietro, Vasileios Lampos, and Nikolaos Aletras. 2015. An analysis of the user occupational class through Twitter content. In Proceedings of ACL/IJCNLP'15 (Volume 1: Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  32. Daniel Preotiuc-Pietro, Ye Liu, Daniel Hopkins, and Lyle Ungar. 2017. Beyond Binary Labels: Political Ideology Prediction of Twitter Users. In Proceedings of ACL'17 (Volume 1: Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  33. Daniel Preotiuc-Pietro and Lyle Ungar. 2018. User-Level Race and Ethnicity Predictors from Twitter Text. In Proceedings of COLING'18.Google ScholarGoogle Scholar
  34. Delip Rao, David Yarowsky, Abhishek Shreevats, and Manaswi Gupta. 2010. Classifying Latent User Attributes in Twitter. In Proceedings of SMUC'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Maarten Sap, Gregory Park, Johannes Eichstaedt, Margaret Kern, David Stillwell, Michal Kosinski, Lyle Ungar, and Hansen Andrew Schwartz. 2014. Developing Age and Gender Predictive Lexica over Social Media. In Proceedings EMNLP'14.Google ScholarGoogle ScholarCross RefCross Ref
  36. H. Andrew Schwartz, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, Martin E. P. Seligman, and Lyle H. Ungar. 2013. Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach. In PloS one.Google ScholarGoogle Scholar
  37. Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 2015. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses. In Proceedings of NAACL-HLT'15.Google ScholarGoogle ScholarCross RefCross Ref
  38. Zhixing Tan, Mingxuan Wang, Jun Xie, Yidong Chen, and Xiaodong Shi. 2018. Deep Semantic Role Labeling With Self-Attention. In Proceedings of AAAI'18.Google ScholarGoogle ScholarCross RefCross Ref
  39. Gongbo Tang, Mathias Müller, Annette Rios, and Rico Sennrich. 2018. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures. In Proceedings of EMNLP'18.Google ScholarGoogle ScholarCross RefCross Ref
  40. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998-6008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Prashanth Vijayaraghavan, Soroush Vosoughi, and Deb Roy. 2017. Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning. In Proceedings of ACL'17 (Volume 2: Short Papers).Google ScholarGoogle ScholarCross RefCross Ref
  42. Kei Wakabayashi, Johane Takeuchi, Kotaro Funakoshi, and Mikio Nakano. 2016. Nonparametric Bayesian Models for Spoken Language Understanding. In Proceedings of EMNLP'16.Google ScholarGoogle ScholarCross RefCross Ref
  43. Chen Xing, Yu Wu, Wei Wu, Yalou Huang, and Ming Zhou. 2018. Hierarchical Recurrent Attention Network for Response Generation. In Proceedings of AAAI'18.Google ScholarGoogle ScholarCross RefCross Ref
  44. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical Attention Networks for Document Classification. In Proceedings of NAACL-HLT'16.Google ScholarGoogle ScholarCross RefCross Ref
  45. Kaisheng Yao, Baolin Peng, Geoffrey Zweig, and Kam-Fai Wong. 2016. An Attentional Neural Conversation Model with Improved Specificity. CoRR abs/1606.01292(2016).Google ScholarGoogle Scholar
  46. Wenpeng Yin, Hinrich Schütze, Bing Xiang, and Bowen Zhou. 2016. ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs. TACL 4(2016).Google ScholarGoogle Scholar
  47. Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. 2018. Personalizing Dialogue Agents: I have a dog, do you have pets too?. In Proceedings of ACL'18 (Volume 1: Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  48. Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, and Bo Xu. 2016. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. In Proceedings of ACL'16 (Volume 2: Short Papers).Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WWW '19: The World Wide Web Conference
    May 2019
    3620 pages
    ISBN:9781450366748
    DOI:10.1145/3308558

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 May 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format