research-article

Listening between the Lines: Learning Personal Attributes from Conversations

Authors:
Anna Tigunova

Max Planck Institute for Informatics, Germany

Max Planck Institute for Informatics, Germany
View Profile

,
Andrew Yates

Max Planck Institute for Informatics, Germany

Max Planck Institute for Informatics, Germany
View Profile

,
Paramita Mirza

Max Planck Institute for Informatics, Germany

Max Planck Institute for Informatics, Germany
View Profile

,
Gerhard Weikum

Max Planck Institute for Informatics, Germany

Max Planck Institute for Informatics, Germany
View Profile

Authors Info & Claims

WWW '19: The World Wide Web ConferenceMay 2019Pages 1818–1828https://doi.org/10.1145/3308558.3313498

Published:13 May 2019Publication History

WWW '19: The World Wide Web Conference

Pages 1818–1828

ABSTRACT

Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation. In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. This problem is more challenging than the established task of information extraction from scientific publications or Wikipedia articles, because dialogues often give merely implicit cues about the speaker. We propose methods for inferring personal attributes, such as profession, age or family status, from conversations using deep learning. Specifically, we propose several Hidden Attribute Models, which are neural networks leveraging attention mechanisms and embeddings. Our methods are trained on a per-predicate basis to output rankings of object values for a given subject-predicate combination (e.g., ranking the doctor and nurse professions high when speakers talk about patients, emergency rooms, etc). Experiments with various conversational texts including Reddit discussions, movie scripts and a collection of crowdsourced personal dialogues demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.

References

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR'15.Google Scholar
Angelo Basile, Gareth Dwyer, Maria Medvedeva, Josine Rawee, Hessel Haagsma, and Malvina Nissim. 2017. N-GrAM: New Groningen Author-profiling Model-Notebook for PAN at CLEF 2017. In Working Notes Papers of the CLEF 2017 Evaluation Labs.Google Scholar
Roy Khristopher Bayot and Teresa Gonçalves. 2018. Age and Gender Classification of Tweets Using Convolutional Neural Networks. In Machine Learning, Optimization, and Big Data. Springer International Publishing, Cham.Google Scholar
John D. Burger, John Henderson, George Kim, and Guido Zarrella. 2011. Discriminating Gender on Twitter. In Proceedings of EMNLP'11. Google ScholarDigital Library
Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling. Technical Report. Google.Google Scholar
Yun-Nung Chen, Dilek Hakkani-Tür, Gokhan Tur, Jianfeng Gao, and Li Deng. 2016. End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding. In Proceedings of Interspeech'16.Google ScholarCross Ref
Cristian Danescu-Niculescu-Mizil and Lillian Lee. 2011. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs.. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, ACL 2011. Google ScholarDigital Library
Benjamin Fabian, Annika Baumann, and Marian Keil. 2015. Privacy on Reddit? Towards Large-scale User Classification. In Proceedings of ECIS'15.Google Scholar
Lucie Flekova, Jordan Carpenter, Salvatore Giorgi, Lyle Ungar, and Daniel Preotiuc-Pietro. 2016. Analyzing Biases in Human Perception of User Age and Gender from Text. In Proceedings of ACL'16 (Volume 1: Long Papers).Google ScholarCross Ref
Lucie Flekova, Daniel Preotiuc-Pietro, and Lyle Ungar. 2016. Exploring Stylistic Variation with Age and Income on Twitter. In Proceedings of ACL'16 (Volume 2: Short Papers).Google ScholarCross Ref
Francisco Manuel, Rangel Pardo, Paolo Rosso, Martin Potthast, and Benno Stein. 2017. Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter. In Working Notes Papers of the CLEF 2017 Evaluation Labs.Google Scholar
Nikesh Garera and David Yarowsky. 2009. Modeling Latent Biographic Attributes in Conversational Genres. In Proceedings of ACL/IJCNLP'09. Google ScholarDigital Library
Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen tau Yih, and Michel Galley. 2018. A Knowledge-Grounded Neural Conversation Model. In Proceedings of AAAI'18.Google ScholarCross Ref
Matej Gjurkovic and Jan Šnajder. 2018. Reddit: A Gold Mine for Personality Prediction. In Proceedings of the Second Workshop on Computational Modeling of People's Opinions, Personality, and Emotions in Social Media, NAACL-HLT'18.Google ScholarCross Ref
Hongyan Jing, Nanda Kambhatla, and Salim Roukos. 2007. Extracting Social Networks and Biographical Facts From Conversational Speech Transcripts. In Proceedings of ACL'07.Google Scholar
Chaitanya K. Joshi, Fei Mi, and Boi Faltings. 2017. Personalization in Goal-Oriented Dialog. In Proceedings of Conversational AI Workshop, NIPS'17.Google Scholar
Sunghwan Mac Kim, Qiongkai Xu, Lizhen Qu, Stephen Wan, and Cecile Paris. 2017. Demographic Inference on Twitter using Recursive Neural Networks. In Proceedings of ACL'17 (Volume 2: Short Papers).Google ScholarCross Ref
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of EMNLP'14.Google ScholarCross Ref
Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of ICLR'15.Google Scholar
Jiwei Li, Michel Galley, Chris Brockett, Georgios Spithourakis, Jianfeng Gao, and Bill Dolan. 2016. A Persona-Based Neural Conversation Model. In Proceedings of ACL'16 (Volume 1: Long Papers).Google ScholarCross Ref
Xiang Li, Gökhan Tür, Dilek Z. Hakkani-Tür, and Qi Li. 2014. Personal knowledge graph population from user utterances in conversational understanding. In Proceedings of IEEE Spoken Language Technology Workshop (SLT).Google ScholarCross Ref
Grace I. Lin and Marilyn A. Walker. 2011. All the World's a Stage: Learning Character Models from Film. In Proceedings of AIIDE'11. Google ScholarDigital Library
Andrea Madotto, Chien-Sheng Wu, and Pascale Fung. 2018. Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems. In Proceedings of ACL'18 (Volume 1: Long Papers).Google ScholarCross Ref
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Proceedings of NIPS'13. Google ScholarDigital Library
Kaixiang Mo, Yu Zhang, Shuangyin Li, Jiajun Li, and Qiang Yang. 2018. Personalizing a Dialogue System With Transfer Reinforcement Learning. In Proceedings of AAAI'18.Google ScholarCross Ref
Marco Pennacchiotti and Ana-Maria Popescu. 2011. A Machine Learning Approach to Twitter User Classification. In Proceedings of ICWSM'11.Google Scholar
James W Pennebaker, Martha E Francis, and Roger J Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates 71 (2001).Google Scholar
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of EMNLP'14.Google Scholar
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of NAACL'18, Volume 1 (Long Papers).Google ScholarCross Ref
Martin Potthast, Francisco Rangel, Michael Tschuggnall, Efstathios Stamatatos, Paolo Rosso, and Benno Stein. 2017. Overview of PAN'17: Author Identification, Author Profiling, and Author Obfuscation. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. 7th International Conference of the CLEF Initiative (CLEF 17). Berlin Heidelberg New York.Google Scholar
Daniel Preotiuc-Pietro, Vasileios Lampos, and Nikolaos Aletras. 2015. An analysis of the user occupational class through Twitter content. In Proceedings of ACL/IJCNLP'15 (Volume 1: Long Papers).Google ScholarCross Ref
Daniel Preotiuc-Pietro, Ye Liu, Daniel Hopkins, and Lyle Ungar. 2017. Beyond Binary Labels: Political Ideology Prediction of Twitter Users. In Proceedings of ACL'17 (Volume 1: Long Papers).Google ScholarCross Ref
Daniel Preotiuc-Pietro and Lyle Ungar. 2018. User-Level Race and Ethnicity Predictors from Twitter Text. In Proceedings of COLING'18.Google Scholar
Delip Rao, David Yarowsky, Abhishek Shreevats, and Manaswi Gupta. 2010. Classifying Latent User Attributes in Twitter. In Proceedings of SMUC'10. Google ScholarDigital Library
Maarten Sap, Gregory Park, Johannes Eichstaedt, Margaret Kern, David Stillwell, Michal Kosinski, Lyle Ungar, and Hansen Andrew Schwartz. 2014. Developing Age and Gender Predictive Lexica over Social Media. In Proceedings EMNLP'14.Google ScholarCross Ref
H. Andrew Schwartz, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, Martin E. P. Seligman, and Lyle H. Ungar. 2013. Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach. In PloS one.Google Scholar
Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 2015. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses. In Proceedings of NAACL-HLT'15.Google ScholarCross Ref
Zhixing Tan, Mingxuan Wang, Jun Xie, Yidong Chen, and Xiaodong Shi. 2018. Deep Semantic Role Labeling With Self-Attention. In Proceedings of AAAI'18.Google ScholarCross Ref
Gongbo Tang, Mathias Müller, Annette Rios, and Rico Sennrich. 2018. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures. In Proceedings of EMNLP'18.Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998-6008. Google ScholarDigital Library
Prashanth Vijayaraghavan, Soroush Vosoughi, and Deb Roy. 2017. Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning. In Proceedings of ACL'17 (Volume 2: Short Papers).Google ScholarCross Ref
Kei Wakabayashi, Johane Takeuchi, Kotaro Funakoshi, and Mikio Nakano. 2016. Nonparametric Bayesian Models for Spoken Language Understanding. In Proceedings of EMNLP'16.Google ScholarCross Ref
Chen Xing, Yu Wu, Wei Wu, Yalou Huang, and Ming Zhou. 2018. Hierarchical Recurrent Attention Network for Response Generation. In Proceedings of AAAI'18.Google ScholarCross Ref
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical Attention Networks for Document Classification. In Proceedings of NAACL-HLT'16.Google ScholarCross Ref
Kaisheng Yao, Baolin Peng, Geoffrey Zweig, and Kam-Fai Wong. 2016. An Attentional Neural Conversation Model with Improved Specificity. CoRR abs/1606.01292(2016).Google Scholar
Wenpeng Yin, Hinrich Schütze, Bing Xiang, and Bowen Zhou. 2016. ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs. TACL 4(2016).Google Scholar
Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. 2018. Personalizing Dialogue Agents: I have a dog, do you have pets too?. In Proceedings of ACL'18 (Volume 1: Long Papers).Google ScholarCross Ref
Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, and Bo Xu. 2016. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. In Proceedings of ACL'16 (Volume 2: Short Papers).Google ScholarCross Ref

Recommendations

Exploring Personal Knowledge Extraction from Conversations with CHARM
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

Incorporating users' personal facts enhances the quality of many downstream services. Automated extraction of such personal knowledge has recently received considerable attention. However, often the operation of extraction models is not exposed to the ...
Read More
Extracting Personal Information from Conversations
WWW '20: Companion Proceedings of the Web Conference 2020

In my doctoral research, I focus on inferring personal information from conversational texts. Such unstructured input does not provide sufficient explicit assertions, hence, personal facts have to be inferred from latent cues or speech style. In my work ...
Read More
Analysis of listening-oriented dialogue for building listening agents
SIGDIAL '09: Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Our aim is to build listening agents that can attentively listen to the user and satisfy his/her desire to speak and have himself/herself heard. This paper investigates the characteristics of such listening-oriented dialogues so that such a listening ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Editors:
Ling Liu
Georgia Tech, USA
,
Ryen White
Microsoft Research, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Information extraction
attention mechanisms
conversational text
neural networks
personal knowledge
user profiling
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 16
  Total Citations
  View Citations
- 580
  Total Downloads
- Downloads (Last 12 months)59
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Listening between the Lines: Learning Personal Attributes from Conversations

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Exploring Personal Knowledge Extraction from Conversations with CHARM

Extracting Personal Information from Conversations

Analysis of listening-oriented dialogue for building listening agents

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Listening between the Lines: Learning Personal Attributes from Conversations

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Exploring Personal Knowledge Extraction from Conversations with CHARM

Extracting Personal Information from Conversations

Analysis of listening-oriented dialogue for building listening agents

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media