Abstract
This work concerns a study in the Natural Language Processing field aiming to recognise personality traits in Portuguese written text. To this end, we first built a corpus of Facebook status updates labelled with the personality traits of their authors, from which we trained a number of computational models of personality recognition. The models include a range of alternatives ranging from a standard approach relying on lexical knowledge from the LIWC dictionary and others, to purely text-based methods such as bag of words, word embeddings and others. Results suggest that word embedding models slightly outperform the alternatives under consideration, with the advantage of not requiring any language-specific lexical resources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allport, F.H., Allport, G.W.: Personality traits: their classification and measurement. J. Abnorm. Soc. Psychol. 16, 6–40 (1921)
Goldberg, L.R.: An alternative description of personality: the Big-Five factor structure. J. Pers. Soc. Psychol. 59, 1216–1229 (1990)
de Andrade, J.M.: Evidências de validade do inventário dos cinco grandes fatores de personalidade para o Brasil. Ph.D. thesis, Universidade de Brasília (2008)
Oberlander, J., Nowson, S.: Whose thumb is it anyway? Classifying author personality from weblog text. In: COLING/ACL-2006 Poster Sessions, Sydney, Australia, Association for Computational Linguistics, pp. 627–634 (2006)
Celli, F.: Adaptive personality recognition from text. Ph.D. thesis, University of Trento (2012)
Argamon, S., Dhawle, S., Koppel, M., Pennebaker, J.W.: Lexical predictors of personality type. In: The Joint Annual Meeting of the Interface and the Classification Society of North America (2005)
Mairesse, F., Walker, M., Mehl, M., Moore, R.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. (JAIR) 30, 457–500 (2007)
Pennebaker, J.W., Francis, M.E., Booth, R.J.: Inquiry and Word Count: LIWC. Lawrence Erlbaum, Mahwah (2001)
Coltheart, M.: The MRC psycholinguistic database. Q. J. Exp. Psychol. Sect. A: Hum. Exp. Psychol. 33(4), 497–505 (1981)
Nowson, S., Oberlander, J.: Identifying more bloggers: towards large scale personality classification of personal weblogs. In: Proceedings of the International Conference on Weblogs and Social Media, Boulder, Colorado, USA (2007)
Kosinski, M., Matz, S., Gosling, S., Popov, V., Stillwell, D.: Facebook as a social science research tool: opportunities, challenges, ethical considerations and practical guidelines. Am. Psychol. 70(6), 543–556 (2015)
John, O.P., Naumann, L.P., Soto, C.J.: Paradigm Shift to the Integrative Big-Five Trait Taxonomy: History, Measurement, and Conceptual Issues, pp. 114–158. Guilford Press, New York (2008)
Filho, P.P.B., Aluísio, S.M., Pardo, T.: An evaluation of the Brazilian Portuguese LIWC dictionary for sentiment analysis. In: 9th Brazilian Symposium in Information and Human Language Technology - STIL, Fortaleza, Brazil, pp. 215–219 (2013)
dos Santos, L.B., Duran, M.S., Hartmann, N.S., Candido, A., Paetzold, G.H., Aluisio, S.M.: A lightweight regression method to infer psycholinguistic properties for Brazilian Portuguese. In: Ekštein, K., Matoušek, V. (eds.) TSD 2017. LNCS (LNAI), vol. 10415, pp. 281–289. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64206-2_32
Mikolov, T., Wen-tau, S., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of NAACL-HLT-2013, Atlanta, USA, pp. 746–751. Association for Computational Linguistics (2013)
Ramos Casimiro, C., Paraboni, I.: Temporal aspects of content recommendation on a microblog corpus. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.G. (eds.) PROPOR 2014. LNCS (LNAI), vol. 8775, pp. 189–194. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09761-9_20
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of Machine Learning Research, PMLR, Beijing, China, vol. 32, no. 2, pp. 1188–1196 (2014)
Ramos, R.M.S., Neto, G.B.S., Silva, B.B.C., Monteiro, D.S., Paraboni, I., Dias, R.F.S.: Building a corpus for personality-dependent natural language understanding and generation. In: 11th International Conference on Language Resources and Evaluation (LREC-2018), ELRA, Miyazaki, Japan, pp. 1138–1145 (2018)
dos Santos, V.G., Paraboni, I., Silva, B.B.C.: Big Five personality recognition from multiple text genres. In: Ekštein, K., Matoušek, V. (eds.) TSD 2017. LNCS (LNAI), vol. 10415, pp. 29–37. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64206-2_4
Hsieh, F.C., Dias, R.F.S., Paraboni, I.: Author profiling from facebook corpora. In: 11th International Conference on Language Resources and Evaluation (LREC-2018), ELRA, Miyazaki, Japan, pp. 2566–2570 (2018)
Silva, B.B.C., Paraboni, I.: Learning personality traits from Facebook text. IEEE Lat. Am. Trans. 16(4), 1256–1262 (2018)
Acknowledgements
The second author received supported from grant # 2016/14223-0, São Paulo Research Foundation (FAPESP).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
da Silva, B.B.C., Paraboni, I. (2018). Personality Recognition from Facebook Text. In: Villavicencio, A., et al. Computational Processing of the Portuguese Language. PROPOR 2018. Lecture Notes in Computer Science(), vol 11122. Springer, Cham. https://doi.org/10.1007/978-3-319-99722-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-99722-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99721-6
Online ISBN: 978-3-319-99722-3
eBook Packages: Computer ScienceComputer Science (R0)