Skip to main content

Graph-Based Features for Automatic Online Abuse Detection

  • Conference paper
  • First Online:
Statistical Language and Speech Processing (SLSP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10583))

Included in the following conference series:

Abstract

While online communities have become increasingly important over the years, the moderation of user-generated content is still performed mostly manually. Automating this task is an important step in reducing the financial cost associated with moderation, but the majority of automated approaches strictly based on message content are highly vulnerable to intentional obfuscation. In this paper, we discuss methods for extracting conversational networks based on raw multi-participant chat logs, and we study the contribution of graph features to a classification system that aims to determine if a given message is abusive. The conversational graph-based system yields unexpectedly high performance, with results comparable to those previously obtained with a content-based approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://play.spaceorigin.fr/.

References

  1. Balci, K., Salah, A.A.: Automatic analysis and identification of verbal aggression and abusive behaviors for online social games. Comput. Hum. Behav. 53, 517–526 (2015)

    Article  Google Scholar 

  2. Bonacich, P.F.: Power and centrality: a family of measures. Am. J. Sociol. 92, 1170–1182 (1987)

    Article  Google Scholar 

  3. Brin, S., Page, L.E.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998)

    Article  Google Scholar 

  4. Chavan, V.S., Shylaja, S.S.: Machine learning approach for detection of cyber-aggressive comments by peers on social media network. In: IEEE ICACCI, pp. 2354–2358 (2015)

    Google Scholar 

  5. Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: PASSAT/SocialCom, pp. 71–80 (2012)

    Google Scholar 

  6. Cheng, J., Danescu-Niculescu-Mizil, C., Leskovec, J.: Antisocial behavior in online discussion communities. Preprint arXiv:1504.00680 (2015)

  7. Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst. 1695(5), 1–9 (2006)

    Google Scholar 

  8. Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. Soc. Mob. Web 11, 02 (2011)

    Google Scholar 

  9. Freeman, L.C.: Centrality in social networks i: conceptual clarification. Soc. Netw. 1(3), 215–239 (1978)

    Article  Google Scholar 

  10. Harary, F.: Graph Theory. Addison-Wesley, Reading (1969)

    Book  MATH  Google Scholar 

  11. Hosseini, H., Kannan, S., Zhang, B., Poovendran, R.: Deceiving Google’s perspective API built for detecting toxic comments. Preprint arXiv:1702.08138 (2017)

  12. Kleinberg, J.: Authoritative sources in a hyperlinked environment. J. Assoc. Comput. Mach. 46(5), 604–632 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  13. Mutton, P.: Inferring and visualizing social networks on internet relay chat. In: 8th International Conference on Information Visualisation, pp. 35–43 (2004)

    Google Scholar 

  14. Newman, M.E.J.: Assortative mixing in networks. Phys. Rev. Lett. 89(20), 208701 (2002)

    Article  Google Scholar 

  15. Papegnies, E., Labatut, V., Dufour, R., Linares, G.: Impact of content features for automatic online abuse detection. In: International Conference on Computational Linguistics and Intelligent Text Processing (2017)

    Google Scholar 

  16. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  17. Seidman, S.B.: Network structure and minimum degree. Soc. Netw. 5(3), 269–287 (1983)

    Article  MathSciNet  Google Scholar 

  18. Sinha, T., Rajasingh, I.: Investigating substructures in goal oriented online communities: case study of Ubuntu IRC. In: IEEE International Advance Computing Conference, pp. 916–922 (2014)

    Google Scholar 

  19. Spertus, E.: Smokey: automatic recognition of hostile messages. In: 14th National Conference on Artificial Intelligence and 9th Conference on Innovative Applications of Artificial Intelligence, pp. 1058–1065 (1997)

    Google Scholar 

  20. Tavassoli, S., Moessner, M., Zweig, K.A.: Constructing social networks from semi-structured chat-log data. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 146–149 (2014)

    Google Scholar 

  21. Yin, D., Xue, Z., Hong, L., Davison, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on web 2.0. In: WWW Workshop: Content Analysis in the WEB 2.0 (2009)

    Google Scholar 

Download references

Acknowledgments

This work was financed by a grant from the Provence Alpes Cte d’Azur region (France) and the Nectar de Code company.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Etienne Papegnies .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Papegnies, E., Labatut, V., Dufour, R., Linarès, G. (2017). Graph-Based Features for Automatic Online Abuse Detection. In: Camelin, N., Estève, Y., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2017. Lecture Notes in Computer Science(), vol 10583. Springer, Cham. https://doi.org/10.1007/978-3-319-68456-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68456-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68455-0

  • Online ISBN: 978-3-319-68456-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics