Abstract
The impact of offensive language on public and professional discourse highlights the need for efficient mitigating measures. Cutting-edge computational linguistic techniques were used to identify and treat such language in a novel way. A two-pronged mechanism is used when hazardous content is found: offending terminology is either removed or put through Natural Language Pre-processing, producing rephrased information that maintains the original meaning of the text. Additionally, this work uses two freely accessible datasets for text categorization. The technique is unique, because during the rephrasing stage, we consider the incorrect words to get their synonyms, and we choose to fit for replacement in the phrase. Classification best accuracy we have achieved of about 95%. The method is comprehensive and aims to create a setting that encourages courteous and peaceful discussion while maintaining semantic integrity. This research provides a sophisticated approach to fostering meaningful relationships in both public and professional contexts by fully addressing incorrect language.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yenala, H., Jhanwar, A., Chinnakotla, M.K., Goyal, J.: Deep learning for detecting inappropriate content in text. Inter. J. Data Sci. Anal. 6, 273–286 (2018)
Xu, Z., Zhu, S.: Filtering offensive language in online communities using grammatical relations. In: Proceedings of the Seventh Annual CEAS 2010 (2010)
Parnell, A.C., González-Castro, V., Alaiz-RodrÃguez, R., et al.: Machine Learning techniques for the detection of inappropriate erotic content in text. Inter. J. Comput. Intell. Syst. 13(1), 591 (2020) ISSN 1875–6883
Yousaf, K., Nawaz, T.: A deep learning-based approach for inappropriate content detection and classification of youtube videos. IEEE Access 10, 16283–16298 (2022). https://doi.org/10.1109/ACCESS.2022.3147519
Wazir, A.S.B., Karim, H.A., Lyn, H.S., Ahmad Fauzi, M.F., Mansor, S., Lye, M.H.: Deep learning-based detection of inappropriate speech content for film censorship. IEEE Access 10, 101697–101715 (2022). doi: https://doi.org/10.1109/ACCESS.2022.3208921
Golem, V., Karan, M., Šnajder, J.: Combining shallow and deep learning for aggressive text detection. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pp. 188–198 (August 2018)
Papadamou, K., et al.: Disturbed youtube for kids: characterizing and detecting inappropriate videos targeting young children. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 14(1), pp. 522–533 (2020). https://doi.org/10.1609/icwsm.v14i1.7320
Endang, W.P., Patti, V.: Cross-domain and cross-lingual abusive language detection: a hybrid approach with deep learning and a multilingual lexicon. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop (2019)
Shah, F., Anwar, A., ul haq, I., AlSalman, H., Hussain, S., Al-Hadhrami, S.: Artificial Intelligence as a Service for Immoral Content Detection and Eradication (2022)
Chen, H., McKeever, S., Delany, S.J.: The use of deep learning distributed representations in the identification of abusive text. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 13(01), pp. 125–133 (2019). https://doi.org/10.1609/icwsm.v13i01.3215
Kaur, S., Singh, S., Kaushal, S.: Abusive content detection in online userGenerated data: a survey, Procedia Comput. Sci. 189, 274- 281 (2021). ISSN 1877–0509,
Lee, Y., Yoon, S., Jung, K.: Comparative studies of detecting abusive language on twitter. arXiv preprint arXiv:1808.10245 (2018)
Kompally, P., Sethuraman, S.C., Walczak, S., Johnson, S., Cruz, M.V.: Malang: a decentralized deep learning approach for detecting abusive textual content. Appl. Sci. 11(18), 8701 (2021)
Pitsilis, G.K., Ramampiaro, H., Langseth, H.:Detecting offensive language in tweets using deep learning. arXiv preprint arXiv:1801.04433 (2018)
Chen, H., McKeever, S., Delany, S.J.: Abusive text detection using neural networks. In: AICS (2017)
Urrutia Zubikarai, A.: Appled NLP and ML for the detection of inappropiarte text in a communications platform. MS thesis. Universitat Politècnica de Catalunya (2020)
Tripathy, B.K.: Audio to Indian sign language interpreter (AISLI) using machine translation and NLP techniques. In: Hybrid Computational Intelligent Systems. pp. 189–200. CRC Press (2023)
Cjadams, J.S., Elliott, J., Dixon, L., Mark McDonald, N., et al.: Toxic Comment Classification Challenge. Kaggle (2017). https://kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge
Samoshyn, A.: Hate Speech and Offensive Language Dataset. Kaggle (2020). https://www.kaggle.com/datasets/mrmorj/hate-speech-and-offensive-language-dataset
Nicapotato Bad Bad Words. Kaggle (2017). https://www.kaggle.com/datasets/nicapotato/bad-bad-words
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jain, S., Tripathy, B.K. (2024). Inappropriate Text Detection and Rephrasing Using NLP. In: Patel, K.K., Santosh, K., Patel, A., Ghosh, A. (eds) Soft Computing and Its Engineering Applications. icSoftComp 2023. Communications in Computer and Information Science, vol 2030. Springer, Cham. https://doi.org/10.1007/978-3-031-53731-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-53731-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53730-1
Online ISBN: 978-3-031-53731-8
eBook Packages: Computer ScienceComputer Science (R0)