Skip to main content

Inappropriate Text Detection and Rephrasing Using NLP

  • Conference paper
  • First Online:
Soft Computing and Its Engineering Applications (icSoftComp 2023)

Abstract

The impact of offensive language on public and professional discourse highlights the need for efficient mitigating measures. Cutting-edge computational linguistic techniques were used to identify and treat such language in a novel way. A two-pronged mechanism is used when hazardous content is found: offending terminology is either removed or put through Natural Language Pre-processing, producing rephrased information that maintains the original meaning of the text. Additionally, this work uses two freely accessible datasets for text categorization. The technique is unique, because during the rephrasing stage, we consider the incorrect words to get their synonyms, and we choose to fit for replacement in the phrase. Classification best accuracy we have achieved of about 95%. The method is comprehensive and aims to create a setting that encourages courteous and peaceful discussion while maintaining semantic integrity. This research provides a sophisticated approach to fostering meaningful relationships in both public and professional contexts by fully addressing incorrect language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Yenala, H., Jhanwar, A., Chinnakotla, M.K., Goyal, J.: Deep learning for detecting inappropriate content in text. Inter. J. Data Sci. Anal. 6, 273–286 (2018)

    Article  Google Scholar 

  2. Xu, Z., Zhu, S.: Filtering offensive language in online communities using grammatical relations. In: Proceedings of the Seventh Annual CEAS 2010 (2010)

    Google Scholar 

  3. Parnell, A.C., González-Castro, V.,  Alaiz-Rodríguez, R., et al.: Machine Learning techniques for the detection of inappropriate erotic content in text. Inter. J. Comput. Intell. Syst. 13(1), 591 (2020) ISSN 1875–6883 

    Google Scholar 

  4. Yousaf, K., Nawaz, T.: A deep learning-based approach for inappropriate content detection and classification of youtube videos. IEEE Access 10, 16283–16298 (2022). https://doi.org/10.1109/ACCESS.2022.3147519

    Article  Google Scholar 

  5. Wazir, A.S.B.,  Karim, H.A.,  Lyn, H.S., Ahmad Fauzi, M.F., Mansor, S., Lye, M.H.: Deep learning-based detection of inappropriate speech content for film censorship. IEEE Access 10, 101697–101715 (2022). doi: https://doi.org/10.1109/ACCESS.2022.3208921

  6. Golem, V., Karan, M., Å najder, J.:  Combining shallow and deep learning for aggressive text detection. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pp. 188–198 (August 2018)

    Google Scholar 

  7. Papadamou, K.,  et al.: Disturbed youtube for kids: characterizing and detecting inappropriate videos targeting young children. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 14(1), pp. 522–533 (2020). https://doi.org/10.1609/icwsm.v14i1.7320

  8. Endang, W.P.,  Patti, V.: Cross-domain and cross-lingual abusive language detection: a hybrid approach with deep learning and a multilingual lexicon. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop (2019)

    Google Scholar 

  9. Shah, F.,  Anwar, A.,  ul haq, I., AlSalman, H., Hussain, S., Al-Hadhrami, S.: Artificial Intelligence as a Service for Immoral Content Detection and Eradication (2022)

    Google Scholar 

  10. Chen, H., McKeever, S., Delany, S.J.: The use of deep learning distributed representations in the identification of abusive text. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 13(01), pp. 125–133 (2019). https://doi.org/10.1609/icwsm.v13i01.3215

  11. Kaur, S., Singh, S., Kaushal, S.: Abusive content detection in online userGenerated data: a survey, Procedia Comput.  Sci. 189, 274- 281 (2021). ISSN 1877–0509,

    Google Scholar 

  12. Lee, Y., Yoon, S., Jung, K.: Comparative studies of detecting abusive language on twitter. arXiv preprint arXiv:1808.10245 (2018)

  13. Kompally, P., Sethuraman, S.C., Walczak, S., Johnson, S., Cruz, M.V.: Malang: a decentralized deep learning approach for detecting abusive textual content. Appl. Sci. 11(18), 8701 (2021)

    Article  Google Scholar 

  14. Pitsilis, G.K., Ramampiaro, H., Langseth, H.:Detecting offensive language in tweets using deep learning. arXiv preprint arXiv:1801.04433 (2018)

  15. Chen, H., McKeever, S.,  Delany, S.J.: Abusive text detection using neural networks. In: AICS (2017)

    Google Scholar 

  16. Urrutia Zubikarai, A.: Appled NLP and ML for the detection of inappropiarte text in a communications platform. MS thesis. Universitat Politècnica de Catalunya (2020)

    Google Scholar 

  17. Tripathy, B.K.:  Audio to Indian sign language interpreter (AISLI) using machine translation and NLP techniques. In: Hybrid Computational Intelligent Systems. pp. 189–200. CRC Press (2023)

    Google Scholar 

  18. Cjadams, J.S., Elliott, J., Dixon, L., Mark McDonald, N., et al.:  Toxic Comment Classification Challenge. Kaggle (2017). https://kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge

  19. Samoshyn, A.:  Hate Speech and Offensive Language Dataset. Kaggle (2020). https://www.kaggle.com/datasets/mrmorj/hate-speech-and-offensive-language-dataset

  20. Nicapotato Bad Bad Words. Kaggle (2017). https://www.kaggle.com/datasets/nicapotato/bad-bad-words

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. K. Tripathy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jain, S., Tripathy, B.K. (2024). Inappropriate Text Detection and Rephrasing Using NLP. In: Patel, K.K., Santosh, K., Patel, A., Ghosh, A. (eds) Soft Computing and Its Engineering Applications. icSoftComp 2023. Communications in Computer and Information Science, vol 2030. Springer, Cham. https://doi.org/10.1007/978-3-031-53731-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-53731-8_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-53730-1

  • Online ISBN: 978-3-031-53731-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics