skip to main content
10.1145/3543507.3583356acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

HateProof: Are Hateful Meme Detection Systems really Robust?

Published:30 April 2023Publication History

ABSTRACT

Exploiting social media to spread hate has tremendously increased over the years. Lately, multi-modal hateful content such as memes has drawn relatively more traction than uni-modal content. Moreover, the availability of implicit content payloads makes them fairly challenging to be detected by existing hateful meme detection systems. In this paper, we present a use case study to analyze such systems’ vulnerabilities against external adversarial attacks. We find that even very simple perturbations in uni-modal and multi-modal settings performed by humans with little knowledge about the model can make the existing detection models highly vulnerable. Empirically, we find a noticeable performance drop of as high as 10% in the macro-F1 score for certain attacks. As a remedy, we attempt to boost the model’s robustness using contrastive learning as well as an adversarial training-based method - VILLA. Using an ensemble of the above two approaches, in two of our high resolution datasets, we are able to (re)gain back the performance to a large extent for certain attacks. We believe that ours is a first step toward addressing this crucial problem in an adversarial setting and would inspire more such investigations in the future.

References

  1. Tariq Habib Afridi, Aftab Alam, Muhammad Numan Khan, Jawad Khan, and Young-Koo Lee. 2020. A Multimodal Memes Classification: A Survey and Open Research Issues. https://doi.org/10.48550/ARXIV.2009.08395Google ScholarGoogle ScholarCross RefCross Ref
  2. Piush Aggarwal, Michelle Espranita Liman, Darina Gold, and Torsten Zesch. 2021. VL-BERT+: Detecting Protected Groups in Hateful Multimodal Memes. In Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). Association for Computational Linguistics, Online, 207–214. https://doi.org/10.18653/v1/2021.woah-1.22Google ScholarGoogle ScholarCross RefCross Ref
  3. Natalie Alkiviadou. 2019. Hate speech on social media networks: towards a regulatory framework¿Information & Communications Technology Law 28, 1 (2019), 19–35. https://doi.org/10.1080/13600834.2018.1494417 arXiv:https://doi.org/10.1080/13600834.2018.1494417Google ScholarGoogle ScholarCross RefCross Ref
  4. Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  5. Belhassen Bayar and Matthew C. Stamm. 2016. A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer. In Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security (Vigo, Galicia, Spain) (IH&MMSec ’16). Association for Computing Machinery, New York, NY, USA, 5–10. https://doi.org/10.1145/2909827.2930786Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. 2013. Evasion Attacks against Machine Learning at Test Time. In Machine Learning and Knowledge Discovery in Databases, Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, and Filip Železný (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 387–402.Google ScholarGoogle Scholar
  7. N. Carlini and D. Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, Los Alamitos, CA, USA, 39–57. https://doi.org/10.1109/SP.2017.49Google ScholarGoogle ScholarCross RefCross Ref
  8. Olivier Chapelle, Jason Weston, Léon Bottou, and Vladimir Vapnik. 2000. Vicinal Risk Minimization. In Advances in Neural Information Processing Systems, T. Leen, T. Dietterich, and V. Tresp (Eds.). Vol. 13. MIT Press. https://proceedings.neurips.cc/paper/2000/file/ba9a56ce0a9bfa26e8ed9e10b2cc8f46-Paper.pdfGoogle ScholarGoogle Scholar
  9. Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. arXiv preprint arXiv:2002.05709 (2020).Google ScholarGoogle Scholar
  10. Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey Hinton. 2020. Big Self-Supervised Models are Strong Semi-Supervised Learners. arXiv preprint arXiv:2006.10029 (2020).Google ScholarGoogle Scholar
  11. Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu. 2020. Uniter: Universal image-text representation learning. In ECCV.Google ScholarGoogle Scholar
  12. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423Google ScholarGoogle ScholarCross RefCross Ref
  13. Samuel Dooley, Tom Goldstein, and John P. Dickerson. 2021. Robustness Disparities in Commercial Face Detection. ArXiv abs/2108.12508 (2021).Google ScholarGoogle Scholar
  14. Ivan Evtimov, Russel Howes, Brian Dolhansky, Hamed Firooz, and Cristian Canton Ferrer. 2020. Adversarial Evaluation of Multimodal Models under Realistic Gray Box Assumption. https://doi.org/10.48550/ARXIV.2011.12902Google ScholarGoogle ScholarCross RefCross Ref
  15. Edgar González Fernández, Ana Sandoval Orozco, Luis García Villalba, and Julio Hernandez-Castro. 2018. Digital Image Tamper Detection Technique Based on Spectrum Analysis of CFA Artifacts. Sensors 18, 9 (Aug. 2018), 2804. https://doi.org/10.3390/s18092804Google ScholarGoogle ScholarCross RefCross Ref
  16. Elisabetta Fersini, Francesca Gasparini, Giulia Rizzi, Aurora Saibene, Berta Chulvi, Paolo Rosso, Alyssa Lees, and Jeffrey Sorensen. 2022. SemEval-2022 Task 5: Multimedia automatic misogyny identification. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  17. Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, and Jingjing Liu. 2020. Large-Scale Adversarial Training for Vision-and-Language Representation Learning. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 6616–6628. https://proceedings.neurips.cc/paper/2020/file/49562478de4c54fafd4ec46fdb297de5-Paper.pdfGoogle ScholarGoogle Scholar
  18. Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, and Jingjing Liu. 2020. Large-Scale Adversarial Training for Vision-and-Language Representation Learning. arxiv:2006.06195 [cs.CV]Google ScholarGoogle Scholar
  19. Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, and Jingjing Liu. 2020. Large-Scale Adversarial Training for Vision-and-Language Representation Learning. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS’20). Curran Associates Inc., Red Hook, NY, USA, Article 555, 13 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Darina Gold, Piush Aggarwal, and Torsten Zesch. 2021. GerMemeHate: A Parallel Dataset of German Hateful Memes Translated from English. https://www.inf.uni-hamburg.de/en/inst/ab/lt/publications/2021-alacamyimmam-konvens-mmhs21.pdf#page=9Google ScholarGoogle Scholar
  21. Thomas Gottron. 2008. Content Code Blurring: A New Approach to Content Extraction. In 2008 19th International Workshop on Database and Expert Systems Applications. 29–33. https://doi.org/10.1109/DEXA.2008.43Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tommi Gröndahl, Luca Pajola, Mika Juuti, Mauro Conti, and N. Asokan. 2018. All You Need is "Love": Evading Hate-speech Detection. https://doi.org/10.48550/ARXIV.1808.09115Google ScholarGoogle ScholarCross RefCross Ref
  23. Amos Guiora and Elizabeth A. Park. 2017. Hate Speech on Social Media. Philosophia 45, 3 (July 2017), 957–971. https://doi.org/10.1007/s11406-017-9858-4Google ScholarGoogle ScholarCross RefCross Ref
  24. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. https://doi.org/10.48550/ARXIV.1512.03385Google ScholarGoogle ScholarCross RefCross Ref
  25. Ming Shan Hee, Roy Ka-Wei Lee, and Wen-Haw Chong. 2022. On Explaining Multimodal Hateful Meme Detection Models. https://doi.org/10.48550/ARXIV.2204.01734Google ScholarGoogle ScholarCross RefCross Ref
  26. Siddharth Jaiswal, Karthikeya Duggirala, Abhisek Dash, and Animesh Mukherjee. 2022. Two-Face: Adversarial Audit of Commercial Face Recognition Systems. Proceedings of the International AAAI Conference on Web and Social Media 16, 1 (May 2022), 381–392. https://doi.org/10.1609/icwsm.v16i1.19300Google ScholarGoogle ScholarCross RefCross Ref
  27. Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. 2020. SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 2177–2190. https://doi.org/10.18653/v1/2020.acl-main.197Google ScholarGoogle ScholarCross RefCross Ref
  28. Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes. https://doi.org/10.48550/ARXIV.2005.04790Google ScholarGoogle ScholarCross RefCross Ref
  29. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. https://doi.org/10.48550/ARXIV.1412.6980Google ScholarGoogle ScholarCross RefCross Ref
  30. Alexandre Lacoste, Alexandra Luccioni, Victor Schmidt, and Thomas Dandres. 2019. Quantifying the Carbon Emissions of Machine Learning. arXiv preprint arXiv:1910.09700 (2019).Google ScholarGoogle Scholar
  31. Vijaysinh Lendave. 2021. A guide to different types of noises and image denoising methods. https://analyticsindiamag.com/a-guide-to-different-types-of-noises-and-image-denoising-methods/Google ScholarGoogle Scholar
  32. Linjie Li, Zhe Gan, and Jingjing Liu. 2021. A Closer Look at the Robustness of Vision-and-Language Pre-trained Models. arxiv:2012.08673 [cs.CV]Google ScholarGoogle Scholar
  33. Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019. VisualBERT: A Simple and Performant Baseline for Vision and Language. https://doi.org/10.48550/ARXIV.1908.03557Google ScholarGoogle ScholarCross RefCross Ref
  34. Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiaowei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, 2020. Oscar: Object-semantics aligned pre-training for vision-language tasks. In European Conference on Computer Vision. Springer, 121–137.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Phillip Lippe, Nithin Holla, Shantanu Chandra, Santhosh Rajamanickam, Georgios Antoniou, Ekaterina Shutova, and Helen Yannakoudakis. 2020. A Multimodal Framework for the Detection of Hateful Memes. (2020). https://doi.org/10.48550/ARXIV.2012.12871Google ScholarGoogle ScholarCross RefCross Ref
  36. Xiaofeng Liu, Yang Zou, Lingsheng Kong, Zhihui Diao, Junliang Yan, Jun Wang, Site Li, Ping Jia, and Jane You. 2018. Data Augmentation via Latent Space Interpolation for Image Classification. In 2018 24th International Conference on Pattern Recognition (ICPR). 728–733. https://doi.org/10.1109/ICPR.2018.8545506Google ScholarGoogle ScholarCross RefCross Ref
  37. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. https://doi.org/10.48550/ARXIV.1907.11692Google ScholarGoogle ScholarCross RefCross Ref
  38. Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards Deep Learning Models Resistant to Adversarial Attacks. https://doi.org/10.48550/ARXIV.1706.06083Google ScholarGoogle ScholarCross RefCross Ref
  39. Francesco Marra, Diego Gragnaniello, Davide Cozzolino, and Luisa Verdoliva. 2018. Detection of GAN-Generated Fake Images over Social Networks. In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). 384–389. https://doi.org/10.1109/MIPR.2018.00084Google ScholarGoogle ScholarCross RefCross Ref
  40. Martina Miliani, Giulia Giorgi, Ilir Rama, Guido Anselmi, and Gianluca E. Lebani. 2020. DANKMEMES @ EVALITA 2020: The Memeing of Life: Memes, Multimodality and Politics. In EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020. Accademia University Press, 275–283. https://doi.org/10.4000/books.aaccademia.7330Google ScholarGoogle ScholarCross RefCross Ref
  41. Niklas Muennighoff. 2020. Vilio: State-of-the-art Visio-Linguistic Models applied to Hateful Memes. https://doi.org/10.48550/ARXIV.2012.07788Google ScholarGoogle ScholarCross RefCross Ref
  42. Lin Pan, Chung-Wei Hang, Avirup Sil, and Saloni Potdar. 2021. Improved Text Classification via Contrastive Adversarial Training. https://doi.org/10.48550/ARXIV.2107.10137Google ScholarGoogle ScholarCross RefCross Ref
  43. Tianyu Pang, Xiao Yang, Yinpeng Dong, Kun Xu, Jun Zhu, and Hang Su. 2020. Boosting Adversarial Training with Hypersphere Embedding. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 7779–7792. https://proceedings.neurips.cc/paper/2020/file/5898d8095428ee310bf7fa3da1864ff7-Paper.pdfGoogle ScholarGoogle Scholar
  44. Zoe Papakipos and Joanna Bitton. 2022. AugLy: Data Augmentations for Robustness. arxiv:2201.06494 [cs.AI]Google ScholarGoogle Scholar
  45. Gabriel Peyré and Marco Cuturi. 2018. Computational Optimal Transport. (2018). https://doi.org/10.48550/ARXIV.1803.00567Google ScholarGoogle ScholarCross RefCross Ref
  46. Shraman Pramanick, Dimitar Dimitrov, Rituparna Mukherjee, Shivam Sharma, Md. Shad Akhtar, Preslav Nakov, and Tanmoy Chakraborty. 2021. Detecting Harmful Memes and Their Targets. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, Online, 2783–2796. https://doi.org/10.18653/v1/2021.findings-acl.246Google ScholarGoogle ScholarCross RefCross Ref
  47. Yao Qiu, Jinchao Zhang, and Jie Zhou. 2021. Improving Gradient-based Adversarial Training for Text Classification by Contrastive Learning and Auto-Encoder. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, Online, 1698–1707. https://doi.org/10.18653/v1/2021.findings-acl.148Google ScholarGoogle ScholarCross RefCross Ref
  48. Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles, and Timothy Mann. 2021. Data Augmentation Can Improve Robustness. https://doi.org/10.48550/ARXIV.2111.05328Google ScholarGoogle ScholarCross RefCross Ref
  49. Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles, and Timothy Mann. 2021. Fixing Data Augmentation to Improve Adversarial Robustness. https://doi.org/10.48550/ARXIV.2103.01946Google ScholarGoogle ScholarCross RefCross Ref
  50. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett (Eds.). Vol. 28. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdfGoogle ScholarGoogle Scholar
  51. Benet Oriol Sabat, Cristian Canton Ferrer, and Xavier Giro i Nieto. 2019. Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation. arxiv:1910.02334 [cs.MM]Google ScholarGoogle Scholar
  52. Chhavi Sharma, Deepesh Bhageria, William Scott, Srinivas PYKL, Amitava Das, Tanmoy Chakraborty, Viswanath Pulabaigari, and Björn Gambäck. 2020. SemEval-2020 Task 8: Memotion Analysis- the Visuo-Lingual Metaphor!. In Proceedings of the Fourteenth Workshop on Semantic Evaluation. International Committee for Computational Linguistics, Barcelona (online), 759–773. https://doi.org/10.18653/v1/2020.semeval-1.99Google ScholarGoogle ScholarCross RefCross Ref
  53. Patrice Y. Simard, Yann A. LeCun, John S. Denker, and Bernard Victorri. 2012. Transformation Invariance in Pattern Recognition – Tangent Distance and Tangent Propagation. Springer Berlin Heidelberg, Berlin, Heidelberg, 235–269. https://doi.org/10.1007/978-3-642-35289-8_17Google ScholarGoogle ScholarCross RefCross Ref
  54. Shardul Suryawanshi, Bharathi Raja Chakravarthi, Mihael Arcan, and Paul Buitelaar. 2020. Multimodal Meme Dataset (MultiOFF) for Identifying Offensive Content in Image and Text. In Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying. European Language Resources Association (ELRA), Marseille, France, 32–41. https://aclanthology.org/2020.trac-1.6Google ScholarGoogle Scholar
  55. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. https://doi.org/10.48550/ARXIV.1312.6199Google ScholarGoogle ScholarCross RefCross Ref
  56. Hao Tan and Mohit Bansal. 2019. LXMERT: Learning Cross-Modality Encoder Representations from Transformers. https://doi.org/10.48550/ARXIV.1908.07490Google ScholarGoogle ScholarCross RefCross Ref
  57. Önsen Toygar, Felix O Babalola, and Yiltan Bitirim. 2020. FYO: a novel multimodal vein database with palmar, dorsal and wrist biometrics. IEEE Access 8 (2020), 82461–82470.Google ScholarGoogle ScholarCross RefCross Ref
  58. Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, and Patrick McDaniel. 2017. Ensemble Adversarial Training: Attacks and Defenses. https://doi.org/10.48550/ARXIV.1705.07204Google ScholarGoogle ScholarCross RefCross Ref
  59. Riza Velioglu and Jewgeni Rose. 2020. Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge. https://doi.org/10.48550/ARXIV.2012.12975Google ScholarGoogle ScholarCross RefCross Ref
  60. Nishant Vishwamitra, Hongxin Hu, Ziming Zhao, Long Cheng, and Feng Luo. 2021. Understanding and Measuring Robustness of Multimodal Learning. https://doi.org/10.48550/ARXIV.2112.12792Google ScholarGoogle ScholarCross RefCross Ref
  61. Nishant Vishwamitra, Hongxin Hu, Ziming Zhao, Long Cheng, and Feng Luo. 2021. Understanding and Measuring Robustness of Multimodal Learning. arxiv:2112.12792 [cs.LG]Google ScholarGoogle Scholar
  62. Guoqing Wang, Chuanxin Lan, Hu Han, Shiguang Shan, and Xilin Chen. 2019. Multi-modal face presentation attack detection via spatial and channel attentions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 0–0.Google ScholarGoogle ScholarCross RefCross Ref
  63. Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. 2019. Detectron2. https://github.com/facebookresearch/detectron2.Google ScholarGoogle Scholar
  64. Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://doi.org/10.48550/ARXIV.1609.08144Google ScholarGoogle ScholarCross RefCross Ref
  65. Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan L. Yuille, and Kaiming He. 2019. Feature Denoising for Improving Adversarial Robustness. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle Scholar
  66. Karren Yang, Wan-Yi Lin, Manash Barman, Filipe Condessa, and Zico Kolter. 2021. Defending Multimodal Fusion Models Against Single-Source Adversaries. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3340–3349.Google ScholarGoogle ScholarCross RefCross Ref
  67. Savvas Zannettou, Barry Bradlyn, Emiliano De Cristofaro, Haewoon Kwak, Michael Sirivianos, Gianluca Stringini, and Jeremy Blackburn. 2018. What is Gab: A Bastion of Free Speech or an Alt-Right Echo Chamber. In Companion Proceedings of the The Web Conference 2018 (Lyon, France) (WWW ’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1007–1014. https://doi.org/10.1145/3184558.3191531Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. 2019. Theoretically Principled Trade-off between Robustness and Accuracy. In Proceedings of the 36th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 7472–7482. https://proceedings.mlr.press/v97/zhang19p.htmlGoogle ScholarGoogle Scholar
  69. Lin Zhao, Changsheng Chen, and Jiwu Huang. 2021. Deep Learning-Based Forgery Attack on Document Images. IEEE Transactions on Image Processing 30 (2021), 7964–7979. https://doi.org/10.1109/TIP.2021.3112048Google ScholarGoogle ScholarCross RefCross Ref
  70. Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, and Jingjing Liu. 2020. FreeLB: Enhanced Adversarial Training for Natural Language Understanding. In International Conference on Learning Representations. https://openreview.net/forum¿id=BygzbyHFvBGoogle ScholarGoogle Scholar

Index Terms

  1. HateProof: Are Hateful Meme Detection Systems really Robust?

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WWW '23: Proceedings of the ACM Web Conference 2023
            April 2023
            4293 pages
            ISBN:9781450394161
            DOI:10.1145/3543507

            Copyright © 2023 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 30 April 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate1,899of8,196submissions,23%
          • Article Metrics

            • Downloads (Last 12 months)248
            • Downloads (Last 6 weeks)11

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format