Skip to main content
Log in

HCPG: a highlighted contrastive learning framework for exemplar-guided paraphrase generation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Exemplar-guided Paraphrase Generation aims to use an exemplar sentence to guide the generation of a paraphrase that retains the semantic content of the source sentence, along with the syntax of the exemplar. Some methods use syntax structure extracted from exemplars to guide generation, but the preprocesses may cause information loss. The other methods directly use the natural exemplar sentences (NES) as syntactic guidance, which avoids the loss of information but fails to capture and integrate the exemplar’s syntax and source sentence’s semantics effectively. In this paper, we propose a Highlighted Contrastive learning framework for exemplar-guided Paraphrase Generation (HCPG), which solves the shortcomings of using NES as syntactic guidance. The “highlight” refers to a continuous process of supplementing and refining, which effectively captures both the semantic and syntactic information of the sentences. HCPG also includes a contrastive loss layer to help the decoder fully integrate the highlighted semantic and syntactic information to generate final paraphrases. Experiments on ParaNMT and QQP-Pos show that HCPG is comparable to several state-of-the-art models, including SAGP and GCPG, and achieves an average 3.19% improvement compared with CLPG.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability Statement

The data that support the findings of this study are openly available in HCPG-Dataset repository.

References

  1. Min J, McCoy RT, Das D, Pitler E, Linzen T (2020) Syntactic data augmentation increases robustness to inference heuristics. arXiv preprint arXiv:2004.11999

  2. Thompson B, Post M (2020) Automatic machine translation evaluation in many languages via zero-shot paraphrasing. arXiv preprint arXiv:2004.14564

  3. Gao S, Zhang Y, Ou Z, Yu Z (2020) Paraphrase augmented task-oriented dialog generation. arXiv preprint arXiv:2004.07462

  4. Lyu Y, Liang PP, Pham H, Hovy E, Póczos B, Salakhutdinov R, Morency L-P (2021) Styleptb: A compositional benchmark for fine-grained controllable text style transfer. arXiv preprint arXiv:2104.05196

  5. Zhang Y, Ge T, Sun X (2020) Parallel data augmentation for formality style transfer. arXiv preprint arXiv:2005.07522

  6. Shen T, Lei T, Barzilay R, Jaakkola T (2017) Style transfer from non-parallel text by cross-alignment. Adv Neural Inform Process Syst 30

  7. Kumar A, Ahuja K, Vadapalli R, Talukdar P (2020) Syntax-guided controlled generation of paraphrases. Trans Assoc Comput Linguis 8:330–345

    Article  Google Scholar 

  8. Yang E, Liu M, Xiong D, Zhang Y, Meng Y, Xu J, Chen Y (2022) Improving generation diversity via syntax-controlled paraphrasing. Neurocomputing 485:103–113

    Article  Google Scholar 

  9. Yang E, Bai C, Xiong D, Zhang Y, Meng Y, Xu J, Chen Y (2022) Learning structural information for syntax-controlled paraphrase generation. In: Findings of the association for computational linguistics: NAACL 2022, pp 2079–2090

  10. Yang K, Liu D, Lei W, Yang B, Zhang H, Zhao X, Yao W, Chen B (2022) Gcpg: A general framework for controllable paraphrase generation. In: Findings of the association for computational linguistics: ACL 2022, pp 4035–4047

  11. Manning CD, Surdeanu M, Bauer J, Finkel JR, Bethard S, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60

  12. Chen M, Tang Q, Wiseman S, Gimpel K (2019) Controllable paraphrase generation with a syntactic exemplar. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5972–5984

  13. Yang H, Lam W, Li P (2021) Contrastive representation learning for exemplar-guided paraphrase generation. arXiv preprint arXiv:2109.01484

  14. Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, Makedon F (2020) A survey on contrastive self-supervised learning. Technologies 9(1):2

    Article  Google Scholar 

  15. Li Y, Feng R, Rehg I, Zhang C (2020) Transformer-based neural text generation with syntactic guidance. arXiv preprint arXiv:2010.01737

  16. Wieting J, Gimpel K (2018) Paranmt-50m: Pushing the limits of paraphrastic sentence embeddings with millions of machine translations. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Vol. 1: Long Papers), pp 451–462

  17. Wu X, Gao C, Zang L, Han J, Wang Z, Hu S (2021) Smoothed contrastive learning for unsupervised sentence embedding. arXiv preprint arXiv:2109.04321

  18. Gupta A, Zhang Z (2018) To attend or not to attend: a case study on syntactic structures for semantic relatedness. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Vol. 1: Long Papers), pp 2116–2125

  19. Vashishth S, Bhandari M, Yadav P, Rai P, Bhattacharyya C, Talukdar P (2019)Incorporating syntactic and semantic information in word embeddings using graph convolutional networks. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 3308–3318

  20. Dhole K, Manning CD (2020) Syn-qg: Syntactic and shallow semantic rules for question generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 752–765

  21. Fei H, Ren Y, Ji D (2020) Improving text understanding via deep syntax-semantics communication. In: Findings of the association for computational linguistics: EMNLP 2020, pp 84–93

  22. Gu X, Zhang Z, Lee S-W, Yoo KM, Ha J-W (2022) Continuous decomposition of granularity for neural paraphrase generation. In: Proceedings of the 29th international conference on computational linguistics, pp 6369–6378

  23. Li Z, Jiang X, Shang L, Liu Q (2019) Decomposable neural paraphrase generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 3403–3414

  24. Zhang D, Hong M, Zou L, Han F, He F, Tu Z, Ren Y (2019) Attention pooling-based bidirectional gated recurrent units model for sentimental classification. Int J Comput Intell Syst 12(2):723

    Article  Google Scholar 

  25. Fei H, Ren Y, Ji D (2020) Mimic and conquer: heterogeneous tree structure distillation for syntactic nlp. In: Findings of the association for computational linguistics: EMNLP 2020, pp 183–193

  26. Hosking T, Tang H, Lapata M (2022) Hierarchical sketch induction for paraphrase generation. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Vol. 1: Long Papers), pp 2489–2501

  27. Yang E, Liu M, Xiong D, Zhang Y, Meng Y, Hu C, Xu J, Chen Y (2021) Syntactically-informed unsupervised paraphrasing with non-parallel data. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 2594–2604

  28. Chen M, Tang Q, Wiseman S, Gimpel K (2019) A multi-task approach for disentangling syntax and semantics in sentence representations. arXiv preprint arXiv:1904.01173

  29. Sun J, Ma X, Peng N (2021) Aesop: Paraphrase generation with adaptive syntactic control. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 5176–5189

  30. Goyal T, Durrett G (2020) Neural syntactic preordering for controlled paraphrase generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 238–252

  31. Fei H, Wu S, Ren Y, Zhang M (2022) Matching structure for dual learning. In: International conference on machine learning, PMLR, pp 6373–6391

  32. Cai Y, Cao Y, Wan X (2021) Revisiting pivot-based paraphrase generation: language is not the only optional pivot. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 4255–4268

  33. Huang K-H, Chang K-W (2021) Generating syntactically controlled paraphrases without using annotated parallel pairs. In: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: Main Vol., pp 1022–1033

  34. Rim DN, Heo D, Choi H (2021) Adversarial training with contrastive learning in nlp. arXiv preprint arXiv:2109.09075

  35. Cho WS, Zhang Y, Rao S, Celikyilmaz A, Xiong C, Gao J, Wang M, Dolan B (2021) Contrastive multi-document question generation. In: EACL

  36. He W, Dai Y, Hui B, Yang M, Cao Z, Dong J, Huang F, Si L, Li Y (2022) Space-2: Tree-structured semi-supervised contrastive pre-training for task-oriented dialog understanding. In: Proceedings of the 29th international conference on computational linguistics, pp 553–569

  37. Li B, Hou Y, Che W (2022) Data augmentation approaches in natural language processing: a survey. AI Open

  38. Yan Y, Li R, Wang S, Zhang F, Wu W, Xu W (2021) Consert: A contrastive framework for self-supervised sentence representation transfer. arXiv preprint arXiv:2105.11741

  39. Liu D, Gong Y, Fu J, Yan Y, Chen J, Lv J, Duan N, Zhou M (2020) Tell me how to ask again: question data augmentation with controllable rewriting in continuous space. In: EMNLP (1)

  40. Chi X, Xiang Y (2021) Augmenting paraphrase generation with syntax information using graph convolutional networks. Entropy 23(5):566

    Article  MathSciNet  Google Scholar 

  41. Feng SY, Gangal V, Wei J, Chandar S, Vosoughi S, Mitamura T, Hovy E (2021) A survey of data augmentation approaches for nlp. arXiv preprint arXiv:2105.03075

  42. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

  43. Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821

  44. Yang W, Xie Y, Lin A, Li X, Tan L, Xiong K, Li M, Lin J (2019) End-to-end open-domain question answering with bertserini. arXiv preprint arXiv:1902.01718

  45. Bai J, Wang Y, Chen Y, Yang Y, Bai J, Yu J, Tong Y (2021) Syntax-bert: Improving pre-trained transformers with syntax trees. In: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: Main Volume, pp 3011–3020

  46. Fei H, Ren Y, Ji D (2020) Retrofitting structure-aware transformer language model for end tasks. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 2151–2161

  47. Jawahar G, Sagot B, Seddah D (2019) What does BERT learn about the structure of language? In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 3651–3657

  48. Sugiyama A, Yoshinaga N (2019) Data augmentation using back-translation for context-aware neural machine translation. In: Proceedings of the 4th workshop on discourse in machine translation (DiscoMT 2019), pp 35–44

  49. Behr D (2017) Assessing the use of back translation: the shortcomings of back translation as a quality testing method. Int J Soc Res Methodol 20(6):573–584

    Article  Google Scholar 

  50. Lee S, Kang M, Lee J, Hwang SJ (2021) Learning to perturb word embeddings for out-of-distribution QA. arXiv preprint arXiv:2105.02692

  51. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  52. Wang Z, Hamza W, Florian R. Bilateral multi-perspective matching for natural language sentences

  53. Iyyer M, Wieting J, Gimpel K, Zettlemoyer L (2018) Adversarial example generation with syntactically controlled paraphrase networks. In: Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: human language technologies, Vol. 1 (Long Papers), pp 1875–1885

  54. Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318

  55. Banerjee S, Lavie A (2005) Meteor: An automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72

  56. Lin C-Y (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81

  57. Zhang K, Shasha D (1989) Simple fast algorithms for the editing distance between trees and related problems. SIAM J Comput 18(6):1245–1262

    Article  MathSciNet  MATH  Google Scholar 

  58. Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  59. Qi W, Yan Y, Gong Y, Liu D, Duan N, Chen J, Zhang R, Zhou M (2020) Prophetnet: Predicting future n-gram for sequence-to-sequence pre-training. arXiv preprint arXiv:2001.04063

Download references

Acknowledgements

This research was partially supported by grants from the National Natural Science Foundation of China (No. 61877051)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Li.

Ethics declarations

Conflicts of interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A hyper-parameter tuning

Appendix A hyper-parameter tuning

Although the middle layers are able to embed rich syntactic information, the result is not perfect. That is to say, the captured syntactic information is not complete. Experiments from [47] also show that BERT needs deeper layers to capture long-range dependency information. As shown in Fig. 7, when \(\beta \) in Eq. 5 is near 1, the long-range dependency information of the previous layers is missing, resulting in a non-monotonic curve based on our experiments.

Fig. 7
figure 7

The changing trend of model performance when \(\alpha \) or \(\beta \) changes

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Li, L. HCPG: a highlighted contrastive learning framework for exemplar-guided paraphrase generation. Neural Comput & Applic 35, 17267–17279 (2023). https://doi.org/10.1007/s00521-023-08609-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08609-7

Keywords

Navigation