Abstract
Exemplar-guided Paraphrase Generation aims to use an exemplar sentence to guide the generation of a paraphrase that retains the semantic content of the source sentence, along with the syntax of the exemplar. Some methods use syntax structure extracted from exemplars to guide generation, but the preprocesses may cause information loss. The other methods directly use the natural exemplar sentences (NES) as syntactic guidance, which avoids the loss of information but fails to capture and integrate the exemplar’s syntax and source sentence’s semantics effectively. In this paper, we propose a Highlighted Contrastive learning framework for exemplar-guided Paraphrase Generation (HCPG), which solves the shortcomings of using NES as syntactic guidance. The “highlight” refers to a continuous process of supplementing and refining, which effectively captures both the semantic and syntactic information of the sentences. HCPG also includes a contrastive loss layer to help the decoder fully integrate the highlighted semantic and syntactic information to generate final paraphrases. Experiments on ParaNMT and QQP-Pos show that HCPG is comparable to several state-of-the-art models, including SAGP and GCPG, and achieves an average 3.19% improvement compared with CLPG.
Similar content being viewed by others
Data Availability Statement
The data that support the findings of this study are openly available in HCPG-Dataset repository.
References
Min J, McCoy RT, Das D, Pitler E, Linzen T (2020) Syntactic data augmentation increases robustness to inference heuristics. arXiv preprint arXiv:2004.11999
Thompson B, Post M (2020) Automatic machine translation evaluation in many languages via zero-shot paraphrasing. arXiv preprint arXiv:2004.14564
Gao S, Zhang Y, Ou Z, Yu Z (2020) Paraphrase augmented task-oriented dialog generation. arXiv preprint arXiv:2004.07462
Lyu Y, Liang PP, Pham H, Hovy E, Póczos B, Salakhutdinov R, Morency L-P (2021) Styleptb: A compositional benchmark for fine-grained controllable text style transfer. arXiv preprint arXiv:2104.05196
Zhang Y, Ge T, Sun X (2020) Parallel data augmentation for formality style transfer. arXiv preprint arXiv:2005.07522
Shen T, Lei T, Barzilay R, Jaakkola T (2017) Style transfer from non-parallel text by cross-alignment. Adv Neural Inform Process Syst 30
Kumar A, Ahuja K, Vadapalli R, Talukdar P (2020) Syntax-guided controlled generation of paraphrases. Trans Assoc Comput Linguis 8:330–345
Yang E, Liu M, Xiong D, Zhang Y, Meng Y, Xu J, Chen Y (2022) Improving generation diversity via syntax-controlled paraphrasing. Neurocomputing 485:103–113
Yang E, Bai C, Xiong D, Zhang Y, Meng Y, Xu J, Chen Y (2022) Learning structural information for syntax-controlled paraphrase generation. In: Findings of the association for computational linguistics: NAACL 2022, pp 2079–2090
Yang K, Liu D, Lei W, Yang B, Zhang H, Zhao X, Yao W, Chen B (2022) Gcpg: A general framework for controllable paraphrase generation. In: Findings of the association for computational linguistics: ACL 2022, pp 4035–4047
Manning CD, Surdeanu M, Bauer J, Finkel JR, Bethard S, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60
Chen M, Tang Q, Wiseman S, Gimpel K (2019) Controllable paraphrase generation with a syntactic exemplar. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5972–5984
Yang H, Lam W, Li P (2021) Contrastive representation learning for exemplar-guided paraphrase generation. arXiv preprint arXiv:2109.01484
Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, Makedon F (2020) A survey on contrastive self-supervised learning. Technologies 9(1):2
Li Y, Feng R, Rehg I, Zhang C (2020) Transformer-based neural text generation with syntactic guidance. arXiv preprint arXiv:2010.01737
Wieting J, Gimpel K (2018) Paranmt-50m: Pushing the limits of paraphrastic sentence embeddings with millions of machine translations. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Vol. 1: Long Papers), pp 451–462
Wu X, Gao C, Zang L, Han J, Wang Z, Hu S (2021) Smoothed contrastive learning for unsupervised sentence embedding. arXiv preprint arXiv:2109.04321
Gupta A, Zhang Z (2018) To attend or not to attend: a case study on syntactic structures for semantic relatedness. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Vol. 1: Long Papers), pp 2116–2125
Vashishth S, Bhandari M, Yadav P, Rai P, Bhattacharyya C, Talukdar P (2019)Incorporating syntactic and semantic information in word embeddings using graph convolutional networks. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 3308–3318
Dhole K, Manning CD (2020) Syn-qg: Syntactic and shallow semantic rules for question generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 752–765
Fei H, Ren Y, Ji D (2020) Improving text understanding via deep syntax-semantics communication. In: Findings of the association for computational linguistics: EMNLP 2020, pp 84–93
Gu X, Zhang Z, Lee S-W, Yoo KM, Ha J-W (2022) Continuous decomposition of granularity for neural paraphrase generation. In: Proceedings of the 29th international conference on computational linguistics, pp 6369–6378
Li Z, Jiang X, Shang L, Liu Q (2019) Decomposable neural paraphrase generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 3403–3414
Zhang D, Hong M, Zou L, Han F, He F, Tu Z, Ren Y (2019) Attention pooling-based bidirectional gated recurrent units model for sentimental classification. Int J Comput Intell Syst 12(2):723
Fei H, Ren Y, Ji D (2020) Mimic and conquer: heterogeneous tree structure distillation for syntactic nlp. In: Findings of the association for computational linguistics: EMNLP 2020, pp 183–193
Hosking T, Tang H, Lapata M (2022) Hierarchical sketch induction for paraphrase generation. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Vol. 1: Long Papers), pp 2489–2501
Yang E, Liu M, Xiong D, Zhang Y, Meng Y, Hu C, Xu J, Chen Y (2021) Syntactically-informed unsupervised paraphrasing with non-parallel data. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 2594–2604
Chen M, Tang Q, Wiseman S, Gimpel K (2019) A multi-task approach for disentangling syntax and semantics in sentence representations. arXiv preprint arXiv:1904.01173
Sun J, Ma X, Peng N (2021) Aesop: Paraphrase generation with adaptive syntactic control. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 5176–5189
Goyal T, Durrett G (2020) Neural syntactic preordering for controlled paraphrase generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 238–252
Fei H, Wu S, Ren Y, Zhang M (2022) Matching structure for dual learning. In: International conference on machine learning, PMLR, pp 6373–6391
Cai Y, Cao Y, Wan X (2021) Revisiting pivot-based paraphrase generation: language is not the only optional pivot. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 4255–4268
Huang K-H, Chang K-W (2021) Generating syntactically controlled paraphrases without using annotated parallel pairs. In: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: Main Vol., pp 1022–1033
Rim DN, Heo D, Choi H (2021) Adversarial training with contrastive learning in nlp. arXiv preprint arXiv:2109.09075
Cho WS, Zhang Y, Rao S, Celikyilmaz A, Xiong C, Gao J, Wang M, Dolan B (2021) Contrastive multi-document question generation. In: EACL
He W, Dai Y, Hui B, Yang M, Cao Z, Dong J, Huang F, Si L, Li Y (2022) Space-2: Tree-structured semi-supervised contrastive pre-training for task-oriented dialog understanding. In: Proceedings of the 29th international conference on computational linguistics, pp 553–569
Li B, Hou Y, Che W (2022) Data augmentation approaches in natural language processing: a survey. AI Open
Yan Y, Li R, Wang S, Zhang F, Wu W, Xu W (2021) Consert: A contrastive framework for self-supervised sentence representation transfer. arXiv preprint arXiv:2105.11741
Liu D, Gong Y, Fu J, Yan Y, Chen J, Lv J, Duan N, Zhou M (2020) Tell me how to ask again: question data augmentation with controllable rewriting in continuous space. In: EMNLP (1)
Chi X, Xiang Y (2021) Augmenting paraphrase generation with syntax information using graph convolutional networks. Entropy 23(5):566
Feng SY, Gangal V, Wei J, Chandar S, Vosoughi S, Mitamura T, Hovy E (2021) A survey of data augmentation approaches for nlp. arXiv preprint arXiv:2105.03075
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821
Yang W, Xie Y, Lin A, Li X, Tan L, Xiong K, Li M, Lin J (2019) End-to-end open-domain question answering with bertserini. arXiv preprint arXiv:1902.01718
Bai J, Wang Y, Chen Y, Yang Y, Bai J, Yu J, Tong Y (2021) Syntax-bert: Improving pre-trained transformers with syntax trees. In: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: Main Volume, pp 3011–3020
Fei H, Ren Y, Ji D (2020) Retrofitting structure-aware transformer language model for end tasks. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 2151–2161
Jawahar G, Sagot B, Seddah D (2019) What does BERT learn about the structure of language? In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 3651–3657
Sugiyama A, Yoshinaga N (2019) Data augmentation using back-translation for context-aware neural machine translation. In: Proceedings of the 4th workshop on discourse in machine translation (DiscoMT 2019), pp 35–44
Behr D (2017) Assessing the use of back translation: the shortcomings of back translation as a quality testing method. Int J Soc Res Methodol 20(6):573–584
Lee S, Kang M, Lee J, Hwang SJ (2021) Learning to perturb word embeddings for out-of-distribution QA. arXiv preprint arXiv:2105.02692
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Wang Z, Hamza W, Florian R. Bilateral multi-perspective matching for natural language sentences
Iyyer M, Wieting J, Gimpel K, Zettlemoyer L (2018) Adversarial example generation with syntactically controlled paraphrase networks. In: Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: human language technologies, Vol. 1 (Long Papers), pp 1875–1885
Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318
Banerjee S, Lavie A (2005) Meteor: An automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72
Lin C-Y (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Zhang K, Shasha D (1989) Simple fast algorithms for the editing distance between trees and related problems. SIAM J Comput 18(6):1245–1262
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Qi W, Yan Y, Gong Y, Liu D, Duan N, Chen J, Zhang R, Zhou M (2020) Prophetnet: Predicting future n-gram for sequence-to-sequence pre-training. arXiv preprint arXiv:2001.04063
Acknowledgements
This research was partially supported by grants from the National Natural Science Foundation of China (No. 61877051)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A hyper-parameter tuning
Appendix A hyper-parameter tuning
Although the middle layers are able to embed rich syntactic information, the result is not perfect. That is to say, the captured syntactic information is not complete. Experiments from [47] also show that BERT needs deeper layers to capture long-range dependency information. As shown in Fig. 7, when \(\beta \) in Eq. 5 is near 1, the long-range dependency information of the previous layers is missing, resulting in a non-monotonic curve based on our experiments.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, H., Li, L. HCPG: a highlighted contrastive learning framework for exemplar-guided paraphrase generation. Neural Comput & Applic 35, 17267–17279 (2023). https://doi.org/10.1007/s00521-023-08609-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08609-7