Skip to main content

Measuring Content Preservation in Textual Style Transfer

  • Conference paper
  • First Online:
Data Mining (AusDM 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1741))

Included in the following conference series:

  • 504 Accesses

Abstract

Style transfer in text, changing text that is written in a particular style such as the works of Shakespeare to be written in another style, currently relies on taking the cosine similarity of the sentence embeddings of the original and transferred sentence to determine if the content of the sentence, its meaning, hasn’t changed. This assumes however that such sentence embeddings are style invariant, which can result in inaccurate measurements of content preservation. To investigate this we compared the average similarity of multiple styles of text from the Corpus of Diverse Styles using a variety of sentence embedding methods and find that those embeddings which are created from aggregated word embeddings are style invariant, but those created by sentence embeddings are not.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/google/sentencepiece.

References

  1. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  2. Callison-Burch, C., Osborne, M., Koehn, P.: Re-evaluating the role of bleu in machine translation research. In: 11th Conference of the European Chapter of the Association for Computational Linguistics, pp. 249–256 (2006)

    Google Scholar 

  3. Fu, Z., Tan, X., Peng, N., Zhao, D., Yan, R.: Style transfer in text: exploration and evaluation. arXiv preprint arXiv:1711.06861 (2017)

  4. Gong, H., Bhat, S., Wu, L., Xiong, J., Hwu, W.: Reinforcement learning based text style transfer without parallel training corpus. arXiv preprint arXiv:1903.10671 (2019)

  5. Jin, D., Jin, Z., Hu, Z., Vechtomova, O., Mihalcea, R.: Deep learning for text style transfer: a survey. Comput. Linguist. 48(1), 155–205 (2022)

    Article  Google Scholar 

  6. Krishna, K., Wieting, J., Iyyer, M.: Reformulating unsupervised style transfer as paraphrase generation. arXiv preprint arXiv:2010.05700 (2020)

  7. Li, J., Jia, R., He, H., Liang, P.: Delete, retrieve, generate: a simple approach to sentiment and style transfer. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana, Volume 1 (Long Papers), pp. 1865–1874. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/N18-1169. https://www.aclweb.org/anthology/N18-1169

  8. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)

    Google Scholar 

  9. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  10. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv abs/1910.01108 (2019)

    Google Scholar 

  11. Shen, T., Lei, T., Barzilay, R., Jaakkola, T.: Style transfer from non-parallel text by cross-alignment. In: Advances in Neural Information Processing Systems, pp. 6830–6841 (2017)

    Google Scholar 

  12. Subramanian, S., Lample, G., Smith, E.M., Denoyer, L., Ranzato, M., Boureau, Y.L.: Multiple-attribute text style transfer. arXiv preprint arXiv:1811.00552 (2018)

  13. Tikhonov, A., Shibaev, V., Nagaev, A., Nugmanova, A., Yamshchikov, I.P.: Style transfer for texts: retrain, report errors, compare with rewrites. arXiv preprint arXiv:1908.06809 (2019)

  14. Wang, A., Cho, K., Lewis, M.: Asking and answering questions to evaluate the factual consistency of summaries. arXiv preprint arXiv:2004.04228 (2020)

  15. Wieting, J., Berg-Kirkpatrick, T., Gimpel, K., Neubig, G.: Beyond bleu: training neural machine translation with semantic similarity. arXiv preprint arXiv:1909.06694 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stuart Fitzpatrick .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fitzpatrick, S., Park, L., Obst, O. (2022). Measuring Content Preservation in Textual Style Transfer. In: Park, L.A.F., et al. Data Mining. AusDM 2022. Communications in Computer and Information Science, vol 1741. Springer, Singapore. https://doi.org/10.1007/978-981-19-8746-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8746-5_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8745-8

  • Online ISBN: 978-981-19-8746-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics