Access this book
Tax calculation will be finalised at checkout
Other ways to access
Table of contents (12 chapters)
About this book
The majority of natural language processing (NLP) is English language processing, and while there is good language technology support for (standard varieties of) English, support for Albanian, Burmese, or Cebuano--and most other languages--remains limited. Being able to bridge this digital divide is important for scientific and democratic reasons but also represents an enormous growth potential. A key challenge for this to happen is learning to align basic meaning-bearing units of different languages.
In this book, the authors survey and discuss recent and historical work on supervised and unsupervised learning of such alignments. Specifically, the book focuses on so-called cross-lingual word embeddings. The survey is intended to be systematic, using consistent notation and putting the available methods on comparable form, making it easy to compare wildly different approaches. In so doing, the authors establish previously unreported relations between these methods and are able to present a fast-growing literature in a very compact way. Furthermore, the authors discuss how best to evaluate cross-lingual word embedding methods and survey the resources available for students and researchers interested in this topic.
Authors and Affiliations
About the authors
Sebastian Ruder is a Research Scientist at DeepMind. He obtained his Ph.D. in Natural Lan-guage Processing at the National University of Ireland, Galway in 2019. He is interested intransfer learning and cross-lingual learning and has published widely read reviews as well asmore than ten peer-reviewed research papers in top-tier conference proceedings in NLP.
Manaal Faruqui is a Senior Research Scientist at Google, working on industrial scale NLP and ML problems. He obtained his Ph.D. in the Language Technologies Institute at Carnegie Mellon University while working on representation learning, multilingual learning, and distributional and lexical semantics. He received a best paper award at NAACL 2015 for his work on incorporating semantic knowledge in word vector representations. He serves on the editorial board of the Computational Linguistics journal and has been an area chair for several ACL conferences.
Bibliographic Information
Book Title: Cross-Lingual Word Embeddings
Authors: Anders Søgaard, Ivan Vulić, Sebastian Ruder, Manaal Faruqui
Series Title: Synthesis Lectures on Human Language Technologies
DOI: https://doi.org/10.1007/978-3-031-02171-8
Publisher: Springer Cham
eBook Packages: Synthesis Collection of Technology (R0), eBColl Synthesis Collection 9
Copyright Information: Springer Nature Switzerland AG 2019
Softcover ISBN: 978-3-031-01043-9Published: 05 June 2019
eBook ISBN: 978-3-031-02171-8Published: 31 May 2022
Series ISSN: 1947-4040
Series E-ISSN: 1947-4059
Edition Number: 1
Number of Pages: XI, 120
Topics: Artificial Intelligence, Natural Language Processing (NLP), Computational Linguistics