Multitask and Multilingual Modelling for Lexical Analysis

Bjerva, Johannes

doi:10.1007/s13218-018-0557-5

Multitask and Multilingual Modelling for Lexical Analysis

Dissertation and Habilitation Abstracts
Published: 10 September 2018

Volume 32, pages 287–290, (2018)
Cite this article

KI - Künstliche Intelligenz Aims and scope Submit manuscript

Johannes Bjerva ORCID: orcid.org/0000-0002-9512-0739¹

162 Accesses
1 Altmetric
Explore all metrics

Abstract

In Natural Language Processing (NLP), one traditionally considers a single task (e.g. part-of-speech tagging) for a single language (e.g. English) at a time. However, recent work has shown that it can be beneficial to take advantage of relatedness between tasks, as well as between languages. In this work I examine the concept of relatedness and explore how it can be utilised to build NLP models that require less manually annotated data. A large selection of NLP tasks is investigated for a substantial language sample comprising 60 languages. The results show potential for joint multitask and multilingual modelling, and hints at linguistic insights which can be gained from such models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Notes

In NLP words are commonly represented by embedding them in a vector space, typically with 64–256 dimensions. These representations are learnt by predicting contexts in large text corpora, such that words occurring in similar contexts are close to one another, which is useful since such words tend to have similar meanings (i.e. distributional semantics).
SemTags: [1, 9]. POS: UD1.3 (universaldependencies.org).
This can be done by learning multilingual word embeddings, in which, e.g., the words dialects and Dialekten are close to one another.
Bi-directional RNNs are frequently used in NLP. One advantage of this is that one can use both the preceding and succeeding contexts of a word when predicting its tag.
Evaluation of a model trained on one language on a test instance for an unobserved language.

References

Abzianidze L, Bjerva J, Evang K, Haagsma H, van Noord R, Ludmann P, Nguyen DD, Bos J (2017) The parallel meaning bank: towards a multilingual corpus of translations annotated with compositional meaning representations. In: EACL, pp 242–247
Bjerva J (2016) Byte-based language identification with deep convolutional networks. In: VarDial3, pp 119–125
Bjerva J (2017) One model to rule them all – multitask and multilingual modelling for lexical analysis. Ph.D. thesis, University of Groningen. http://hdl.handle.net/11370/73e67d8a-14b0-42b1-9dcf-292eab63539c. Accessed 01 May 2018
Bjerva J (2017) Will my auxiliary tagging task help? Estimating auxiliary tasks effectivity in multi-task learning. In: NoDaLiDa, pp 216–220
Bjerva J, Augenstein I (2018) From phonology to syntax: unsupervised linguistic typology at different levels with language embeddings. In: NAACL-HLT
Bjerva J, Augenstein I (2018) Tracking typological features of uralic languages in distributed language representations. In: IWCLUL
Bjerva J, Bos J, Van der Goot R, Nissim M (2014) The meaning factory: Formal semantics for recognizing textual entailment and determining semantic similarity. In: SemEval 2014, pp 642–646
Bjerva J, Östling R (2017) Cross-lingual learning of semantic textual similarity with multilingual word representations. In: NoDaLiDa, pp 211–215
Bjerva J, Plank B, Bos J (2016) Semantic tagging with deep residual networks. In: COLING, pp 3531–3541
Bos J, Basile V, Evang K, Venhuizen NJ, Bjerva J (2017) The Groningen meaning bank. Springer, Dordrecht, pp 463–496
Google Scholar
Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75
Article MathSciNet Google Scholar
de Lhoneux M, Bjerva J, Augenstein I, Søgaard A (2018) Parameter sharing between dependency parsers for related languages. In: EMNLP

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
Johannes Bjerva

Authors

Johannes Bjerva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johannes Bjerva.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bjerva, J. Multitask and Multilingual Modelling for Lexical Analysis. Künstl Intell 32, 287–290 (2018). https://doi.org/10.1007/s13218-018-0557-5

Download citation

Published: 10 September 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s13218-018-0557-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multitask and Multilingual Modelling for Lexical Analysis

Abstract

Access this article

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation