Skip to main content
Log in

Multitask and Multilingual Modelling for Lexical Analysis

  • Dissertation and Habilitation Abstracts
  • Published:
KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

In Natural Language Processing (NLP), one traditionally considers a single task (e.g. part-of-speech tagging) for a single language (e.g. English) at a time. However, recent work has shown that it can be beneficial to take advantage of relatedness between tasks, as well as between languages. In this work I examine the concept of relatedness and explore how it can be utilised to build NLP models that require less manually annotated data. A large selection of NLP tasks is investigated for a substantial language sample comprising 60 languages. The results show potential for joint multitask and multilingual modelling, and hints at linguistic insights which can be gained from such models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Notes

  1. In NLP words are commonly represented by embedding them in a vector space, typically with 64–256 dimensions. These representations are learnt by predicting contexts in large text corpora, such that words occurring in similar contexts are close to one another, which is useful since such words tend to have similar meanings (i.e. distributional semantics).

  2. SemTags: [1, 9]. POS: UD1.3 (universaldependencies.org).

  3. This can be done by learning multilingual word embeddings, in which, e.g., the words dialects and Dialekten are close to one another.

  4. Bi-directional RNNs are frequently used in NLP. One advantage of this is that one can use both the preceding and succeeding contexts of a word when predicting its tag.

  5. Evaluation of a model trained on one language on a test instance for an unobserved language.

References

  1. Abzianidze L, Bjerva J, Evang K, Haagsma H, van Noord R, Ludmann P, Nguyen DD, Bos J (2017) The parallel meaning bank: towards a multilingual corpus of translations annotated with compositional meaning representations. In: EACL, pp 242–247

  2. Bjerva J (2016) Byte-based language identification with deep convolutional networks. In: VarDial3, pp 119–125

  3. Bjerva J (2017) One model to rule them all – multitask and multilingual modelling for lexical analysis. Ph.D. thesis, University of Groningen. http://hdl.handle.net/11370/73e67d8a-14b0-42b1-9dcf-292eab63539c. Accessed 01 May 2018

  4. Bjerva J (2017) Will my auxiliary tagging task help? Estimating auxiliary tasks effectivity in multi-task learning. In: NoDaLiDa, pp 216–220

  5. Bjerva J, Augenstein I (2018) From phonology to syntax: unsupervised linguistic typology at different levels with language embeddings. In: NAACL-HLT

  6. Bjerva J, Augenstein I (2018) Tracking typological features of uralic languages in distributed language representations. In: IWCLUL

  7. Bjerva J, Bos J, Van der Goot R, Nissim M (2014) The meaning factory: Formal semantics for recognizing textual entailment and determining semantic similarity. In: SemEval 2014, pp 642–646

  8. Bjerva J, Östling R (2017) Cross-lingual learning of semantic textual similarity with multilingual word representations. In: NoDaLiDa, pp 211–215

  9. Bjerva J, Plank B, Bos J (2016) Semantic tagging with deep residual networks. In: COLING, pp 3531–3541

  10. Bos J, Basile V, Evang K, Venhuizen NJ, Bjerva J (2017) The Groningen meaning bank. Springer, Dordrecht, pp 463–496

    Google Scholar 

  11. Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75

    Article  MathSciNet  Google Scholar 

  12. de Lhoneux M, Bjerva J, Augenstein I, Søgaard A (2018) Parameter sharing between dependency parsers for related languages. In: EMNLP

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes Bjerva.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bjerva, J. Multitask and Multilingual Modelling for Lexical Analysis. Künstl Intell 32, 287–290 (2018). https://doi.org/10.1007/s13218-018-0557-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-018-0557-5

Keywords

Navigation