Skip to main content

Combining Distributional Semantics and Structured Data to Study Lexical Change

  • Conference paper
  • First Online:
Knowledge Engineering and Knowledge Management (EKAW 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10180))

Included in the following conference series:

Abstract

Statistical Natural Language Processing (NLP) techniques allow to quantify lexical semantic change using large text corpora. Word-level results of these methods can be hard to analyse in the context of sets of semantically or linguistically related words. On the other hand, structured knowledge sources represent semantic relationships explicitly, but ignore the problem of semantic change. We aim to address these limitations by combining the statistical and symbolic approach: we enrich WordNet, a structured lexical database, with quantitative lexical change scores provided by HistWords, a dataset produced by distributional NLP methods. We publish the result as Linked Open Data and demonstrate how queries on the combined dataset can provide new insights.

This paper is an extended version of [13].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://snap.stanford.edu/historical_embeddings/eng-all_sgns.zip, fullstats.

  2. 2.

    http://storage.googleapis.com/books/ngrams/books/datasetsv2.html.

  3. 3.

    www.github.com/aan680/SemanticChange.

  4. 4.

    http://www.w3.org/TR/owl-time/.

References

  1. Andreas Blank: Words and concepts in time: towards diachronic cognitive onomasiology (2003)

    Google Scholar 

  2. De Bolla, P.: The Architecture of Concepts: The Historical Formation of Human Rights. Oxford University Press, New York (2013)

    Google Scholar 

  3. Gabrielatos, C., Baker, P.: Fleeing, sneaking, flooding: a corpus analysis of discursive constructions of refugees and asylum seekers in the UK press, 1996–2005. J. Eng. Linguist. 36(1), 5–38 (2008)

    Article  Google Scholar 

  4. Gulordava, K., Baroni, M.: A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. In: Proceedings of the GEMS 2011 Workshop on Geometrical Models of Natural Language Semantics, pp. 67–71. Association for Computational Linguistics (2011)

    Google Scholar 

  5. Hamilton, W.L., Leskovec, J., Jurafsky, D.: Diachronic word embeddings reveal statistical laws of semantic change. In: ACL 2016, pp. 1489–1501 (2016)

    Google Scholar 

  6. Kenter, T., Wevers, M., Huijnen, P., de Rijke, M.: Ad hoc monitoring of vocabulary shifts over time. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management, pp. 1191–1200. ACM (2015)

    Google Scholar 

  7. Khan, F., Díaz-Vera, J.E., Monachini, M.: Representing polysemy and diachronic lexico-semantic data on the Semantic Web. In: Proceedings of the Second International Workshop on Semantic Web for Scientific Heritage Co-located with 13th Extended Semantic Web Conference (ESWC 2016) (2016)

    Google Scholar 

  8. Kim, Y., Chiu, Y.-I., Hanaki, K., Hegde, D., Petrov, S.: Temporal analysis of language through neural language models. In: Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pp. 61–65 (2014)

    Google Scholar 

  9. Kulkarni, V., Al-Rfou, R., Perozzi, B., Skiena, S.: Statistically significant detection of linguistic change. In: Proceedings of the 24th International Conference on World Wide Web, pp. 625–635. ACM (2015)

    Google Scholar 

  10. McCrae, J.P., Fellbaum, C., Cimiano, P.: Publishing and linking WordNet using lemon and RDF. In: Proceedings of the 3rd Workshop on Linked Data in Linguistics (2014)

    Google Scholar 

  11. Mikolov, T., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (2013)

    Google Scholar 

  12. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  13. van Aggelen, A., Hollink, L., van Ossenbruggen, J.: Combining distributional semantics and structured data to study lexical change. In: Proceedings of the 1st Workshop on Detection, Representation and Management of Concept Drift in Linked Open Data. CEUR Workshop Proceedings, vol. 1799, pp. 18–25. http://ceur-ws.org/Vol-1799

  14. Van Assem, M., Gangemi, A., Schreiber, G.: Conversion of WordNet to a standard RDF/OWL representation. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, pp. 237–242 (2006)

    Google Scholar 

  15. Wielemaker, J., Beek, W., Hildebrand, M., van Ossenbruggen, J.: ClioPatria: A SWI-Prolog infrastructure for the semantic web. Semant. Web 7(5), 529–541 (2016). IOS Press

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by H2020 project VRE4EIC under grant agreement No. 676247.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Astrid van Aggelen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

van Aggelen, A., Hollink, L., van Ossenbruggen, J. (2017). Combining Distributional Semantics and Structured Data to Study Lexical Change. In: Ciancarini, P., et al. Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10180. Springer, Cham. https://doi.org/10.1007/978-3-319-58694-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58694-6_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58693-9

  • Online ISBN: 978-3-319-58694-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics