skip to main content
10.1145/1839707.1839730acmotherconferencesArticle/Chapter ViewAbstractPublication PagessemanticsConference Proceedingsconference-collections
research-article

Using linked open data to bootstrap corporate knowledge management in the OrganiK project

Published:01 September 2010Publication History

ABSTRACT

Tagging has become a wide-spread tool for organising content, from photos and music, to research paper and data-visualisations. Organising tags in a taxonomy adds hierarchical structure and relationships, this can be helpful, both for finding and applying tags to new content, as well as for enabling query expansion when searching. However, taxonomies can be very time-consuming to create and maintain. If a hierarchical taxonomy could be automatically built and adapted to a particular domain, the entry cost for using taxonomies for structuring information would go down. Small and medium enterprises (SMEs) do not currently have sufficient resources to invest in Enterprise 2.0 technologies like taxonomies, wikis or blogging as the entry cost it too high. The OrganiK project aims to make Enterprise 2.0 features available with low entry- and maintenance costs.

In this paper, an algorithm and methodology to automatically create and maintain taxonomies is presented. It analyses enterprise document corpora and uses background information from domain-specific data sources or from the Linked Open Data cloud to improve and contextualise the created SKOS taxonomy. Content created in a Drupal-based Enterprise 2.0 content management system is automatically categorised, and the automatically created taxonomy is extended when needed. The system has been tested with corpora of medical abstracts, computer science papers, and the Enron email collection, and is in productive use.

References

  1. C. Beylier, F. Pourroy, F. Villeneuve, and A. Mille. A collaboration-centred approach to manage engineering knowledge: a case study of an engineering sme. Journal of Engineering Design, 20(6):523--542, December 2009. cited in OrganiK D1.1.Google ScholarGoogle ScholarCross RefCross Ref
  2. D. Bibikas, E. Kargioti, D. Panagiotou, K. Christidis, L. Sauermann, A. C. Vasconcelos, and A. Bernardi. D2.1 organik km framework specification. Deliverable 2.1, OrganiK Consortium, Leading Partner: SEERC, June 2009. Public.Google ScholarGoogle Scholar
  3. D. Bibikas, D. Kourtesis, I. Paraskakis, A. Bernardi, L. Sauermann, D. Apostolou, G. Mentzas, and A. C. Vasconcelos. A sociotechnical approach to knowledge management in the era of enterprise 2.0: the case of organik. Scalable Computing: Practice and Experience Scientific International Journal for Parallel and Distributed Computing, 9(4):315--327, December 2008. Special Issue: The Web on the Move.Google ScholarGoogle Scholar
  4. S. Brants and S. Hansen. Developments in the tiger annotation scheme and their realization in the corpus. In In Proceedings of the Third Conference on Language Resources and Evaluation LREC-02. Las Palmas de Gran Canaria, pages 1643--1649, 2002.Google ScholarGoogle Scholar
  5. Y. Y. Bryan Klimt. Introducing the enron corpus. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), July 30 and 31 2004. Mountain View, CA.Google ScholarGoogle Scholar
  6. P. Cimiano. Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Springer, New York, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Cimiano, A. Hotho, and S. Staab. Learning concept hierarchies from text corpora using formal concept analysis. Journal of Artificial Intelligence Research, 24:305--339, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Dellschaft and S. Staab. On How to Perform a Gold Standard Based Evaluation of Ontology Learning. In In Proceedings of the 5th International Semantic Web Conference (ISWC2006), volume 4273 of LNCS, Athens, GA, USA, November 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the Fourteenth International Conference on Computational Linguistics, pages 539--545, Nantes, France, July 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Kobilarov, T. Scott, Y. Raimond, S. Oliver, C. Sizemore, M. Smethurst, C. Bizer, and R. Lee. Media meets semantic web - how the bbc uses dbpedia and linked data to make connections. In L. Aroyo, P. Traverso, F. Ciravegna, P. Cimiano, T. Heath, E. HyvÃűnen, R. Mizoguchi, E. Oren, M. Sabou, and E. P. B. Simperl, editors, ESWC, volume 5554 of Lecture Notes in Computer Science, pages 723--737. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Kreiser, A. Nauerz, and F. Bakalov. A web 3.0 approach for improving tagging systems. In In Proceedings of Workshop on Web 3.0: Merging Semantic Web and Social Web 2009, volume 467 of CEUR Workshop Proceedings, Turin, Italy, June 29 2009. ISSN 1613--0073.Google ScholarGoogle Scholar
  12. A. Ratnaparkhi. Maximum Entropy Models for Natural Language Ambiguity Resolution. PhD thesis, University of Pennsylvania, Philadelphia, PA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Schoenmackers, O. Etzioni, and D. S. Weld. Scaling textual inference to the web. In EMNLP '08: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 79--88, Morristown, NJ, USA, 2008. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Ummel. Sea change: Toward a new world semantic enterprise architecture. Cutter IT Journal, 22(11):34--39, November 2009.Google ScholarGoogle Scholar
  15. F. Wu and D. S. Weld. Automatically refining the wikipedia infobox ontology. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 635--644, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Zijlstra, A. Vasconcelos, G. Mentzas, D. Bibikas, I. Paraskakis, D. Panagiotou, G. Grimnes, and A. Bernardi. D1.1 state-of-the-art review: Knowledge management in smes. Deliverable 1.1, OrganiK Consortium, Leading Partner: USFD, March 26 2009. Public.Google ScholarGoogle Scholar

Index Terms

  1. Using linked open data to bootstrap corporate knowledge management in the OrganiK project

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              I-SEMANTICS '10: Proceedings of the 6th International Conference on Semantic Systems
              September 2010
              281 pages
              ISBN:9781450300148
              DOI:10.1145/1839707

              Copyright © 2010 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 September 2010

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate40of182submissions,22%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader