Skip to main content

A Discourse Information Radio News Database for Linguistic Analysis

  • Chapter
Linked Data in Linguistics

Abstract

In this paper we present DIRNDL, an annotated corpus resource comprising syntactic annotations as well as information status labels and prosodic information. We introduce each annotation layer and then focus on the linking of the data in a standoff approach. The corpus is based on data from radio news broadcasts, i.e. two sets of primary data: spoken radio news files and a written text version which sometimes deviates from the actual spoken data. We utilize a generic relational database management system to bridge the gap between the deviating primary data as well as between the different properties of the annotation levels. We show how the resource can support data extraction concerning the interface between information status, syntax and prosody.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Baumann S, Riester A (to appear) Referential and Lexical Givenness: Semantic, Prosodic and Cognitive Aspects. In: Elordieta G, Prieto P (eds) Prosody and Meaning Interface Explorations, De Gruyter Mouton, Berlin

    Google Scholar 

  • Burchardt A, Erk K, Frank A, Kowalski A, Padó S (2006) SALTO: A Versatile Multi-Level Annotation Tool. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC), Genoa, Italy

    Google Scholar 

  • Cassidy S (2010) An RDF realisation of LAF in the DADA annotation server. In: Proceedings of the 5th Joint ISO-ACL/SIGSEM Workshop on Interoperable Semantic Annotation (ISA-5), Hong Kong

    Google Scholar 

  • Chiarcos C (this vol.) Interoperability of corpora and annotations. pp 161–179

    Google Scholar 

  • Chiarcos C, Ritz J, Stede M (2009) By all these lovely tokens… Merging Conflicting Tokenizations. In: Proceedings of the Third Linguistic Annotation Workshop, Association for Computational Linguistics, Suntec, Singapore, pp 35–43

    Google Scholar 

  • Dipper S (2005) XML-based Stand-off Representation and Exploitation of Multi-Level Linguistic Annotation. In: Proceedings of Berliner XML Tage 2005 (BXML 2005), Berlin, pp 39–50

    Google Scholar 

  • Eckart K, Eberle K, Heid U (2010) An Infrastructure for More Reliable Corpus Analysis. In: Proceedings of the Workshop on Web Services and Processing Pipelines in HLT: Tool Evaluation, LR Production and Validation (LREC’10), Valletta, Malta, pp 8–14

    Google Scholar 

  • Lezius W, Biesinger H, Gerstenberger C (2002) TIGERRegistry Manual. Tech. rep., IMS Stuttgart

    Google Scholar 

  • Mayer J (1995) Transcription of German Intonation. The Stuttgart System. URL http://www.ims.uni-stuttgart.de/phonetik/joerg/labman/STGTsystem.html, ms

  • Prince EF (1981) Toward a Taxonomy of Given-New Information. In: Cole P (ed) Radical Pragmatics, Academic Press, New York, pp 233–255

    Google Scholar 

  • Prince EF (1992) The ZPG Letter: Subjects, Definiteness and Information Status. In: Mann W, Thompson S (eds) Discourse Description: Diverse Linguistic Analyses of a Fund-Raising Text, Benjamins, Amsterdam, pp 295–325

    Google Scholar 

  • Rapp S (1995) Automatic Phonemic Transcription and Linguistic Annotation from Known Text with Hidden Markov Models – An Aligner for German. In: Proceedings of ELSNET Goes East and IMACS Workshop “Integration of Language and Speech in Academia and Industry” (Russia)

    Google Scholar 

  • Riester A, Lorenz D, Seemann N (2010) A Recursive Annotation Scheme for Referential Information Status. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC), Valletta, Malta, pp 717–722

    Google Scholar 

  • Rohrer C, Forst M (2006) Improving Coverage and Parsing Quality of a Large-scale LFG for German. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC), Genoa, Italy

    Google Scholar 

  • Taylor P, Black AW, Caley R (1998) The Architecture Of The Festival Speech Synthesis System. In: Proceedings of the Third ESCA Workshop in Speech Synthesis, pp 147–151

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kerstin Eckart .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Eckart, K., Riester, A., Schweitzer, K. (2012). A Discourse Information Radio News Database for Linguistic Analysis. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds) Linked Data in Linguistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28249-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28249-2_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28248-5

  • Online ISBN: 978-3-642-28249-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics