Abstract
The Turkish Discourse Bank (TDB) is a resource of approximately 400,000 words in its current release in which explicit discourse connectives and phrasal expressions are annotated along with the textual spans they relate. The corpus has been annotated by annotators using a semiautomatic annotation tool. We expect that it will enable researchers to study aspects of language beyond the sentence level. The TDB follows the Penn Discourse Tree Bank (PDTB) in adopting a connective-based annotation for discourse. The connectives are considered heads of annotated discourse relations. We have so far found only applicative structures in Turkish discourse, which, unlike syntactic heads, seem to have no need for composition. Interleaving in-text spans of arguments appears to be only apparently-crossing, and related to information structure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The tools and the data are available to researchers free of charge by applying to the TDB research team through medid.ii.metu.edu.tr (Accessed Sept. 14, 2017).
- 2.
The connectives that gave low κ values are amaçla “for this purpose”, ayrıca “in addition/separately”, dolayısıyla “in consequence of”, fakat “but”, oysa “however”, rağmen “despite/despite this”, tersine “in contrast”, and yandan “on the one hand/on the other hand”.
- 3.
Of the 7,486 ve “and” tokens in the TDB, 2,111 are annotated as discourse connectives.
- 4.
The disagreed text spans are rendered in both italics and boldface.
- 5.
Here we follow Forbes-Riley et al. (2006) who argue that discourse adverbials and other connectives such as coordinating and subordinating conjunctions differ in how they take their arguments. Discourse adverbials only take their second argument structurally, their first argument being anaphoric. Other kinds of discourse connectives take both of their arguments structurally. Thus, we use the term “structural discourse connective” for coordinating and subordinating conjunctions, “anaphoric discourse connective” for discourse adverbials as well as expressions that contain a deictic anaphor (i.e., phrasal expressions).
- 6.
Indices show argument-taking.
References
Aktaş B, Bozşahin C, Zeyrek D (2010) Discourse relation configurations in Turkish and an annotation environment. In: Proceedings of the linguistic annotation workshop, Uppsala, pp 202–206
Asher N (1993) Reference to abstract objects in discourse. Kluwer Academic Publishers, Dordrecht
Demirşahin I, Zeyrek D (2017) Pair annotation as a novel annotation procedure: the case of Turkish Discourse Bank. In: Pustejovsky J, Ide N (eds) Handbook of linguistic annotation. Springer, Dordrecht
Demirşahin I, Yalçınkaya İ, Zeyrek D (2012) Pair annotation: adaption of pair programming to corpus annotation. In: Proceedings of the linguistic annotation workshop, Jeju, pp 31–39
Demirşahin I, Öztürel A, Bozşahin C, Zeyrek D (2013) Applicative structures and immediate discourse in the Turkish Discourse Bank. In: Proceedings of the linguistic annotation workshop, Sofia, pp 122–130
Egg M, Redeker G (2008) Underspecified discourse representation. In: Benz A, Kuhnlein P (eds) Constraints in discourse. John Benjamins, Amsterdam, pp 117–138
Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378
Forbes-Riley K, Webber B, Joshi A (2006) Computing discourse semantics: the predicate-argument semantics of discourse connectives in D-LTAG. J Semant 23(1):55–106
Hobbs JR (1985) On the coherence and structure of discourse. Tech. Rep. CSLI-85-37, CSLI, Stanford, CA
Joshi A (2011) Some aspects of transition from sentence to discourse. Keynote address, Informatics Science Festival, Middle East Technical University
Lee A, Prasad R, Joshi A, Dinesh N, Webber B (2006) Complexity of dependencies in discourse: are dependencies in discourse more complex than in syntax. In: Proceedings of the international workshop on treebanks and linguistic theories, Prague
Lee A, Prasad R, Joshi A, Webber B (2008) Departures from tree structures in discourse: shared arguments in the Penn Discourse Treebank. In: Proceedings of the third workshop on constraints in discourse, Potsdam
Mann WC, Thompson SA (1988) Rhetorical structure theory: toward a functional theory of text organization. Text 8(3):243–281
Nakatsu C, White M (2010) Generating with discourse combinatory categorial grammar. Linguist Issues Lang Technol 4(1):1–62
Polanyi L (1988) A formal model of the structure of discourse. J Pragmat 12(5):601–638
Prasad R, Webber BL, Joshi A (2014) Reflections on the Penn Discourse TreeBank, comparable corpora, and complementary annotation. Comput Linguist 40(4):921–950
Say B, Zeyrek D, Oflazer K, Özge U (2004) Development of a corpus and a treebank for present-day written Turkish. In: Proceedings of the international conference on Turkish Linguistics, Magosa, TRNC, pp 183–192
Shieber S (1985) Evidence against the context-freeness of natural language. Linguist Philos 8:333–343
Tın E, Akman V (1994) Situated processing of pronominal anaphora. In: Proceedings of the Konferenz, Verarbeitung natürlicher Sprache, Vienna, pp 369–378
Tüfekçi P, Kılıçaslan Y (2005) A computational model for resolving pronominal anaphora in Turkish using Hobbs’ naïve algorithm. Int J Comput Intell 2(1):71–75
Tüfekçi P, Küçük D, Yöndem MT, Kılıçaslan Y (2007) Comparison of a syntax-based and a knowledge-poor pronoun resolution systems for Turkish. Poster presented at international symposium on computer and information sciences (ISCIS)
Webber B (2004) D-LTAG: Extending lexicalized TAG to discourse. Cognit Sci 28(5):751–779
Williams L, Kessler RR, Cunningham W, Jeffries R (2000) Strengthening the case for pair programming. IEEE Softw 17(4):19–25
Wolf F, Gibson E (2004) Representing discourse coherence: a corpus-based analysis. In: Proceedings of COLING, Geneva, pp 134–140
Wolf F, Gibson E (2005) Representing discourse coherence: a corpus-based study. Comput Linguist 31(2):249–287
Yıldırım S, Kılıçaslan Y, Aykaç RE (2004) A computational model for anaphora resolution in Turkish via centering theory: an initial approach. In: Proceedings of the international conference on computational intelligence, Istanbul, pp 124–128
Yüksel Ö, Bozşahin C (2002) Contextually appropriate reference generation. Nat Lang Eng 8(1):69–89
Zeyrek D, Webber BL (2008) A discourse resource for Turkish: annotating discourse connectives in the METU corpus. In: Proceedings of the workshop on Asian language resources, Hyderabad, pp 65–72
Zeyrek D, Turan ÜD, Bozşahin C, Çakıcı R, Sevdik-Çallı A, Demirşahin I, Aktaş B, Yalçınkaya İ, Ögel H (2009) Annotating subordinators in the Turkish Discourse Bank. In: Proceedings of the 3rd linguistic annotation workshop, Singapore, pp 44–47
Zeyrek D, Demirşahin I, Sevdik-Çallı A, Balaban Hö, Yalçınkaya İ, Turan ÜD (2010) The annotation scheme of the Turkish Discourse Bank and an evaluation of inconsistent annotations. In: Proceedings of the 4th linguistic annotation workshop, Uppsala, pp 282–289
Zeyrek D, Demirşahin I, Sevdik-Çallı A, Çakıcı R (2013) Turkish Discourse Bank: porting a discourse annotation style to a morphologically rich language. Dialogue Discourse 4(2):174–184
Zeyrek, D, Kurfalı, M (2017) TDB 1.1: Extensions on Turkish Discourse Bank. In: Proceedings of the 11th linguistic annotation workshop, Valencia, pp 76–81
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Zeyrek, D., Demirşahin, I., Bozşahin, C. (2018). Turkish Discourse Bank: Connectives and Their Configurations. In: Oflazer, K., Saraçlar, M. (eds) Turkish Natural Language Processing. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-90165-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-90165-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90163-3
Online ISBN: 978-3-319-90165-7
eBook Packages: Computer ScienceComputer Science (R0)