Syntactic complexity and ambiguity resolution in a free word order language: Behavioral and electrophysiological evidences from Basque

https://doi.org/10.1016/j.bandl.2008.12.003Get rights and content

Abstract

In natural languages some syntactic structures are simpler than others. Syntactically complex structures require further computation that is not required by syntactically simple structures. In particular, canonical, basic word order represents the simplest sentence-structure. Natural languages have different canonical word orders, and they vary in the degree of word order freedom they allow. In the case of free word order, whether canonical word order plays any role in processing is still unclear. In this paper, we present behavioral and electrophysiological evidence that simpler, canonical word order preference is found even in a free word order language.

Canonical and derived structures were compared in two self-paced reading and one ERPs experiment. Non-canonical sentences required further syntactic computation in Basque, they showed longer reading times and a modulation of anterior negativities and P600 components providing evidence that even in free word order, case-marking grammars, underlying canonical word order can play a relevant role in sentence processing. These findings could signal universal processing mechanisms because similar processing patterns are found in typologically very distant grammars.

We also provide evidence from syntactically fully ambiguous sequences. Our results on ambiguity resolution showed that fully ambiguous sequences were processed as canonical sentences. Moreover, when fully ambiguous sequences were forced to complex interpretation by means of the world knowledge of the participants, a frontal negativity distinguished simple and complex ambiguous sequences. Thus the preference of simple structures is presumably a universal design property for language processing, despite differences on parametric variation of a given grammar.

Introduction

In recent years, a rapidly growing body of experimental studies using neuroimaging techniques has explored syntactic processing of natural language. As a result, findings from linguistics and neuroscience are progressively reaching increasing levels of convergence and reciprocal relevance. However, the vast majority of language neuroimaging studies focus on rather similar languages, such as English, Italian, French, German, Spanish or Dutch. These languages belong to the Indo-European family, and share many central design properties (Baker, 2001). In linguistics, a significant expansion of the language pool investigated was crucial to uncover the interplay between universal and variable aspects of the language faculty (Chomsky, 1981, Greenberg, 1963). It is to be expected, therefore, that the exploration of language in the brain will also benefit from cross-linguistic research, so that we can differentiate language-particular processing strategies from universal language processing mechanisms. In order to discriminate between the two, it is necessary to conduct studies and gather evidence from a wide array of languages pertaining to various typological groups (Bates et al., 2001, Bornkessel and Schlesewsky, 2006).

The present study attempts to broaden the empirical basis of experimental studies, and it does it by presenting and discussing a series of behavioral and electrophysiological experiments in Basque, a free word order, case-marking and ergative language (De Rijk, 2007, Hualde and Ortiz de Urbina, 2003, Laka, 1996). From the large array of potentially relevant features of Basque which could be relevant for the emerging field of neurocognition of language, this paper focuses on two aspects: the relevance of underlying, canonical word order in sentence processing and its role in sentence-ambiguity resolution.

Languages vary with respect to word order freedom; some have a very fixed word order, like English, others allow a greater variation of word order, like Spanish, and still others allow for almost all possible combinations of phrases in a sentence, like Basque (Baker, 2001). Despite this variation, it has been argued in linguistics that grammars have an underlying, canonical word order (Chomsky, 1981, Greenberg, 1963), which surfaces in a declarative sentence that initiates discourse, that is, a sentence where no constituent is focalized and where the entire event constitutes new information (Lambrecht, 1994). Canonical word order thus reflects the simplest phrase sequence generated by the grammar. Most human languages, independently of the degree of word order freedom, group into two main types regarding canonical word order (Greenberg, 1963): Subject–Verb–Object (SVO) languages, such as English or Spanish, and Subject–Object–Verb (SOV) languages, such as Japanese or Basque (these two main types are also known as head-first and head-last languages within the Principles and Parameters model, Baker, 2001, Chomsky, 1981).

A wide array of experiments have explored the asymmetries between the subjects and objects in relative clauses (e.g., Gibson, 1998, King and Kutas, 1995, Müller et al., 1997, Traxler et al., 2002), in interrogative sentences (e.g., Ben-Shachar et al., 2004, Bornkessel et al., 2004, Felser et al., 2003, Fiebach et al., 2002) and in declarative sentences (e.g., Bahlmann et al., 2007, Hagiwara et al., 2007, Matzke et al., 2002, Rösler et al., 1998, Schlesewsky et al., 2003). Most of these researches found that Object-before-Subject structures require higher processing effort than Subject-before-Object structures. For instance, subject relative clauses (The reporter [who attacked the senator] admitted the error) and subject interrogative clauses (Thomas asks himself [who called the doctor]) are easier to process than object relatives (The reporter [who the senator attacked] admitted the error) and object interrogatives (Thomas asks himself [who the doctor called]). These studies have proposed that the larger processing effort observed might be due either to the syntactic complexity of the transformations or to the load in working memory (see Fiebach, Schleswsky, Lohmann, von Cramon, & Friederici, 2005).

However, Traxler et al. (2002) showed that the processing cost of the object relatives with respect to the subject relatives could be reduced manipulating the semantic properties of the subjects and objects. For example, this reduction has been accomplished manipulating the animacy of the subjects and the objects of subject relatives and object relatives (Mak, Vonk, & Schriefers, 2006), manipulating the argument structure of the verbs (Bornkessel et al., 2005, Schlesewsky and Bornkessel, 2006), or comparing pronominal subjects and objects (Schlesewsky et al., 2003). These results could not be explained by the transformation-based framework or by the working memory framework. Bornkessel et al. (2005) argued that the higher processing effort of the sentences could be related to the syntactic–semantic interface. These authors suggested that the linearization of the constituents’ thematic role in a given sentence could vary the processing effort of the sentences independently or jointly of the syntactic structure of the sentence (Bornkessel et al., 2005).

Regarding to the signs of the processing effort, behaviorally, easier processing correlates with shorter reading times and/or shorter error rates (see Sekerina, 2003, for an overview). Electrophysiological studies, on the other hand, showed a variety of results. For instance, Object interrogative clauses showed a left anterior negativity (LAN) and P600 pattern compared with Subject interrogatives (Felser et al., 2003, Fiebach et al., 2002). This pattern has been explained as the storage and the integration of displaced elements respectively (Felser et al., 2003, Fiebach et al., 2002). On the other hand, Bornkessel et al. (2004) showed a sustained negativity in the Object interrogatives and a N400 in the object-second position of Subject interrogatives. This N400 effect could be explained appealing to the Minimality-driven processing strategy (Bornkessel & Schlesewsky, 2006) which predicts that the subject-initial sentences are considered intransitive in German, and processing the unpredictable object then generates a N400 related to the semantic integration. Thus, in some ERP studies the subject/object asymmetries in non-declarative sentences are considered to be syntactic processes because the LAN and the P600 components are related to syntactic processing (Felser et al., 2003, Fiebach et al., 2002). In other studies, the N400 component related the subject/object asymmetries with the processing of argument structure (Bornkessel et al., 2004).

In declarative sentences, behavioral studies have shown that sentence processing is sensitive to word order. In most languages canonical word order is processed faster and with greater ease (see Sekerina, 2003, for an overview). However, in a scrambling language like Japanese, the results of behavioral studies do not converge. While some self-paced reading studies reported no differences between SOV and OSV word orders in Japanese (Tamaoka et al., 2003, Yamashita, 1997), other studies showed that OSV word orders required higher processing demands (Mazuka et al., 2002, Miyamoto and Takahashi, 2002).

ERP studies have investigated word order processing in sentences in which no interrogative or relative marker is heading the clause. Most of these studies were carried out in German (Bornkessel et al., 2002, Matzke et al., 2002, Rösler et al., 1998, Schlesewsky et al., 2003). German is not a fixed word order language: it allows SVO word order as in the sentence “der Mann sah den Jungen” (the man – SUBJECT saw the boy – OBJECT) but it also allows other word orders like OVS as in, “den Jungen sah der Mann” (the boy – OBJECT saw the man – SUBJECT). The ERP studies of Rösler et al., 1998, Matzke et al., 2002 observed a LAN component at object-first position, which has been attributed to the storage cost of the displaced Object-phrase. The authors argued that the displaced Object-phrase must wait until its canonical position is reached for interpretation. This result agrees with previous studies in which differences in word order using relative or interrogative clauses were explored. In these studies, the LAN component has been attributed to syntactic working memory storage (Felser et al., 2003, Fiebach et al., 2002, King and Kutas, 1995, Kluender and Kutas, 1993, Müller et al., 1997, Münte et al., 1998). In a recent functional magnetic resonance imaging (fMRI) study, and using the same type of sentences used by Matzke et al., 2002, Bahlmann et al., 2007 reported that non-canonical sentences elicited larger activation in the left inferior frontal gyrus. This result confirmed the idea that non-canonical sentences require a larger integration cost and make greater demands on working memory. These results in turn also converge with linguistic accounts of German syntax, where object-initial main sentences involve displacing the object from its canonical position to a higher place in the sentence-structure (Schwartz & Vikner, 1996). On the other hand, comparing sentences initial Accusative objects and Nominative Subjects, Bornkessel et al. (2002) showed a central negativity between 300 and 450 ms, but comparing sentences initial Dative objects and Nominative subjects no differences were observed. These results converged with the results of an fMRI study in which verb argument demands required dative objects (Bornkessel et al., 2005), demonstrating that the syntactic complexity can be modulated by means of the semantic properties of the verbs (active vs. object-experiencer) and the semantic properties of nouns (animate vs. inanimate).

In contrast to these previous ERP studies on German, a recent study in Japanese did not replicate the presence of a negative component in object-first position of non-canonical declarative sentences (Hagiwara et al., 2007). In Japanese, canonical word order is SOV as in the sentence “hishoga bengoshio sagasiteiru” (the secretary – SUBJECT the lawyer – OBJECT was looking for). However, it also allows other verb-final word orders, even to a greater degree than German, as for example in the sentence “bengoshio hishoga sagasiteiru” (the lawyer – OBJECT the secretary – SUBJECT was looking for). Hagiwara et al. (2007) interpreted the lack of LAN in object-first position sentences, as an indication that the non-canonical OSV structure is not complex enough to elicit an ERP component at Object-position (Hagiwara et al., 2007). Nevertheless, when Object-phrase was displaced from an embedded clause, at a greater distance, the complexity increased and the LAN component was observed at object-first position.

Moving along the constituents in the sentence, Matzke et al. (2002) reported a LAN effect in subject position of OVS non-canonical German sentences. Rösler et al. (1998), using German verb-final sentences, also reported a LAN-like component in sentence second position comparing the non-canonical Objects (S–O–IO–V) to canonical Indirect Objects (S–IO–O–V). Finally, Bornkessel et al. (2002) reported an early positivity (in between 300 and 400 ms) at subject second position in Dative Object initial sentences. In Japanese, the S in an OSV sentence (compared to the O of SOV) showed a Frontal Negativity in the right hemisphere (Hagiwara et al., 2007).

Basque (or Euskara, as it is known to its speakers) is an isolate language; it does not belong to any known language family. It is a free word order language; nearly all constituent combinations yield a grammatical sentence, as shown partially in (1):Despite this freedom in sentence word order, Basque grammar has been argued to be of the SOV type (De Rijk, 1969, Greenberg, 1963), because it has most correlated properties of this type: relative clauses and genitives are prenominal (2a, b), it has postpositions and suffixes instead of prepositions (1, 2b), determiners follow the noun (1), (2a, b), and inflected auxiliaries follow the verb (1), (2a):Given these properties, linguists consider Basque to be a Head-final language (Ortiz de Urbina, 1989, Laka, 1994, Baker, 2001). Basque is a case-marking language (1), (3), and the verb agrees with the subject, the object and the dative, if contained in the sentence (3a, b).Basque is three-way pro-drop language (Laka, 1996, Ortiz de Urbina, 1989): subjects, objects and datives can be phonologically unrealized:Basque is also an ergative language (Dixon, 1994, Levin (1983), Ortiz de Urbina (1989)); this means that subjects of intransitive clauses (SINTR) and objects of transitive clauses (O) are morphologically identical, and bear no overt case ending, while agentive subjects of transitive clauses (S) are morphologically distinct, and carry and ergative case marker (5):Given this combination of grammatical features, if the constituent gizona “the man” is encountered at the beginning of an utterance, the following possibilities arise: (i) it is the S(ubject) of an SV intransitive sentence like (5a); (ii) it is the O(bject) of an OSV sentence like (1b,c,h,i); and (iii) it is the O of a OV transitive sentence where S has been pro-dropped as in (4).

Basque Noun-Phrase morphology presents an interesting homomorphism that has been exploited in this study to explore structural ambiguity resolution. As shown in (6a, b), the form of the plural determiner for either SINTR or O is -ak, and as shown in (6b) the combination of the singular determiner -a plus the ergative marker -k yields a sequence -ak that is homophonous with the plural determiner. Consequently, given the free word order property, the same sequence of sounds could be interpreted as a canonical SOV sentence (6b) or as a non-canonical OSV sentence (6c).In the absence of any other means to disambiguate the sentence, both (6b) and (6c) are possible parsings.

Ambiguity is a pervasive phenomenon in all dimensions of natural language (semantics, phonology, morphology, syntax), and it has been massively explored in language processing at least since Bever (1970). Frazier and Fodor (1978) assumed that when confronted with a syntactically ambiguous structure, the simplest one is favored. If true, this predicts that canonical word order will be the preferred interpretation given an ambiguous input. Matzke et al. (2002) compared temporally ambiguous SVO sentences (e.g., die Frau hatte den Mann gesehen; lit.: the woman – AMBIGUOUS has the man – OBJECT seen) to temporally ambiguous OVS sentences (e.g., die Frau hatte der Mann gesehen lit.: the woman – AMBIGUOUS has the man – SUBJECT seen). Notice that in both sentences the initial feminine noun phrases are compatible with either a Subject-first or an Object-first interpretation due to the morphological ambiguity of the determiner “die”. The ERP results showed a P600 component at the disambiguation point only for OVS sentences. The authors interpreted this effect as a reanalysis of syntactic structure after an initial subject-first interpretation. Matzke et al. (2002) did not find any difference between unambiguous and ambiguous SVO sentences at the beginning of the sentence, suggesting a general subject-first processing strategy for German (Bates et al., 1988, Schlesewsky et al., 2000). However, using similar materials, Frisch, Schlesewsky, Saddy, and Alpermann (2002) found a P600 component for subject-initial temporally ambiguous sentences, which they interpreted as an indicator of syntactic ambiguity. Similarly, comparing unambiguously marked subject-initial interrogatives with questions starting with an ambiguous marker, Bornkessel et al. (2004) found that high span participants generated a sustained anterior negativity while low span participants generated a sustained parietal positivity.

The present study was designed to investigate word order effects and ambiguity resolution in Basque, using behavioral (self-paced reading) and electrophysiological techniques (ERPs). We carried out three different experiments: The purpose of Experiment 1 was to measure reading time patterns for SOV versus OSV sentences (see Table 1) in order to determine the canonical processing strategy of Basque speakers. In Experiment 2 we aimed at determining whether fully ambiguous sentences (see Table 1) are processed by means of a default subject-first strategy (Bates et al., 1988). Finally, Experiment 3 aimed to extend and to evaluate the previous word order effects and the world knowledge guided ambiguity resolution effects using ERPs.

Section snippets

Experiment 1

The main objective of this experiment was to find out whether native speakers of Basque process SOV and OSV sentences differently. According to the majority of syntactic analyses, (De Rijk, 2007, Hualde and Ortiz de Urbina, 2003) SOV is the canonical word order, and OSV is derived (see examples of Experiment 1 in Table 1). Using a self-paced reading paradigm, reading times to each word and the whole sentence were studied. We expected sentences featuring an OSV word order to require longer

Experiment 2

Given our previous results, suggesting that the processing cost of OSV sentences is greater than the one observed in SOV sentences, we wanted to further explore the processing of fully ambiguous sentences. In particular, we wanted to determine whether fully ambiguous sentences were processed resorting to a default Subject-first strategy, a processing strategy that has been thoroughly reported in the literature for many languages (Bates et al., 1988).

In this experiment we presented three types

Experiment 3

Using ERPs we aimed to obtain electrophysiological evidence of the syntactic complexity observed in previously presented behavioral Experiments 1 and 2. On the one hand, we investigated whether the Object of non-canonical OSV sentences, in comparison with the Subject of canonical SOV sentences (respectively, example 9b and 9a in Table 1) generated any electrophysiological response as it has been previously observed in German (Matzke et al., 2002, Rösler et al., 1998) or not, as in Japanese (

Discussion

In the last ERP experiment we investigated the electrophysiological correlates of canonical (SOV) and non-canonical word orders (OSV) in ambiguous and unambiguous syntactic contexts. The main results confirmed the evidences obtained in the previous behavioral experiments with regard to the higher processing cost derived from the higher syntactic complexity of non-canonical Object-first sentences (Experiment 1), and with regard to the existence of default subject-first processing mechanism

Conclusions

Our study shows that certain previous findings in ERPs language-processing studies observed in head initial, nominative and relatively fixed word order languages, are also observed in a head final, ergative, free word order and highly inflected and ergative language like Basque. This strongly suggests that these findings signal universal processing mechanisms, independent on parametric specifications of the grammar. We have also found strong evidence for a canonical word order processing

Acknowledgments

This research has been supported by research Grants of the European Science Foundation (BFF2002-20379-E), the Spanish Ministry of Education and Science (CSD2007-00012, SEJ2007-60751/PSIC) and the Basque Government (GIU06/52) to Itziar Laka and Kepa Erdozia, a pre-doctoral grant from the University of the Basque Country to Kepa Erdozia, the Spanish Government (SEJ2005-06067/PSIC) to Antoni Rodriguez-Fornells and a predoctoral grant from the Spanish Government to Anna Mestres-Missé. We want to

References (79)

  • O. Hauk et al.

    Effects of word length and frequency on the human event-related potential

    Clinical Neurophysiology

    (2004)
  • R. Linares et al.

    Stem allomorphy in the Spanish mental lexicon: Evidence from behavioral and ERP experiments

    Brain and Language

    (2006)
  • W. Mak et al.

    Animacy in processing relative clauses: The hikers that rocks crush

    Journal of Memory and Language

    (2006)
  • M. Matzke et al.

    The cost of freedom: An ERP-study of non-canonical sentences

    Clinical Neurophysiology

    (2002)
  • B. Mohr et al.

    Lexcial decision after left, right, and bilateral presentation of function words, content words, and non-words: Evidence for interhemispheric interaction

    Neuropsychologia

    (1994)
  • H. Müller et al.

    Event-related potentials elicited by spoken relative clauses

    Cognitive Brain Research

    (1997)
  • T.F. Münte et al.

    Brain potentials and syntactic violations revisited: No evidence for specificity of the syntactic positive shift

    Neuropsychologia

    (1998)
  • A. Newman et al.

    An ERP study of regular and irregular English past tense inflection

    NeuroImage

    (2007)
  • R.C. Oldfield

    The assessment and analysis of handedness: The Edinburgh inventory

    Neuropsychologia

    (1971)
  • L. Osterhout et al.

    Event-related brain potentials elicited by syntactic anomaly

    Journal of Memory and Language

    (1992)
  • L. Osterhout et al.

    Brain potentials elicited by words: Word length and frequency predict the latency of an early negativity

    Biological Psychology

    (1997)
  • F. Rösler et al.

    Parsing of sentences in a language with varying word order: word-by-word variations of processing demands are revealed by even-related potentials

    Journal of Memory and Language

    (1998)
  • M. Schlesewsky et al.

    The neurophysiological basis of word order variations in German

    Brain and Language

    (2003)
  • M. Schlesewsky et al.

    Context-sensitive neural responses to conflict resolution: Electrophysiological evidence from subject-object ambiguities in language comprehension

    Brain Research

    (2006)
  • M. Traxler et al.

    Processing subject and object relative clauses: Evidence from eye movements

    Journal of Memory and Language

    (2002)
  • Arregi, K. (2002). Focus on Basque movements. Ph.D. Dissertation....
  • R. Assadollahi et al.

    Neuromagnetic evidence for early access to cognitive representations

    Neuroreport

    (2001)
  • J. Bahlmann et al.

    An fMRI study of canonical and noncanonical word order in German

    Human Brain Mapping

    (2007)
  • M.C. Baker

    Atoms of Language: The mind’s hidden rules of grammar

    (2001)
  • E. Bates et al.

    Psycholinguistics: A cross-language prespective

    Annual Review of Psychology

    (2001)
  • T.G. Bever

    The cognitive basis for linguistic structures

  • I. Bornkessel et al.

    The extended argument dependency model: A neurocognitive approach to sentence comprehension across languages

    Psychological Review

    (2006)
  • T.S. Braver et al.

    Functional neuroimaging of executive functions

  • C.M. Brown et al.

    Electrophysiological signatures of visual lexical processing: Open- and closed-class words

    Journal of Cognitive Neuroscience

    (1999)
  • N. Chomsky

    Lectures on government and binding

    (1981)
  • S. Coulson et al.

    Expect the unexpected: Event-related brain response to morphosyntactic violations

    Language and Cognitive Processes

    (1998)
  • C.J. Davis

    N-Watch: A program for deriving neighborhood size and other psycholinguistic statistics

    Behavior Research Methods

    (2005)
  • R. De Rijk

    Is Baque an S.O.V. language?

    Fontes Linguae Vasconum

    (1969)
  • R. De Rijk

    Standard Basque, a progressive grammar

    (2007)
  • Cited by (71)

    • ERP indexes of number attraction and word order during correct verb agreement production

      2020, Brain and Language
      Citation Excerpt :

      Our findings provide further evidence on processing mechanisms of verb agreement computation (comprehension and production) in morphologically rich languages such as Basque by taking into account the interplay of both the order of arguments and their morphological characteristics (number feature). Our pattern of results is fully supported by data from other studies in Basque with word order and number agreement manipulations (Erdocia et al., 2009; Zawiszewski & Friederici, 2009), with a larger negativity for non-canonical OSV than for canonical SOV structures in the first constituent position (S vs. O), as well as in the main verb position. However, those studies did not control for morphological specifications of the intervening arguments (number), and in that sense the current study presents novel findings revealing the interaction of both morphological characteristics and word order factors during verb agreement computation.

    • Bilingual aphasia: Assessing cross-linguistic asymmetries and bilingual advantage in sentence comprehension deficits

      2019, Cortex
      Citation Excerpt :

      Until disambiguation, the DP can correspond to the subject of an unaccusative verb (2), or to a sentence-initial object (3), or to a topicalized object in a OSV sentence (4) (see also Laka, 2012). When presented with temporarily ambiguous sentences such as (4), healthy speakers employ a “subject-first” processing strategy, and systematically revise their initial parsing routine when confronted with the second DP (Erdocia et al., 2009). There is evidence suggesting that healthy speakers of Basque use word-order information to resolve morphological ambiguities affecting sentence interpretations (for a review, Laka & Erdocia, 2012).

    • Philippine Psycholinguistics

      2024, Annual Review of Linguistics
    View all citing articles on Scopus
    View full text