Syntactic complexity and ambiguity resolution in a free word order language: Behavioral and electrophysiological evidences from Basque
Introduction
In recent years, a rapidly growing body of experimental studies using neuroimaging techniques has explored syntactic processing of natural language. As a result, findings from linguistics and neuroscience are progressively reaching increasing levels of convergence and reciprocal relevance. However, the vast majority of language neuroimaging studies focus on rather similar languages, such as English, Italian, French, German, Spanish or Dutch. These languages belong to the Indo-European family, and share many central design properties (Baker, 2001). In linguistics, a significant expansion of the language pool investigated was crucial to uncover the interplay between universal and variable aspects of the language faculty (Chomsky, 1981, Greenberg, 1963). It is to be expected, therefore, that the exploration of language in the brain will also benefit from cross-linguistic research, so that we can differentiate language-particular processing strategies from universal language processing mechanisms. In order to discriminate between the two, it is necessary to conduct studies and gather evidence from a wide array of languages pertaining to various typological groups (Bates et al., 2001, Bornkessel and Schlesewsky, 2006).
The present study attempts to broaden the empirical basis of experimental studies, and it does it by presenting and discussing a series of behavioral and electrophysiological experiments in Basque, a free word order, case-marking and ergative language (De Rijk, 2007, Hualde and Ortiz de Urbina, 2003, Laka, 1996). From the large array of potentially relevant features of Basque which could be relevant for the emerging field of neurocognition of language, this paper focuses on two aspects: the relevance of underlying, canonical word order in sentence processing and its role in sentence-ambiguity resolution.
Languages vary with respect to word order freedom; some have a very fixed word order, like English, others allow a greater variation of word order, like Spanish, and still others allow for almost all possible combinations of phrases in a sentence, like Basque (Baker, 2001). Despite this variation, it has been argued in linguistics that grammars have an underlying, canonical word order (Chomsky, 1981, Greenberg, 1963), which surfaces in a declarative sentence that initiates discourse, that is, a sentence where no constituent is focalized and where the entire event constitutes new information (Lambrecht, 1994). Canonical word order thus reflects the simplest phrase sequence generated by the grammar. Most human languages, independently of the degree of word order freedom, group into two main types regarding canonical word order (Greenberg, 1963): Subject–Verb–Object (SVO) languages, such as English or Spanish, and Subject–Object–Verb (SOV) languages, such as Japanese or Basque (these two main types are also known as head-first and head-last languages within the Principles and Parameters model, Baker, 2001, Chomsky, 1981).
A wide array of experiments have explored the asymmetries between the subjects and objects in relative clauses (e.g., Gibson, 1998, King and Kutas, 1995, Müller et al., 1997, Traxler et al., 2002), in interrogative sentences (e.g., Ben-Shachar et al., 2004, Bornkessel et al., 2004, Felser et al., 2003, Fiebach et al., 2002) and in declarative sentences (e.g., Bahlmann et al., 2007, Hagiwara et al., 2007, Matzke et al., 2002, Rösler et al., 1998, Schlesewsky et al., 2003). Most of these researches found that Object-before-Subject structures require higher processing effort than Subject-before-Object structures. For instance, subject relative clauses (The reporter [who attacked the senator] admitted the error) and subject interrogative clauses (Thomas asks himself [who called the doctor]) are easier to process than object relatives (The reporter [who the senator attacked] admitted the error) and object interrogatives (Thomas asks himself [who the doctor called]). These studies have proposed that the larger processing effort observed might be due either to the syntactic complexity of the transformations or to the load in working memory (see Fiebach, Schleswsky, Lohmann, von Cramon, & Friederici, 2005).
However, Traxler et al. (2002) showed that the processing cost of the object relatives with respect to the subject relatives could be reduced manipulating the semantic properties of the subjects and objects. For example, this reduction has been accomplished manipulating the animacy of the subjects and the objects of subject relatives and object relatives (Mak, Vonk, & Schriefers, 2006), manipulating the argument structure of the verbs (Bornkessel et al., 2005, Schlesewsky and Bornkessel, 2006), or comparing pronominal subjects and objects (Schlesewsky et al., 2003). These results could not be explained by the transformation-based framework or by the working memory framework. Bornkessel et al. (2005) argued that the higher processing effort of the sentences could be related to the syntactic–semantic interface. These authors suggested that the linearization of the constituents’ thematic role in a given sentence could vary the processing effort of the sentences independently or jointly of the syntactic structure of the sentence (Bornkessel et al., 2005).
Regarding to the signs of the processing effort, behaviorally, easier processing correlates with shorter reading times and/or shorter error rates (see Sekerina, 2003, for an overview). Electrophysiological studies, on the other hand, showed a variety of results. For instance, Object interrogative clauses showed a left anterior negativity (LAN) and P600 pattern compared with Subject interrogatives (Felser et al., 2003, Fiebach et al., 2002). This pattern has been explained as the storage and the integration of displaced elements respectively (Felser et al., 2003, Fiebach et al., 2002). On the other hand, Bornkessel et al. (2004) showed a sustained negativity in the Object interrogatives and a N400 in the object-second position of Subject interrogatives. This N400 effect could be explained appealing to the Minimality-driven processing strategy (Bornkessel & Schlesewsky, 2006) which predicts that the subject-initial sentences are considered intransitive in German, and processing the unpredictable object then generates a N400 related to the semantic integration. Thus, in some ERP studies the subject/object asymmetries in non-declarative sentences are considered to be syntactic processes because the LAN and the P600 components are related to syntactic processing (Felser et al., 2003, Fiebach et al., 2002). In other studies, the N400 component related the subject/object asymmetries with the processing of argument structure (Bornkessel et al., 2004).
In declarative sentences, behavioral studies have shown that sentence processing is sensitive to word order. In most languages canonical word order is processed faster and with greater ease (see Sekerina, 2003, for an overview). However, in a scrambling language like Japanese, the results of behavioral studies do not converge. While some self-paced reading studies reported no differences between SOV and OSV word orders in Japanese (Tamaoka et al., 2003, Yamashita, 1997), other studies showed that OSV word orders required higher processing demands (Mazuka et al., 2002, Miyamoto and Takahashi, 2002).
ERP studies have investigated word order processing in sentences in which no interrogative or relative marker is heading the clause. Most of these studies were carried out in German (Bornkessel et al., 2002, Matzke et al., 2002, Rösler et al., 1998, Schlesewsky et al., 2003). German is not a fixed word order language: it allows SVO word order as in the sentence “der Mann sah den Jungen” (the man – SUBJECT saw the boy – OBJECT) but it also allows other word orders like OVS as in, “den Jungen sah der Mann” (the boy – OBJECT saw the man – SUBJECT). The ERP studies of Rösler et al., 1998, Matzke et al., 2002 observed a LAN component at object-first position, which has been attributed to the storage cost of the displaced Object-phrase. The authors argued that the displaced Object-phrase must wait until its canonical position is reached for interpretation. This result agrees with previous studies in which differences in word order using relative or interrogative clauses were explored. In these studies, the LAN component has been attributed to syntactic working memory storage (Felser et al., 2003, Fiebach et al., 2002, King and Kutas, 1995, Kluender and Kutas, 1993, Müller et al., 1997, Münte et al., 1998). In a recent functional magnetic resonance imaging (fMRI) study, and using the same type of sentences used by Matzke et al., 2002, Bahlmann et al., 2007 reported that non-canonical sentences elicited larger activation in the left inferior frontal gyrus. This result confirmed the idea that non-canonical sentences require a larger integration cost and make greater demands on working memory. These results in turn also converge with linguistic accounts of German syntax, where object-initial main sentences involve displacing the object from its canonical position to a higher place in the sentence-structure (Schwartz & Vikner, 1996). On the other hand, comparing sentences initial Accusative objects and Nominative Subjects, Bornkessel et al. (2002) showed a central negativity between 300 and 450 ms, but comparing sentences initial Dative objects and Nominative subjects no differences were observed. These results converged with the results of an fMRI study in which verb argument demands required dative objects (Bornkessel et al., 2005), demonstrating that the syntactic complexity can be modulated by means of the semantic properties of the verbs (active vs. object-experiencer) and the semantic properties of nouns (animate vs. inanimate).
In contrast to these previous ERP studies on German, a recent study in Japanese did not replicate the presence of a negative component in object-first position of non-canonical declarative sentences (Hagiwara et al., 2007). In Japanese, canonical word order is SOV as in the sentence “hishoga bengoshio sagasiteiru” (the secretary – SUBJECT the lawyer – OBJECT was looking for). However, it also allows other verb-final word orders, even to a greater degree than German, as for example in the sentence “bengoshio hishoga sagasiteiru” (the lawyer – OBJECT the secretary – SUBJECT was looking for). Hagiwara et al. (2007) interpreted the lack of LAN in object-first position sentences, as an indication that the non-canonical OSV structure is not complex enough to elicit an ERP component at Object-position (Hagiwara et al., 2007). Nevertheless, when Object-phrase was displaced from an embedded clause, at a greater distance, the complexity increased and the LAN component was observed at object-first position.
Moving along the constituents in the sentence, Matzke et al. (2002) reported a LAN effect in subject position of OVS non-canonical German sentences. Rösler et al. (1998), using German verb-final sentences, also reported a LAN-like component in sentence second position comparing the non-canonical Objects (S–O–IO–V) to canonical Indirect Objects (S–IO–O–V). Finally, Bornkessel et al. (2002) reported an early positivity (in between 300 and 400 ms) at subject second position in Dative Object initial sentences. In Japanese, the S in an OSV sentence (compared to the O of SOV) showed a Frontal Negativity in the right hemisphere (Hagiwara et al., 2007).
Basque (or Euskara, as it is known to its speakers) is an isolate language; it does not belong to any known language family. It is a free word order language; nearly all constituent combinations yield a grammatical sentence, as shown partially in (1):Despite this freedom in sentence word order, Basque grammar has been argued to be of the SOV type (De Rijk, 1969, Greenberg, 1963), because it has most correlated properties of this type: relative clauses and genitives are prenominal (2a, b), it has postpositions and suffixes instead of prepositions (1, 2b), determiners follow the noun (1), (2a, b), and inflected auxiliaries follow the verb (1), (2a):Given these properties, linguists consider Basque to be a Head-final language (Ortiz de Urbina, 1989, Laka, 1994, Baker, 2001). Basque is a case-marking language (1), (3), and the verb agrees with the subject, the object and the dative, if contained in the sentence (3a, b).Basque is three-way pro-drop language (Laka, 1996, Ortiz de Urbina, 1989): subjects, objects and datives can be phonologically unrealized:Basque is also an ergative language (Dixon, 1994, Levin (1983), Ortiz de Urbina (1989)); this means that subjects of intransitive clauses (SINTR) and objects of transitive clauses (O) are morphologically identical, and bear no overt case ending, while agentive subjects of transitive clauses (S) are morphologically distinct, and carry and ergative case marker (5):Given this combination of grammatical features, if the constituent gizona “the man” is encountered at the beginning of an utterance, the following possibilities arise: (i) it is the S(ubject) of an SV intransitive sentence like (5a); (ii) it is the O(bject) of an OSV sentence like (1b,c,h,i); and (iii) it is the O of a OV transitive sentence where S has been pro-dropped as in (4).
Basque Noun-Phrase morphology presents an interesting homomorphism that has been exploited in this study to explore structural ambiguity resolution. As shown in (6a, b), the form of the plural determiner for either SINTR or O is -ak, and as shown in (6b) the combination of the singular determiner -a plus the ergative marker -k yields a sequence -ak that is homophonous with the plural determiner. Consequently, given the free word order property, the same sequence of sounds could be interpreted as a canonical SOV sentence (6b) or as a non-canonical OSV sentence (6c).In the absence of any other means to disambiguate the sentence, both (6b) and (6c) are possible parsings.
Ambiguity is a pervasive phenomenon in all dimensions of natural language (semantics, phonology, morphology, syntax), and it has been massively explored in language processing at least since Bever (1970). Frazier and Fodor (1978) assumed that when confronted with a syntactically ambiguous structure, the simplest one is favored. If true, this predicts that canonical word order will be the preferred interpretation given an ambiguous input. Matzke et al. (2002) compared temporally ambiguous SVO sentences (e.g., die Frau hatte den Mann gesehen; lit.: the woman – AMBIGUOUS has the man – OBJECT seen) to temporally ambiguous OVS sentences (e.g., die Frau hatte der Mann gesehen lit.: the woman – AMBIGUOUS has the man – SUBJECT seen). Notice that in both sentences the initial feminine noun phrases are compatible with either a Subject-first or an Object-first interpretation due to the morphological ambiguity of the determiner “die”. The ERP results showed a P600 component at the disambiguation point only for OVS sentences. The authors interpreted this effect as a reanalysis of syntactic structure after an initial subject-first interpretation. Matzke et al. (2002) did not find any difference between unambiguous and ambiguous SVO sentences at the beginning of the sentence, suggesting a general subject-first processing strategy for German (Bates et al., 1988, Schlesewsky et al., 2000). However, using similar materials, Frisch, Schlesewsky, Saddy, and Alpermann (2002) found a P600 component for subject-initial temporally ambiguous sentences, which they interpreted as an indicator of syntactic ambiguity. Similarly, comparing unambiguously marked subject-initial interrogatives with questions starting with an ambiguous marker, Bornkessel et al. (2004) found that high span participants generated a sustained anterior negativity while low span participants generated a sustained parietal positivity.
The present study was designed to investigate word order effects and ambiguity resolution in Basque, using behavioral (self-paced reading) and electrophysiological techniques (ERPs). We carried out three different experiments: The purpose of Experiment 1 was to measure reading time patterns for SOV versus OSV sentences (see Table 1) in order to determine the canonical processing strategy of Basque speakers. In Experiment 2 we aimed at determining whether fully ambiguous sentences (see Table 1) are processed by means of a default subject-first strategy (Bates et al., 1988). Finally, Experiment 3 aimed to extend and to evaluate the previous word order effects and the world knowledge guided ambiguity resolution effects using ERPs.
Section snippets
Experiment 1
The main objective of this experiment was to find out whether native speakers of Basque process SOV and OSV sentences differently. According to the majority of syntactic analyses, (De Rijk, 2007, Hualde and Ortiz de Urbina, 2003) SOV is the canonical word order, and OSV is derived (see examples of Experiment 1 in Table 1). Using a self-paced reading paradigm, reading times to each word and the whole sentence were studied. We expected sentences featuring an OSV word order to require longer
Experiment 2
Given our previous results, suggesting that the processing cost of OSV sentences is greater than the one observed in SOV sentences, we wanted to further explore the processing of fully ambiguous sentences. In particular, we wanted to determine whether fully ambiguous sentences were processed resorting to a default Subject-first strategy, a processing strategy that has been thoroughly reported in the literature for many languages (Bates et al., 1988).
In this experiment we presented three types
Experiment 3
Using ERPs we aimed to obtain electrophysiological evidence of the syntactic complexity observed in previously presented behavioral Experiments 1 and 2. On the one hand, we investigated whether the Object of non-canonical OSV sentences, in comparison with the Subject of canonical SOV sentences (respectively, example 9b and 9a in Table 1) generated any electrophysiological response as it has been previously observed in German (Matzke et al., 2002, Rösler et al., 1998) or not, as in Japanese (
Discussion
In the last ERP experiment we investigated the electrophysiological correlates of canonical (SOV) and non-canonical word orders (OSV) in ambiguous and unambiguous syntactic contexts. The main results confirmed the evidences obtained in the previous behavioral experiments with regard to the higher processing cost derived from the higher syntactic complexity of non-canonical Object-first sentences (Experiment 1), and with regard to the existence of default subject-first processing mechanism
Conclusions
Our study shows that certain previous findings in ERPs language-processing studies observed in head initial, nominative and relatively fixed word order languages, are also observed in a head final, ergative, free word order and highly inflected and ergative language like Basque. This strongly suggests that these findings signal universal processing mechanisms, independent on parametric specifications of the grammar. We have also found strong evidence for a canonical word order processing
Acknowledgments
This research has been supported by research Grants of the European Science Foundation (BFF2002-20379-E), the Spanish Ministry of Education and Science (CSD2007-00012, SEJ2007-60751/PSIC) and the Basque Government (GIU06/52) to Itziar Laka and Kepa Erdozia, a pre-doctoral grant from the University of the Basque Country to Kepa Erdozia, the Spanish Government (SEJ2005-06067/PSIC) to Antoni Rodriguez-Fornells and a predoctoral grant from the Spanish Government to Anna Mestres-Missé. We want to
References (79)
- et al.
On the preservation of word order in aphasia
Brain and Language
(1988) - et al.
Neural correlates of syntactic movement: Converging evidence from two fMRI experiments
NeuroImage
(2004) - et al.
Grammar overrides frequency: Evidence from online processing of flexible word order
Cognition
(2002) - et al.
On the cost of syntactic ambiguity in human language comprehension: An individual differences approach
Cognitive Brain Research
(2004) - et al.
Who did what to whom? The neural basis of argument hierarchies during language comprehension
NeuroImage
(2005) - et al.
Separating syntactic integration cost during parsing: The processing of German WH-questions
Journal of Memory and Language
(2002) - et al.
The sausage machine: A new two-stage parsing model
Cognition
(1978) - et al.
The P600 as an indicator of syntactic ambiguity
Cognition
(2002) Linguistic complexity: Locality of syntactic dependencies
Cognition
(1998)- et al.
Differential task effects on semantic and syntactic processes as revealed by ERPs
Cognitive Brain Research
(2002)
Effects of word length and frequency on the human event-related potential
Clinical Neurophysiology
Stem allomorphy in the Spanish mental lexicon: Evidence from behavioral and ERP experiments
Brain and Language
Animacy in processing relative clauses: The hikers that rocks crush
Journal of Memory and Language
The cost of freedom: An ERP-study of non-canonical sentences
Clinical Neurophysiology
Lexcial decision after left, right, and bilateral presentation of function words, content words, and non-words: Evidence for interhemispheric interaction
Neuropsychologia
Event-related potentials elicited by spoken relative clauses
Cognitive Brain Research
Brain potentials and syntactic violations revisited: No evidence for specificity of the syntactic positive shift
Neuropsychologia
An ERP study of regular and irregular English past tense inflection
NeuroImage
The assessment and analysis of handedness: The Edinburgh inventory
Neuropsychologia
Event-related brain potentials elicited by syntactic anomaly
Journal of Memory and Language
Brain potentials elicited by words: Word length and frequency predict the latency of an early negativity
Biological Psychology
Parsing of sentences in a language with varying word order: word-by-word variations of processing demands are revealed by even-related potentials
Journal of Memory and Language
The neurophysiological basis of word order variations in German
Brain and Language
Context-sensitive neural responses to conflict resolution: Electrophysiological evidence from subject-object ambiguities in language comprehension
Brain Research
Processing subject and object relative clauses: Evidence from eye movements
Journal of Memory and Language
Neuromagnetic evidence for early access to cognitive representations
Neuroreport
An fMRI study of canonical and noncanonical word order in German
Human Brain Mapping
Atoms of Language: The mind’s hidden rules of grammar
Psycholinguistics: A cross-language prespective
Annual Review of Psychology
The cognitive basis for linguistic structures
The extended argument dependency model: A neurocognitive approach to sentence comprehension across languages
Psychological Review
Functional neuroimaging of executive functions
Electrophysiological signatures of visual lexical processing: Open- and closed-class words
Journal of Cognitive Neuroscience
Lectures on government and binding
Expect the unexpected: Event-related brain response to morphosyntactic violations
Language and Cognitive Processes
N-Watch: A program for deriving neighborhood size and other psycholinguistic statistics
Behavior Research Methods
Is Baque an S.O.V. language?
Fontes Linguae Vasconum
Standard Basque, a progressive grammar
Cited by (71)
Hybrid embeddings for transition-based dependency parsing of free word order languages
2023, Information Processing and ManagementERP indexes of number attraction and word order during correct verb agreement production
2020, Brain and LanguageCitation Excerpt :Our findings provide further evidence on processing mechanisms of verb agreement computation (comprehension and production) in morphologically rich languages such as Basque by taking into account the interplay of both the order of arguments and their morphological characteristics (number feature). Our pattern of results is fully supported by data from other studies in Basque with word order and number agreement manipulations (Erdocia et al., 2009; Zawiszewski & Friederici, 2009), with a larger negativity for non-canonical OSV than for canonical SOV structures in the first constituent position (S vs. O), as well as in the main verb position. However, those studies did not control for morphological specifications of the intervening arguments (number), and in that sense the current study presents novel findings revealing the interaction of both morphological characteristics and word order factors during verb agreement computation.
Bilinguals processing noun morphology: Evidence for the Language Distance Hypothesis from event-related potentials
2020, Journal of NeurolinguisticsBilingual aphasia: Assessing cross-linguistic asymmetries and bilingual advantage in sentence comprehension deficits
2019, CortexCitation Excerpt :Until disambiguation, the DP can correspond to the subject of an unaccusative verb (2), or to a sentence-initial object (3), or to a topicalized object in a OSV sentence (4) (see also Laka, 2012). When presented with temporarily ambiguous sentences such as (4), healthy speakers employ a “subject-first” processing strategy, and systematically revise their initial parsing routine when confronted with the second DP (Erdocia et al., 2009). There is evidence suggesting that healthy speakers of Basque use word-order information to resolve morphological ambiguities affecting sentence interpretations (for a review, Laka & Erdocia, 2012).
Philippine Psycholinguistics
2024, Annual Review of Linguistics