Seeking an optimal design of Search and Merge: its consequences and challenges

Nobu Goto; Toru Ishii

doi:10.1515/tlr-2024-2005

Article Open Access

Seeking an optimal design of Search and Merge: its consequences and challenges

Nobu Goto and Toru Ishii

Published/Copyright: January 19, 2024

Published by

Become an author with De Gruyter Brill

Author Information Explore this Subject

From the journal The Linguistic Review Volume 41 Issue 1

Abstract

We propose that Merge, both External Merge and Internal Merge, is totally free from Minimal Search, and more specifically, Search Σ to determine the input of Merge only obeys Binarity and the Phase Impenetrability Condition but not Minimal Search (the Minimal Search-free Merge Hypothesis). We argue that our proposal provides a unified account of various movement restrictions, such as the freezing effect, the that-trace effect, the anti-locality effect, the vacuous movement hypothesis, and the economy of derivation. We also argue that our proposal derives the insights/consequences of Minimal Yield such as ruling out so-called extensions of Merge by limiting the search space at a later stage of a derivation in terms of Binarity and the PIC. We further expand the empirical and theoretical scope of our proposal by considering exceptions to the freezing effect. We suggest that the exceptions can be dealt with by adopting Form Copy.

Keywords: Strong Minimalist Thesis; Merge; Search Σ; Resource Restriction; Binarity

1 Introduction

The minimalist program (MP) proposes as its core hypothesis the Strong Minimalist Thesis (SMT). The SMT states that the internal computational system for human language, which generates expressions that interface with the Conceptual-Intentional (CI) system, is an optimal system (Chomsky 1995, 2013, 2015, 2021, to appear). The MP with the SMT has validated optimal language design by revealing that core computational operations work in accord with Optimal Computation, a more general and independently motivated third factor principle. Chomsky (2021) proposes that Merge, both External Merge (EM) and Internal Merge (IM), obeys Resource Restriction, a general property of brain computation. Resource Restriction reduces resources available to computational operations, i.e., the set of elements accessible to computational operations, to the minimum, thereby contributing to Optimal Computation. Assuming that Resource Restriction includes the conditions that restrict accessible elements such as Binarity, Minimal Search (MS), Minimal Yield (MY), and the Phase Impenetrability Condition (PIC), Chomsky argues that IM is constrained by MS. This paper proposes the Minimal Search (MS)-free Merge Hypothesis, according to which Merge is totally free from MS, and thus Search Σ to determine the input of Merge only obeys Binarity and the PIC but not MS. It is shown that our hypothesis can get rid of unnecessary complications of Chomsky’s system where only IM is constrained by MS and provide a unified account of various movement constraints in such a way that cannot be obtained otherwise.

This paper is organized as follows. Section 2 introduces the framework in Chomsky (2021), and points out problems with the claim that IM is constrained by MS. Section 3 proposes the MS-free Merge Hypothesis, claiming that Merge is totally free from MS. Section 4 discusses some empirical and theoretical consequences of the MS-free Merge Hypothesis. We first show that the MS-free Merge Hypothesis can account for various movement restrictions, such as the freezing effect, the that-trace effect, the anti-locality effect, the vacuous movement hypothesis, and the economy of derivation. We then argue that the MS-free Merge Hypothesis rules in the legitimate IM but rules out the so-called extensions of Merge such as Late Merge, Parallel Merge, and Sideward Movement without recourse to Minimal Search or Minimal Yield. Section 5 presents an empirical challenge to the MS-free Merge Hypothesis. We investigate exceptions to the freezing effect, arguing that they can be accommodated under our hypothesis by extending Chomsky’s (2021) Form Copy analysis from A-positions to A′-positions. Section 6 concludes the paper.

2 Framework

2.1 Merge and Search

The most basic properties of human languages that any linguistic theory has to capture are discrete infinity and displacement. Since Chomsky (1995), it has been assumed that these properties are captured by Merge. The standard definition of Merge is as follows (1):

(1)

Merge (P, Q) = {P, Q}

The output of Merge, {P, Q}, can be obtained in two ways: either by External Merge (EM) or by Internal Merge (IM). EM takes two distinct Syntactic Objects (SOs) and combines them into one, yielding the effect of base generation. IM differs from EM in taking a subpart of an existing SO as one of the two SOs. IM thus yields the effect of syntactic movement. If EM applies to P and Q, where neither is a term of the other, the set {P, Q} is yielded. If IM applies to P and Q, where Q is a term of P, the set {Q₁ {_P … Q₂ …}} is yielded.^[1] Discrete infinity is captured by recursive application of EM and IM, and displacement by application of IM.

Under this standard definition of Merge, we cannot answer (at least) the following two questions: How is the input of Merge selected? How are exocentric constructions such as the subject-predicate construction generated?^[2] If Merge is just a combinatorial set-formation operation, then a process of finding and selecting the items to be combined should not be part of Merge. Also, in order to construct exocentric structures like the subject-predicate construction, NP and VP are constructed in parallel before they are combined. This means that there must be a space in which the subject NP and the predicate VP are formed in parallel and put together. To clarify these matters, Chomsky (2021, to appear) incorporates an operation Search Σ to select items to which Merge applies, and proposes that Merge should be an operation that applies to Workspace (WS), not to particular SOs, reformulating Merge as follows:^[3]

(2)

Merge (P, Q, WS) = WS’ = [{P, Q} …]

In (2), Merge applies to WS containing P and Q, and maps to a new workspace WS’, i.e., the set containing the new item {P, Q} and other things carried over, unaffected by the operation. Note that P and Q are items selected from the lexicon (LEX) or WS by Search Σ. Before Merge applies to WS, Search Σ accesses LEX and WS = [X₁, …, X_n] from which it selects P and Q, providing P and Q to Merge as its input operand. Search Σ is essential to the application of Merge, and Merge is not applied unless Search Σ, which provides the elements to Merge, is applied.

2.2 Search Σ and Optimal Computation

Following Chomsky (2021, to appear), we assume that Search Σ satisfies Optimal Computation. Let us consider how Search Σ works in accordance with Optimal Computation. When the operation Σ searches LEX, where items have no structural relationship to each other, simply stored randomly, it searches the entire LEX simultaneously. On the other hand, when Σ searches WS, which is a set of already generated items that is hierarchically constructed in a stepwise manner, it applies in a way that follows Optimal Computation as advocated by Chomsky (2021, to appear) (see also Ke 2023). Let us consider how Chomsky’s search system works, taking WS (3) as an example:^[4]

(3)

WS = [P, Q], where P = {R, S}

When Σ searches WS = [P, Q], it selects a member of WS, either P or Q, but no terms of P or Q. Suppose that Search Σ selects P. With P fixed, Search Σ either (i) selects Q, the other member of WS, or (ii) searches into P and selects its term, say S. Merge with P and Q selected in the form (i) as input is EM, which forms {P, Q}. Merge with P and S selected in the form (ii) as input is IM, which forms {S, {_P R, S}}. Q, which is unaffected by IM, is carried over to the next WS. Notice that Search Σ cannot directly select S, which is not a member of WS. Search Σ must first select P, a member of WS, and then select S, a term of P.

2.3 Resource Restriction

Based on the idea that language design depends on properties of the brain “concerning the fundamental neural processes of discarding vast amounts of information provided by super-sensitive sensory organs (Chomsky 2021: 19)”, Chomsky (2021) argues that Merge obeys Resource Restriction, a general property of brain computation. Resource Restriction reduces resources available to computational operations, i.e., the set of elements accessible to computational operations, to the minimum, thereby contributing to Optimal Computation. According to Chomsky, Resource Restriction includes conditions such as Binarity, Minimal Search (MS), Minimal Yield (MY), and the Phase Impenetrability Condition (PIC).

Let us consider the relationship between IM and MS. In the case of IM, MS, a “least effort” condition, is a process that terminates once it reaches the head of a chain formed by movement. Chomsky’s argument that IM is constrained by MS is based on the legitimacy of IM. Assuming that IM is constrained by MS, Chomsky (2021: 18) analyses (4) as follows: “If minimality of search is abandoned, nothing bars raising of who₁, which is otherwise a legitimate operation, yielding (6) [= (4): NG and TI]”:

(4)

*who₃ do you wonder if ~~who~~ ₂ was appointed ~~who~~ ₁

In (4), given that IM obeys MS, Search Σ terminates once it reaches who₂, the head of the chain (who₂, who₁). Raising of who₁ is thus blocked by MS, since who₂ is closer than who₁ to the target of movement for who. Raising of who₂, on the other hand, causes a violation of the Empty Category Principle, which is meant here as the descriptive generalization which derives the Comp-trace effect. Chomsky (2021) argues that without being constrained by MS, IM would be illegitimately allowed to apply to who₁, and (4) would be derived legitimately contrary to fact.

Furthermore, without being constrained by MS, IM would never be allowed due to Minimal Yield (MY), which is proposed by Chomsky (2021: 19) as part of Resource Restriction:

(5)

Minimal Yield (MY): Merge adds only one new accessible element to Workspace (WS).

MY requires that Merge should add only one new accessible element to WS in order to reduce the computational burden for further operations at a later stage of derivation.

Consider how EM satisfies MY, taking as an example WS₁ in (6a), where accessible elements are a, b, and c. Suppose that EM applies to a and b, forming {a, b}. Then WS₁ (6a) becomes WS₂ (6b), where accessible elements are a, b, c, and {a, b}.

(6)

WS₁ = [a, b, c]

WS₂ = [{a, b}, c]

Here, EM yields only one new accessible element, {a, b}, hence satisfying MY.

Likewise, consider how IM satisfies MY, taking (7) as an example, where IM applies to c:

(7)

WS₁ = [{a, {b, c}}]

WS₂ = [{c, {a, {b, c}}}]

In WS₁ (7a), accessible elements are a, b, c, {b, c}, and {a, {b, c}}. IM applies to c and {a, {b, c}} in WS₁, forming {c, {a, {b, c}}} in WS₂. WS₁ is mapped to WS₂. Here, IM yields two new accessible elements, i.e., another copy of c and {c, {a, {b, c}}}, hence violating MY. Chomsky (2021) argues, however, that in WS₂ (7b), no violation of MY has occurred thanks to Minimal Search (MS); the lower c is c-commanded and thus protected/blocked by the higher c, thereby being no longer accessible. This is because Search Σ terminates once it reaches the higher c, the head of the chain. Hence only one new accessible element, i.e., {c, {a, {b, c}}}, is added, satisfying MY. In this way, MS is considered to play a key role in not only ruling out illegitimate applications of IM but also ruling in legitimate applications of IM.

3 Proposals

3.1 Search Σ for Merge and other operations

We argue that although Search Σ for Merge applies in a way that follows Optimal Computation when searching WS, it does not obey Minimal Search (MS). This is contrary to Chomsky’s (2021) claim that IM is subject to MS. Essentially following Ke (2023), we define MS as follows:

(8)

Minimal Search (MS): Minimal Search looks into its search domain for a specific target and terminates as soon as the specific target is found.

It should be noted that MS is “minimal” in the sense that it is terminated as soon as its specific target is found by Search Σ. We are not claiming that Search Σ never obeys MS, but only argue that Search Σ for Merge does not, and do not exclude the possibility that Search Σ for other operations is constrained by MS. It is plausible to claim that Search Σ for operations such as Agree, Form Copy, and Labeling conforms to MS. This is because Search Σ for these operations terminates immediately once it finds the first occurrence of its specific target, i.e., a specific feature in Agree, a specific term in Form Copy, and a specific head in Labeling (see Section 5.1.2 for Form Copy).

Let us consider WS (9), where [uF] undergoes Agree with [iF], as an example:

(9)

WS = [{P_[uF] {_Q R, S_[iF]}}]

In (9), Search Σ for Agree has the specific feature [F] as its target, and terminates as soon as it finds the specific target [F] in its search domain. In the case of Search Σ for Merge, however, MS is not involved. Under Chomsky’s (2008, 2013, 2015 free Merge system, Merge is not a feature-triggered operation any more as in Chomsky’s (2001) probe-goal system; it is a feature-free operation just pairing any two SOs. In the case of Search Σ for Merge, no specific target can be defined. It does not search for any specific target but rather searches every element in LEX and/or WS (unless it is rendered inaccessible by the PIC/Transfer). Without a specification of the search target, it does not make any sense to claim that Search Σ for Merge is subject to MS. Thus, Search Σ for Merge is totally free from MS.^[5]

3.2 Minimal Search-free Merge Hypothesis

We propose that Search Σ, which is involved in essentially every operation, follows Optimal Computation, but Search Σ for Merge, by its nature, is not subject to MS. We call this hypothesis Minimal Search (MS)-free Merge Hypothesis:

(10)

Minimal Search-free Merge Hypothesis: Search Σ for Merge is free from Minimal Search.

In the MS-free Merge Hypothesis, Binarity and the PIC, both of which are part of Resource Restriction, play important roles in limiting Search Σ for Merge. Binarity restricts the number of the accessible targets of Search Σ to two, and the PIC makes the complement of a phase head inaccessible for Search Σ by Transfer. Both Binarity and the PIC reduce resources available to computational operations and thus count as part of Resource Restriction. Note that under the MS-free Merge Hypothesis, although Merge is a constraint-free combinatorial set-formation operation, Search Σ for Merge to determine the input of Merge is constrained by Binarity and the PIC. Since Merge does not apply unless Search Σ does, it follows that Merge is indirectly constrained by Binarity and the PIC through Search Σ, which is explicated in the next subsection.

3.3 Search Σ for Merge under Binarity and the PIC

To identify Transfer domains, we adopt Phase Theory in Chomsky (2013, 2015. According to that, the Transfer domain of the transitive v*P phase is the complement of R(oot) (R-COMP), as in (11a), and that of the CP phase is the complement of C (C-COMP), i.e., TP, as in (11b) (where the transferred materials are in gray):

(11)

Chomsky (2015) proposes that “R raises to v* forming R with v* affixed, hence invisible, so phasehood is activated on the copy of R,” claiming that the Transfer domain is R-COMP, whose SPEC can be responsible for successive-cyclic A′-movement at the next phase.^[6]

Let us look at how successive-cyclic A′-movement proceeds under the MS-free Merge Hypothesis:

(12)

First, Search Σ selects R(buy) and what from LEX. This satisfies Binarity on Search Σ; Merge applies to R(buy) and what, yielding WS₁ (12a).^[7] Then, under successive-cyclic A′-movement, Search Σ selects RP and what, a term of RP, from WS₁. This satisfies Binarity on Search Σ; Merge applies to what and RP, yielding WS₂ (12b). Hereafter, for ease of illustration, such a situation will be denoted as Search Σ for Merge(R, what), referring only to the relevant label/head to intend the entire SO to be merged. Then, Search Σ selects v* from LEX and RP from WS₂.^[8] This satisfies Binarity on Search Σ; Merge applies to v* and RP, yielding WS₃ (12c). With the introduction of v*, R raises to v*, forming an amalgam R-v*. Note that in WS₃ (12c), although there are two copies of what, i.e., what₁ and what₂, what₁ in R-COMP is inaccessible by the PIC, since phasehood is activated on R(buy), and R-COMP undergoes Transfer. In a similar fashion, recursive application of Search Σ and Merge satisfying Binarity will yield WS₄ (12d) (for the sake of simplicity, we are ignoring raising of subject to SPEC-T). After this, Search Σ selects C from LEX and TP from WS₄. This satisfies Binarity on Search Σ; Merge applies to C and TP, yielding WS₅ (12e) with the CP phase. Under successive-cyclic A'-movement, Search Σ for IM(C, what) applies to WS₅ (12e). This satisfies Binarity on Search Σ, since it is only what₂ in SPEC-R that is still accessible to further wh-movement; Merge applies to C and what₂, yielding WS₆ (12f) with a new copy of what₃ in the embedded SPEC-C. As is the case of the RP phase level, note that in WS₆ (12f), it is only what₃ in SPEC-C that is still accessible to further wh-movement, since C-COMP is inaccessible by the PIC (TP-Transfer).^[9] After WS₆ (12f), the sentence in (12) is derived by the successive-cyclic movement of what through SPEC-R to SPEC-C in the matrix clause. IM of what from the embedded SPEC-C to the matrix SPEC-R is derived as in (12b), and IM from the matrix SPEC-R to the matrix SPEC-C is derived as in (12f). Importantly, thanks to the PIC/Transfer at each phase level, a violation of Binarity on Search Σ for IM is avoided. In this way, successive-cyclic IM is ensured under the MS-free Merge Hypothesis (see also Goto and Ishii 2020 for an MS-free approach to movement under Determinacy, a successor concept of Minimal Yield). It should be noted that in our theory, anything can be merged with anything as long as Search Σ for Merge satisfies Binarity and the PIC, but illegitimate results are ruled out by independent factors. Thus, to satisfy Binarity, Search Σ for Merge in (12f) can in principle select an element other than what along with C, but the resulting output would be ruled out by a labeling failure of <Q, Q>.

4 Consequences

4.1 Empirical consequences

4.1.1 The freezing effect

The MS-free Merge Hypothesis can provide a principled explanation for the freezing effect that movement is not possible out of a moved element (see, among many others, Bosković 2018; Wexler and Culicover 1980 and references cited therein). One of the most typical examples of the freezing effect is the subject island effect as shown in (13) (Chomsky 1973):

(13)

*[_CP who _i did [_TP [_DP pictures of t_i]_j [_v*P t_j please you]]]

In (13), who is extracted out of the subject DP moved from SPEC-v* to SPEC-T. We can account for the ungrammaticality of (13) as a Binarity violation on Search Σ for Merge. Consider the WS before who undergoes IM to SPEC-C:

(14)

WS = [{C {_TP {_DP2 … who₂} {T {_v*P {_DP1 … who₁} {R(please)-v* … }}}}}

In (14), the DP₁ containing who₁ occupies SPEC-v*, and the DP₂ containing who₂ SPEC-T. These are required from θ-Theory (Chomsky 2021) and Labeling Theory (Chomsky 2013, 2015), respectively. The former dictates that language must provide argument structure at the Conceptual-Intentional (CI) interface, and the latter that language must provide labeled SOs at the CI interface. DP₁ is assigned a θ-role by the R(please)-v* amalgam, and the SPEC-T construction is labeled as <φ, φ> through φ-agreement between DP₂ and T. Note that since the SPEC-T construction is of symmetric {XP, YP} type, its label cannot be determined by MS. Its label is rather determined by prominent feature sharing via agreement. To generate (13), Search Σ for IM(C, who) must apply to (14). There are, however, two copies of who, i.e., who₂ and who₁. If Search Σ were to select C and who, the elements provided to IM would be ternary, i.e., C, who₂, and who₁. This would violate Binarity on Search Σ. Since (13) cannot be generated by Merge, it is ungrammatical.^[10]

Notice in this analysis that if Search Σ for Merge selected C and one of the two copies of who, i.e., either who₂ or who₁, it would satisfy Binarity on Search Σ. This search process, however, would not provide an appropriate input to Merge. This is because Merge is defined as an operation that combines two particular SOs (see Section 2.1). Phase-level memory dictates that who₂ and who₁, which are within the same Transfer domain (i.e., TP), are the occurrences of the same SO who. In other words, the SO who counts as a discontinuous SO in WS (14). Hence, there is no way of selecting C and part of a particular SO, i.e., either who₂ or who₁, so as to provide an appropriate input to Merge of C and who.^[11] It should also be recalled that to satisfy Binarity, Search Σ for Merge here can in principle select an element other than who along with C, but the resulting output would be ruled out by independent principles (e.g., by a labeling failure of <Q, Q>).

Our analysis predicts that the subject island effect is obviated in an environment where a subject DP containing a wh-phrase stays in-situ. This prediction is borne out by the following example (see Lasnik and Park 2003 for more examples):

(15)

[_CP who _i is [_TP there [_vP [_DP a picture of t_i] [on the wall]]]]

In (15), there occupies SPEC-T, and accordingly, the DP containing who stays in SPEC-v. The grammaticality of (15) can be accounted for by considering the following WS:

(16)

WS = [{C {_TP there {T {_vP {_DP … who} {v … }}}}}]

To generate (15), Search Σ for IM(C, who) applies to WS (16). There is only one copy of who in (16). When Search Σ is to select who along with C, the elements provided to IM are binary, i.e., C and who. This satisfies Binarity. Hence, (15), which can be generated by Merge, is grammatical.

To see how our theory deals with extraction from object position, consider (17) as an example:

(17)

[_CP who _i did [_TP you [_v*P see [_DP a picture of t_i]]]]

In (17), who is extracted out of the object DP, and the sentence is acceptable. Consider the WS before who undergoes IM to SPEC-C:

(18)

To generate (17), Search Σ for IM(C, who) must apply to WS (18). Note that who₁ in DP within R-COMP is inaccessible by the PIC/Transfer. When Search Σ is to select who along with C, the elements provided to IM are binary, i.e., C and who₂. This satisfies Binarity; (17) is grammatical. The fact that it is possible to extract a wh-phrase from an ECM subject (Chomsky 2008: 153) can be treated on a par with the extraction from object position, given the raising-to-object analysis of ECM, which claims that the ECM subject is raised to SPEC-R of the matrix clause.

(19)

Of which car did they believe [the picture t _i] to have caused a scandal?

Another example of the freezing effect is concerned with verb-particle constructions in English:

(20)

Mikey looked up the reference.

Mikey looked the reference up.

Johnson (1991) proposes that V(erb)-Part(icle) is a single item in the underlying structure, arguing that the basic word order of the verb particle construction is the V-Part-DP order, and the V-DP-Part order is derived, as shown in (21):

(21)

Mikey [looked_i [_VP the reference_j [_V’ [_V t_i up ] t_j]]]

Johnson (1991: 590–591) presents evidence for V-Part as a single verb. The first piece of evidence comes from morphological processes. As shown in (22), -ing to form a noun and -ed to form an adjective, which only apply to verbs, can be attached to V-Part as a whole:

(22)

a.	Mikey’s looking up of the reference is a trying affair.
b.	a looked up number

These facts indicate that the particle up should be treated as part of the single verb looked up.

The second piece of evidence is obtained from Gapping (Johnson 1991: 591):

(23)

a.	Gary looked up Sam’s number, and Mittie, my number.
b.	*Gary looked up Sam’s number, and Mittie, up my number.

Gapping treats the V-Part looked up as a single verb (23a), but it cannot apply to the verb looked alone, with the particle up being stranded (23b). Since only verbs, either a single or in series, can be gapped, (23b) shows that the particle up should be treated as part of the single verb looked up.

Johnson (1991: 607) reports that when DP precedes Part, extraction out of the DP becomes degraded (see also Lasnik 2001); the freezing effect is observed in the V-DP-Part order:

(24)

a.	*What* _i did Chris look up [stories about t_i]?
b.	* *What* _i did Chris look [stories about t_i] up?

Within the framework assumed here, if V-Part forms a single verb and if R universally raises to v* for root-categorization (Chomsky 2015: 15; see also Section 3.3), it then follows that [_R V-Part] raises to v*, whether it is in the V-Part-DP order or the V-DP-Part order, as represented in (25):

(25)

[_v*P [_R V-Part]_i-v* [_RP DP_j [_R’ t_i t_j]]]

In (25), [_R V-Part] is with v* for root-categorization. Given feature inheritance from a phase head to the head of its complement, the copy of R (t_i) inherits ϕ-features from v*. The object DP in SPEC-R agrees with the copy of R (t_i) for {ϕ, ϕ} labeling. This derives the V-Part-DP order. To derive the V-DP-Part order from this structure, V has to raise further, leaving Part in v*, and DP also needs to move to around SPEC-v*, as shown in (26):

(26)

[V_k … [_v*P DP_j [_v*’ [_R t_k-Part]_i-v* [_RP t_j [_R’ t_i t_j]]]]]

Given that V-Part is a single item in the underlying structure, wherever the exact landing sites of the moved V and the moved DP are, derivation (26) should be involved in the V-DP-Part order.^[12]

Importantly, if (24b) has the derivation as in (26), then the degraded status can be accounted for as a violation of Binarity on Search Σ for Merge. Consider the following relevant WS:

(27)

In (27), there are three copies of DP, i.e., DP₃ in SPEC-v*, DP₂ in SPEC-R, and DP₁ in COMP-R. Note that DP₁ is inaccessible by the PIC/Transfer, but DP₃ and DP₂ are accessible. Thus, if Search Σ were to select C and who, the elements provided to IM would be ternary, i.e., C, what₃, and what₂. This would violate Binarity; (24b) is ungrammatical.

4.1.2 The that-trace effect

The MS-free Merge Hypothesis can also range over the that-trace effect exemplified by (28) (Bošković 2016; Chomsky 1986; Ishii 2004; Kayne 1984; Lasnik and Saito 1992; Perlmutter 1971):

(28)

*[_CP who _i do [_TP you [_v*P think [_CP that [_TP t_i saw Bill]]]]]

In (28), who is moved to the matrix SPEC-C from the embedded SPEC-T over the embedded SPEC-C with the overt COMP-that. The unacceptability of (28) can be accounted for as a Binarity violation. Consider the WS before who undergoes IM to SPEC-C in the embedded clause:

(29)

WS = [{C(that) {_TP who₂ {T {_v*P who₁ {v … }}}}}]

To generate (28), Search Σ for IM(C, who) must apply to WS (29). There are, however, two copies of who, i.e., who₂ and who₁. If Search Σ were to select C and who, the elements provided to IM would be ternary, i.e., C, who₂, and who₁. This would violate Binarity; (28) is ungrammatical.

It is well-known that the that-trace effect disappears when C is deleted, as shown in (30):

(30)

[_CP who _i do [_TP you [_v*P think [_CP C(that) → Ø [_TP t_i T [_v*P saw Bill]]]]]]

We assume with Chomsky (2015) that finite clauses without overt COMP-that (that-less clauses) are TPs, T is a phase head through the transfer of phasehood from C to T, and T-COMP (v*P) is inaccessible by the PIC/Transfer. Evidence for the view that that-less clauses are TPs comes from the fact that topicalization is impossible in that-less clauses (see Bošković 1997; Doherty 2000):

(31)

a.	Peter doesn’t believe (that)* *Mary* _i John likes t_i.
b.	I hope (that)* *this book* _i you will read t_i.

Assuming that topicalization targets the CP field, this fact suggests that that-less clauses are TPs: topicalization in that-less clauses is impossible because there is no C head (see also, among many others, Ishii 2004; Weisler 1980, for the TP approach to that-less clauses). If there is no CP in the embedded clause of (30), the matrix RP is the next relevant phase level for further wh-movement:

(32)

To generate (30), Search Σ for IM(R, who) must apply to (32) under successive-cyclic movement. Note here that who₁ in the embedded T-COMP (the shaded v*P) is inaccessible by the PIC/Transfer. As a result, when Search Σ is to select who along with R, the elements provided to IM are binary, i.e., R and who₂. This satisfies Binarity on Search Σ for Merge. (30) is grammatical.^[13]

We can obtain the same prediction as in the analysis of obviation of the subject island effect: the that-trace effect does not occur in an environment where a wh-phrase stays in-situ. This prediction is borne out by the following contrast (Rizzi and Shlonsky 2007: 126):

(33)

a.	* *What* _i do you think that t_i is in the box?
b.	*What* _i do you think that there is t_i in the box?

The grammaticality of (33b) can be accounted for in the same way as the fact that the subject island effect is obviated when there occupies SPEC-T (see (15) and (16)).

4.1.3 Why Japanese has neither the that-trace effect nor the subject island effect

Japanese exhibits neither the that-trace effect (Ishii 2004) nor the subject island effect (Ishii 1997, 2011; Kayne 1984; Lasnik and Saito 1992; Saito and Fukui 1998):

(34)

*Dare-ni* _i	[John-ga	[[Mary-ga	t _i	atta]	koto]-ga	mondai-da	to]
who-Dat	John-Nom	Mary-Nom		met	fact-Nom	problem-is	that
omotteru	no?
think	Q
Lit. ‘Who_i, John thinks that [the fact that Mary met t _i] is a problem.’

(35)

John-ga	[t_i	zikan-doorini	tootyaku-si-ta	to]	omotteiru	no-wa
John-Nom		on.time	arrived	that	think	NL-Top
*basu-ga*	*san-dai* _i	da.
bus-Nom	three-Cl	be.Pres
Lit. ‘It is three busses that John thinks that arrived on time.’

In (34) (Ishii 2011: 408), the wh-element dare-ni ‘who-Dat’ is scrambled out of the nominative phrase that occupies the subject position in the embedded clause, and the sentence is acceptable. This shows that there is no subject island effect. In (35), the subject basu-ga san-dai ‘three busses’ is extracted out of the embedded clause with the COMP-to ‘that’, and the sentence is acceptable. This indicates that there is no that-trace effect. The question is why English has both the subject island effect and the that-trace effect, whereas Japanese has neither.

If we assume with Fukui (1986) and Kuroda (1988) that the subject in Japanese stays in SPEC-v throughout a derivation, we can give a principled explanation to this fact.^[14] One of the arguments for the claim that the subject stays in SPEC-v in Japanese is presented by Kasai (2018). His argument is based upon how coordination structures like (36) are derived, where the verb tabe ‘eat’ in the first conjunct is bare and lacks a tense morpheme:

(36)

Taroo-ga	nattoo-o	tabe,	Jiroo-ga	koohii-o	nom-u	yooni	natta.
Taro-Nom	nattoo-Acc	eat	Jiro-Nom	coffee-Acc	drink-Pres	Comp	happened
‘It happened that Taro ate natto and Jiro drank coffee.’

Kasai (2018: 9) argues that if the subject stays in SPEC-v, (36) is supposed to be derived from coordination of the embedded vPs, as illustrated in (37a), and on the other hand, if the subject moves to SPEC-T, (36) should be derived from coordination of the matrix TPs by applying ellipsis in the first conjunct, as illustrated in (37b) (strike-through here indicates ellipsis):

(37)

a.	[[_vP Taroo-ga nattoo-o tabe] & [_vP Jiroo-ga koohii-o nom] u] yooni natta.
b.	[Taroo-ga nattoo-o tabe-~~ru yooni natta~~] & [Jiroo-ga koohii-o nom-u yooni natta].

Kasai argues that the non-ellipsis analysis (37a) is more plausible than the ellipsis analysis (37b) based on the ungrammaticality of (38) below, where Taroo-ga, Hanako-ni, nattoo-o and tabe do not make a constituent:

(38)

*Taroo-ga	Hanako-ni	nattoo-o	tabe,	John-ga	Mary-ni	koohii-o
Taro-Nom	Hanako-dat	nattoo-Acc	eat	John-Nom	Mary-Dat	coffee-Acc
nomu	yooni	itta.
drink	Comp	told
‘Taro told Hanako to eat nattoo and John told Mary to drink coffee.’

Kasai argues that under the ellipsis analysis, it is not clear why (38) cannot be derived for the same reason as (37b) (by applying ellipsis in the first conjunct), but under the non-ellipsis analysis, the ungrammaticality of (38) is easily captured by a ban against coordination of non-constituents. He concludes that (36) should be analyzed as in (37a), and the subject stays in SPEC-v in Japanese.^[15]

A similar argument appears in Chomsky (2021: 34). (39) shows that the verb in the first conjunct appears in the present tense, and the verb in the second conjunct in the past tense:

(39)

John arrives every day at noon and met Bill yesterday

Based on the fact that a tense feature is not necessarily shared by two conjuncts, Chomsky (2021: 33–34) argues that tense is a feature of v. He further argues that such cases as (40) are supposed to be derived from coordination of the embedded vPs not with a single T head, as shown in (40a), but with a single INFL that only has φ-features but not a tense feature, as shown in (40b):

(40)

a.	[T, [_vP John arrive every day at noon] & [_vP John meet Bill yesterday]]
b.	[INFL, [_vP John arrive every day at noon] & [_vP John meet Bill yesterday]]

It is plausible to analyze (39) as (40b). This is because if T has a tense feature, as in (40a), not only it is unclear why the two conjuncts do not share the same tense (although they are c-commanded by the same single T head), but it is not clear at all how the different tenses are realized by the two conjuncts. If, as Chomsky argues, tense is a feature of v and there is an INFL only with φ-features without tense, however, it naturally follows that a tense feature does not have to be shared by two conjuncts like (39). Given that tense is a feature of v, since there is no φ-feature agreement in Japanese (Fukui 1986; Kuroda 1988; Saito 1985), it is natural to claim that in Japanese there is no INFL whose SPEC is the target of subject raising, and subjects remain SPEC-v in the language.^[16]

Given that the subject in Japanese stays in SPEC-v throughout a derivation, the relevant WS of (34) and (35) before the wh-element dare-ni ‘who-Dat’ and the subject basu-ga san-dai ‘three busses’ undergo IM to SPEC-C in the embedded clause is as follows (where English glosses are used for ease of exposition):

(41)

WS = [{C {_TP {T {_vP {…who/three busses} {v … }}}}}]

To generate (34) and (35), Search Σ for IM(C, who/three busses) applies to WS (41). There is only one copy of who/three busses in WS (41). When Search Σ is to select who/three busses with C, the elements provided to IM are binary, i.e., C and who/three busses. (34) and (35), which are Merge-generable, are grammatical.^[17]

4.1.4 The anti-locality effect

The anti-locality effect that a subject cannot undergo “short” topicalization (Fiengo et al. 1988; Lasnik and Saito 1992) also receives a principled account under the MS-free Merge Hypothesis. As shown by the contrast between (42a) and (42b), “short” topicalization of a subject from SPEC-T to SPEC-C is impossible, but clause-internal topicalization of an object is possible (we are assuming that an element to be topicalized is merged with C; see the discussion around (31)):

(42)

*[_CP John _i [_TP t_i came yesterday]]

[_CP Mary _i [_TP John like t_i]]

Lasnik and Saito (1992: 110–111) present two pieces of evidence for the non-existence of “short” topicalization of a subject. The first evidence is concerned with the effects of topicalization on anaphor binding. Consider the following examples:

(43)

a.	*John thinks that Mary likes himself
b.	John thinks that *himself* _i, Mary likes t_i
c.	*John thinks that himself likes Mary
d.	*John thinks that *himself* _i, t_i likes Mary

(43a) and (43b) show that if the embedded object anaphor himself does not undergo topicalization, it cannot be bound by John, but if himself undergoes topicalization, it can be bound by John. The ungrammaticality of (43c) illustrates that if the embedded subject anaphor himself does not undergo topicalization, it cannot be bound by John, in the same way as (43a). These facts lead us to predict that (43d), where the embedded subject anaphor himself undergoes topicalization, is grammatical for the same reason as (43b). This prediction, however, is not borne out. Lasnik and Saito (1992) argue that the ungrammaticality of (43d) supports the claim that a subject cannot undergo “short” topicalization.

The second evidence has to do with the following paradigm involving the subject island effect (Lasnik and Saito 1992: 111):

(44)

a.	?*Who do you think that pictures of are on sale
b.	?Which athletes do you wonder which picture of Mary bought
c.	?Which athletes do you wonder which pictures of are on sale
d.	??Which athletes do you think that pictures of, Mary bought
e.	?*Which athletes do you think that pictures of, are on sale

(44a), where the wh-phrase is extracted out of the subject DP that is moved from SPEC-v* to SPEC-T in the embedded clause, is a case of the subject island effect. On the other hand, (44b) and (44c), where the wh-phrase is extracted out of the subject/object DP that undergoes wh-movement to SPEC-C in the embedded clause, indicate that wh-extraction out of DPs in an A′-position is reasonably acceptable, irrespectively of whether the DPs are subjects or objects. Lasnik and Saito (1992: 111) observe that (44d), where the wh-phrase is extracted out of the object DP that undergoes topicalization in the embedded clause, is marginally allowed, conforming to the generalization that wh-extraction out of DPs in an A′-position is acceptable. These facts lead us to predict (44e), where a wh-phrase is extracted out of the subject DP that undergoes topicalization in the embedded clause, is as acceptable as (44b), (44c), and (44d), since the DP is in an A′-position. However, this is not the case. Lasnik and Saito (1992) argue that this indicates that a subject cannot undergo “short” topicalization.

Turning back to (42a), we can account for the anti-locality effect as a Binarity violation. Consider the following WS of (42a) before John undergoes IM to SPEC-C for topicalization:

(45)

WS = [{C {_TP John₂ {T {_vP John₁ {v … }}}}}]

To generate (42a), Search Σ for IM(C, John) must apply to WS (45). There are, however, two copies of John, i.e., John₂ and John₁. If Search Σ were to select John along with C, the elements provided to IM would be ternary, i.e., C, John₂, and John₁. This would violate Binarity. Since (42a) is not Merge-generable, it is ungrammatical. Topicalization of an object as in (42b), on the other hand, is possible. This fact can be accounted for in the same way as the extraction from an object and an ECM subject seen above.

4.1.5 The vacuous movement hypothesis

Our theory provides a principled explanation for the Vacuous Movement Hypothesis (VMH) that a wh-subject does not move locally to SPEC-C (Chomsky 1986; George 1980; Ishii 2004):

(46)

Who left?
a.	[C [_TP who₂ [T [_vP who₁ [v …	b.	[_CP who*₃ [C [_TP who₂ [T [_vP who₁ [v …

According to the VMH, the derivation of (46a) and (46b), where who stays in SPEC-T, is chosen over that of (46b), where who undergoes IM to SPEC-C from SPEC-T. In other words, string vacuous wh-movement of a subject is not allowed.

Chomsky (1986: 50) argues that the difference in acceptability between (47a) and (47b) below constitutes evidence for the VMH:

(47)

a.	?he is the man *to whom* _i I wonder [ *who* knew [which book_i to give t_i t_i]]
b.	?*he is the man *to whom* _i I wonder [ *who* John told [which book_i to give t_i t_i]]

According to Chomsky (1986), (47a), which includes relativization from within the wh-island in which who functions as a subject, is more acceptable than (47b), which involves relativization from within the wh-island in which who functions as an object. Chomsky (1986) argues that if the subject who in (47a) does not occupy the embedded SPEC-C, then the embedded SPEC-C serves as an escape hatch for successive-cyclic movement of to whom; to whom moves out of the CP complement of wonder without violating Subjacency. Since the object who in (47b) occupies the embedded SPEC-C, on the other hand, the embedded SPEC-C does not serve as an escape hatch for successive-cyclic movement of to whom; movement of to whom out of the CP complement of wonder violates Subjacency. Hence, the contrast between (47a) and (47b) follows from the VMH.

Chomsky (2021: 35) also presents parasitic gap (PG) facts as evidence in favor of the VMH:

(48)

a.	*What* _i did John file t_i [without [reading PG_i]]?
b.	* *What* _i was filed t_i [without [reading PG_i]]?

The structures of (48a) and (48b) are represented below:

(49)

a.	[_CP what₄ C-did [_TP John₁ file what₃ [without [what₂ [John₂ reading what₁]]]]]
b.	[_CP C [_TP what₄ was filed what₃ [without [what₂ [John reading what₁]]]]]

As shown by the contrast between (48a) and (48b), the object wh-phrase licenses a PG, while the subject wh-phrase does not. Following the well-established generalization that the antecedent of a PG must be in an A′-position, and a PG can only be licensed by overt movement (see, among others, Chomsky 1982, 1986; Engdahl 1983), Chomsky argues that this contrast follows from the VMH coupled with the Form Copy analysis of PG and the ban against improper Copy Pair, the Copy analog of improper movement.^[18] In (49a), since what₄ moves to SPEC-C, an A′-position, the identical inscriptions <what₄, what₂>, which are in an IM configuration, are assigned a Copy relation by Form Copy; what₄ and what₂ are interpreted as in a Copy relation. This forms the Copy pair <what₄, what₂>. This Copy pair is legitimate, since both what₄ in SPEC-C and what₂ are in an A′-position. The dependency of the parasitic chain on the licenser’s chain is guaranteed by this Copy pair; the PG is licensed. In (49b), on the other hand, the VMH requires what₄ to remain in SPEC-T, an A-position. The identical inscriptions <what₄, what₂> in an IM configuration are assigned a Copy relation by Form Copy. This results in the improper Copy pair <what₄, what₂>, however, since what₄ in SPEC-T is in an A-position whereas what₂ is in an A′-position. The PG cannot be licensed in (49b). It should be noted that if what₄ moved further from SPEC-T to SPEC-C as represented in (50), the identical inscriptions <what₅, what₂> could be assigned a Copy relation by Form Copy, which is an optional operation. This Copy pair would license PG in (48b), contrary to fact. Hence, the contrast between (48a) and (48b) presents evidence for the VMH:

(50)

[_CP what₅ [C [_TP what₄ was filed what₃ [without [what₂ [John reading what₁]]]]]]

Chomsky (2021: 35) argues that this analysis also rules out (51), whose structure is (52), due to the improper Copy pair <what₄, what₂>. It should be noted that since Form Copy only applies within a phase, what₅ in the matrix SPEC-C cannot form a Copy relation with what₂ due to the intervening phase boundaries. Hence, PG is not licensed in (51):

(51)

* What did you say was filed t [without reading PG]?

(52)

*[_CP what₅ [C [_TP did you say [_CP C [_TP what₄ was filed what₃ [without [what₂

[John reading what₁]]]]]]]]

The VMH follows from the MS-free Merge Hypothesis. Consider the relevant WS of (46a, b) below, where who still stays in SPEC-T:

(53)

WS = [{C {_TP who₂ {_vP who₁ {v …}}}}]

In WS (53), there are two copies of who, i.e., who₂ and who₁. If who were to undergo IM to SPEC-C vacuously as in (46b), Search Σ for IM(C, who) must apply. In so doing, however, there would be three accessible elements for Search Σ, i.e., C, who₂, and who₁. This would violate Binarity.^[19] In order not to violate Binarity, Search Σ for IM(C, who) should not be applied, as shown in (46a). This is exactly what the VMH requires, which follows from the MS-free Merge Hypothesis.^[20]

4.1.6 No superfluous steps

The MS-free Merge Hypothesis also provides us with an important insight to understand the last resort nature of successive-cyclic A′-movement that avoids superfluous steps. Let us compare two possible derivations of (54), which are represented in (54a) and (54b):

(54)

Who_i do you like t_i?
a.	[_CP who [C [_TP you [_vP you [_RP who* [R(like) [ …
b.	[_CP who* [C [_TP who [_TP you [_vP you [_RP who* [R(like) [ …

In (54a), who undergoes IM from SPEC-R to SPEC-C successive-cyclically without stopping over any intermediate positions. In (54b), on the other hand, who undergoes IM from SPEC-R to the TP-adjoined position before moving to SPEC-C. Within the framework of Chomsky’s (1986) Barriers system, Chomsky (1986: 5, 32) and Lasnik and Saito (1992: 72–73) present evidence to show that a wh-phrase may not adjoin to TP, as in (55b). We present their argument against TP-adjunction of a wh-phrase within the Barriers system for an expository purpose. They argue that if TPs were possible landing sites for wh-movement, then (55) would not violate Subjacency at all, as shown in (55a):

(55)

??Who_i did he wonder if she saw t_i?
a.	[_CP who [_TP he [_RP who [_RP wonder [_CP if [_TP who [_TP she [_RP who [_RP saw who]]]]]]]]]
b.	[_CP who [_TP he [_RP what [_RP wonder [_CP if [_TP she [_RP what [_RP saw who]]]]]]]]

In the Barriers system, any maximal projection is a potential barrier. A potential barrier, however, is exempted from barrierhood if it is θ-marked.^[21] In (55a), where who adjoins to the embedded TP, there are no barriers between the embedded TP-adjoined position and the matrix RP-adjoined position (i.e., VP-adjoined position in Chomsky’s 1986 system). The embedded CP is devoid of its barrierhood for Subjacency, since it is θ-marked by the matrix predicate wonder. The embedded TP is not a barrier for who in the TP-adjoined position, either. We would incorrectly predict that (55) is perfect. In (55b), where what does not adjoin to the embedded TP, the embedded CP, though θ-marked by the matrix predicate wonder, functions as a barrier for Subjacency due to the inheritance of barrierhood from the embedded TP. Since the movement of what from the embedded RP-adjoined position to the matrix RP-adjoined position crosses one barrier, i.e., the embedded CP, conforming to 1-subjacency, the derivation (55b) can correctly capture the marginality of (55).^[22]

Then, the question is, why (54b) with the TP-adjunction is prohibited, and (54a) with phase-by-phase cyclicity is chosen over (54b) with a superfluous step. If Merge is free, even such “superfluous” steps should be a freely available option. In Chomsky (1995), derivations such as (54b) with superfluous steps were excluded by the principle of Economy of Derivation, such that “shorter derivations are always chosen over longer ones” (Chomsky 1995: 139). Significantly, this economy principle follows from the MS-free Merge Hypothesis. Consider the following relevant WSs of (54a) and (54b) before who undergoes IM to SPEC-C (where who in R-COMP is inaccessible by the PIC/Transfer):

(56)

WS = [{C … {_RP who {R {…

WS = [{C … {_TP who₂ {_RP who₁ {R {…

To generate the sentence (54), Search Σ for IM(C, who) must apply. If it applies to (56a), with phase-by-phase cyclicity, it satisfies Binarity, as the elements provided to IM are binary, i.e., C and who. But, if it applied to (56b), with a superfluous step, it would violate Binarity, as the elements provided to IM would be ternary, i.e., C, who₂ and who₁. Hence, the preference for a derivation with phase-by-phase cyclicity follows from our MS-free Merge Hypothesis.

4.2 Theoretical consequences

4.2.1 Eliminating redundancy

In addition to the empirical consequences, our MS-free Merge Hypothesis has theoretical consequenses. Our hypothesis clears up some theoretical and conceptual uncertainties about the relation between IM and Minimal Search (MS) in the system advocated by Chomsky (2021). The claim that IM obeys MS induces a redundancy problem. Let us consider the configuration [X₂ … [PH ]], where there are one copy of X above the phase head (PH) and another copy of X below PH. In this configuration, MS requires that the higher X₂ is accessible but the lower X₁ is not accessible (here the strike-through line stands for inaccessibility). This (in)accessibility is also ensured by the PIC. The PIC dictates that items within the complement of PH cannot be accessed due to Transfer at a phase level. The PIC/Transfer makes the higher X₂ accessible but the lower X₁ inaccessible as it is contained in the complement of PH, as indicated by the gray shade. Since eliminating redundancies has been a working hypothesis in the minimalist program (Chomsky 1995: 152), such a redundancy between MS and the PIC should be eliminated. Our MS-free Merge Hypothesis, where MS is dissociated from Merge, eliminates this redundancy.^[23]

4.2.2 Our analysis of the illegitimate IM

Since Chomsky’s argument that IM is constrained by MS is based on the legitimacy of IM, we need to show that our hypothesis can explain the legitimacy of IM even without assuming MS.

Let us first reconsider the illegitimate IM (4) (repeated here as (57)). (58) is the relevant WS of (57) before IM is applied:

(57)

*who₃ do you wonder if ~~who~~₂ was appointed ~~who~~₁

(58)

WS = [{C {_TP who₂ {T {v {R who₁}}}}}]

To generate (57), Search Σ for IM(C, who) must apply to WS (58). There are, however, two copies of who, i.e., who₂ and who₁. If Search Σ were to select C and who, the elements provided to IM would be ternary, i.e., C, who₂, and who₁. This would violate Binarity on Search Σ. Since (57) cannot be generated by Merge, it is ungrammatical. Note that who₁ in R-COMP is accessible if the complement of the passive v phase head is not blocked by the PIC/Transfer, unlike the transitive v* phase head.^[24] Thus, our theory can correctly rule out illegitimate IM without assuming MS.

4.2.3 Our analysis of the legitimate IM

Turning now to legitimate applications of IM, let us reconsider (7) (repeated here as (59)):

(59)

WS₁ = [{a, {b, c}}]

WS₂ = [{c, {a, {b, c}}}]

In WS₂ (59b), IM applies to c. Recall that in Chomsky’s analysis, the lower c is protected/blocked by the higher c because of MS, thereby being no longer accessible; there is no violation of Minimal Yield (MY).

How can the MS-free Merge Hypothesis rule in the legitimate IM without assuming MS while capturing the effect of MY? In order to derive (59b) from (59a), Search Σ for IM needs to select {a, {b, c}} and c in it in accord with Optimal Computation. This search process satisfies Binarity, so our theory rules in IM without assuming MS. It is true that Merge adds two new accessible elements from (59a) to (59b), i.e., one copy of c and {c, {a, {b, c}}}. Hence, one might say that it is a violation of MY. It is important to notice, however, that the main purpose of MY is not to restrict the application of Merge. In fact, Chomsky (2021: 19) argues that “Merge should construct the fewest possible new items that are accessible to further operations, thereby limiting Σ [emphasis ours]”. That is, the ultimate effect to be exhibited by MY is to limit Search at a later stage of a derivation, not to restrict application of Merge at the present stage of a derivation. Reinterpreting MY in this way, there is nothing wrong with IM producing (59b) with two new elements added at the present stage of a derivation. If further Search Σ for IM applies to c in (59b), it results in a violation of Binarity due to the two copies of c, thereby limiting Search at a later stage of a derivation. Hence, our theory can capture the insight of MY without positing it as an independent constraint on Merge. Note that if further Search Σ for IM is not applied to (59b), then the derivation is maintained without any problems. Hence, our theory can correctly rule in legitimate applications of IM without assuming MS.

4.2.4 Our analysis of the extensions of Merge

One significant consequence of Chomsky’s (2021) MY is that the so-called extensions of Merge, such as Late Merge (Ishii 1997; Lebeaux 1991), Parallel Merge (Citko 2005), and Sideward Movement (Nunes 1995), are ruled out.

Let us consider (60), the relevant WSs of Parallel Merge/Sideward Movement, where (60b) is yielded by applying Parallel Merge/Sideward Movement to a and c in {b, c} in (60a):

(60)

WS₁ = [a, {b, c}]

WS₂ = [{a, c}, {b, c}]

In WS₁ (60a), the number of accessible elements is four (i.e., a, b, c, {b, c}), while in WS₂ (60b), it is six (i.e., a, c, b, c, {a, c}, {b, c}). Note here that c in {b, c} remains accessible, since it is not c-commanded and thus not protected/blocked by c in {a, c}. From WS₁ (60a) to WS₂ (60b), Merge adds two new accessible elements to WS; this violates MY, according to Chomsky (2021).^[25]

Under the MS-free Merge Hypothesis, the extensions of Merge can be ruled out as a violation of Binarity on Search Σ for Merge without recourse to MY. Let us consider (60) again, now with particular attention to WS₁ (60a) before Parallel Merge/Sideward Movement applies. In order to derive WS₂ (60b) from WS₁ (60a) in one step, Search Σ for Merge needs to select three items, i.e., a, {b, c}, and c in {b, c}. This violates Binarity on Search Σ for Merge; hence it follows that the extensions of Merge can be ruled out without recourse to MY. Recall that according to Optimal Computation, Search Σ cannot directly select c, which is not a member of WS₁. In order to select c in {b, c}, Search Σ must first select {b, c}, a member of WS₁.

5 An empirical challenge: exceptions to the freezing effect

Since we have shown in Section 4.1 that the MS-free Merge Hypothesis can capture the freezing effect, which bans extraction out of a moved phrase, a phenomenon that appears to be an exception for the freezing effect becomes a problem for the MS-free Merge Hypothesis. Bošković (2018) points out that examples like (61) in Serbo-Croatian fall under it:

(61)

*Jovanovu* _i	je	on	[_NP	t _i	sliku]_j	video	t_j.
John’s.Acc	is	he			picture.Acc	seen
‘He saw John’s picture.’

*Jovanovu* _i	je [_NP	t _i	sliku]_j	ukradena	t_j.
John’s.Acc	is		picture.Acc	stolen
‘John’s picture is stolen.’

*Jovanovu* _i	je	[_NP	t _i	prijatelj]_j	vjerovatno	t _j	otpustio	Mariju.
John’s.Nom	is			friend.Nom	probably		fired	Maria.Acc
‘John’s friend probably fired Maria.’

In (61a), (61b), and (61c), although the possessor Jovanovu ‘John’s’ is moved/Left-Branch Extracted (LBE-ed) from the moved NPs, (61a), (61b), and (61c) are acceptable, voiding the freezing effect. In the following, we suggest that the apparent exception to the freezing effect can be accommodated within the MS-free Merge Hypothesis by adopting and elaborating Chomsky’s (2021) Form Copy (FC) approach to Copy relations.

Chomsky (2021: 31) proposes that the relation Copy is not created by Merge but by FC, claiming that FC applies at the phase level and assigns a Copy relation to structurally identical inscriptions that are in a c-c(ommand) configuration. He argues that application of EM is restricted by the principle of Duality of Semantics (DoS) that “For A-positions, EM and EM alone fills a θ-position”, and application of FC is limited by the principle of Univocality, which bans an element from being assigned more than one θ-role from the same θ-assigner. Given these, he analyses raising, obligatory control, and transitive constructions as follows:

(62)

a.	John₂ seems [~~John~~₁ to have left]
b.	John₂ tried [~~John~~₁ to win]
c.	John₂ saw John₁

In (62a), the DoS requires that John₁ be introduced by EM and John₂ by IM. Since they are structurally identical inscriptions in a cc-configuration and assigned only one θ-role by leave, FC applies to them, assigning a Copy relation to <John₂, John₁>. (62a) obtains by deleting John₁. In (62b), the DoS requires that both John₁ and John₂ be introduced by EM. Since they are structurally identical inscriptions in a cc-configuration and each of them is independently assigned a θ-role by the different θ-assigners, i.e., win and try, FC applies to them, assigning a Copy relation to <John₂, John₁>. (62b) obtains by deleting John₁. (62c) is subject to the same EM process as (62b). Unlike (62b), however, John is assigned more than one θ-role by the same θ-assigner see in (62c). Hence, FC cannot apply to them due to Univocality; John₂ and John₁ are interpreted as repetitions.

Although the DoS only limits applicaiton of Merge to A-positions, Chomsky (2021: 30, fn. 44) suggests: “This [= the DoS] can be extended to A′-positions by generalizing θ-role to include positions of the left periphery in Rizzi’s sense.” Following Chomsky’s suggestion, we propose that Rizzi’s (2006) Criterial Freezing (63) count as the A′-position (i.e., left periphery position) counterpart of the DoS:^[26]

(63)

Criterial Freezing: An element satisfying a criterion is frozen in place.

This states that for A′-positions, IM cannot apply to an element in a criterial position. We argue that criterial features in the A′-system correspond with θ-roles in the A-system so that Criterial Freezing limits application of Merge to an A′-position just as the DoS limits that of Merge to an A-position.

Based on the fact that in languages like Serbo-Croatian, elements in the nominal left periphery undergo Case and φ-feature agreement within NP as in (64) (Despić 2013: 247), we propose that a bundle of Case and φ-features in the nominal left periphery count as a criterial feature.

(64)

onih	Milanovih	zelenih
those.Fem.Pl.Gen	Milan’s.Fem.Pl.Gen	green.Fem.Pl.Gen
knjiga
books.Fem.Pl.Gen

Given this assumption, we can analyze (61a) as in (65) (where we use English glosses for the sake of exposition):

(65)

a.	WS₁ = [{is, {_TP he {_NP John’s₁ picture}, {_vP seen, …
b.	WS₂ = [{John’s₂ {is, {_TP he {_NP John’s₁ picture}, {_vP seen, …

(65a) is the WS before John’s is extracted out of the fronted NP object. We assume with Bošković (2014, 2018 that languages like Serbo-Croatian do not have overt D elements or project DP above NP. We also assume with Arano and Oda (2019) that crossing a nominal boundary as well as a clause boundary forces A′-movement. Given that NP is a phase in Serbo-Croatian, SPEC-N, being the edge of the phase used as an escape hatch, counts as an A′-position. Since John’s₁ in SPEC-N, which undergoes Case and φ-feature agreement within NP, is in a criterial A′-position, Criterial Freezing (63) prevents IM from applying to John’s₁ in WS₁ (65a). Hence, when we are to map WS₁ (65a) to WS₂ (65b), John’s₂ must be introduced by EM. John’s₂ and John’s₁ are structurally identical inscriptions in a cc-configuration; FC applies to them, assigning a Copy relation to <John’s₂, John’s₁>. (61a) obtains by deleting John’s₁. The acceptability of (61a), (61b), and (61c) can be accounted for in the same way.^[27]^, ^[28]^, ^[29]^, ^[30] Hence, the exceptions to the freezing effect can be accommodated by extending Chomsky’s FC analysis from A-positions to A′-positions in terms of Criterial Freezing in the sense that application of Merge to an A-position is constrained by the θ-roles whereas application of Merge to an A′-position is constrained by criterial features.^[31]

A remaining question is how to rule out typical cases of the criterial freezing (66), whose relevant WS is (67):

(66)

*[_CP which book does [Bill wonder [_CP t’ [she read t]]]]?

(67)

WS = [{_CP which book₂ {C, {Bill wonder {which book₁ {C { …

Unless stipulated otherwise, nothing would prevent which book₂ in SPEC-C from being introduced by EM, and FC assigns a Copy relation to <which book₂, which book₁>. This would wrongly predict that (66) is grammatical, contrary to fact. We thus propose (68) as the A′-position counterpart of Univocality (cf. Gallego 2009). Call it A′-oriented Univocality:

(68)

One and only one A’-oriented interpretation is assigned to elements in a Copy relation.

In (67), since which book₂ and which book₁, which are in a Copy relation, are independently assigned an A′-oriented interpretation, i.e., an interrogative interpretation, in the matrix clause periphery and the embedded clause periphery, respectively, <which book₂, which book₁> is assigned more than one A′-oriented interpretation. This violates (68). Hence the ungrammaticality of (66) can be accounted for as an intolerable situation for interpretation just like the transitive construction (62c), where the subject and the object form a Copy relation.^[32]

Note that Univocality relevant for (62a) and A′-oriented Univocality relevant for (66) are parallel to each other in that it results in “an intolerable situation for interpretation” when an element is assigned more than one relevant interpretation, i.e., a θ-role for the former or a scope/discourse-oriented interpretation for the latter. It should be noted that there are points where these principles are not completely parallel to each other. They differ as to whether an element is assigned relevant interpretation by the same head. Recall that while an argument can be assigned more than one theta-role by distinct heads, an element cannot have more than one discourse interpretation in relation to distinct functional heads in difference clauses. We thus argue that with respect to θ-relation, if an element is assigned more than one θ-role from the same element, it counts as “an intolerable situation for interpretation”, and with respect to criterial relation, if an element is assigned more than one criterial feature (not necessarily from the same element), it counts as “an intolerable situation for interpretation”. It remains to be investigated, however, how to derive this difference between these two univocality principles.^[33]

6 Conclusions

We have proposed that Search Σ for Merge is free from Minimal Search (the MS-free Merge Hypothesis), and demonstrated that the hypothesis is empirically adequate in that it provides a unified account of the various movement restrictions that were dealt with by different constraints or principles, such as (i) the freezing effect, (ii) the that-trace effect, (iii) the anti-locality effect, (iv) the vacuous movement hypothesis, and (v) the economy of derivation. We have also argued that our hypothesis is theoretically desirable in that it makes it possible to (i) eliminate the potential incompatibility of Merge and MS; (ii) solve the redundancy problem between MS and the PIC; (iii) rule in the legitimate IM without recourse to MS; and (iv) rule out the so-called extensions of Merge without assuming Minimal Yield. We have furthermore strengthened our theory by tackling the empirical challenge to our theory, and suggesting that the exceptions to the freezing effect can be accommodated by extending Chomsky’s Form Copy analysis from A-positions to A′-positions.

Corresponding author: Nobu Goto, Toyo University, Bunkyo City, Japan, E-mail: ngoto@toyo.jp

Portions of this paper have been presented at the workshop of “Workspace, MERGE, and Labeling” at Generative Linguistics in the Old World in Asia XIII (August 4–7, 2022). We would like to thank the audience and in particular, Mamoru Saito, Željko Bošković, Myung-Kwan Park, and Rajesh Bhatt for their helpful comments and suggestions on this work, and Naoki Fukui for insightful comments on an earlier version of this paper. This work is supported by JSPS KAKENHI Grant Number 19K00692.

References

Arano, Akihiko & Hiromune Oda. 2019. The A-/A’-distinction in scrambling revisited. In Richard Stockwell, Maura O’Leary, Zhongshi Xu & Z.L. Zhou (eds.), Proceedings of the 36th West Coast Conference on Formal Linguistics, 48–54. Somerville, MA: Cascadilla Proceedings Project.Search in Google Scholar

Bošković, Željko. 1997. The syntax of nonfinite complementation: An economy approach. Cambridge, MA: MIT Press.Search in Google Scholar

Bošković, Željko. 2004. Be careful where you float your quantifiers. Natural Language & Linguistic Theory 22(4). 681–742. https://doi.org/10.1007/s11049-004-2541-z.Search in Google Scholar

Bošković, Željko. 2005. Left branch extraction, structure of NP, and scrambling. In Sabel Joachim & Mamoru Saito (eds.), The free word order phenomenon: Its syntactic sources and diversity, 13–73. Berlin: Mouton de Gruyter.10.1515/9783110197266.13Search in Google Scholar

Bošković, Željko. 2007. On the locality and motivation of Move and Agree: An even moreminimal theory. Linguistic Inquiry 38(4). 589–644. https://doi.org/10.1162/ling.2007.38.4.589.Search in Google Scholar

Bošković, Željko. 2008. What will you have, DP or NP? In Emily Elfner & Martin Walkow (eds.), Proceedings of the 37th Annual Meeting of the North East Linguistic Society (1), 101–114. Amherst: University of Massachusetts, Graduate Linguistic Student Association.Search in Google Scholar

Bošković, Željko. 2014. Now I’m a phase, now I’m not a phase: On the variability of phases with extraction and ellipsis. Linguistic Inquiry 45(1). 27–89. https://doi.org/10.1162/ling_a_00148.Search in Google Scholar

Bošković, Želko. 2016. On the timing of labeling: Deducing comp-trace effects, the subject condition, the adjunct condition, and tucking in from labeling. The Linguistic Review 33(1). 17–66. https://doi.org/10.1515/tlr-2015-0013.Search in Google Scholar

Bošković, Željko. 2018. On movement out of moved elements, labels, and phases. Linguistic Inquiry 29(2). 247–282. https://doi.org/10.1162/ling_a_00273.Search in Google Scholar

Bošković, Željko. 2020. Jeroen van Craenenbroeck, Cora Pots, & Tanja Temmerman (eds.), Recent developments in phase theory. In On the coordinate structure constraint, across-the-board-movement, phases, andlabeling, 133–182. Berlin/Boston: De Gruyter Mouton.10.1515/9781501510199-006Search in Google Scholar

Bošković, Željko. 2022. On Wh and subject positions and the EPP. Paper presented at GLOW in Asia XIII, The Chinese University of Hong Kong, August 4–7.Search in Google Scholar

Bošković, Željko. To appear. The comp-trace effect and contextuality of the EPP. In Proceedings of the 39th West Coast Conference on Formal Linguistics. Somerville, MA: Cascadilla Proceedings Project.Search in Google Scholar

Bresnan, Joan. 1977. Variables in the theory of transformations. In Peter Culicover, Thomas Wasow & Adrien Akmajian (eds.), Formal syntax, 157–196. New York: Academic Press.Search in Google Scholar

Chomsky, Noam. 1973. Conditions on transformations. In Stephan Anderson & Paul Kiparsky (eds.), A Festschrift for Morris Halle, 232–286. New York: Holt, Rinehart & Winston.Search in Google Scholar

Chomsky, Noam. 1982. Some concepts and consequences of the theory of government and binding. Cambridge, MA: MIT Press.Search in Google Scholar

Chomsky, Noam. 1986. Barriers. Cambridge, MA: MIT Press.Search in Google Scholar

Chomsky, Noam. 1995. The minimalist program. Cambridge, MA: MIT Press.Search in Google Scholar

Chomsky, Noam. 2001. Derivation by phase. In Michael Kenstowicz (ed.). Ken Hale: A life in language, 1–52. Cambridge, MA: MIT Press.10.7551/mitpress/4056.003.0004Search in Google Scholar

Chomsky, Noam. 2008. On phases. In Robert Freidin, Carlos P. Otero & Maria Luisa Zubizarreta (eds.), Foundational issues in linguistic theory: Essay in honor of Jean-Roger Vergnaud, 133–166. Cambridge, MA: MIT Press.10.7551/mitpress/7713.003.0009Search in Google Scholar

Chomsky, Noam. 2013. Problems of projection. Lingua 130. 33–49. https://doi.org/10.1016/j.lingua.2012.12.003.Search in Google Scholar

Chomsky, Noam. 2015. Problems of projection: Extensions. In Elisa Di Domenico, Cornelia Hamann & Simona Matteini (eds.), Structures, strategies and beyond – studies in honor of Adriana Belletti, 3–16. Amsterdam/Philadelphia: John Benjamins.Search in Google Scholar

Chomsky, Noam. 2021. Minimalism: Where are we now, and where can we hope to go. Gengo Kenkyu 160. 1–41.Search in Google Scholar

Chomsky, Noam. To appear. The miracle creed and SMT. In Matteo Greco & Davide Mocci (eds.), A Cartesian dream: A geometrical account of syntax. In honor of Andrea Moro, Rivista di Grammatica Generativa/Research in Generative Grammar. Lingbuzz Press. Available at: http://www.icl.keio.ac.jp/news/2023/Miracle%20Creed-SMT%20FINAL%20%2831%29%201-23.pdf.Search in Google Scholar

Citko, Barbara. 2005. On the nature of Merge: External merge, internal merge, and parallel merge. Linguistic Inquiry 36(4). 475–496. https://doi.org/10.1162/002438905774464331.Search in Google Scholar

Despić, Miloje. 2013. Binding and the structure of NP in Serbo-Croatian. Linguistic Inquiry 44(2). 239–270. https://doi.org/10.1162/ling_a_00126.Search in Google Scholar

Doherty, Cathal. 2000. Clauses without “that”: The case for bare sentential complementation in English. New York/London: Garland.Search in Google Scholar

Douglas, Jamie. 2017. Unifying the that-trace and anti-that-trace effects. Glossa: A Journal of General Linguistics 2(60). 1–28. https://doi.org/10.5334/gjgl.312.Search in Google Scholar

Engdahl, Elisabet. 1983. Parasitic gaps. Linguistics and Philosophy 6(1). 5–34. https://doi.org/10.1007/bf00868088.Search in Google Scholar

Fiengo, Robert, C.-T. James Huang, Howard Lasnik & Tanya Reinhart. 1988. The syntax of wh-in situ. In Hagit Borer (ed.), Proceedings of the 7th West Coast Conference on Formal Linguistics, 81–98. Stanford: Center for the Study of Language and Information.Search in Google Scholar

Fukui, Naoki. 1986. A theory of category projection and its application. Cambridge, MA: MIT dissertation.Search in Google Scholar

Gallego, Ángel. 2009. On freezing effects. Iberia: An International Journal of Theoretical Linguistics 1(1). 33–51.Search in Google Scholar

George, Leland. 1980. Analogical generalization in natural language syntax. Cambridge, MA: MIT dissertation.Search in Google Scholar

Goto, Nobu & Toru Ishii. 2020. The determinacy theory of movement. In Mariam Asatryan, Yixiao Song & Ayana Whitmal (eds.), Proceedings of the 50th Annual Meeting of the North Eastern Linguistic Society (2), 29–38. Amherst, MA: Graduate Linguistics Student Association, University of Massachusetts Amherst.Search in Google Scholar

Ishii, Toru. 1997. An asymmetry in the composition of phrase structure and its consequences. California: University of California, Irvine dissertation.Search in Google Scholar

Ishii, Toru. 2004. The phase impenetrability condition, the vacuous movement hypothesis and that-t effects. Lingua 114(2). 183–215. https://doi.org/10.1016/s0024-3841(03)00045-7.Search in Google Scholar

Ishii, Toru. 2011. The subject condition and its crosslinguistic variations. In Christina Galeano, Emrah Görgülü & Irina Presnyakova (eds.), Proceedings of the 40th Western Conference on Linguistics, vol. 21, 407–418. Fresno: Department of Linguistics, California State University.Search in Google Scholar

Johnson, Kyle. 1991. Object positions. Natural Language and Linguistic Theory 9(4). 577–636. https://doi.org/10.1007/bf00134751.Search in Google Scholar

Kasai, Hironobu. 2018. Case valuation after scrambling: Nominative objects in Japanese. Glossa: A Journal of General Linguistics 3(1). 1–29. https://doi.org/10.5334/gjgl.676.Search in Google Scholar

Kayne, Richard. 1984. Connectedness and binary branching. Dordrecht: Foris.10.1515/9783111682228Search in Google Scholar

Ke, Alan Hezao. 2023. Can Agree and Labeling be reduced to minimal search? Linguistic Inquiry. 1–22. https://doi.org/10.1162/ling_a_00481.Search in Google Scholar

Kishimoto, Hideki. 2010. Binding of indeterminate pronouns and clause structure in Japanese. Linguistic Inquiry 32(4). 597–633. https://doi.org/10.1162/002438901753373014.Search in Google Scholar

Kornfilt, Jaklin. 2012. Revisiting ‘suspended affixation’ and other coordinate mysteries. In Laura Brugé, Anna Cardinaletti, Giuliana Giusti, Nicola Munaro & Cecilia Poletto (eds.), Functional heads: The cartography of syntactic structures, vol. 7, 181–196. Oxford: Oxford University Press.10.1093/acprof:oso/9780199746736.003.0014Search in Google Scholar

Kuroda, Shige-Yuki. 1988. Whether we agree or not: A comparative syntax of English and Japanese. Linguisticae Investigationes 12. 1–47. https://doi.org/10.1075/li.12.1.02kur.Search in Google Scholar

Lasnik, Howard. 2001. Subjects, objects, and the EPP. In William D. Davies & Stanley Dubinsky (eds.), Objects and other subjects: Grammatical functions, functional categories, and configurationality, 103–121. Dordrecht: Kluwer.10.1007/978-94-010-0991-1_5Search in Google Scholar

Lasnik, Howard & Mamoru Saito. 1992. Move α: Conditions on its application and output. Cambridge, MA: MIT Press.Search in Google Scholar

Lasnik, Howard & Myung-Kwan Park. 2003. The EPP and the subject condition under sluicing. Linguistic Inquiry 34(4). 649–660. https://doi.org/10.1162/ling.2003.34.4.649.Search in Google Scholar

Lebeaux, David. 1991. Relative clauses, licensing and the nature of derivations. In Susan Rothstein & Margaret Speas (eds.), Phrase structure, heads and licensing, 209–239. San Diego, CA: Academic Press.10.1163/9789004373198_011Search in Google Scholar

Legate, Julie Anne. 2002. Some interface properties of the phase. Linguistic Inquiry 34(3). 506–516. https://doi.org/10.1162/ling.2003.34.3.506.Search in Google Scholar

Messick, Troy. 2020. The derivation of highest subject questions and the nature of the EPP. Glossa: A Journal of General Linguistics 5(1). 1–12. https://doi.org/10.5334/gjgl.1029.Search in Google Scholar

Neeleman, Ad, Elena Titov, Hans van de Koot & Reiko Vermeulen. 2009. A syntactic typology of topic, focus and contrast. Studies in Generative Linguistics 100. 15–51.10.1515/9783110217124.15Search in Google Scholar

Nunes, Jairo. 1995. The copy theory of movement and linearization of chains in the minimalist program. Maryland: University of Maryland, College Park dissertation.Search in Google Scholar

Perlmutter, David. 1971. Deep and surface structure constraints in syntax. New York: Holt, Rinehart Winston.Search in Google Scholar

Ramchand, Gillian. 2008. Verb meaning and the lexicon. Oxford: Oxford University Press.10.1017/CBO9780511486319Search in Google Scholar

Riemsdijk, Henk van. 1997. Push chains and drag chains: Complex predicate split in Dutch. In Shigeo Tonoike (ed.), Scrambling, 7–33. Tokyo: Kurosio.Search in Google Scholar

Rizzi, Luigi. 2006. On the form of chains: Criterial positions and ECP effects. In Lisa Lai-Shen Cheng & Norbert Corver (eds.), Wh-movement: Moving on, 97–133. Cambridge, MA: MIT Press.10.7551/mitpress/7197.003.0010Search in Google Scholar

Rizzi, Luigi & Ur Shlonsky. 2007. Strategies of subject extraction. In Uli Sauerland & Hans Martin Gärtner (eds.), Interfaces + recursion = language? Chomsky’s minimalism and the view from syntax-semantics, 115–160. Berlin: Mouton de Gruyter.10.1515/9783110207552.115Search in Google Scholar

Saito, Mamoru. 1985. Some asymmetries in Japanese and their theoretical implications. Cambridge, MA: MIT dissertation.Search in Google Scholar

Saito, Mamoru & Naoki Fukui. 1998. Order in phrase structure and movement. Linguistic Inquiry 29(3). 439–474. https://doi.org/10.1162/002438998553815.Search in Google Scholar

Weisler, Steven. 1980. The syntax of that-less relatives. Linguistic Inquiry 11(3). 624–631.Search in Google Scholar

Wexler, Kenneth & Peter Culicover. 1980. Formal principles of language acquisition. Cambridge, MA: MIT Press.Search in Google Scholar

Published Online: 2024-01-19

Published in Print: 2024-02-26

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

DOI: doi.org/10.1515/tlr-2024-2005

Keywords for this article

Strong Minimalist Thesis; Merge; Search Σ; Resource Restriction; Binarity

Creative Commons

BY 4.0