Pesetsky Language-Particular Processes and The Earliness Principle
Pesetsky Language-Particular Processes and The Earliness Principle
_______________ _*
_______________________________________________________
PROCESSES AND THE EARLINESS PRINCIPLE
David Pesetsky/MIT
June 1989
(1) _____________________
Principles of Economy
a. _________________________
Principle of Least Effort: If two derivations from a given
D-structure each yield legitimate outputs, and one contains
more transformational steps than the other, only the shorter
derivation is grammatical.
b. _____________________
Last Resort Principle: “UG principles are applied wherever
possible, with language-particular rules used only to “save” a
D-structure yielding no output.“
This paper proposes an alternative to these principles. Chomsky’s arguments for his
approach involve a reanalysis of Pollock’s (1989) discoveries concerning verb movement and
INFL. I will begin by briefly summarizing how Chomsky’s system works.
Chomsky uses a filter due to Lasnik (1981) as a prime mover for the various
transformations and insertions found in the English and French verbal auxiliary system. Lasnik’s
filter requires morphemes designated as affixes to be “supported” by lexical material at PF. It is
informally stated in (2):
(2) _______________
Lasnik’s Filter: An affix must be lexically supported at PF.
________________________________________________________________________________________________________________
*This is a very
____ rough draft, minimally expanded from a talk read at GLOW 1989. Here and there I indicate what I intend to elaborate on
later. In particular, the citations and footnotes are quite incomplete, and dates for citations have been supplied from memory.
-2-
(3) _______________
V-to-I Raising_
a. Marie ne parlei pas _
ei fran¸ais.
c
b. Marie parlei souvent fran¸ais.
c
I
________________
-to-V Lowering_
c. *Marie ne _
ei pas parl-ei fran¸ais.
c
d. *Marie _ei souvent parl-ei fran¸ais.
c
On the other hand, we know that Lowering of I-to-V does exist, but only where V-to-I
is impossible, as it is with English main verbs:
(4) _______________
V-to-I Raising_
a. *Bill speaksi not _
ei French.
b. *Bill speaksi often _
ei French.
I
________________
-to-V Lowering_
c. Bill [INFL _
ei] speak-si French.
d. Bill often [INFL _
ei] speak-si French.
Why does V-to-I Raising take precedence over I-to-V Lowering? Chomsky suggests
that length
________________
of derivation makes the crucial distinction. Given the ECP, it is somewhat surprising that
Affix Lowering should be allowed at all. Chomsky proposed that the ECP is satisfied for the trace
of finite affix Lowering because at LF the trace position is actually filled by Raising of V back to
INFL. Crucially, this sort of “round trip” derivation involves two steps, Lowering and Raising,
while a derivation that simply raises V to INFL involves only one, Raising. The “Least Effort”
principle requires that the shorter derivation be picked: hence the round trip induced by Lowering is
picked only when the one-way trip involving S-structure Raising is unavailable.
The round trip derivation found with I-to-V Lowering is empirically supported by the
______________________
Head Movement Constraint_ (HMC) effects (Travis 1984) induced by negation. Chomsky follows
the analysis assumed in the first part of Pollock (1989) in treating negation as a head of a NegP (but
cf. section ??? for an alternative suggestion, also due to Pollock). If the HMC derives from the
ECP, then an intervening negation should not affect the S-structure Lowering portion of the round
trip, but should block the LF Raising portion. Indeed, it is exactly in negative clauses that simple
I-to-V Lowering is impossible. Instead, an inflected do__ is inserted above Neg to support INFL —
Do
___ support:
(5)a. *Bill _
ei not speak-si French.
b. Bill does not speak French.
Let us assume that the movements and insertions that are assumed on Chomsky’s
analysis are correct, including the round trip derivation for cases of finite-I-to-V Lowering. In the
beginning, however, I will assume a simplified version of his analysis in order to keep our rather
complicated discussion manageable. In particular, two questions arise with respect to Chomsky’s
analysis that are quite important, but avoidable for the moment. The first is seen in (7a):
(7)a. ________
Question: Why can main verbs in English raise to INFL
at LF, but not at S-structure?
b. __________________
Provisional Answer: English INFL is “θ-opaque” at S-structure,
but the effects of this type of opacity are turned off at LF.
Chomsky’s actual answer to this question follows Pollock in part by relying on the
structural distinction between an I adjoined to a V (which is a V) and a V adjoined to an I (which is
an I).3
Another question can be seen in (8a):
(8)a. ________
Question: Why can ____
have and __
be verbs in English, and all verbs
in French finite clauses, raise over negation at S-structure,
while negation blocks raising at LF?
b. __________________
Provisional Answer: ___
Not, like Pollock’s ___
pas, is a modifier of
NegP, not its head. The head of NegP is in fact empty at
S-structure, allowing movement through it, but is filled at LF,
blocking movement through it.
The phenomenon of LF lexical insertion of functional heads has been argued to exist
for the English complementizer for___ in Pesetsky (in prep.). Chomsky’s actual answer to (8a) is quite
different, and has to do with issues to which I return in section ???.
The simplifications in (7) and (8) will introduce certain distortions into are argument,
which will be cleared up in section ???. Leaving the questions in (7) and (8) aside, and assuming the
basic facts of Chomsky’s analysis, I will propose an alternative characterization of the reasons for
the various steps.
(11) A t w h a t l e v e l i s L a s n i k ’ s F i l t e r s a t i s f i e d ?
a. V - t o - I R a i s i n g ³ S-structure.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
b. ( F i n i t e ) I - t o - V L o w e r i n g ³ LF
+ V-to-I Raising ³
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
c. D o - s u p p o r t ³LP-structure.
³
Now recall that the preferences for satisfaction of Lasnik’s filter are as in (12):
Putting (11) and (12) together, it is clear that the combined effects of Chomsky’s Least
Effort and Last Resort principles can be captured by the Earliness
_______________
Principle_ in (13) below:
(13) ____________________
Earliness Principle _: Satisfy filters as early as possible on the
hierarchy of levels: (DS >) S S > L F > L P .4
1.3 Differences
The Earliness Principle differs in three salient respects from Chomsky’s proposals.
First, it does not require examination of derivations as a whole, but only the structure of
chains affected by the Structural Description of Filters. This difference may have some
computational interest, but since I have not attempted to examine these questions, I will leave them
for now.
Second, Earliness is a homogeneous
____________ condition, unlike the two distinct conditions of
Economy given in (1). In the next part of this paper, I will try to establish that a “homogeneous”
Earliness Principle is feasible in one crucial respect. The Earliness Principle relies upon the idea
that Language-Particular insertion rules like do-support
__ are not merely marked as Language-
Particular, but actually apply at a level of representation set aside for rules of this sort. This idea
must be defended I will attempt to do just this, by developing two arguments that do-support
__ does
not feed Move α. The arguments will involve inversion constructions and adverb placement.
Third, the Earliness Principle is both stronger and weaker than Economy, in ways that I
will argue are advantageous:
2 L P S t r u c t u r e : T h e I n e r t n e s s o f d o_
_________________________________
The grammar we are arguing for has the articulations in (14):
ÚÄÄÄÄÄÄ LF
(14) DSÄÄÄÄÄSSÄ´
ÀÄÄÄÄÄÄ LP
This grammar predicts that LP insertion rules should not feed any S-structure
______α_ is part of the mapping from D-structure to S-structure and LF, but not
processes. Suppose Move
part of the mapping from S-structure to LP-structure. We predict that inserted do__ should not
undergo the sort of head-movement that other verbs undergo. I will provide two arguments that this
is correct. The first argument involves the inability of do
__ to undergo I-to-C Raising. The second
argument involves the inability of do__ to raise over adverbs.
-6-
(15) a. If John had read the book, he would have known the answer.
b. Had John read the book, he would have known the answer.
c. *If had John read the book, he would have known the answer.
d. *Had if John read the book, he would have known the answer.
(16) a. If John had solved the problem, he would have shown up.
b. If Mary should meet him, she would certainly come and
tell us. [“non-obligational ______
should”]
c. If John were to solve the problem, we would be happy.
d. If Mary were dying, she would look worse.
e. If Mary could speak French, she would have shown up.
f. ?If we were to take out the garbage every day, they would
have left us a note.
(on the reading “if we were to” ≈“If we were expected to”)
g. If John would drive a little faster, he would get there a
little sooner. [“agentive _____
would”]
h. *If Mary can speak French, she would have translated for us.
i. *If Bill may leave, we would have been told.
j. *If Bill might leave, Sue would have informed us.
k. *If Sue must take out the garbage each morning, she would have
asked for higher wages. [cf. (93f)]
l. *If Bill ought to take out the garbage each morning, Sue would
have informed us.
m. *If Mary shall speak French, she would have started already.
n. *If Bill should take out the garbage, we would have known about it.
[“obligational ______
should”]
o. *If Sue will speak French, she would have told us.
(17) shows that paraphrases of many of the bad modals in (16) are acceptable. We
therefore can’t look to their meaning for a simple answer:
-7-
(17)a. If Bill were permitted to leave, Sue would have informed us.
(cf. (16g)
b. If Bill were supposed to take out the garbage each morning,
Sue would have informed us. (cf. (16l),(16n))
With should
______ treated in this fashion, the generalization in (20) appears to be true:
(20) ___________________________________________________
What can occur in the protasis of a counterfactual?
An auxiliary or modal α is acceptable in INFL of the protasis of a
counterfactual iff α is acceptable as a past-tense form.
2.1.3 Do
___ in counterfactual conditionals
Now let us consider the do__ of do
__ support. Do
___ behaves as predicted so far. It is
acceptable as a past-tense form, as seen in (22):
-8-
(24) _________________________________________________________________
Auxiliaries that independently occur in counterfactuals with
‘
_________________________
if’ (surprising facts):
h. *Can Mary speak French, she would have translated for us.
i. *May Bill leave, we would have been told.
j. *Might Bill leave, Sue would have informed us.
k. *Must Sue take out the garbage each morning, she would have asked for
higher wages. [cf. (93f)]
l. *Ought Bill to take out the garbage each morning, Sue would have
informed us.
m. *Shall Mary speak French, she would have started already.
n. *Should Bill take out the garbage, we would have known about it.
o. *Will Sue speak French, she would have told us.
-9-
(25)
³ok in ³ok as ³
³counter-³true ³
³factual ³Past ³ undergoes
³protasis³tense ³ I-to-C
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍ
a. had ³ + ³ + ³ +
b. should/would ³ + ³ + ³ + non-θ-assigners
c. were-to(NON-OBL)³ + ³ + ³ +
d. were ³ + ³ + ³ +
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄ
e. could ³ + ³ + ³ –
f. were-to(OBL) ³ ? ³ + ³ – θ-assigners
g. would(AGENT) ³ + ³ + ³ –
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄ
h. can ³ – ³ – ³ –
i. may ³ – ³ – ³ –
j. might ³ – ³ – ³ –
k. must ³ – ³ – ³ –
l. ought ³ – ³ – ³ –
m. shall(AGENT) ³ – ³ – ³ –
n. should(OBLIG) ³ – ³ – ³ –
o. will ³ – ³ – ³ –
The contrast between (25a-d) and (25e-g) are immediately and strikingly reminiscent of
Pollock’s description of verb movement to INFL in English and in French infinitives.
Recall that Pollock suggested that certain instances of INFL are designated
“[+θ-opaque]” (Pollock called them simply “opaque”). A θ-assigner that moves to such an INFL
cannot assign its θ-role through its trace; this creates a θ-criterion violation. It is worth noting that
the arguments for Pollock’s Description were somewhat weak in that they rested on a particular
theory (due to Gue´ ron and Kayne) under which possessional have ____ is treated as a licensor of a
possessional Small Clause, rather than as a θ-assigning predicate in its own right. This assumption
was necessary to account for V-to-I by possessional have____ in some registers of English usage. If
have were itself a θ-assigner, such movement should be impossible:
____
The evidence mounts for the correctness of this linkage between θ-assignment and
movement when we observe the same generalization governing I-to-C in (25). I-to-C affects a
wider range of verbal elements than are affected by V-to-I, because I-to-C also affects elements that
are base-generated in
__ INFL, e.g. the modals.
I thus argue that conditional COMP is θ-opaque just like English INFL — and that the
cut between (25a-d) and (25a-g) is simply the cut between θ-assigners and non-θ-assigners.
Whether I-to-C in these Cases is an adjunction, as in Pollock’s analysis of movement to T and to
AGR, or is a substitution, as the complementary distribution with if
__ suggests, I will leave open.
One point should be noted here, which might cause confusion. The same θ-assigning
modals that cannot move to C may, of course, occur in INFL. Why is this? An answer can be
given if the effect of θ-opacity is crucially tied to movement
_________ (whether specifically adjunction,
_________ as in
Pollock’s work, I leave open). There is evidence, familiar since Emonds, that English modals are
-10-
Finally a speculation. We have accounted for which of the verbs that occur in
counterfactuals can also occur in inverted counterfactuals. We have not accounted for the
restriction of I-to-C to counterfactuals in the first place. I suggest that the restriction of I-to-C to
counterfactuals is also an example of Pollock’s Description. Assume that the availability of a past
tense form for a modal really means the availability of an unmarked tense form to which a
past-tense interpretation can be assigned in certain environments. Imagine that the relation borne by
tense to its clause is in some fashion analogous to the relation of a θ-assigner to its arguments. The
impossibility of moving to C a form with full tense interpretation can be likewise analogized to the
impossibility of moving a θ-assigner to a θ-opaque category. This would prevent all but verbs in an
unmarked tense form from moving to C in English. Among conditionals, this in effect restricts
movement to counterfactuals.
2.1.5 Non-inversion with do
__
Leaving these speculations aside, now consider do. __ We saw in (22)-(23) that do __
participates in counterfactuals. This is not surprising, since it has a past-tense form in accordance
with (20).9
Now let us consider do’s
__ expected behavior with respect to Pollock’s observations. Do
___
is a non-θ-assigner par excellence. This can be easily demonstrated by (28):
Do is a non-θ-assigner
(28) ________________________
There didn’t seem to be any need for this example.
In this light, it is very surprising that it does not undergo I-to-C, as can be seen in (29):
I would like to briefly propose a solution to this problem. With Hale (English), Diesing
(Yiddish), and Ro¨ gnvaldsson/Thra´ insson (Icelandic), I explain the lack of restrictions on modals
and do
__ in these constructions by showing that they are not instances of I-to-C; in fact, they are not
instances of I-to-anywhere. Instead, in keeping with an old suggestion of McCawley’s, made in
“English as a VSO Language”, let us assume that the “inversion” seen in questions involves not
INFL raising to C, but incomplete subject raising within IP.
(33) __________________________________
Exceptional Case Marking by INFL _
Let us also propose that the converse of is true: embedded clauses are always CPs.
This is clearly true for tensed clauses, as seen again by (35b).12 In other work (Pesetsky (in
preparation)), I defend this assumption for infinitives as well (arguing against bare IPs or
CP-deletion analyses).
I thus assume (36)
The idea that [SPEC,IP] can be an A-bar position stems from work by Diesing on
Yiddish, and by Ro¨ gnvaldsson & Thra´ insson on Icelandic. Icelandic and Yiddish differ from
English in allowing [SPEC,IP] to function as an A-bar position not only for question words and
negatives, but also for topics— yielding the well-known phenomenon of “embedded V2” in these
languages: the verb moves to INFL, and the topic moves to [SPEC,IP].14 The proposal for English
is summarized in (38):
This analysis is supported by data from topicalization and from the distribution of do
__
support in questions.
T
___
opicalization
________________
evidence:
_ Suppose matrix WH-movement lands in (SPEC,IP), but
embedded WH-movement lands in (SPEC,CP), as I have suggested. Consider the consequences of
Baltin’s observation that English topics are IP-adjoined. Topics should land to the left of a moved
WH-phrase in a matrix question, and to the right of the moved WH-phrase in an embedded
question. While all topicalization in a question is somewhat odd, these predictions seem correct, as
the data in (39)-(42) show.15
Matrix questions_
______________
-13-
Embedded questions
___________________
_
(41)a. ?I wonder [CP why [IP A book like this, I should buy]].
b. ?I wonder to whom _________
this book we should give?
c. ?Tell me what to Bill you’re going to give for Christmas?
d. ?Ask him what book to John he would give.
e. ?I need to know what with Bill he’s going to discuss.
(42)a. *I wonder [CPA book like this, [CP why I should buy]]?
b. *I wonder this book to whom we should give?
c. *I need to know to John what he would give.
d. *Tell me this book to whom you will give.
Finally, in embedded questions, which are CPs, the subject can Raise to (SPEC,IP), so
that there is no problem with nominative Case assignment in the normal fashion, and do __ is not
needed.
Actually, something more needs to be said, given that we are deriving the “Last Resort”
character of do
__ Insertion from the Earliness Principle, rather than from any “Last Resort Principle”
as such. In fact, Earliness as developed so far require INFL Lowering in examples like (44)b,
thereby preventing Do___ Insertion and incorrrectly allowing no grammatical output. We must thus
revise the notion of “satisfies” in (9)-(10) so that Lasnik’s Filter is not satisfied until the Subject
which is to be Case-marked by INFL is also legal with respect to all grammatical principles. I will
not spell out this revision here [i.e. in this draft], but it will involve a generalization of the notion of
“chain” presented there such that if INFL is to assign nominative Case to the subject, the two
elements must form a chain.17
There is thus a plausible account of Question Inversion that does not interfere with our
account of C-Inv, and brings with it certain advantages of its own. 18 The crucial point is that if
matrix Question Inversion is not an instance of I-to-C, then the availability of do
__ in questions is not
a problem for do-inertness.
__ Recall in turn that our whole discussion of do-inertness
__ is an argument
for LP-structure, which in turn is a crucial part of our Earliness hypothesis.
Pollock added the important observation that movement over an adverb does not
necessarily diagnose V-to-I, but could diagnose V movement to a position between V and INFL
which he called “AGR”. I shall call this position, not AGR, but µ. _ I will be arguing later that µ is
not an agreement element at all, but is, in fact, contentless (or, perhaps, an empty verb). Pollock’s
evidence for the existence of the µ-position comes from French infinitives. French infinitives do
not allow θ-assigners to move all the way to INFL, but do allow them to move “half-way” to µ over
an adverbial, as seen in (46a-b):
Now let us add a new observation: leftward movement over an adverb in English is
possible for almost every verbal element in IP — not merely for the main verb:
-15-
_________________________________________________
Either side of modals, base-generated in INFL. _
(47)a. Bill absolutely must be shovelling his walkway by 6:00.
b. Bill must absolutely be shovelling his walkway by 6:00.
c. *Bill must be absolutely shovelling his walkway by 6:00.
(48)a. John soon will have won the Nobel prize.
b. John will soon have won the Nobel prize.
c. *John will have soon won the Nobel prize.
E
____________________________________
ither side of aspectuals in INFL. _
E
_________________________________________
ither side of aspectuals not in INFL.
(51)a. Mary soon will recently have won the Nobel prize.
b. Mary soon will have recently won the Nobel prize.
c. Mary will soon recently have won the Nobel prize.
d. Mary will soon have recently won the Nobel prize.
e. *Mary will have soon recently won the Nobel prize.
These data suggest that µ is not limited to the area immediately above the main verb.
Each member of the auxiliary suystem, including INFL itself, may have a µ-position above it. (52a)
demonstrates such a position above have,
____ and (52b) demonstrates such a position for modals in
INFL. Note that I am suggesting that µ projects a maximal phrase containing a specifier into which
Bill moves in (52b).
___
(52)a. [IP Billj mighti [µ P tj havei [PerfectP long ti [VP said …]]]]
ÀÄÄÄÄÄ<ÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
b. [µ P Billj willi [IP soon tj ti [VP …]]]
ÀÄÄÄÄÄÄÄÄÄÄ<ÄÄÄÄÙ
But now note something important: once again, do __ is exceptional. Since every other
element in the auxiliary system can have its own µP above it, into which it moves, we might expect
the same possibilities for auxiliary do. __ does not move to µ.
__ Yet the facts are otherwise: do
Relevant data are seen in (53)-(54).
The inertness of ‘do’ for movement over an adverb_
___________________________________________
(54)a. Mary certainly has not returned from the conference yet.
b. Mary has certainly not returned from the conference yet.
c. Mary certainly did not understand the question.
d. *Mary did certainly not understand the question.
These data are expected if verb Adv_ order derives from verb-movement to µ. Such
_______
movement, like the I-to-C seen above, is impossible for do,__ since do
__ is inserted under (INFL,IP) too
______α_ to apply. And this is exactly what is predicted if do
late for Move __ Support takes place at
LP-structure.
S t r o n g : µ o r A G R ?_
3 E c o n o m y i s Toooo__________________
________________
Recall my observation earlier that Economy, but not Earliness, prohibits “spontaneous
movement” unmotivated by any filter such as Lasnik’s Filter or the ECP. I now turn to an argument
that this restriction imposed by Economy is too strong. My argument is designed to show a case
where a longer derivation is taken even when the shorter derivation is available.
My arguments concern movement over an adverb to µ in English — some of the facts
we have just been discussing in another context. I will argue first that, contrary to Pollock and
Chomsky analyses, µ is not an AGR and is not any other type of contentful affix.
Assume that movement to µ is motivated by Lasnik’s morphological filter. What kind
of morphology is µ? We can see that though it is phonologically null, it is not the familiar “zero
morpheme”, since it is semantically null as well: it makes no detectable contribution to meaning.
Thus either µ is not an affix, and hence cannot trigger Lasnik’s filter, or it has some syntactic
________ role to
play in the sentence. This is just what Pollock and Chomsky claim: they claim that µ is some sort
of agreement element. Pollock’s main evidence for identifying µ with AGR is the claim that
movement to µ filters movement to INFL: if you can’t move to µ, you can’t move to INFL. This is
predicted by the HMC if µ is an obligatory member of IP, and hence is an obligatory half-way
house on the journey to INFL.
I will argue that Pollock was wrong in his description of what can move to µ. Things
can move to µ that cannot move to INFL, and things can move to INFL that cannot move to µ.
Given the HMC, we conclude that µ is not an obligatory member of IP. 19 Things can move to
INFL that do not move to µ first because µ is not always present in the tree. The one restriction on
movement to µ will involve, not θ-marking, but Case assignment. On the other hand, the one
restriction on movement to INFL will involve, not Case-assignment, but θ-marking.
I conclude from this that if µ is an affix of some sort, it is not merely phonologically
null and semantically null, but is syntactically null as well: it licenses nothing and makes no
contribution to its structure. If this is a syntactic affix, it is like no other known affix. Notice as
well that if such affixes exist that are semantically, syntactically and phonologically null, we expect
to find affixes that are semantically and syntactically null, but not ___ phonologically null. This would
be an utterly optional and meaningless phonological string inserted as part of the morphology of
certain words. No such affix exists, which strongly suggests that µ is not an affix either. Therefore,
I conclude that Lasnik’s filter does not force movement to µ.
-17-
I will suggest finally that not even the ECP requires filling of µ — though here I will
have to cut short my explanation in the present draft. From all this I will conclude that movement
to µ is as unnecessary as µ itself. Given that this movement does occur (as we’ve just seen), we
have an argument for spontaneous movement.
3.1 Movement to µ
3.1.1 Prolegomena to main verb movement to µ in English
I will argue that the examples in (55a-c) have a structural description in which they
involve leftward main-verb movement to µ:
The ordering seen with PP objects in (55a-c) is, of course, impossible with the NP
objects seen in (55d-f). In fact, contrasts of this sort are the stock-in-trade of demonstrations of
Case adjacency. I will suggest that this reflects a basic property of movement to µ given in (56):
(58)a. _____________________________________________
disambiguate HNPS by controlling the focus _
#As for War and Peace, I gave to B´ll
ı that book
b. __________________________________________________
disambiguate HNPS by extracting from the object _
??War and Peace, which I gave to Bill a copy of…
(59)a. *As for this book, Bill has looked now at it.
b. ?As for this book, Bill has looked recently at it.
(65) /\
/ \
verb \
/\
/ \
adv1 \
/ \
/ /\
adv2 ____
/ \
direct obj.
This evidence will argue in favor of verb movement from a position between adv2
____ and
the direct object. The following sections present evidence of this sort.
Scope argument:
______________
-19-
Andrews (1983) noted that when adverbs are stacked on the left or right periphery of
the VP, the relative scope of the adverbs is as predicted if the structures are “articulated” rather than
“flat”, as indicated in (66)-(67):
In (66a), twice
_____ unambiguously has scope over intentionally:
___________ the sentence can only refer
to two events of intentional knocking. (66b) is unambiguous in the same way: it too can only refer
to two events of intentional knocking. The examples in (67) have only the opposite scope
interpretation: there was one intention, which was to knock twice.
Scope judgments of this sort give us a probe into the constituency of cases in which
adverbs intervene between a main verb and its object. If we construct examples in which two
adverbs come between a main verb and its object — examples of the form V ______________
Adv1 Adv2 PP, _ and
the PP is modestly “heavy” — we observe that scope is suddenly ambiguous, as in (68a) or (69a).
I suggest that this ambiguity is structural: (68a) and (69a) either show adverbs stacked
on the left of VP plus leftward verb movement, yielding the hierarchy in (70a), or else they show
adverbs stacked on the right of VP plus rightward PP movement, yielding the hierarchy in (70b).
Indeed, the heavier the PP gets, the more the latter interpretation is available.
(70)a. µ’ b. VP
/\ /\
/ \ / \
µ VP VP d.o.j
/ /\ /\
verbi / \ / \
adv1 \ /\ adv2
/ \ / \
/ \ /\ adv1
adv2 / \ / \
t i d.o. verb tj
Indeed, if we alter the examples in ways which tend to rule out a rightward Heavy
Shift analysis of (70b), the relative scope of the two adverbs becomes unambiguously that predicted
by (70a), as can be seen in (71).
(71) ___________________________________________
Disambiguating for rightward Heavy Shift _
a. As for Mary, Bill relied stupidly twice on her. (focus)
b. Mary’s the one who Bill relied stupidly twice on __. (extraction)
[unambiguously ( s t u p i d l y ( t w i c e … ) ) ]
-20-
We thus conclude — contra Pollock and Emonds — that main verbs do __ move leftward
over adverbs in English, to the position that we are calling µ — the position that Pollock called
AGR-S, and Chomsky AGR-O.
-21-
(79) _______
English
µ: [-θ-opaque, +Case-opaque]
French
______
µ: [-θ-opaque, –Case-opaque]
Let us now ask about English finite INFL. English INFL must be [+θ-opaque] —
unlike µ and unlike French finite INFL. The evidence for this is what it always was: English main
verbs may not raise past negation to INFL or invert in questions.
English INFL
_____ is [+θ-opaque], but is it [+Case-opaque]? I think the answer (unlike the
answer for µ) must be “no”. The relevant evidence comes constructions in which it can be argued
that a Case-marking verb has moved to INFL in English. Such examples are discussed in a recent
manuscript by Lasnik.
Lasnik’s paper argues for a version of Belletti’s idea that existential be
__ assigns Case to
its object. In support of this claim, Lasnik notes that when existential be
__ is not the main tensed
verb, it cannot be separated from its object by an adverb. On the other hand, when existential be __ is
the main tensed verb, it can be separated by an adverb. (80) shows this; (81)-(82) present similar
data for have
____ in INFL:
(80) E x i s t e n t i a l __
be _ moves to INFL
a. There are never any cops when you need them.
E x i s t e n t i a l __
be _ moves to µ
b. *My whole life, there have been never any cops when I’ve
needed them.
_ moves to µ
N o n - e x i s t e n t i a l __
be
c. My whole life, cops have been never where I’ve needed
them.
(81) ____
Have _ moves to INFL
a. ?John has never time to do anything good.
____
Have _ moves to µ
b. *John has had never time to do anything good.
M o v e m e n t t o µ o v e r _____
never
_ is possible
c. John relies never on anyone important.
(82)a. John has always something on his mind.
b. *John must have always something on his mind.
c. John knocks always on my door by mistake.
(83) _______
English
µ: [-θ-opaque, +Case-opaque]
INFL: [+θ-opaque, –Case-opaque]
French
______
µ: [-θ-opaque, –Case-opaque]
INFL: [-θ-opaque, –Case-opaque]
(finite)
But this result is of great importance: both Pollock and Chomsky assume tacitly that µ
is to be identified with some syntactically contentful position. Thus, Pollock suggests AGR-S;
Chomsky, AGR-O. Neither of them give any real argument for these identifications — just,
presumably, a background assumption that a head position must have some _____ name and must fulfill
some function other than merely acting as a landing site. The data summarized in (83) argue against
this background assumption. If the HMC _____ (or the ECP) holds of verb movement, then V-to-I should
necessarily involve movement to µ as an intermediate step. We should therefore be quite surprised
to learn about verbs that can move to INFL but not move to µ. Movement to µ should filter
movement to INFL, given the HMC.24
Yet we have just examined this type of “surprising” data. Verbs that do not
___ assign a
θ-role but do
__ assign Case — existential be ____ — may move to INFL but may not move to µ.
__ and have
The simplest explanation of this phenomenon goes as follows: when a non-θ-marking
verb moves to INFL, it does not have to pass through µ because µ does not have to be generated.
________________________________ _ In
other words, I am claiming that µ is not a syntactically contentful affix of any kind at all — neither
AGR-S nor AGR-O nor any sort of contentful affix. Since it is also not a phonologically or
semantically contentful affix, it would be an affix like none other we’ve seen, motivated only to
save the Economy principles. I conclude that µ is not an affix, and hence not subject to Lasnik’s
filter. But remember that the argument is weak in one respect: if one is willing to investigate an
affix like none we’ve seen to save the Economy principles, this possibility is, of course, open.
Let us now step back and consider how the argument has progressed. The argument is
presented in (84).
(84)i. A µ position may exist above each auxiliary and main verb.
We almost have what we need to conclude that a longer derivation may be chosen when
a shorter is available — that is, an argument against Economy.
***THIS SECTION TO BE EXPANDED (or spun off into a spearate squib**********
There is one piece missing from the argument above, however. It may be the case that
µ is strictly an optional node (hence not AGR-anything) but we might still claim that when µ is by
chance generated in a tree, V movement to µ is forced by the ECP. This would amount to claiming
that empty µ nodes must be head-governed or lexically filled — a not unreasonable claim. It looks
like this claim is false for µ, however, though a full development of the argument will await another
draft or paper.
-23-
Direct evidence for the legality of empty µ comes from multiple quantifier floating. In
a recent article, which I will assume is convincing, Sportiche has argued that the “quantifier
floating” seen in (85) actually shows “quantifier stranding” in subject position internal to VP:
(85) The kidsi must have been [cleverly [VP [all ti] pretending to
sleep]] (Sportiche (1988))
If this idea is correct, then Floating phenomena give us a probe for intermediate subject
positions. In a longer version of this paper, I demonstrate first that “floated” emphatic reflexives and
subject-oriented even
____ are also instances of the general Q-float phenomenon (for even
____ it is first
necessary to distinguish its floated use from its use in “association with focus” constructions in the
sense of Jackendoff and Rooth). Now let us ask: if floated quantifiers are actually stranded in
subject positions, what subject positions are relevant in multiple floating constructions like (86)
(similar to some examples from Dowtie and Brodie (1984))?
It can be shown, based on ordering restrictions among the quantifiers that each of the
floated elements in (86c) has been stranded separately, but I cannot pursue the matter here. The
point of these examples for our purposes is to ask what position the middle floated elements are
occupying. Even if one blames the order of the leftmost floater and have
____ on leftward V-movement,
we must still assume at least two phrases with subject positions sitting between have
____ and been,
____
whose heads are empty at S-structure. This is demonstrated in (87):
(87)
havei [µP even ei [µP all _
e [µP themSELVES _
e [VP each been awarded …
3.3 A l t e r n a t i v e s t o t h e a r g u m e n t t h a t µ i s n o t a n a f fi x a l p o s i t i o n_.
_______________________________________________________
At this point, we must consider a number of alternatives to our conclusions concerning
movement to µ and to INFL. Up to now, I have contrasted my analysis with a drastically simplified
version of Chomsky’s hypotheses. If one considers the actual mechanisms that are needed to
handle the cases Chomsky considered in his paper, a variety of important alternatives immediately
suggest themselves.
For example, I have tacitly assumed that “Case-opacity” is a filter on movement,
_________ rather
than a condition on representations. It followed from this assumption that when V-to-I fails to show
Case-opacity effects, V does not stop in µ on its way to I. I have also assumed the HMC of Travis.
From all these assumptions plus our empirical observations, the conclusion followed that µ is not an
obligatory constituent of IP. Any of the various assumptions just discussed could be false, however.
Indeed, the assumption about the HMC is quite crucially false in Chomsky’s actual system: HMC
-24-
effects are derived from the ECP, and the HMC as a descriptive generalization is, in certain cases,
argued to be wrong. The other assumptions are open to similar challenges.
Consider first Case-opacity. Case-opacity might not be a property of movement, but
rather a property of chains
______ created by movement. This alternative view might be summarized as in
(88):
(89) V-to-µ
a. INFL [µ e] [VP V… ————->
V-to-I
b. INFL [µ Vi] [VP ti… ————->
µ-deletion
c. Vi-INFL [µ ti [VP ti… ————->
2. The filter is satisfied by V-to-I Raising whenever this is possible; V-to-I Raising is only
possible for verbs of the have __ class, due to θ-opacity.
____ or be
3. Where V-to-I Raising is impossible, lower I-to-V. Except in the case where T may be
deleted, the ECP motivates subsequent LF Raising to T.
Granted that Raising at LF past Negation is impossible (as noted in point 5 above), why
is Raising at S-structure possible? This is where the replacement of Travis’s HMC with the ECP is
relevant, and where Chomsky makes crucial use of µ-deletion.
Chomsky’s suggests that the impossibility of movement over Negation at LF
___ is due to
the ECP. Chomsky here adopts Pollock’s idea that Neg heads its own maximal projection.
The possibility of movement over Neg at S-structure
_________ is explained as follows: have
____ or be
__
moves from its underlying position over Neg to INFL in two steps. First, have __ moves to µ,
____ or be
and then it moves from µ to INFL:
Returning to the impossibility of movement over Neg at LF, Chomsky provides reasons
why t’
__ should be undeletable in these cases, which I will not discuss here. In any case, if
Chomsky’s discussion is correct, then S-structure movement, but not LF movement, can sidestep
the ECP. In turn, if µ-deletion is possible, then we also have a means of sidestepping Case-opacity
formulated as in (8). Derivations that satisfy both the ECP and (8) may are schematized in (92):
(92) V-to-µ
a. INFL [µ e] [VP V… ————->
V-to-I
b. INFL [µ Vi] [VP ti… ————->
γ-marking
c. Vi-INFL [µ ti [VP ti… ————->
µ-deletion
c. Vi-INFL [µ ti [VP ti… ————->
³ +γ
ÀÄÄ>ÄÄÄÄÙ
d. Vi-INFL [µ ∅ [VP ti… satisfies ECP ___
and (88)
+γ
Some of the data I have presented in this paper allow a rather strong argument against
Chomsky’s use of µ-deletion as a method of sidestepping the ECP. This argument, if correct,
eliminates any presently available independent motivation for µ-deletion, but does require an
alternative account of the S-structure/LF asymmetry in movement over Neg.
In section 2.2, we saw that µ positions, while they may not be omnipresent, are
ubiquitous: a µ position can be found above each auxiliary or main verb. The relevant data was
presented in (48)-(52), of which a sampling is repeated here:
(93)a. Mary soon will recently have won the Nobel prize.
b. Mary soon will have recently won the Nobel prize.
c. Mary will soon recently have won the Nobel prize.
d. Mary will soon have recently won the Nobel prize.
e. *Mary will have soon recently won the Nobel prize.
-26-
Now let us return to the way Chomsky’s system allows S-structure movement over
Neg. Verbs that can move over Neg like auxiliaries have
____ and be
__ move first to an intermediate
position – µ. The trace of this movement assigns +γ to the trace internal to VP, then deletes. The
availability of this procedure has as a consequence that whenever movement over a head H can be
accomplished in two steps, the first of which involves a position lower than H which is deletable, no
ECP violation will ensue. This consequence opens the door to unwelcome HMC violations.
Consider, for example, the possibility of “leapfrogging” one auxiliary element over
another, which is quite impossible. For example, only the highest auxiliary element or main verb
may move to I:
The system does not rule this out without additional mechanisms. Consider the
structure in (95):
If µ may be an intermediate landing point for movement to INFL over negation by the
highest verbal element, there is no reason why it should not be an intermediate landing point for
movement to INFL by any member of the auxiliary system. For example:
Be moves to µ3 and then over HAVE and NEG to INFL; the trace in
(96)a. __
µ3 γ-marks the trace of __
be and deletes.
b. V moves to µ4, and then over BE, HAVE and NEG to INFL; the
trace in µ4 γ-marks the trace of V and deletes.
On the other hand, Lasnik (1981) notes that this sort of movement might create
problems for the assignment of affixes like past particple –en
___ or present participle –ing.
____ On certain
plausible assumptions about these affixes, for example, the derivation in (96a) would lead to outputs
like (97a) (where be
__ moves to INFL before –en
___ can be affixed to it) or (97b) (where be
__ moves to
INFL after –en
___ has been affixed to it):
Lasnik suggests that independent properties of English word-structure filter out forms
like be-en-s
______ or _________
read-ing-en.
_ This suggestion might conceivably be developed into a reasonable
theory. Nonetheless, other examples can be created that are not amenable to this solution.
Consider, for example, (98a-b):
(98)a. Bill ha-si not ti seemed [IP [I to] have enjoyed himself for
many years now].
b. *Bill ha-si not have seemed [IP [I to] ti enjoyed himself for
many years now].
Example (98b) shows verb raising from an embedded clause to the INFL of the matrix
clause. Example (98a) shows normal movement within a single clause. By the hypothesis in
Chomsky (1989), both examples involve two steps: (A) movement to the µ minimally
c-commanding the moved have,
____ (B) movement to INFL. The first step creates the conditions
necessary for γ-marking of the original trace of have.
____ Thus, no considerations of Economy of
Derivation distinguish the two cases. Additionally, neither example (98a) nor example (98b)
-27-
violates any conceivable principles of English word-structure. Each ends up with one instance of
unaffixed have
____ and one instance of have
____ adjoined to INFL.
Example (98b) might be taken to violate the subjacency condition of Chomsky’s
_______ depending on the barrier status of the embedded µP, the embedded IP, and the
Barriers,
auxiliary-verb projections of the higher clause. In the spirit of Barriers,
_______ however, it should be
immediately obvious that (98b) is considerably worse than any subjacency violation. 27
The particular examples discussed above were chosen in order not to beg the question
of the nature of µ. Depending on what conclusions we draw about µ, other examples can make the
same point as (98). For example, an auxiliary or main verb should be able to raise through the µ
that minimally c-commands it to some other µ-projection, as in (99):
This example would not violate the ECP if the trace in µ could delete. The example
might be taken to violate some sort of morphological constraint if µ, contrary to my claims, is __ some
sort of affix. A restriction in the spirit of Lasnik’s suggestions concerning (97) might filter out a
verb that has acquired two µ-affixes in the course of a derivation.
In any case, the status of (98) should suffice to make the desired point: µ-deletion after
γ-assignment allows too many HMC violations to satisfy the ECP. If we are to follow Chomsky’s
(1986) suggestion that all desirable instances of Travis’s HMC reduce to the ECP, and if no
plausible alternative principles take care of the troublesome cases, then we have an argument
against µ-deletion before γ-assignment.
If this is correct, then we need to find an alternative account of why S-structure
movement across negation is possible, while LF movement is impossible. My best suggestion has
in fact already been given in (8b), which I repeat below as (100):
(100) ___
Not, like Pollock’s ___
pas, is a modifier of NegP, not its head.
The head of NegP is in fact empty at S-structure, allowing
movement through it, but is filled at LF, blocking movement
through it.
This suggestion makes sense if adjunction to a filled Neg° is not allowed and if
γ-assignment precedes LF filling of Neg°.28 S-structure movement across NegP is possible because
on this account — not because movement to some lower deletable position provides the γ-marking
necessary for the ECP, but because movement to Neg° itself provides the necessary γ-marking:
(101) Mary hasi [NegP not [Neg° ti] [ti left the room]]
As I noted in connection with (8b), there is at least one other known case of a head that
behaves as at S-structure but filled at LF: this is the case of verbs that select irrealis infinitival
complements — verbs like desire.
_____ These verbs behave at S-structure (e.g. for purposes of
Exceptional Case Marking) as if the COMP of their object were empty or missing, but behave at
LF (e.g. with respect to the ECP) as if this same COMP were filled. Let us assume that at
S-structure and PF, the notion “empty category” refers to phonetic emptiness, so that a lexical item
(like null for)
___ with no phonetic matrix counts as empty for movement purposes — the position can
be moved through. At LF, on the other hand, the notion “empty category” refers to categories that
are both phonetically and semantically contentless. Thus, a null version of for ___ will count as filled at
LF, as will a null version of not.
___
-28-
Actually, Pollock, in a later section of his paper, proposes that “NegP” is actually
“AssertionP”, where “Assertion°” can be modified by negation or by emphasis. Chomsky (1957)
already noted a strong parallelism between negative sentences and sentences with emphatic do: __
Pollock adapts Chomsky’s suggestions and posits that NegP and whatever accounts for
focus as in (102) are both modifiers of Assertion°. If future investigations can invest Assertion°
with some sort of semantic content, then we shall be justified in proposing an analogy between its
behavior and the behavior of null for,
___ and (8b) will be an acceptable account of the S-structure/LF
asymmetry in movement over negation.
Taking up once more the main thread of this section, there are still a number of possible
alternatives to our hypothesis concerning µ, which I wish to mention briefly. For example, one
might accept the argument made above against µ-deletion before γ-assignment. One could then
propose that µ-deletion occurs as in (89), but does not counterbleed γ-assignment in the fashion
suggested by Chomsky (1989). Instead, one would suggest that µ deletes after
____ the ECP has applied
(and thus plays no role in explaining movement across negation), but before
______ the version of
Case-opacity in (88) applies. Notice that this hypothesis would make of µ-deletion merely a device
to explain why Case-opacity effects never show up for movement to INFL. In other words, this
hypothesis would account for the desired facts, but would be (for now) ad hoc.
One final possibility would be to restrict Case-opacity effects to chains whose heads
occupy µ. Movement through µ would escape Case-opacity effects, since the resulting chain would
not be headed by µ, but movement to µ would be subject to Case-opacity. This hypothesis faces
essentially the objections lodged against the ordering hypothesis of the preceding paragraph: the
head/non-head distinction relates to nothing else in the system, and is therefore suspicious, though
not inconceivable.
3.4 Summary
We saw in our discussion of do-inertness
__ that the Earliness Alternative is feasible:
there is good evidence that do-insertion
__ applies at its own Level of Representation. We have just
seen a somewhat more complicated argument which, if correct, militates against the Economy of
Derivation. Finally, in Part Four, we will look at an argument that militates for Earliness.
Fortunately, this argument is quite simple.
4 E c o n o m y i s Toooo Weea
____________________ _______________________________
ak : D - l i n k i n g a n d W H - m o v e m e n_t
*********THIS SECTION TO BE EXPANDED SLIGHTLY***********
In earlier work of my own (Pesetsky (1987)), I argued that WH-movement is motivated
in English and in languages like Polish by two separate conditions.
The first is a condition on the Q-morpheme in COMP (or INFL) which requires that
one WH-phrase move to its SPEC by S-structure: thus every WH-question (particularly an
-29-
embedded question) will contain at least one instance of WH-movement to SPEC,CP (or
(SPEC,IP), as discussed above)).
The second is the familiar condition on WH-phrases themselves which requires certain
of them to move to an appropriate A-bar position by LF.
Why these two overlapping conditions? The evidence concerned a distinction between
what I called “D(iscourse)-linked” and “non-D-linked” WH-phrases. Discourse-linked
WH-phrases, roughly speaking, ask questions where the range of possible answers is limited to
some set given in the prior discourse or “in the air”. WH-phrases of the form which
___________
person or
which thing_ are prototypical D-linked phrases — but almost any WH-phrase may have a D-linked
_________
usage.29
Crucially, D-linked WH-phrases in English show none of the syntactic indications of
LF WH-movement. They are, for example, immune from Superiority and ECP effects, as can be
seen by comparing the contrasting examples in (103):
This left open the question of why D-linked phrases must undergo S-structure
WH-movement in the examples in (105):
My answer was that WH-movement in (105a) is forced not by the needs of the
WH-phrase, but by the needs of the Q morpheme in COMP. I thus proposed that not only is (106b)
a possible motive for WH-movement, but (106a) is as well:
(106)a. __________
Q F i l t e r : Q must be supported by a WH-phrase in its SPEC.
b. _____________
S c o p e F i l t e r : All WH-phrases must be assigned scope:
(i) either by (LF) movement, or
(ii) for D-linked phrases, by coindexation with Q.
Important supporting evidence in favor of this approach, and in favor of the analysis of
(103) vs. (106) in terms of LF movement vs. coindexation, was supplied by Polish. The relevant
data were already presented by Wachowicz (1974), and have been consistently confirmed by most
native speakers I have asked. As is well-known, Polish, like all Slavic languages, allows multiple
WH-movement in multiple questions.30 (107) gives an example:
-30-
As has been often noted, the Slavic languages in this respect seem to “wear their LF on
their sleeve”. In my earlier paper, I took it as an exciting fact that this “sleeve-wearing” extended to
the distinction between D-linked and non-D-linked phrases. Wachowicz had already noted that
there were certain circumstances under which WH-phrases could stay in situ at S-structure in Polish
(the same facts appear to be true for Czech, Russian and perhaps Romanian), and these appeared to
be precisely when the WH-phrases were D-linked. She considered examples like (108), and made
the observation in (109):
(108) W ko˜cu,
n kto robi co?
finally who does what
(109)
“[Such] questions are somewhat different from echo questions. We can call them
clarifying questions. The speaker could ask [(108)] in the following situation.
There are various tasks, and several people to be assigned for them. Proposals have
been made how to pair up people and tasks, but no fixed plan has been set up yet.
The speaker of [(108)] is confused by the proposals, and wants to have a fixed plan.”
(Wachowicz, 1974)
I thus noted that exactly those WH-phrases which, under my analysis of English, could
be assigned scope without LF movement, were exempt from S-structure movement in multiple
fronting languages like Polish. If one thinks more carefully about the matter, however, it becomes
apparent that the contrast between the interpretive possibilities of (107) and (108) was not really
explained. To be sure, the contrast between (107) and (108) suggested a link between D-linking and
movement that strongly recalls the English data in (103) and (104), but Wachowicz’s fundamental
observation actually had no satisfactory explanation.
If co
__ in (108) happens to be D-linked and thus receives scope by coindexation, we
know why it need not move and why it cannot move at S-structure or LF. But suppose co __ is not
interpreted as D-linked? Why isn’t LF movement available to non-D-linked in-situ co? __
(110) ________________
Pesetsky (1987)_ ³Pol³Eng³
a. multiple S-structure ³ ³ ³
WH-movement ³ + ³ – ³
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÅÄÄÄ´
b. LF WH-movement ³ – ³ + ³
³ ³ ³
No connection was made between these two English-Polish differences. Thus, the facts
in (107)-(108) did not really follow from the D-linking hypothesis alone, but from D-linking
combined with a stipulation about Polish LF. If the difference at LF is to be viewed as a parametric
difference, it runs afoul of arguments by Higginbotham that there can in principle be no LF
parameters.
In fact, preliminary evidence from Russian (which otherwise seems to behave like
Polish; I have not checked the Polish equivalents as of this writing) suggests that the LF parameter
is actually wrong. Consider phrases meaning ‘how many X’ and ‘how much’. I argued in Pesetsky
-31-
(1987) that these phrases were not automatically D-linked, and thus show Superiority and ECP
effects in English. The Russian equivalent of ‘how much’ — skol’ko
______ — behaves as predicted given
the theory outlined above. Example (111b) shows only the D-linked reading for skol’ko,
______ and (111a)
shows only the non-D-linked reading:
For unknown reasons, however, full phrases of the form skol’ko N_n are not allowed to
____________
participate in the multiple movement construction:
(112)a. ??Kto skol’ko dollarov zaplatil za e` tu knigu?
Who how-many dollars paid for this book
(114) ___________
This paper ³Pol³Eng³
multiple S-structure ³ ³ ³ + _________
Earliness
WH-movement ³ + ³ – ³
Notice now that Economy is too weak to achieve this result: it cannot derive the
preference for S-structure WH-movement over LF WH-movement. In particular, a derivation
involving S-structure WH-movement for a non-D-linked phrase is exactly the length of the
comparable derivation involving LF WH-movement32 An Economy story would have to fall back
on the LF parameter in (110b).
-32-
We thus have a case that the Earliness Principle can explain, but not Economy — a case
of derivations with identical numbers of steps, where nonetheless the one that “finishes first” is
preferred over the other. This is exactly the sort of phenomenon we expect if there is some sort of
Earliness Principle, and provides evidence for it.
-33-
NOTES
1. Lasnik’s filter (p. 162) actually states that “a morphologically realized affix must be a syntactic
dependent at surface structure”, where by “syntactic dependent” is meant (fn 8) B in the structure [ B
A B]. The subsequent literature tacitly extends this to B in [A A B] as well.
3. This solution raises, in any case, an important question which our shortcut sidesteps: namely,
the question of why the “round trip” cannot be completed before S-structure, thereby in effect
allowing S-structure V-to-INFL movement in English. Sidestepping a problem is not, of course, the
same as solving it.
4. As in Chomsky’s system, lexical insertion must have “wide scope” with respect to the Earliness
Principle. Neither principle can be allowed to constraint lexical insertion, or else there would be, for
example, a preference against passive morphology (which can force movement), the versions of
Latin and Irish infinitivals that do not assign S-internal accusative case (forcing Raising of the
subject of the infinitive; Rouveret and Vergnaud (1980); McCloskey (1986)), etc. The inclusion of
D-structure in (13) thus has no empirical consequences: if Lexical Insertion is not governed by (13),
then “nothing can be done” to satisfy the Earliness Principle by D-structure.
5. A third possibility, which follows from considerations explored below, is that the mappings
from D-structure to S-structure and from S-structure to LF involve Move α, but the mapping from
S-structure to LP-structure does not.
6. I am indebted to Palmer (198x) for his thorough coverage of many of the properties of English
modals, though his work does not deal directly with the problems raised below.
8. The be-to
_____ of obligation is an apparent counterexample, but only apparent. If we assume that
(ia.) is good, and that be-to
_____ is acceptable in a counterfactual, then the fact that were
____ cannot raise to
C ((ib)) suggests that it is a θ-assigner. But then we are suprised to find it even in INFL. That it is
in INFL can be seen by its order with respect to negation ((ic)). The answer lies in the suggestion
-34-
that the be
__ of this construction is a modal, base-generated in INFL. Its incompatibility with
infinitives, seen in (ii), shows this:
(i) a. ?If we were to take out the garbage daily, we would have been
given instructions to that effect.
b. *Were we to take out the garbage daily, we would have been
given instructions to that effect.
c. Our instructions were that we weren’t to go near the television.
(ii)a. We are to take out the garbage daily.
b. *For an aristocrat to be to take out the garbage daily would
be a serious insult to his dignity.
c. We are expected to take out the garbage daily.
d. For an aristocrat to be expected to take out the garbage
daily would be a serious insult to his dignity.
9. Thus, in the terms of our speculation, where past tense is just a particular interpretation of a
tenseless form, it has a tenseless form.
11. I am using the term “ECM” to refer to Case-marking across a constituent boundary, not
necessarily an IP-boundary.
12. Though one might attempt to defend a claim that ungoverned clauses are all CP, while
allowing governed clauses to be IP.
13. In the next version of this draft, I will elaborate on this: basically my idea is that A-bar
specifier positions are not just a default case, but are determined by the selectional properties of the
heads of which they are specifiers. This kind of A-bar-selection feeds into a generalization of
Burzio’s idea that Case can be assigned by α only if α selects (θ-marks) its subject: only instances
of I that select for their subject, e.g. I containing Q, can assign Case via ECM.
14. The extension to English questions was independently proposed in a March 1988 paper by
Akira Watanabe (antedating my own work). Watanabe also gives the topicalization argument
presented below (attributed by him to Imanishi (1986a,b)).
15. As an aside, note that C-Inv structures — which involve full CP structure — act like embedded
questions, as predicted, with respect to Topicalization:
*This book [CP were [IP you to buy]], you would discover…
-35-
16. Important questions are raised about the nature of the chain: which element will count as a
variable at LF, and whci elements of an A-bar chain can in general bear Case. Here, the A-bar head
of the chain bears Case. Perhaps this will force further A-bar movement at LF, if only a variable
may bear Case in an A-bar chain.
17. And if INFL and the subject do not form a chain, no grammatical output is possible due to the
Case filter.
18. One semi-argument in favor of the distinction between the two inversion processes discussed
above, involves AUX-NEG contraction. Assume that AUX-NEG contraction applies at
LP-structure, cliticizing NEG to INFL. Then it cannot feed I-to-C. We predict contraction in QI but
not in CI, correctly:
This a “semi-argument” at the moment, since I have no principled reason why the
contraction process per se should be an LP-structure rule. Perhaps the existence of allomorphy (e.g.
won’t)
_____ is relevant here, but such considerations would also make many INFL+V combinations (are, ___
were) into LP-rules — an undesirable result.
____
19. The situation is considerably more complex if the HMC derives from the ECP, as suggested in
Chomsky (1986), and if the gamma-marking mechanisms assumed by Chomsky (1989) are
adopted. Section 3.3 discusses a variety of alternatives that might be proposed in the spirit of
Chomsky’s ideas, and provides some evidence that seems to weigh against these alternatives.
20. A problem for my discussion is the fact that adverbs of the scarcely
_______ class, which Emonds and
Zagona have noted are strictly VP-initial, do not very felicitously allow leftward verb movement of
the sort we have been looking at. Intonation does make a difference, however: the adverbs
eimprove in leftward verb-moveent constructions if the verb bears focal stress and the adverb is
unstressed. Note that this does not
___ make the adverb parenthetical, since no pauses are necessary.
However, it is a problem for my approach that similar verb focalization is not necessary for
corresponding examples in French like those in (iv), and I have no explanation for this at present:
-36-
(V. Deprez (personal communication) notes, however, that even in French there are certain
difficulties with examples like (iv)a-b: in certain cases the adverb is actually modifying the direct
object; in others, verb class appears to make a difference (e.g. parler
________________
presque franc¸ ais
___ vs.
??rencontrer
____________________
presque Marie).)
_
21. These occasionally receive one or two question marks in the literature (for example, from
Andrews (1983)), but I have not found any objection to them among informants.
22. This observation puts one in mind of Chomsky’s identification of µ with object agreement
(AGR-O), and his suggestion that AGR-O is somehow implicated in objective Case assignment, but
I do not see how to connect the two facts, and in any case the discussion below argues against
identifying µ with a morpheme of this sort.
23. Inter alia, this shows that the restrictions on movement to English INFL cannot be reduced to
restrictions on movement to µ, as claimed by Pollock.
24. Here I commit a gross oversimplification of Chomsky’s approach. The theory of verb
movement in Chomsky (1989) is actually a sustained argument that the HMC is only a partially
correct generalization, and that those cases that seem correct follow from the ECP. In section 3.3 I
deal directly with Chomsky’s actual claims.
25. There is a certain parallel with Safir’s (1982) claim that expletive elements may be exempt
from the ECP of Chomsky (1981).
26. Another argument can be constructed on the basis of subject-oriented adverb interpretation.
Assume an subject-oriented adverb takes as an argument the highest A-position in the clause to
-37-
27. This fact is even more striking when we note that the trace of have
____ in (98a) is a “doubtless”
gap, given the absence of any “hole” in the upstairs clause and the presense of past-participle
morphology in the lower clause. Gaps of adjuncts like why
____ or how
____ are generally not “doubtless” in
this way.
28. Or if LF filling of Neg° does not eliminate an index on Neg° created by S-structure substitution
for that position.
29. The term “D-linked” is somewhat misleading, since no actual discourse need to have taken
place. It is merely sufficient that a class of possible answers be “in the air”.
30. Though see Rudin (1989) for arguments that the phenomenon is syntactically rather different
in Bulgarian and Romanian than in the other multiple-fronting languages.
31. The relevant notion of “unavailable” does not seem to include island violations in this case:
when a WH-phrase is buried in a complex NP or tensed S (an island in Russian), S-structure
movement is impossible, but the only LF interpretation available seems to involve D-linking.
Additionally, such examples are not that easily accepted by my consultant. Thanks to Maria A.
Babyonyshev and Olga Brown for the Russian data cited here, but these results are the result of
very shallow questioning, and should be treated with caution at present.
32. Indeed, if those who have argued that subjacency fails to hold at LF are correct, an LF
derivation might involve fewer
_____ steps than the S-structure derivation.
Contents