0% found this document useful (0 votes)
100 views21 pages

Minimalist Syntax and Word Order

This document discusses the Minimalist Program proposed by Chomsky and how it relates to the notion of language being a "perfect system". It makes three key points: 1) The Minimalist Program aims to show how language can be viewed as a near-perfect system, despite evidence that it is a complex biological system not designed for perfection. 2) The minimal computational system proposed has syntax building strings via Merge, with linear order and other properties handled by interface levels like PF and LF. 3) However, this view is problematic as it is clear that word order variations and other linguistic facts depend on structural relations not preserved in the interface mappings. So the interface levels could not actually perform the operations ascribed to them

Uploaded by

Sabrina Gomes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views21 pages

Minimalist Syntax and Word Order

This document discusses the Minimalist Program proposed by Chomsky and how it relates to the notion of language being a "perfect system". It makes three key points: 1) The Minimalist Program aims to show how language can be viewed as a near-perfect system, despite evidence that it is a complex biological system not designed for perfection. 2) The minimal computational system proposed has syntax building strings via Merge, with linear order and other properties handled by interface levels like PF and LF. 3) However, this view is problematic as it is clear that word order variations and other linguistic facts depend on structural relations not preserved in the interface mappings. So the interface levels could not actually perform the operations ascribed to them

Uploaded by

Sabrina Gomes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

It is interesting to observe that there actually is very little empirical content to

this whole line of particular development of the theory; it is driven almost exclusively
by Uniformity. Assuming uniform binary branching, it follows that linear order must
be derived entirely through movement, and various formal devices must then be invoked
to require or prevent movement of a given phrase in a given language. In the case of
apparent optionality of movement, it must be presumed that the given device is
optionally present (e.g. the feature can be optionally ‘strong’ or ‘weak’). The empirical
considerations enter into the question of what the features are and at what point in the
derivation they must be licensed.
{Cinque 1997} carries the idea of deriving linear order from hierarchical
structure to its logical conclusion. He demonstrates how linear order can be correlated
exactly with hierarchical structure to a very high degree of precision, if for every element
that appears in the linear order there is a functional head in the tree. To illustrate, we let
x= ", $, ... be morphemes, words or phrases. Assume that each x is the specifier of
some functional head, ö(x). Assuming that linear order is correlated with relative height
in the tree ({Kayne 1994}), " precedes $ iff ö(") c-commands ö($).

(90)
.

" .

ö(") . .
.
.

$ .

ö($) ...

Cinque shows that there are word order universals of adverbs that can be characterized
in terms of a fixed hierarchy of functional heads, given these assumptions.

4. Minimalist Program
The Minimalist Program of {Chomsky 1995} assumes that the structures and
derivations of Principles and Parameters Theory are essentially correct. The objective
of MP is to explain why PPT works the way it does on the basis of first principles,
especially those of Economy. In its overt philosophy MP is indeed minimalist in our
sense. But, in practice, it is a variant of PPT.
A strictly minimalist syntactic theory would begin by eliminating all theoretical

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 45


devices except for what is known to exist: sounds paired with meanings. It would then
seek to find the minimal assumptions that would explain how a learner acquires the
mapping between these, in order to account for the learner’s capacity to identify
relationships and generalizations. The theory would not assume any formal operations
that were not required, and it would not assume that the learner came to the task
equipped only with knowledge that it could not extract from the primary linguistic data
of sound/meaning pairs. That is, the theory would be Minimal Syntax in our sense.
The MP does not meet all of these general conditions for an adequate minimalist
theory in several respects. First, it does not take the task of the learner to be central;
rather, it develops out of a conception of human language as a ‘perfect’ system. Second,
it assumes that the correct characterization of linguistic knowledge takes the form of a
derivation. Third, it imposes certain notions of economy on its formulations that appear
to have little if any empirical motivation. And fourth, it relies heavily on the Uniformity
methodology of much earlier work. We will develop each of these points as we survey
the Minimalist Program.

4.1. Perfection
{Chomsky n.d.} characterizes the search for perfection in language in the
following way:

“We are now asking how well language is designed. How closely does language resemble
what a superbly competent engineer might have constructed, given certain design
specifications. ... I suggested that the answer to the question might turn out to be that
language is very well designed, perhaps close to ‘perfect’ in satisfying external conditions.

If there is any truth to this conclusion, it is rather surprising, for several reasons. First,
languages have often been assumed to be such complex and defective objects as to be hardly
worth studying from a stern theoretical perspective. ... Second, one might not expect to find
such design properties in biological systems, which evolve over long periods through
incremental changes under complicated and accidental circumstances, making the best of
difficult and murky contingencies.

Suppose nonetheless that we turn aside initial skepticism and try to formulate some
reasonably clear questions about optimality of language design. The ‘minimalist program’,
as it has come to be called, is an effort to examine such questions.”

So, to put it somewhat differently, there is little if any empirical evidence to suggest that
language is a perfect system. The fact that it is a biological system also weighs against
this view. Therefore, the minimalist program has been launched as an attempt to
discover a way in which language can be construed as a perfect system, in spite of the
prior indications that it is not. The notion of perfection is imposed upon the Principles
and Parameters framework, requiring that it be reevaluated in light of this demanding
criterion.
The notion of perfection goes beyond the the standard scientific criterion of

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 46


simplicity and generality that might be demanded by Occam’s Razor, what has been
called ‘methodological minimalism.’ Methodological minimalism requires that our
descriptions be maximally simple, general, and elegant economical, given empirical
observations. The further criterion of ‘substantive minimalism’ (the term used by
{Atkinson 200x}) is that language is maximally simple, general, and elegant independent
of empirical observations.
Thus, the goal of the MP is to design the minimal computational system for
human language (CHL) that can accommodate the type of system that we know a
language to be. Since language does not obviously appear to be such a system, there are
three choices when encountering recalcitrant facts. (i) We can complicate the
computational system, departing from maximal economy in order to accommodate the
facts, concluding that language deviates (minimally) from perfection. It may even be
possible to explain why this deviation occurs. (ii) We can make the facts the
responsibility of systems other than the computational system itself (e.g. PF and LF).
(iii) We can set aside the facts if we don’t see how to do this.
Let us consider, then, what the minimal requirements are for a computational
system that can ‘do’ language. Chomsky reasons as follows: CHL must be able to
combine words into phrases. Therefore a derivation begins with a set N (the
‘numeration’) of primitive elements of the language, taken from the lexicon. The
primitive operation Merge recursively combines elements of N and eliminates them from
N. The minimal domain of Merge would be two elements of N. Since language is
produced in time, the interface level PF must interpret the output of Merge as an ordered
string of these two elements. This interpretation process is called Spell Out. Spell Out
occurs throughout the derivation; hence in the MP there is no level of S-structure that
interfaces with PF.
Since expressions have meaning, the result of Merge must also be mapped into
the interface level LF. This interpretive operation occurs throughout the derivation;
hence in the MP there is no level of D-structure that interfaces with LF.
Given this, the minimal conception of a computational system that forms strings
is the following:

“Given N, CHL computes until it converges (if it does) at PF and LF with the pair
(B,8). In a 'perfect language' any structure G formed by the computation – hence
B and 8– is constituted of elements already present in the lexical elements
selected for N; no new objects are added in the course of computation (in
particular, no indices, bar-levels ...etc).” ({Chomsky 1994:393})

For convenience of exposition, we will call CHL in the MP ‘syntax’. From what
we have said so far, it would appear that the responsibility of syntax in MP is
significantly more limited than it is in GB/PPT. Syntax in MP is not responsible for
linear order; that belongs to PF. Syntax in MP is not responsible for binding; that
belongs to LF (where the indexing can be done). Subcategorization does not belong to
syntax in MP, since Merge can pair any two elements in N. Hence the ungrammaticality

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 47


of a string like *eat the must be a consequence of its failure to map into a well-formed
object at LF. It appears that word order variation, of the sort encountered in
extraposition, is not the province of syntax in MP, but of PF. In fact, so is the type of
word order variation that arises out of the passive, wh-Movement, topicalization, and so
on. On the maximally minimalist view, syntax does nothing but build strings by putting
things together, pairwise.
There is a problem with the view just arrived at, however, which is that it is clear
from fifty years of syntactic research that the facts of language cannot be accounted for
simply in terms of properties of the strings. Linguistic relations are structure-dependent,
but whatever structure is constructed in the syntax is lost in the mapping to PF. So PF
does not have sufficient information to actually carry out the operations that are being
required of it.35
The fact that languages have word order variation is important evidence that
language is not a perfect system. Introducing mechanisms to address this empirical fact
appears to be a tolerable complication of the system, since it is a fact that is hard to
ignore. Hence, reasons Chomsky, we introduce into the concept of CHL an additional
operation that takes part of an already constructed string F, copies it, and attaches the
copy to F. This operation is called Move. The intuition of MP is that copying is the
minimal operation that will change the arrangement of elements. Literal movement
involves not only copying but erasure.36 Given that there are two copies, call them
‘original’ and ‘new’, either copy or both (or neither) can be mapped into PF. If only the
original copy is mapped into PF, we derive the effect of ‘LF movement,’ where the
moved constituent functions as though it adjoined high in the structure but appears in situ
(as in the case of wh-movement in Japanese).

(91) PF: New [.... Original]


LF: Newi [.... Originali]

If only the new copy is mapped into PF, we derive the appearance of overt movement.

(92) PF: New [.... Original]


LF: Newi [.... Originali]

Given that actual copying is not often found in natural language in place of movement,

35
We do not rule out the possibility that someone will propose enriching PF with syntactic information
that will allow movements to be formulated in PF. Of course, making PF into a syntactic component would be
a perfectly reasonable way to actually construct a grammar. Such a move would flatly violate the strong
assumptions of MP.

36
These primitive operations were originally identified in {Chomsky 1955}.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 48


some additional complication of the theory will be required.37
The question now arises as to what licenses the result of Move. We have already
noted that PF cannot be responsible for syntactic structure. MP borrows the PPT view
of movement as licensed by feature checking (see §3.4). Crucially, there is limited
independent empirical evidence to suggest that the features assumed by PPT that are
implicated in movement actually exist, let alone that they are ‘strong’ or ‘weak’, and
there is none in the case of languages that lack a robust inflectional morphology. But
putting this aside, MP can adopt the PPT devices more or less without alteration, Move
derives the hierarchical structure, and PF expresses this structure in linear terms.

4.2. Derivational economy


With the introduction of Move comes the question of whether all combinations
of Merge and Move yield grammatical sentences in a language, assuming that the results
satisfy the requirements of the PF and LF interfaces. The answer is ‘no’. As is well
known, many logically possible movements produce ungrammaticality in natural
language. Constraints on movement have been treated as irreducible properties of
syntactic theory. A natural goal of MP is to explain them in terms of the overall criterion
of computational simplicity.
Again there must be a complication of the theory beyond what is minimally
required, in order to accommodate the empirical facts. MP assumes that the critical
property of a derivation, that is, a sequence of Merges and Moves, is whether it is the
computationally simplest one against some comparison set. PPT assumes that all
features must be licensed, and furthermore that some features must be licensed prior to
the mapping to PF. Given these two assumptions, research has been devoted to
investigating whether there is a cogent characterization of computational economy in
terms of the number of Moves, and the length of each Move. Questions also arise
regarding the definition of the comparison set for any given derivation; see {Johnson &
Lappin 1999} for discussion. We will not go into these technical matters here, except
to note that at this point in the theory, considerations of inherent computational
simplicity of the system have been supplanted by questions of whether there is evidence
for one concept of computational simplicity over another.
Moreover, there does not appear to be any strong empirical motivation for any
constraint on derivations in terms of the number of moves or the length of moves, with
one exception. It appears that in many cases, when there are two candidates for a move
to a particular location, the closer one is the one that moves, as illustrated in (93) . This
result follows from the requirement that the shortest move is more economical than the
longer move.

37
Since the result of Move is licensed by mapping into PF,

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 49


(93)
.
0

H .
[F] .
.

XP .
[ F] .
.

YP .
[ F]

Empirical evidence that is generally consistent with this result involves Superiority –

(94) a. Who saw what.


b. *What did who see?
(95) a. Robin forgot where Leslie saw what.
b. *What did Robin forget where Leslie saw.
c. *Where did Robin forget what Leslie saw.
d What did Robin see where?
Where did Robin see what?

What is striking at this point is At this point it should be clear that relatively little
of PPT has been reconstructed in terms of the MP. As {Koopman 1999} writes

“The MP has had led to a much improved and cleaner theory. However,
compared to the GB framework, the Minimalist Program led to relatively few new
insights in our understanding of phenomena in the first half of the nineties. This
is probably because it did not generate new analytical tools, and thus failed to
generate novel ways of looking at well-known paradigms or expand and solve old
problems, an essential ingredient for progress to be made at this point.”

We agree with this sentiment except for the word “improved”, which is difficult
to substantiate in the face of the empirical limitations that Koopman notes. The MP
lacks an account of most of the phenomena handled by GB/PPT and other syntactic
theories. In some cases we consider this to be a welcome result, consistent with our own
minimalist principles. For example, the absence of syntactic indices forces Binding
theory out of syntax into the interpretation, although we would argue that the proper
representation is Conceptual Structure and not LF. But it is not clear how the MP can
incorporate the relevant syntactic relations of linear precedence into its account of

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 50


Binding, since linear order is not in syntax in MP, but in PF. Other descriptive
limitations of MP are its inability to express constraints on movement in terms of
inherent computational economy, its inability to account in a natural way for true
optionality of syntactic structure, or for syntactic alternatives that are linked not to LF
but to information structure (in the sense of {Roberts 1998} and others working in
Discourse Representation Theory), and its disavowal of responsibility inability to
account for the ‘periphery’ of language, which is a non-trivial aspect of linguistic
knowledge (see Chapter One and {Culicover 1999}).

5. Uniformity entails Generative Semantics


5.1. MP meets GS
While MP diverges from earlier syntactic theories in regard to empirical
coverage, it does share many important architectural features with earlier work. One
important characteristic that MP shares with earlier theories is its insistence on
Uniformity; another is the centrality of derivation. The two are inextricably linked, since
without derivations (and in particular, movement), it is impossible to map a single
structure into distinct strings. Also, in MP branching is binary and uniform, as in PPT.
As we have discussed, UTAH ensures a uniform assignment of 2-roles on the
basis of syntactic configuration. The consequence of applying UTAH rigorously is that
all derivational as well as inflectional morphology is syntacticized (see {Baker 1988} for
an early development of this idea in PPT).
It has not escaped the attention of many researchers that the rigorous application
of UTAH produces analyses that are hauntingly reminiscent of those of Generative
Semantics of the late 1960s/early 1970s. We reproduce here a representative passage
from {Bach & Harms 1968:viii}.

“The main point that we would like to mention is the development of 'deeper' and more
abstract underlying structures than are to be found, say, in Chomsky's Aspects of the Theory
of Syntax. We cite one example: the surface sentence Floyd broke the glass is composed of
no less than eight sentences [which we discuss below–PWC/RJ].... Each can be justified on
syntactic grounds. It is but a step, then, to the position represented in McCawley's 'Postscript'
where' these 'deep structures' are taken to be identical with the semantic representations of
sentences.”

It is fundamental to an understanding of the history of syntactic theory to


recognize that Generative Semantics was based on a literal and uniform application of
the Katz-Postal Hypothesis. Its major innovation was to say that lexical insertion is
scattered through the derivation. All inflectional and derivational morphology in GS was
carried out derivationally in the syntax. The similarity between MP and GS is noted by
Hale and Keyser (1991), who write:

“When we claim that the English verb saddle has underlying it a syntactic representation of

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 51


the form depicted in (17) [that is, [v saddle v p the horse] - PWC & RJ], it is clear that we
are accepting – to some extent, at least – a viewpoint represented in the Generative
Semantics framework, as in the work of Lakoff (1971), McCawley (1971), and others. The
Generative Semantics program was motivated, in part, by a vision of the nature of lexical
items which is essentially the same as ours. This is the idea that the notion ‘possible lexical
item’ (in relation to the view that syntax is projected from the lexicon), is defined, or
constrained, by certain principles of grammar which also determine the well-formedness of
syntactic structures.”

Given this convergence, it will be important to reconsider the original arguments raised
against the GS program, and see whether or not they are relevant to the MP.

5.2. Classic cases


Let us consider some classic cases from GS.

i. Syntactic derivation of agentive nouns


{Lakoff 1965/1970} observed that an agentive noun, such as builder, is subject
to the same selectional restrictions as the verb build. So, if builder of X violates a
selectional restriction, build X violates the same selection restriction, and in the same
way: . E.g.,

(96) a. Robin is building //a house / ?similarity//.


b. Robin is the builder of //a house / ?similarity//.

The subject of build and the subject of is the builder of are subject to the same
restrictions.

(97) a. //Robin / ?Sincerity// built a house.


b. //Robin / ?Sincerity// is the builder of this house.

Applying the Katz-Postal Hypothesis, Lakoff argues that builder must be derived from
build in a derivation along the following lines.

(98) Robin is [NP (one) [who build ... ]] =>


Robin is [NP build +er ... ]

The relative lack of syntactic precision of the early analysis can be ameliorated
by imposing reasonable constraints on the derivation, such as the requirement that heads
move to head positions, which would make it look a lot like Baker, Hale/Keyser, and
Larsonian head-movementVP-shell derivations. The other central aspect of Lakoff’s
derivation is the substitution of +er for the underlying (one) who, a step that can be
aligned with Spell Out in the MP.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 52


ii. Causative alternation (head-to-head)
{Lakoff 1965/70} and {McCawley 1968} argued that verbs of causation have
an underlying structure in which there is a predicate that conveys the relation CAUSE.
The derivation of John opened the door along these lines, in a version due to
{McCawley 1968}, is summarized by {Shibatani 1976} as follows:

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 53


(99)

The parallels with the more recent treatments of V-raising are striking. At the same time,
there are significant differences. The choice of primitives, e.g. NOT CLOSED instead of
OPEN, has a distinctly arbitrary quality. And, as has often been noted, there were no
constraints in GS on the types of operations that could be involved in the derivation of
complex structures that correspond to lexical items.
In fact, {Lakoff & Ross 1967} wrote a note entitled “Is Deep Structure
necessary?” in which they argue, anticipating in many ways {Chomsky 1995}, that there

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 54


is no need to stipulate a level of Deep Structure distinct from semantic representation.
Deep Structure is taken to have the following properties. It is,

(99) A. The base of the simplest syntactic component.


B. The place where co-occurrence and selection restrictions are defined.
C. The place where basic grammatical relations are defined.
D. The place where lexical items are inserted from the lexicon.

Lakoff and Ross argue that none of these properties have to be associated with a
particular level of representation: they can be associated with steps in a derivation where
the appropriate conditions are met.
Similar arguments are made in the MP., Merge and Move derive sequences of
structures that conform to the conditions associated with A, but there is no single
representation that meets these conditions. In MP, as in GS, co-occurrence and selection
are defined by the lexical entries and can be checked at any point in the derivation where
they are applicable. In GS, the grammatical relations subject and object are not relevant
for semantic interpretation, and they are not relevant notions in the MP either. And in
MP, as in GS, “lexical items are inserted at many points of a derivation” (Chomsky
1995:160). In MP lexical insertion is taken to be Spell Out, where a morpheme or
combination of morphemes receives a phonetic interpretation. This is more or less
equivalent to the way that lexical insertion was conceived of in GS, as well.

iii. Association of aspectual and temporal morphology with independent


heads
The flavor of a GS analysis is given by the following structure for Floyd broke
the glass.38

38
As far as we know this structure was never published, so we are forced to rely on our memories,
including a mobile of it constructed by Susan Fischer that hung in Ross’s office at MIT in the late 1960s. The
representation given here is somewhat richer than that cited by Bach and Harms in the passage given earlier and,
we believe, somewhat more accurate.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 55


(100)

There are a number of important points of comparison between this tree and what
we would expect to find in a MP representation. First, as noted already, each aspectual
and temporal component of the meaning is represented by a distinct head. In GS this
head is called V, since GS lacked functional categories, and every maximal projection
is called S. It is a small step to reinterpret this tree using contemporary categories,
particularly since the category of the head and of the projection lack independent
syntactic justification.39 Second, the subject Floyd appears twice, as the subject of CAUSE

39
Which is not to say that independent justification is irrelevant. For example, {Ross 1969a} argued
that English auxiliaries are verbs, and {Ross 1970} and {[Link] 1968} provided arguments that per formative
heads are verbs. It is not clear whether such arguments would be valid in the MP.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 56


and as the subject of DO. Again, this is eerily prescient, since a contemporary derivation
in the MP would Move Floyd up to the subject position of higher verbs in order for it to
get assigned the proper 2-roles, leaving behind a copy in each lower position (see
{Hornstein 2000}). Third, as in Larson’s proposals, the direct object of broke is the
underlying subject of the intransitive broken, denoting the property of the glass.
Transitivity in both approaches is a property of an abstract verb.

iv. VP Internal Subject Hypothesis

Another conclusion of MGG that is anticipated in GS is {McCawley 1970}’s


proposal that English is a VSO language. Viewed from the present perspective,
McCawley’s main contribution in this paper is the demonstration that without a
principled account of movement, any initial configuration can be manipulated so as to
produce the observed word order and constituent structure.40 As we noted earlier, current
approaches to the core structure of a sentence assume that it arises from a verb that
Merges with one argument or adjunct, yielding an interpretation that is equivalent to the
assignment of a 2-role; then the verb raises to the next verbal head and repeats the
operation on the next argument or adjunct. The NPs raise up in order to have their Case
feature checked (e.g., the direct object moves to the AgrO position), or to satisfy other
grammatical requirements (e.g. a sentence in English must have a subject, a requirement
known as the EPP and satisfied by a feature sometimes called ‘EPP’). In English, the
verb raises so that it ends up to the left of the VP arguments and adjuncts.

(101) VP

v AgrO

AgrO VP

NP V

the potatoes V PP

put in the pot

On McCawley’s account, the V originates to the left of the rest of VP.

40
Such a criticism has been directed against derivations in Kayne’s Antisymmetry theory; cf. for
example {Rochemont & Culicover 1997}.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 57


Subsequent movement moves the subject NP to the left of V. Since modals and tense
are also verbs, the derivation from this point is strikingly similar to that of MP, where
the subject continues to move up until it is in initial position.

Robin would have put the potatoes in the pot


(102)

5.3. How is this possible?


The fact that the machinery of GS is so similar to that of PPT/MP now appears
inevitable, given the assumption of Strong Interface Uniformity. The question naturally
arises, Why was GS so roundly criticized and ultimately discredited, given that the major
critic, Chomsky, and his contemporary colleagues, are now pursuing what appears to be
essentially the same line of research? Do the criticisms that held then hold now?
The answer to the first question appears to be largely tied to the sociology of the
field. For extensive discussion of the sociological questions focusing particularly on the
MP, see {Lappin, et al. 2001} and the subsequent exchange. And indeed, many of the
forces that are active today were in effect in the late 1960s during the dispute over GS
(see {Newmeyer 1980}, {Harris 1993}, {Huck & Goldsmith 1995}). At the same time,
there were substantive problems with the GS program which rendered it intractable as
an account of syntactic structure per se. Perhaps most significantly, GS did not
distinguish between literal meaning and non-literal meaning (or pragmatics), and
proposed to incorporate into the syntax/semantics representation everything that a native
speaker might be entitled to know about the world in virtue of a particular sentence being
true.41 However, there were serious criticisms of the part of the program that concerned
lexical semantics.
So, turning to the second question, let us consider the specific criticisms of GS
in the literature. The fundamental criticism comes from {Chomsky 1970:14ff}.
Chomsky considers the analysis of sentences like John felt sad. If the underlying
structure were John felt [John be sad], then we would expect that John felt sad could
mean the same as John felt that he was sad. However, Chomsky writes, “if we are
correct in assuming that it is the grammatical relations of the deep structure that
determine the semantic interpretation,” John felt sad must have a different DS
representation from John felt that he was sad. So feel would subcategorize both an S
complement and an AP complement. This conclusion is, of course, in violation of

41
And similarly, ceteris paribus, for non-declarative, imperatives and exclamatives.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 58


Interface Uniformity.42 If accepted, it undermines the GS program, which is what
Chomsky intended. Chomsky goes on to make the same kind of argument for
nominalizations, showing that the semantic interpretation of a nominal, while related in
argument structure to its corresponding verb, is nevertheless distinct. This is the
‘lexicalist’ view of grammar.
Similarly, {Chomsky 1972b:80-84}, following {Bresnan 1969}, offers an
argument against the GS/MP approach to argument structure articulated originally by
{Lakoff 1968} and subsequently by Baker/Hale/Keyser. Lakoff originally proposed that
the following two sentences have the same Deep Structure.

(103) a. Seymour cut the salami with a knife.


b. Seymour used a knife to cut the salami.

The conclusion Lakoff arrived at was that the instrumental role associated with
with in Surface Structure is actually expressed by an argument of the verb use.
Chomsky’s arguments against this view are, first, that use can appear with with –

(104) Seymour used a knife to cut the salami with.

– and second, that use cannot always be paraphrased by with –

(105) Seymour used this car to escape.


… Seymour escaped with this car.
cf. Seymour used this car to escape in.

So the fact that certain examples appear to be paraphrases of one another in terms of
Argument Structure does not in itself support the conclusion that they share the same
underlying syntactic representation, contrary to the Baker/Hale/Keyser view.43
The issue turns on question then devolves upon the status of the non-argument
structure (non-AS) aspects of interpretation in the grammar. Following Chomsky 1972a,
{Jackendoff 1972; 1975; 2000; 2002} and {Pustejovsky 1995}, {Fodor & Lepore 1999}
argue that non-AS information is an essential part of the information associated with a
lexical item. The Baker/Hale/Keyser approach, on the other hand, abandons lexicalism

42
In contemporary terms, these sentences would be represented as John felt [scPRO sad] and John felt
[CPthat he was sad]. The lack of uniformity still exists, although it takes a different form. At the same time, the
fact that “I am sad” is not a feeling attributed to John in the case of the small clause could be attributed to the
fact that PRO is not referential and therefore John not within the scope of John’s belief. See {Zwart 1999}.

43
Another line of argument mounted by Chomsky against GS is not pertinent to MP; it is that at least
some semantic interpretation is determined only at S-structure. With the introduction of traces, all semantic
interpretation can be determined at S-structure, and with linear ordering relegated to PF, D-structure has no role
beyond constraining the combinatorial relationships among words and phrases.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 59


in favor of a theory in which the AS aspects of interpretation are encoded in a uniform
way, while (presumably) the non-AS aspects of interpretation are still associated with
individual lexical items. The AS part of the account is intended to explain certain facts
about ‘possible lexical item’ in terms of movement and conflation of complex underlying
structures, along the lines of GS. However, as we showed in §2.1.3, arguments parallel
to Chomsky’s can be mounted against these accounts, based on differences in non-AS
aspects of word meaning; Hale and Keyser are aware of such arguments but declare that
they will not address them. Since the non-AS information must be associated with
individual lexical items in any case, the move to a non-lexicalist account of AS relations
between lexical items can only be motivated by a demonstration that a lexicalist account
cannot correctly account for them. To our knowledge no such demonstration has ever
been made, let alone attempted; and there has been no reply from the derivational camp
to the various lexicalist accounts in the literature such as Jackendoff 1972, 1976, 1983,
2002; Pinker 1989, Pustejovsky 1995, and Talmy 1988.

5.4. Non-lexicalism and integrated syntax


This is not to say, however, that a non-lexicalist account is impossible in
principle. {Marantz 1998} argues that the properties of lexical items are not unformly
represented in the lexicon, but parceled out among various components of the grammar
in such a way that the combinatorial devices that derive new lexical items are just those
of phrasal syntax, while the core meanings and idiosyncratic properties are stored in
various ‘lists’ that interface with syntactic representations in different ways.
On this view, the core lexicon contains primitive roots, and thematic properties
such as AGENT are associated with primitive light verbs. The alternation between destroy
and destruction discussed by {Chomsky 1970} is expressed not as a lexical one but as
a syntactic one, with the special meanings associated with the Nominalizations
represented in a distinct list. Let v-1 represent the verb that assigns AGENT, and D the
nominalizing morpheme.

(107) a. v-1 + %DESTROY => destroy


b. D + %DESTROY => destruction

Hence there must be a list that specifies the form of D + %DESTROY and another (or
the same one) that specifies any idiosyncratic semantic properties of destruction.
Of course, distribution of this information into various lists would be completely
unmotivated unless there was a reason for doing it. Since the information in question
must be represented in one way or the other, the ultimate issue is whether there is a basis
for integrating the syntactic component that deals with the derivation of lexical items and
the syntactic component that deals with the derivation of phrases. The arguments that

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 60


we have given above constitute prima facie evidence against this integration. The
phonological and semantic idiosyncrasies of a lexical item need to be associated with it
in some way, and treating the structure of lexical items outside of the lexicon does not
account for these idiosyncrasies (and in fact predicts that they do not exist).
On the other hand, if there is good reason to treat these as a single component,
then there cannot be a lexicon in which derivation, phononological form and
idiosyncratic meaning are dealt with together. This integration is at the core of the
Baker/Hale/Keyser program, as well as Marantz’. And if lexical derivation is different
from phrasal derivation in fundamental ways that cannot be attributed to the properties
of the individual components, then there is a lexicon.
To resolve this issue it would be necessary to consider in some detail the
properties of lexical and phrasal derivation. Doing so immediately brings out the fact
that to some extent the resolution of the issue depends on independent assumptions,
however.
For example, we would claim that lexical derivation does not involve movement,
while phrasal derivation does, in the sense that there can be systematic word order
variation at the phrasal level that do not appear at the lexical level. For instance,
languages have syntactic scrambling, but not lexical scrambling. But the natural way to
deal with this in a syntactic approach to lexical derivation is to treat word order variation
to be the consequence of empty functional heads (see for example {Miyagawa 1997}),
which are absent from lexical phrases.
Or, to take another example, consider the fact that lexical items and phrases
appear to have different stress assignment.

(108) a. black BIRD


b. BLACKbird

Applying Uniformity, one could say that stress assignment in English is uniform, and
that the structure of (108b) is

(109) bird black

Stress would then be assigned to the right, to black, and then black would move to the
left of bird, yielding (108b).44 It would follow, then, that there would be movement in
both the lexical and the phrasal domain. On the other hand, all compounds in English
are stressed initially.

(110) a. CUTthroat (V-N)

44
Thanks to Jan-Wouter Zwart for suggesting this possibility to us.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 61


b. CATnip (N-?)
c. BIRDbath (N-N)
d. OUTcast (P-V)
e. OFFroad (P-N)
f. DROPin (V-P)
g. HARDon (A-P)
h. CASHback (N-P)
i. HARDwon (A-V)

One might reasonably take the position that there are in fact different stress patterns for
lexical items and phrases, and that a movement analysis is motivated strictly by the goal
of maintaining Uniformity.
The question of whether locality conditions constrain lexical derivation depends
on the details of the assumed structure. Derivation of shelve NP from [v NP p shelf] is
subject to locality conditions stated over more or less conventional syntactic structures,
assuming a head-to-head derivation and empty v and p. But lexical derivations are
necessarily local because they apply only to the constituents of lexical items, that is, the
arguments. So, an argument, such as a Cause, can be added to a lexical item to create
a new lexical item, or an argument can be deleted, but it is impossible to create a lexical
item on the basis of a complex structure composed of one lexical item nested inside of
the other. For discussion, see {Jackendoff 1997}.
These examples suggest that it is not possible to fix the properties of lexical
derivation independently of other assumptions. However, the fact that it is necessary to
make particular assumptions that are not otherwise motivated, as in the case of stress
assignment and movement, suggests that the case for integrating lexical and phrasal
syntax is one that is difficult to sustain on the basis of empirical considerations, and gets
its strongest motivation from Uniformity.

6. The alternative
Our survey of the history of mainstream generative grammar has shown how at
every stage of development, innovations have been motivated by Uniformity,
particularly Interface Uniformity. As we went through the earlier stages, we mentioned
some alternative positions that have arisen in the literature. On the other hand, the
development of the later stages, from UTAH on, has been driven more on theory-internal
grounds, and there have been virtually no points of substantive contact with other
syntactic frameworks. At this point, however, it is useful to step back and see what
syntactic theory might look like if the MGG train of argument were not adopted.
As mentioned in §2, the empirical drawbacks of the transformational account of
the passive were addressed by by the middle 1970s (e.g. Bresnan, Brame). By ten years
later there were at least two major frameworks, LFG ({Bresnan 1982a}) and GPSG

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 62


({Gazdar, et al. 1985}), in which the active-passive relation was captured not through
syntactic movement, but rather through relations in argument structure in the lexicon.
This implies relaxing Interface Uniformity.

(110) No passive movement e Weaker Interface Uniformity

If the passive is done through argument structure, so is Raising to Subject; the same
mechanisms of argument structure can account for the interpretation. The alternation
between a tensed complement of seem and an infinitival with a “raised” subject is a
matter of lexical subcategorization by seem plus an extra principle of interpretation.

(111) No passive movement e No Raising to Subject movement

Furthermore, consider the NPs originally analyzed as undergoing Raising to Object, then
by NRO treated as complement subjects governed by the matrix verb. The reason they
had to be subjects of the complement is to satisfy Interface Uniformity. If “raised”
subjects are generated in situ, then “raised” objects can be as well, using a parallel
principle of interpretation.

(112) No Raising to Subject movement e “Raised” objects in situ

Next recall that NRO was used to motivate conditions on movement such as
NIC, plus the notions of government and abstract Case. If “raised” objects are generated
in situ, these additions to UG are unnecessary.45

(113) “Raised” objects in situ e No need for NIC, government, or abstract Case

A weakened criterion of Interface Uniformity also has consequences for


quantifier scope. Should one accept the price of LF and covert movement in UG in order
to assure that different quantifier scopes correspond to different syntactic structures? Or,
given that Interface Uniformity is weakened anyway, forego these additions to syntactic
theory and substitute a richer syntax-semantics interface, as urged in {Chomsky 1972},
and {Jackendoff 1972}? Chapters Five through Nine of the present study will show that
this latter course is empirically preferable.

(114) Weaker Interface Uniformity e No level of LF, no covert movement

The enrichment of the lexicon required by nontransformational passive and

45
They are properties of lexical relations, which are constrained by the amount and type of information
in a lexical item.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 63


raising makes it easy to formulate lexicalist alternatives to Baker’s account of noun-
incorporation. As pointed out above, the empirical basis of Baker’s move was in any
event challenged by Rosen 1989. Noun-incorporation, in turn, was the principal
motivation for head movement. In addition, since movement in the passive has been
eliminated, head movement is independently less plausible.

(115) No passive movement + Weaker Interface Uniformity e richer lexical relations


e no head movement

If there is no head movement, there can be no VP-shells (with their attendant empirical
difficulty). In turn there can be no Hale-Keyser style derivational morphology (with its
attendant empirical difficulties), and no Pollock-style functional heads

(116) No head movement e No VP shells e No syntactic derivation of denominal verbs


e No functional heads to which verbs move

And the absence of head movement makes it impossible to implement strictly binary
branching and agreement through movement.

(117) No head movement e No strictly binary branching


e No agreement through movement

At this point we also see that, with so much movement eliminated, it is hardly
such a central concern in linguistic theory to outlaw all kinds of exuberant movement
through constraints on movement or derivational economy.

(118) Little if any movement e Many fewer constraints


e No need for derivational economy

In this chapter we have examined primarily theoretical developments


surrounding argument structure alternations and their formulation in terms of A-
movement and head movement. We have not said much about long-distance
dependencies, formulated in MGG as AN-movement. However, similar arguments could
be made there (an important and virtually neglected critique of the MGG approach to AN-
movement is {Bresnan 1977}).
The upshot is that adopting a nontransformational account of passive forces us
to adopt a theory of UG in which many of the theoretical and empirical problems of
MGG do not arise. As mentioned earlier, most of the non-mainstream frameworks of
generative grammar have gone the route of non-movement, constraint-based description
(an exception is Tree-Adjoining Grammar, which is derivational). That does not mean
that the goals of explanation, of constraining the grammar, of learnability, and of
innateness go away. From our point of view, the issue for these non-mainstream

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 64


frameworks is how lexical relations, argument structure relations, and the syntax-
semantics interface – as well as the syntactic component – can be constrained
appropriately so as to achieve these goals.

Culicover and Jackendoff, Simple(?) Syntax, Version of 7/20/03. Chapter 2, p. 65

You might also like