Monday, December 18, 2006

Maximal Assumptions, Minimal Output

One of the books I need to read eventually is Noam Chomsky's The Minimalist Program. I dipped into it a bit today, and it started annoying me already in the introduction.

Here're the lines in question:

This work is motivated by two related questions: (1) what are the general conditions that the human language faculty should be expected to satisfy? and (2) to what extent is the language faculty determined by these conditions, without special structure that lies beyond them?

And later:

To the extent that the answer to question (2) is positive, language is something like a "perfect system," meeting external constraints as well as can be done, in one of the reasonable ways. The Minimalist Program for linguistic theory seeks to explore these possibilities.

I actually have a lot of sympathy for this view, to the extent that I understand it. The idea seems to be this: language serves a function, and if it does no more than satisfy that function, and if all of its specifications are more or less in the service of that function and can therefore be explained by the function being served (communication, one presumes), then it's "perfect." The Minimalist Program is a kind of "guess." We will try to explain all of language as far as possible within the framework of how it works to fulfill its prime function. We therefore make the assumption that language is "perfect." But we do not necessarily expect to see this assumption borne out. The point is just that the Minimalist Program is a kind of discipline (in the monastic sense) in research. We know the conclusion we want to reach, and we try to reach it. We may not, but the discipline will have paid off by giving us the minimal theory necessary to account for the data.

In a sly way, Chomsky seems to be responding to some of the baroque excesses of GB. It isn't science just to capture the data. You've got to capture it in the most general possible way - without sacrificing accuracy. Accuracy is easy to get on the cheap - but you don't learn much by just describing things as they are.

What I guess annoys me about this is that, although Chomsky's wording here is actually quite precise, it's also easy to misinterpret. And so a lot of syntacticians have simply taken on faith that language is "perfect," and they do not realize that this was just a disciplinary assumption rather than a proven theorem. I'm also not completely sure that this is a good device to build into your research methods. It's one thing to say "all other things being equal, we'll go with the simplest explanation." All sciences have taken that as a general principle for centuries! It's quite another to say, "for lack of a better explanation, I'll just assume that things work out in the predicted way." And it's the second that I think a lot of syntacticians are guilty of. Syntax is missing a vital step, in other words. In Physics, it goes something more like this: "Here's how we want it to be, and we have independent evidence that it probably is that way, so for lack of a better explanation we'll assume it is until someone shows otherwise. And on that assumption I conclude about phenomena x and y that z." What's in itallics is often left out in Syntax. I'm convinced in most cases that there's no evidence against a lot of Minimalist assumptions, but generally speaking, science needs a bit more than that to go on. We have to have at least prima facie evidence that what we're assuming is, in fact, the case in most sciences to proceed. In Syntax, that would be done by checking how your conclusions about the specific phenomenon you're studying would affect previously-drawn conclusions about other phenomena. But most syntacticians omit this step - they don't bother to check that adding device y to the system doesn't mess up something about device x (which was proposed to explain another phenomenon). I think Computational Linguistics has a definite role to play here. We can make parsers for corpora that allow researchers to "test" their conclusions (namely, add a device to the grammar and reparse a previously parsed corpus and see if you "lost" anything that you were previously able to account for). But few researchers are currently using these techniques. Minimalism is in particular difficult to code parsers for - which may explain a lot of the laziness.

In any case, this is oily Chomsky at his best. Nothing he says here is objectionable - but neither is it optimally clear. Lots of people have gotten the wrong impression here, and while that's obviously not Chomsky's fault (he worded it correctly, after all) directly, I sometimes wonder if he really minds the misinterpretation.


At 4:30 PM, Anonymous Anonymous said...


Using computational linguistics to test the theory is an interesting thought. You could almost call it a test-driven hypothesis building.

The only thing I wonder about is whether grammar parsers or corpus-based techniques is the right way to go about this testing approach. If you had already parsed something, doesn't it mean that you have a full grammar for that corpus.

Or did you mean start with a very general grammar (noun|adjective|verb|adverb) and then make it more detailed by recognising more suffixes/prefixes/etc?

It feels like you have more thoughts in that paragraph than you have written down. It might deserve a full blog article IMHO.

At 4:40 PM, Blogger Joshua said...

Pleased you posted. I followed the link to your page and noticed that you have, in addition to a section on Computational Linguistics, also a section on Esperanto - something I'm also interested in.

You're right, there are more thoughts here. I'll post more on it in about two weeks after I've read through some books.


Post a Comment

<< Home