The Only Winning Move: Are Connectionist Networks Symbol Systems?

Yesterday in Philosophical Foundations of Cognitive Science the topic was Connectionist Networks. The discussion, rightly I think, focused on whether or not these are a different kind of model for Cognition than the Newell and Simon Physical Symbol System Hypothesis.

The case that they are fundamentally different is superficially compelling. After all, the biggest problem with Connectionism is semantic transparency. For a complex enough network, it's exceedingly difficult to say how it gets at the answers it provides, or what sort of generalizations it's capturing (if any - indeed, another pitfall is that if the network is large enough it may simply be storing patterns and not generalizing at all!). And the reason this is so is because the calculation is distributed over a mass of nodes with identical designs (though different internal values). One way to look at it is to say that it's a hugely parallel computer with lots of incredibly simple (but still separate!) processors linked in particular patterns. Because the design is uniform throughout, it's not so easy to pinpoint the source of a particular "decision" that affects the final output.

But is it really doing anything different?

I think this is a fascinating question, and I am seriously thinking about writing my final paper in the course on this topic. My answer is that they are indeed (not-so-)glorified Physical Symbol Systems (or, rather, implementations of same), but I'm at a point where I'm willing to be convinced otherwise.

The argument that they are really just implementations of Physical Symbol Systems goes something like this. Connectionist networks are cool because they learn. (Learning is a much bigger problem for systems implemented on the purely symbolic level since in some since setting up a symbol system commits you to deciding details that should really be left to the environment.) By slight modifications to their internal specifications over a (large) series of examples, they come to approximate the regularity underlying the examples. Provided one exercises restraint in the number of nodes on the hidden layer, and provided there is, in fact, something to be learned, the network should, given enough exposure, come to capture any generalizations that can be made over the examples. This is indeed a Very Good Thing.

The problem with saying that it's anything different, however, hinges on these words "underlying regularity." The assumption is that the network is learning a function, ultimately - a regular pairing of inputs with outputs. Since we (hopefully) constrain it to keep it from simply storing the inputs and associated outputs (by not giving it enough internal space to do so), it will have to instead store some kind of "observations" (used VERY loosely here) about the inputs that allow it to classify them into categories and make decisions on that basis. Since it is indeed a "function" (pairing of inputs and outputs) that it's storing, and since it is generalizing this function, it would seem to be possible to model whatever it comes up with with a straight-ahead mathematical representation. And indeed, the internal structure of a connectionist network is really nothing other than a series of if...then...else statements from a certain point of view (or, if preferred, logical primitives like and and or). After all, networks are all about thresholds. If a certain threshold is exceeded do this, else do this. It's a cascade of logic gates like any other computer.

So the short answer, for me, is that yes, ok, there are some differences. The network is mutable on a (much!) finer level than symbol systems. It has a clear environment-based learning algorithm, which symbol systems don't necessarily. These things are true. But ultimately, what's going on inside would seem to be a symbolic function like any other, and this is just a particular implementation. More to the point, it offers no counterevidence to Newell and Simon's hypothesis that thinking per se is a symbolic process.

I don't, however, think that should dampen cognitive scientists' interest in connectionist networks.

Networks are clearly useful tools. If we assume they are modeling functions (and I see no reason not to), it is still the case that they can be used to model functions that humans can't directly see. I won't link any here, but there are several documented examples of networks figuring out regularities that humans couldn't grasp - becuase the regularities in question were so complex as to be non-obvious, or not even obviously derivable. So thought of as "regularity detectors" they are very useful things indeed.

I have said before that science has to be more than simply representing - in the sense of re-presenting - a set of data. Science is data compression. It induces models from masses of data that allow us to make predictions. So it would, in any case, be pointless to assert that Connectionist Networks were some kind of new model of cognition that had to be taken as such. Even if this were true, it would be a useless fact if we had to take series of connections as being the final answer on anything; a series of connections and weights is not a suitable answer to the kinds of questions that humans ask! However, I do not believe that that is the alpha and omega of networks. What they are, as I said above, is generalizers - regularity detectors. If a network can learn something, that (presumably) means there is something "out there" in reality to be learned. And so I think interesting research can and should be done (and is being done) in the area of figuring out how to read the functions networks find off of the connections. Obviously the behaviorist approach of noting which inputs produce which outputs is inadequate since that is usually known beforehand. The nice thing about networks is that these are "brains" (rather, very crude computational models of brains) that we can cut open and look inside - down to the squirting juices. Developing systematic ways to figure out what's going on seems a very useful thing to do indeed!

That said, there will be times when the function learned is simply too complicated. And here's the fuzzy boundary that those who think this is a new kind of computation altogether can exploit. It may be (in fact, it probably is the case) that there are functions in the world that cannot be easily represented in human-understandable terms. Though these functions are presumably also symbolic and can be represented as mathematical computations just like any other, there may be so many symbols involved - it may be of such fine complexity - that listing all the rules is pointless for humans because it is over the attention threshold. In this case, there might be some meaningful distinction to be drawn between "symbolic" and "sub-symbolic" functions - though I myself would consider this an abuse of term (in the sense that "human-understandable" and "of super-human complexity" would be more accurate terms). For such functions, networks are a compact way of instantiating them and nothing more. They do not buy us anything in terms of understanding genralizations if the generalizations are, by their nature, things we can't understand. And it's this that leaves the question of whether connectionist networks really are just implementations of symbol systems open for me - technically.

The Only Winning Move

Tuesday, September 26, 2006

Are Connectionist Networks Symbol Systems?

0 Comments:

About Me

Previous Posts