Thursday, January 22, 2009

Gray! Gooey!

The first part of this is extracted from another post that I started working on. Reading back through, the jump just didn't seem to work. But I wanted to make sure I set it down in some form. So happy birthday little post.

Our brains are made of a large number of interconnected nodes. An individual node takes as input the current state of the nodes it is connected to, and running through a quick tabulation, decides to fire or not fire based on those inputs. Further, the nodes that it fires to are determined by the inputs (it doesn't flail out to every node that it "knows" to say that its on). That's a somewhat simplistic description but provides a nice trace back to the approach in AI that attempts to model the brain's structure by the use of neural networks.

The usages of neural networks in AI that I'm familiar with focus on matching particular inputs to particular output states. For an example, such networks have been used to teach computers to recognize the difference between photographs of men and women. The process starts with a significant training period. The trainer provides an image. The image is fed through the network (which has no meaningfully set nodes yet). The trainer then looks at the output from the network and indicates whether the network's output is correct or not. If correct, the network sits tight. If wrong, the network runs through a process whereby it adjusts impedance levels between nodes so that given the particular input (and with something like this it gets a bit tricky defining what the specific input is. Do you just feed it raw .bmp or .jpg data? Probably not if you want to differentiate gender. You probably pre-process to break out particular characteristics of the subject of the photograph, especially in regards to relationships between the parts and then feed those into your network), the output corresponds to a value (or series of values) you've designated as male or female (maybe your output string could have a strength of prediction component to it. Rather then simply 01110101 = male and 11010010 = female, the real measure of male/female is the position/quantity of ones and zeros. Perhaps the first half of an 8 bit sequence represents non-overlapping male characteristics, and the second half non-overlapping female characteristics. If there are more ones in the first half then the second half, then it indicates male. If there are more ones in the second half, then it indicates female. The degree of prediction certainty could be based on the difference in # of 1s and 0s from expected 100% certainty subjects. 11110000 would be an output that it was 100% certain was male. 00001111 would be 100% certain that it was female. Then along the way you might get 11010010 or 10010001. Both examples would register as males, however, the fact that the first was missing one indicator of maleness and had one indicator of femaleness would indicate a degree of uncertainty. And of course, this would leave an out for complete androgyny - 11111111).

So to get back on track, the trainer feeds the network input conditions and lets the network know if the output is right or wrong. The network updates, and the trainer feeds the network another image. This is repeated through a large set of inputs. With nothing other then right/wrong as guidance, the network of nodes begins to have encoded in it, an ability to correctly categorize this particular kind of input (although, it is very limited. Try feeding it a picture of a female giraffe. Or a bowling ball. Or any other curve ball that you can think of, and it will still give an answer, but the answer won't be grounded in anything). The outputs are symbolic representations of categories that have been built up and defined.

And because insertion and deletion have made things a bit disjoint, here's a random chunk:

When a symbol is represented, it is not realized in a single location in the brain, but is a diffuse pattern across a subset of the collective. Further, any individual node may be enmeshed in a large number of particular symbol patterns. So neuron #23646 may be one bit in the chain of firings that may represent your sister, but it also is one bit in the chain of firings that represents lizards. And also, one bit in the chain that represents paper. As an added level of complexity, within any pattern of firings, there's enough redundancy that the death of any one node (or number of nodes), would not break the symbol (otherwise, a night of drinking would signal the physical death of any number of things that you know about the world).

A great deal of criticism regarding neural networks emerges from the "amount of training required for real world training" (see the wikipedia entry). So another little jump. Does the time line of human development (in terms of our bits and pieces coming online - by which I mean when particular senses become more attenuated, motor skills develop, etc...) make sense in this context? From what I've read thus far, it seems to lend itself some credence. The most crucial categorization that we need for more complex learning is the differentiation between I/not I. And it seems that early on we are confined to a much smaller bubble. Here's a bit on eye sight in infancy:

"When they’re born, babies see in black and white and shades of gray. Because newborns can only focus eight to twelve inches, most of their vision is blurred. Babies first start to learn to focus their eyes by looking at faces and then gradually moving out to bright objects of interest brought near them. Newborns should be able to momentarily hold their gaze on an object for a few seconds, but by 8-12 weeks they should start to follow people or moving objects with their eyes. At first, infants have to move their whole head to move their eyes, but by 2-4 months they should start to move their eyes independently with much less head movement. When infants start to follow moving objects with their eyes they begin to develop tracking and eye teaming skills. Young infants haven't learned to use their eyes together; they haven't developed enough neuromuscular control yet to keep their eyes from crossing. This alarms many parents, but by 4 or 5 months babies usually have learned to coordinate their eye movements as a team and the crossed-eyes should stop."

The time line of a child's development seems keyed in to allowing for specific categorizations to develop at specific times. (Looking at the above excerpt, I'm drawn to particular parts of it. Black and white vision to start - colors get integrated into our mental representations later. 8-12 inch vision range - keeps focus on objects that are closer to our physical space. Independent eye motions - start seeing in two dimensions as opposed to three? Depth perception added later? Move whole head to move eyes - we must face in the direction of the things we want to see, body is not allowed to partially take part in stimulus, we've got to about face the whole thing if we want to engage. All things that indicate that our senses - or at least sight in this case - prevent overwhelming complexity early on in learning processes)

With that I'll stop for the moment. There are definitely things there that I want to get more information on. I haven't really looked at anything related to neural networks since the Artificial Intelligence class I took senior year (2003). Hofstadter's talk of the Careenium and Simmballs got me thinking in that direction a bit. And the bits and pieces I've seen on child development tend to separate everything out based on the kind of function (vision gets its own section, hearing another,...). It'd be a nice exercise to cross reference the different senses to see where the various milestones (large and small) fit into context with each other.

There are three other posts related to things that have popped up while reading Hofstadter being smacked around in various forms to try and get them somewhat cohesive. Hopefully I'll get them together sometime in the next week.

No comments: