Archive for July, 2003

030730 – Synaptic plasticity (Hebbian learning)

Wednesday, July 30th, 2003

030730 – Synaptic plasticity in the brain

On average, whatever that might mean, each neuron in the brain is connected to about 10,000 other neurons via synapses.  “The human nervous system is able to perform complex visual tasks in time intervals as short as 150 milliseconds.  The pathway from the retina to higher areas in neocortex alone which a visual stimuli [sic] is processed consists of about 10 ‘processing stages’.  (Thomas Nachtschlager, December 1998.  “Networks of Spiking Neurons: A New Generation of Neural Network Models”  “Experimental results indicate that some biological neural systems … use the exact timing of individual spikes,” that is, “the firing rate alone does not carry all the relevant information.”  That certainly makes sense in a system that has to deal with race conditions.

“Each cubic millimeter of cortical tissue contains about 105 neurons. This impressive number suggests that a description of neuronal dynamics in terms of an averaged population activity is more appropriate than a description on the single-neuron level. Furthermore, the cerebral cortex is huge. More precisely, the unfolded cerebral cortex of humans covers a surface of 2200-2400 cm2, but its thickness amounts on average to only 2.5-3.0 mm2. If we do not look too closely, the cerebral cortex can hence be treated as a continuous two-dimensional sheet of neurons.” (Wolfram Gerstner & Werner M. Kistler.  Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge University Press, 2002. Chap. 9.)

“Correlation-based learning is, as a whole, often called Hebbian learning. The Hebb rule (10.2) is a special case of a local learning rule because it only depends on pre- and postsynaptic firing rates and the present state wij of the synapse, i.e., information that is easily `available’ at the location of the synapse.

“Recent experiments have shown that the relative timing of pre- and postsynaptic spikes critically determines the amplitude and even the direction of changes of the synaptic efficacy. In order to account for these effects, learning rules on the level of individual spikes are formulated with a learning window that consists of two parts: If the presynaptic spike arrives before a postsynaptic output spike, the synaptic change is positive. If the timing is the other way round, the synaptic change is negative (Zhang et al., 1998; Markram et al., 1997; Bi and Poo, 1998,1999; Debanne et al., 1998). For some synapses, the learning window is reversed (Bell et al., 1997b), for others it contains only a single component (Egger et al., 1999).

“Hebbian learning is considered to be a major principle of neuronal organization during development. The first modeling studies of cortical organization development (Willshaw and von der Malsburg, 1976; Swindale, 1982) have incited a long line of research, e.g., Linsker (1986b); Obermayer et al. (1992); Linsker (1986a); Kohonen (1984); Miller et al. (1989); MacKay and Miller (1990); Linsker (1986c). Most of these models use in some way or another an unsupervised correlation-based learning rule similar to the general Hebb rule of Eq. (10.2); see Erwin et al. (1995) for a recent review.”

030728 – The simplest incomplete grammar

Monday, July 28th, 2003

030728 – The simplest incomplete grammar

If grammars are inherently incomplete, what is the simplest incomplete grammar?  Actually the question should be given an example based on say English of the simplest incomplete grammar.

Even if grammars are not inherently incomplete, one may argue that individuals acquire aspects of a grammar over time.  I vaguely recall that certain grammatical structures are in fact acquired at different ages as children learn languages.  Moreover, there are some built-in conflicts in the grammar of English (and probably just about any other language).  For example:

It’s me.  (Arguably based on the Norman French equivalent of modern French C’est moi).

It is I.  (Based on the rule that the verb to be takes the nominative case on both sides).

We’re truly unaccustomed to thinking about massively parallel computing.  Our approach to computing has been to create very fast single threaded processors; and as an afterthought, ordinarily to take advantage of idle time, we have introduced multi programming.  I think it is fair to say that our excursions into the realm of massively parallel computing are still in their infancy.  Without having done a careful survey of the literature, it would seem that the challenge of massively parallel computing  (at least that which would be patterned after neural structures in the mammalian brain) is to be able to handle the large number of interconnections found in the brain as well as the large number of projections from place to place.  [However, it is emphatically not the case that in the brain everything is connected directly to everything else.  It would be impractical, and it’s hard to see what it would accomplish beyond confusion.]

To hazard a gross oversimplification of the computational architecture of the brain, the brain is composed of layers of neurons, whose layers are identified by their common synaptic distance from some source of input.  Layers are stacked like layers in a cake (giving rise to “columns”, identified by their association with predecessor and postdecessor synapses.  To the extent the word “column” suggests a cylinder of roughly constant diameter, or even constant cross-section, it may be a bad choice of metaphor.  I imagine the diameter of a “column” to increase initially (as inputs pass deeper into the processor) and then to decrease (as signals that are to become outputs pass towards the effectors).  At various stages in the processing, intermediate outputs are transmitted to other areas (projections, via fiber bundles).  Depending on the stage of processing, a layer may receive synchronic input (that is, all inputs represent some class of inputs that originated at essentially the same moment in time, e.g., visual input from the retina) or, it may receive diachronic input (that is, a set of samples over time that originated at essentially the same location).  Indeed, some layers may receive both synchronic and diachronic inputs.

We don’t know much about how to think about the functions computed (computable) by such a system.  Not to mention that I don’t know much of anything about synaptic transmission.  Yeah, yeah, neurotransmitters pour into the synaptic gap.  Some of them are taken up by receptors on the axon and if enough of them arrive, the axon fires into the neuron.  But there are lots of different neurotransmitters.  Why?  How do the stellate glia affect the speed and nature of the pulses and slow potentials?  Do concentrations of neurotransmitters change globally?  Locally?

Somebody pointed out (Damasio?) that “homeostasis” is not really a good metaphor because the “set point” (my term) of the system changes depending on things.  In some cases, it’s clear what goes on: Too much water in the system?  Excrete water?  But the other side of that: Too much salt in the system?  Conserve water?  Well, yes, but what needs to happen is the triggering of an appetitive state that leads to locating a source of water (in some form, e.g., a water tap, a pond, a peach) and taking the appropriate steps to make that water internally available (e.g., get a glass, open the tap, fill the glass, drink the water; stick face in water, slurp it up; eat the peach).

At its core, this is a sort of low-level optimizer.  Based on the readings of a set of enteroceptors (sensors), create an internal state that either modifies the internal environment directly or that “motivates” (“activates”) behaviors that will indirectly modify the internal state.

It’s all very well to say that if one drips hypersaline solution into the CSF by the hypothalamus, the goat “gets thirsty and drinks lots of water,” but an awful lot has to happen on the way.

And it’s not very safe for the optimizer (governor?) to call for specific external behavior.  It can specify the goal state and monitor whether the organism is getting closer to or farther away from the goal state, but it’s not clear (with respect to thirst, say) how the information about getting closer to the goal can get to the optimizer at any time before an appropriate change in the internal environment is detected, e.g., the organism begins to ingest something that triggers the “incoming water” detectors.  Prior to that, it’s all promises.  Presumably, it goes something like this: behaviors “associated” with triggering the “incoming water” detectors are “primed”.  How?  Maybe by presentation of the feeling of thirst.  Priming of those behaviors triggers back-chained behaviors associated with the initiation of the “directly” primed behaviors.  And so on, like ripples in a pond.  The ever-widening circles of primed behaviors are looking for triggers that can be found in the current environment (more correctly, that can be found in the current internal environment as it represents the current external environment).

[Back-chaining seems related to abduction, the process of concocting hypotheses to account for observed circumstances.]

I keep coming around to this pattern matching paradigm as an explanation of all behavior.  It’s really a variation of innate releasing mechanisms and fixed action patterns.

030723 – Limits of consciousness

Wednesday, July 23rd, 2003

030723 – Limits of consciousness

Is important to note that we’re not conscious of absolutely everything that goes on in our bodies.  We’re not conscious of the normal functioning of our lymphatic system.  We’re not conscious of the normal functioning of the stomach, the liver, the pancreas, etc. We’re not conscious of changes in the iris of the eye.  With respect to these functions, we’re zombies.

We’re not ordinarily conscious of breathing, although we have the ability to take deep breaths or to hold our breaths.  Breathing is sometimes conscious, sometimes not.

I wouldn’t say we’re very good at imagining smells or tastes, but I can’t speak to the abilities of a skilled smeller or taster.  Still, we can recognize specific tastes and smells (new Coca-Cola didn’t taste like “Classic” Coca-Cola and people didn’t need a side-by-side comparison to know that).

I think I vote with Lakoff on the fact that our model of just about anything is ultimately based on our model of our self.  Or at least our models ultimately refer “metaphorically” to built-in [well, maybe not built-in, but acquired in the course of post-natal (and possibly some pre-natal) experience] “concepts” relating in some way to perceptual experience, often kinesthetic.  It is certainly the case that some of our knowledge is factual, e.g., the battle of Hastings was fought in 1066.  Other knowledge is procedural, I would say “model based”.  Model based knowledge is of necessity based on metaphor.  That is, the behavior of something is referenced mutatis mutandis to the behavior of something else already understood or at least already modeled.

An important model is our internal model of another person.  Is not clear to me whether the origin of this model is self-observation or observation of others.  Is there an internal model of the self and an internal model of another person?  Or are they one and the same, available to be applied equally to oneself or another?  Certainly, a key element of our model of another is projection of our own understanding onto the other.  Now comes the fun part.  By “introspection” it is clear that because I have a model of another person, my model of another person should include a model of that person’s model of yet another person.  So from these models, I now have available my own behavior (whether actual or under consideration), my anticipation of the behavior of another, and my anticipation of the other’s understanding of my behavior [and so on, but not infinitely because of (literally) memory limitations].

030721 – Consciousness and (philosophical) zombies

Monday, July 21st, 2003

[Added 040426]

Is consciousness an expert system that can answer questions about the behavior of the organism?  That is, does SHRDLU have all the consciousness there is?  Does consciousness arise from the need to have a better i/o interface?  Maybe the answer to the zombie problem is that there are nothing but zombies, so it’s not a problem.

In effect, everything happens automatically.  The i/o system is available to request clarification if the input is ambiguous and is available to announce the result of the computations as an output report.

030721 – Consciousness and zombies

The reason the zombie problem and the Chinese room problem are significant is that they are both stand-ins for the physicalism/dualism problem.  That being the case, it seems pointless to continue arguing about zombies and Chinese rooms absent a convincing explanation of how self-awareness can arise in a physical system.  That is the explanation I am looking for.

Ted Honderich (2000) observes that, “Something does go out of existence when I lose consciousness.”  From a systems point of view, loss of consciousness entails loss of the ability (the faculty?) to respond to ordinary stimuli and to initiate ordinary activities.  Loss of consciousness is characterized by inactivity and unresponsiveness.  Loss of consciousness is distinguished from death in that certain homeostatic functions necessary to the continued biological existence of the organism, but not generally accessible to consciousness, are preserved.

In sleep, the most commonly occurring loss of consciousness, these ongoing homeostatic functions have the ability to “reanimate” consciousness in response to internal or external stimuli.

Honderich observes that “consciousness can be both effect and cause of physical things.”  This is consistent with my sense that consciousness is an emergent property of the continuous flow of stimuli into the organism and equally continuous flow of behaviors emanating from the organism.  I’m not real happy about “emergent property”, but it’s the best I can do at the moment.

Honderich identifies three kinds of consciousness: perceptual consciousness, which “contains only what we have without inference;” reflective consciousness, which “roughly speaking is thinking without perceiving;” and affective consciousness, “which has to do with desire, emotion and so on.”

Aaron Sloman (“The Evolution of What?”  1998) notes that in performing a systems analysis of consciousness, we need to consider “what sorts of information the system has access to…, how it has access to this information (e.g., via some sort of inference, or via something more like sensory perception), [and] in what form it has the information (e.g., in linguistic form or pictorial form or diagrammatic form or something else).”

Sloman also identifies the problem that I had independently identified that leads to it being in the general case impossible for one to predict what one will do in any given situation.  “In any system, no matter how sophisticated, self-monitoring will always be limited by the available access mechanisms and the information structures used to record the results.  The only alternative to limited self-monitoring is an infinite explosion of monitoring of monitoring of monitoring…  A corollary of limited self-monitoring is that whatever an agent believes about itself on the basis only of introspection is likely to be incomplete or possibly even wrong.”

Sloman (and others), in discussing what I would call levels of models or types of models, identifies “a reactive layer, a deliberative layer, and a meta management (or self-monitoring) layer.”

030718 – Self-Reporting

Friday, July 18th, 2003

030718 – Self-Reporting

Is there any advantage to an organism to be able to report its own internal state to another organism?  For that is one of the things that human beings are able to do.  Is there any advantage to an organism to be able to use language internally without actually producing an utterance?

Winograd’s SHRDLU program had the ability to answer questions about what it was doing.  Many expert system programs have the ability to answer questions about the way they reached their conclusions.  In both cases, the ability to answer questions is implemented separately from the part of the program that “does the work” so to speak.  However, in order to be able to answer questions about its own behavior, the question answering portion of the program must have access to the information required to answer the questions.  That is, the expertise required to perform the task is different from the expertise required to answer questions about the performance of the task.

In order to answer questions about a process that has been completed, there must be a record of, or a way to reconstruct, the steps in the process.  Actually, is not sufficient simply to be able to reconstruct the steps in the process.  At the very least, there must be some record that enables the organism to identify the process to be reconstructed.

Not all questions posed to SHRDLU require memory.  For example one can ask SHRDLU, “What is on the red block?”  To answer a question like this, SHRDLU need only observe the current state of its universe and report the requested information.  However, to answer at question like, “Why did you remove the pyramid from the red block?”  SHRDLU must examine the record of its recent actions and the “motivations” for its recent actions to come up with an answer such as, “In order to make room for the blue cylinder.”

Not all questions that require memory require information about motivation as, for example, “When was the blue cylinder placed on the red cube?”

Is SHRDLU self-aware?  I don’t think anyone would say so.  Is an expert system that can answer questions about its reasoning self-aware?  I don’t think anyone would say so.  Still, the fact remains that it is possible to perform a task without being able to answer questions about the way the task was performed.  Answering questions is an entirely different task.


Thursday, July 17th, 2003


I think I am getting tired of the gee-whiz attitude of linguists who are forever marveling at “the astronomical variety of sentences and a natural language user can produce and understand.”  Hauser, Chomsky, and Fitch (2002).  I can’t recall anyone marveling at the astronomical variety of visual images a human observer can understand, or the astronomical variety of visual images a human artist can produce.  I am also tired of the gee-whiz attitude linguists take with respect to the fact that there can be no “longest” sentence.  With essentially the same argument, I can assert that there can be no “largest” painting.  So what?

Another gee-whiz topic for linguists is the fact that, “A child is exposed to only a small proportion of the possible sentences in its language, thus limiting its database for constructing a more general version of that language in its own mind/brain.”  Hauser, Chomsky, and Fitch (2002).  It is also the case that a child is exposed to only a small proportion of the possible visual experiences in the universe, thus limiting its database for constructing a more general version of visual experience in its own mind/brain.  If one is to marvel at “the open ended generative property of human language,” one must marvel at the open ended generative property of human endeavor in art and music as well.  And if we do that, must we also marvel at the open ended generative property of bower bird endeavor in bower building and whale endeavor in whale song composition?

Hauser, Chomsky, and Fitch (2002) refer to “the interface systems — sensory-motor and conceptual-intentional”.  Note that there is a nice parallelism between sensory and conceptual and between motor and intentional.  I like it.

Hauser, Chomsky, and Fitch (2002) observe that it is possible that “recursion in animals represents a modular system designed for a particular function (e.g., navigation) and impenetrable with respect to other systems.  During evolution, the modular and highly domain specific system of recursion may have become penetrable and domain general.  This opened the way for humans, perhaps uniquely, to apply the power of recursion to other problems.”

Here, again, is a suggestion that to me points at a new kind of model found only in humans: a model of the self?  perhaps in some sense a model of models, but otherwise behaving like models in other animals.

A cat may be conscious, but does it, can it, know that it is conscious?


Wednesday, July 16th, 2003


Babies are born with reflexes (IRM-FAP’s).  I wonder if the corresponding models mirror the reflexes.  It’s certainly a better place to start than a) all connection weights set to zero or b) connection weights set to random values.

How do babies to imitation?  How does the organization make the connection between what is seen at its own body?  Is the basic rule for babies: imitate unless homeostatic needs dictate otherwise?

“No” is an active response.  Absence of “no” seems to indicate “no objection”.

With respect to internal models, updating the model is not the trick.  The trick is turning off the Plant (effectors) for the purpose of thinking about actions.  Being able to talk to oneself is an outgrowth of being able to think about actions without acting.  The model can only be updated when action is taken, because that’s the only time the model can get an error signal.  Well, that’s true when the model models an internal process.  It’s interesting question to consider when a model of an external process gets updated.

An appeal to parsimony would suggest that a model of an external process gets updated when the model is being used, shall I say, unconsciously.  That is, if we assume a model of an external process is some kind of generalization of a model of an internal process, then the circumstances under which a model of an external process is updated will be some kind of generalization of the circumstances under which a model of an internal process is updated.

As an off the wall aside, this might account for the difficulty humans experience in psychotherapeutic circumstances.  Simply thinking about one’s worldview and recognizing that it should be changed is, by my model, not going to change one’s worldview.  In effect, change to an unconscious process can only take place unconsciously.

Margaret Foerster (personal communication) indicates that in her therapeutic experience, change begins when a patient is confronted with a highly specific example of his/her maladaptive behavior.  Not all highly specific examples have the effect of initiating change, but examples that do are recognizable by the therapist from the reaction of the patient (who also recognizes at a “gut” level) the significance of the example.  That is, the example creates a state in which the predictions of the patient’s internal model do not match the actual results.  To the extent that the internal model was invoked automatically rather than using the model analytically, the mismatch triggers (by my hypothesis) the (automatic) model correction (learning) process.

Foerster observes that in the sequel to such a significant therapeutic intervention, the patient experiences (and reports) additional related mismatches.  I don’t know that my model has anything to say about the fact that such mismatches are experienced consciously.  Nonetheless, I would be surprised to find that an unconscious model would change in a major way in response to a single mismatch.  I would rather expect gradual change based on accumulating evidence of consistently erroneous predictions.  On the other hand, I would expect the model to respond fairly rapidly to correct itself.  Notice that I say “correct itself”.  That is my way of indicating that the process is unconscious and not directly accessible, although significant change will manifest itself in the form of a recognizably (to both patient and therapist) different “way of thinking”.

Actually, I don’t think I have to worry about the fact that the mismatches Foerster describes are experienced consciously.  On reflection, I think mismatches are experienced consciously.  For example, when one is not paying attention and steps off a curb, the mismatch between expectation (no curb) and reality (sudden drop in the level of the ground) is most assuredly experienced consciously.

But back to the double life of models: it is all very well to say that a model can be used off line and that the experience of so doing is a mental image of some sort, but aside from the question of how a model is placed on line or off line, there remains the question of how inputs to the off line model are created.  Not to mention, of course, the question of why we “experience” anything.  So far, it would seem that there is nothing in a description of human behavior from the outside (for example, as seen by a Martian) that would lead one to posit “experience”, aside, that is, from our hetero phenomenological reports of “experience”.  That’s still a stumper.

Query: do hetero phenomenological reports of “experience” require the faculty of language?  Without the faculty of language how could one obtain a hetero phenomenological report?  How could one interpret such a report?  Is it the case that the only way a Martian can understand a hetero phenomenological report is to learn the language in which the report is made?  How much of the language?

Would it be sufficient for a Martian who only understood some form of pidgin like “me happy feelings now”.  The point seems to be that somehow English speakers generally come to understand what the word “experience” means and can use in appropriate hetero phenomenological contexts.  What would be necessary for a Martian to understand what “experience” means?


Tuesday, July 15th, 2003


Hauser, Chomsky, and Fitch in their Science review article (2002) indicate that “comparative studies of chimpanzees and human infants suggest that only the latter read intentionality into action, and thus extract unobserved rational intent.” this goes along with my own conviction that internal models are significant in the phenomenon of human and self-awareness.

Hauser, Chomsky, and Fitch argue that “the computational mechanism of recursion” is critical to language ability, “is recently involved and unique to our species.”  I am well aware that many have died attempting to oppose Chomsky and his insistence that practical limitations have no place in the description of language capabilities.  I am reminded of Dennett’s discussion of the question of whether zebra is a precise term, that is, whether there exists anything that can be correctly called a zebra.  It seems fairly clear that Chomsky assumes that language exists in the abstract (much the way we naively assume that zebras exist in the abstract) and then proceeds to draw conclusions based on that assumption.  The alternative is that language, like zebras, is in the mind of the beholder, but that when language is placed under the microscope it becomes fuzzy at the boundaries precisely because it is implemented in the human brain and not in a comprehensive design document.

Uncritical acceptance of the idea that our abstract understanding of the computational mechanism of recursion is anything other than a convenient crutch for understanding the way language is implemented in human beings is misguided.  In this I vote with David Marr (1982) who believed that neither computational iteration nor computational recursion is implemented in the nervous system.

On the other hand, it is interesting that a facility which is at least a first approximation to the computational mechanism of recursion exists in human beings.  Perhaps the value of the mechanism from an evolutionary standpoint is that it does make possible the extraction of intentionality from the observed behavior of others.  I think I want to turn that around.  It seems reasonable to believe that the ability to extract intentionality from observed behavior would confer an evolutionary advantage.  In order to do that, it is necessary to have or create an internal model of the other in order to get access to the surmised state of the other.

Once such a model is available it can be used online to surmise intentionality and it can be used off line for introspection, that is, it can be used as a model of the self.  Building from Grush’s idea that mental imagery is the result of running a model in off line mode, we may ask what kind of imagery would result from running a model of a human being off line.  Does it create an image of a self?

Alternatively, since all of the other models proposed by Grush are in models of some aspect of the organism itself, it might be more reasonable to suppose that a model of the complete self could arise as a relatively simple generalization of the mechanism used in pre-existing models of aspects of the organism.

If one has a built-in model of one’s self in the same way one has a built-in model of the musculoskeletal system, then language learning may become less of a problem.  Here’s how it would work.  At birth, the built-in model is rudimentary and needs to be fine-tuned to bring it into closer correspondence with the system it models.  An infant is only capable of modeling the behavior of another infant.  Adults attempting to teach language skills to infants use their internal model to surmise what the infant is attending to and then name it for the child.  To the extent that the adult has correctly modeled the infant and the infant has correctly modeled the adult (who has tried to make it easy to be modeled), the problem of establishing what it is that a word refers to becomes less problematical.


Monday, July 14th, 2003


Here’s what’s wrong with Dennett’s homunculus exception.  It’s a bit misleading to discuss a flow chart for a massively parallel system.  We’re accustomed to high bandwidth interfaces between modules where high bandwidth is implemented as a high rate of transmission through a narrow pipe.  In the brain, high bandwidth is implemented as a leisurely rate of transmission through the Mississippi river delta.


Friday, July 11th, 2003


The body is hierarchical in at least one obvious way.  In order to make a voluntary movement, a particular muscle is targeted, but the specific muscle cell doesn’t matter.  What matters is the strength of the contraction.  In assessing the result of the contraction, what matters is the change in position of the joint controlled by the muscle, not the change in position of specific muscle cells.  Thus, an internal model needs only to work with intensity of effort, predicted outcome, and perceived outcome.  This kind of model is something that computer “neural networks” can shed some light on.  Certainly, there are more parameters, like “anticipated resistance” but there are probably not an overwhelming number of them.

The point of this is that, as Grush (2002) points out, the internal model has to be updateable in order to enable the organism to handle changes to its own capabilities over time.  At least at this level, Hebbian learning (as if I knew exactly what that denoted) seems sufficient.