Is Learning Shaped by Consequences?

Lee Kelley
Jan 21, 2016
4 min read

Is Learning Shaped by Consequences or by Detecting Environmental Changes?

Originally published in different form on September 16, 2010 at PsychologyToday.com.

There’s a tendency among +R trainers to believe that their method is based on “the science of how animals learn,” when in fact, there are still many gaps in our knowledge about how learning actually takes place. For instance the idea that dogs learn through making associations between a behavior (“I sit”) and its consequences (“I get a treat”) may not actually be the case; there’s a growing body of clinical research, particularly in the area of neuroscience, which strongly suggests that the learning process may be the result of a very different set of rules than what we’ve previously been taught.

Dr. Ian Dunbar, one of the figureheads in the +R movement wrote on his blog recently that he' been puzzling recently over why “dog training isn't working that well” anymore. He also writes: “The first gift that we can give to all animal owners, parents and teachers is to simplify the ridiculously ambiguous and unnecessarily complicated and confusing [behavioral science] terminology. Second, let’s simplify the underlying theory by going back to Thorndike’s original premise—that behavior is influenced by [its] consequences.”

This idea of how pleasant or unpleasant outcomes shape behavior can be traced directly back to Freud’s “pleasure principle;” we tend to be attracted to things that increase pleasure (or decrease internal tension), and that we tend to avoid things that do the opposite. However, new research suggests that behavior is not learned via its consequences.

I think one of the biggest misunderstandings about positive reinforcement is the idea that animals learn new behaviors primarily because a neurotransmitter called dopamine creates a feeling of well-being in connection with an external reward, and that even the anticipation of a reward releases dopamine.

Here’s what WikiPedia has to say: “Dopamine is commonly associated with the reward system of the brain, providing feelings of enjoyment and reinforcement to motivate a person to perform certain activities.”

That sounds about right, doesn’t it?

Yes, but here’s the problem. In testing this idea directly on the brains of certain animals (mainly rats, mice, and monkeys), researchers have found an interesting set of anomalies. In his paper “Dopamine and Reward: Comment on Hernandez et al. (2006),” Neuroscientist Randy Gallistel of Rutgers University writes, “In the monkey, dopamine neurons do not fire in response to an expected reward, only in response to an unexpected or uncertain one, and, most distressingly of all, to the omission of an expected one.”

So missing out on a reward is pleasurable? How could that be?

In another article, “Deconstructing the Law of Effect,” Gallistel poses the problem of learning from an information theory perspective, contrasting Edward Thorndike’s model, which operates as a feedback system, and a feedforward model based on Claude Shannon’s information theory.

It’s well-known that shaping animal behavior via operant or classical conditioning requires a certain amount of time and repetition. But in the feedforward model learning can take place instantly, in real time.

Why the difference? And is it important?

I think so. Which is more adaptive, being able to learn a new behavior on the fly, in the heat of the moment, or waiting for more and more repetitions of the exact same experience to set a new behavior in place?

In Thorndike’s model, the main focus is on targeting which events in a stream seem to create changes in behavior. But according to information theory, the intervals between events, where nothing is happening, also carry information, sometimes even more than is carried during the unconditioned stimulus. This would explain why the monkey’s brains were producing dopamine when they detected a big change in the pattern of reward, i.e., no reward at all!

We’re now discovering that the real purpose of dopamine is to help motivate us to gather new information about the outside world quickly and efficiently. In fact dopamine is released during negative experiences as well as positive ones. (The puppy who gets his nose scratched by the cat doesn’t need further lessons to reinforce the “no-chasing-the-cat” rule; he learns that instantaneously, with a single swipe of the cat’s paw.)

This adds further importance to the idea that learning is not as much about pairing behaviors with their consequences as it is about paying close attention to salient changes in our environment: the bigger the changes, the more dopamine is released, and, therefore, the deeper the learning.

Randy Gallistel again: “...behavior is not the result of a learning process that selects behaviors on the basis of their consequences ... both the appearance of ‘conditioned’ responses and their relative strengths may depend simply on perceived patterns of reward without regard to the behavior that produced those rewards.” (“The Rat Approximates an Ideal Detector of Changes in Rates of Reward: Implications for the Law of Effect,” Journal of Experimental Psychology: 2001, 27, 354-372.)

Temple Grandin always provides us with keen insights into animal behavior, and more particularly, their thought processes. I think she hits the nail on the head when she says that animal minds are geared toward perceiving vivid sensory details about their environments while the human brain tends to gather these details into conceptual chunks. In general terms: the animal mind is, in most cases, a difference detector, while the human mind is a similarity detector.

So if learning takes place through recognizing changes in the environment—an instantaneous process that releases dopamine—and not through the slow, random, trial-and-error recognition of connections between behaviors and their consequences, this would indicate that the foundation of behavioral science is no longer valid. It also explains why “dog training isn't working that well anymore.”

This is the 21st Century. And while Thorndike's work on how animals learn may have been relevant in 1905, current science no longer supports it.

So what do we replace the outmoded science of “learning by consequences with?”

Good question. My vote would be Natural Dog Training, which marries ancient dog-training wisdom with 21st Century science.

LCK

“Life Is an Adventure—Where Will Your Dog Take You?”

#behavioralscience #naturaldogtraininginnewyorkcity #newyorkdogtrainers #dogtrainingnyc #newyorkcitypuppyclasses #puppyclassesnyc #bestdogtrainersinnewyork #whatsthebestdogtrainingmethod #dogtrainersnyc