Chapter 5: Learning
Comparing Classical and Operant Conditioning
New Learning Based on Original Learning
A Review of Classical Conditioning and Operant Conditioning
Pavlov's Conditioning Experiments
Russian psychologist Ivan Pavlov hit upon classical (or Pavlovian) conditioning almost by accident when studying digestive processes. He trained a dog to salivate at the sound of a bell by presenting the sound just before food was brought into the room. Eventually the dog began to salivate at the sound of the bell alone.
Elements of Classical Conditioning
Classical conditioning involves pairing a response naturally caused by one stimulus with another, previously neutral stimulus. There are four basic elements to this transfer: The unconditioned stimulus (US), often food, invariably causes an organism to respond in a specific way. The unconditioned response (UR) is the reaction (such as salivation) that always results from the unconditioned stimulus. The conditioned stimulus (CS) is a stimulus (such as a bell) that does not initially bring about the desired response; over the course of conditioning, however, the CS comes to produce the desired response when presented alone. Finally, the conditioned response (CR) is the behavior that the organism learns to exhibit in the presence of a conditioned stimulus.
Classical Conditioning in Humans
Humans also learn to associate certain sights or sounds with other stimuli. John Watson and Rosalie Rayner conditioned a little boy, Albert, to fear white rats by making a loud, frightening noise every time the boy was shown a rat. Using much the same principle, Mary Cover Jones developed a method for unlearning fears: She paired the sight of a caged rat, at gradually decreasing distances, with a child's pleasant experience of eating candy. This method evolved into desensitization therapy, a conditioning technique designed to gradually reduce anxiety about a particular object or situation. Recently, scientists have discovered that the immune system may respond to classical conditioning techniques, thus allowing doctors to use fewer drugs in treating certain disorders.
Classical Conditioning Is Selective
Some kinds of conditioning are accomplished very easily, whereas other kinds may never occur. Research demonstrating that we develop phobias about snakes and spiders, for example, but almost never about flowers or cooking utensils illustrates Seligman's principles of preparedness and contrapreparedness, respectively. The ease with which we develop conditioned food (or taste) aversions also illustrates learning preparedness. Conditioned food aversions are exceptions to the general rules about classical conditioning. Animals can learn to avoid poisonous food even if there is a lengthy interval between eating the food and becoming ill. In many cases, only one pairing of conditioned and unconditioned stimuli is necessary for learning to take place.
Classical conditioning focuses on a behavior that invariably follows a particular event, whereas operant (or instrumental) conditioning concerns the learning of behavior that operates on the environment: The person or animal behaves in a particular way to gain something desired or avoid something unpleasant. This behavior is initially emitted rather than elicitedyou wave your hand to flag down a taxi, dogs beg at the dinner table to get food.
Thorndike's Conditioning Experiments
Psychologist Edward Lee Thorndike was the first researcher to study operant behavior systematically. He used a "puzzle box" to determine how cats learn.
Elements of Operant Conditioning
Types of Reinforcement
There are several kinds of reinforcers; all of them strengthen behavior just as steel rods reinforce or strengthen concrete. The presence of positive reinforcers (such as food) adds to or increases the likelihood that a behavior will recur. Negative reinforcers (such as terminating electric shocks) also increase the likelihood that a behavior will recur, but they do so by reducing or eliminating something unpleasant from the environment.
Although all reinforcers (both positive and negative) increase the likelihood that a behavior will occur again, punishment is any event whose presence decreases the likelihood that ongoing behavior will recur. Reinforcement always strengthens behavior; punishment weakens it. Avoidance training involves learning a desirable behavior that prevents an unpleasant condition, such as punishment, from occurring.
Operant Conditioning Is Selective
Studies have revealed that in operant conditioning the behaviors that are easiest to condition are those that animals typically would perform in the training situation. These behaviors vary from species to species, and put significant constraints on both classical and operant conditioning.
When something we do is followed closely by a reinforcer, we tend to repeat that behavior, even if it was not actually responsible for producing the reinforcement. Such behaviors are called superstitious. Nonhumans as well as humans exhibit superstitious behaviors.
The failure to avoid or escape from an unpleasant or aversive stimulus that occurs as a result of previous exposure to unavoidable painful stimuli is referred to as learned helplessness. Learned helplessness, which has been demonstrated in both animals and humans, is associated with many of the symptoms characteristic of depression.
COMPARING CLASSICAL AND OPERANT CONDITIONING
A number of phenomena characterize both classical conditioning and operant conditioning, and there are several terms and concepts common to both kinds of learning.
In classical conditioning, responses occur naturally and automatically in the presence of the unconditioned stimulus. During the phase of the learning process called response acquisition, these naturally occurring responses are attached to the conditioned stimulus by pairing that stimulus with the unconditioned stimulus. Intermittent pairing reduces both the rate of learning and the final level of learning achieved.
In operant conditioning, response acquisition refers to the phase of the learning process in which desired responses are followed by reinforcers. A Skinner box is often used to limit the range of available responses and thus increase the likelihood that the desired response will occur. To speed up this process and make the occurrence of a desired response more likely, motivation may be increased by letting the animal become hungry; the number of potential responses may also be reduced by restricting the animal's environment.
For behaviors outside the laboratory, which cannot be controlled so conveniently, the process of shaping is often useful: Reinforcement is given for successive approximations to the desired behavior. However, there are differences among species in what behaviors can be learned and the circumstances under which learning will take hold.
Extinction and Spontaneous Recovery
If the unconditioned stimulus and the conditioned stimulus are no longer paired, extinction occurs, meaning the strength and/or frequency of the learned response diminishes. When Pavlov's dogs received no food after repeatedly hearing the bell, they ceased to salivate at the sound of the bell. However, after a while, this extinguished response may reappear without retraining in a process called spontaneous recovery. Extinction is complete when the subject no longer produces the conditioned response.
Extinction occurs in operant conditioning when reinforcement is withheld. However, the ease with which a behavior is extinguished varies according to several factors: the strength of the original learning, the variety of settings in which learning takes place, and the schedule of reinforcement used during conditioning. Especially hard to extinguish is behavior learned through punishment rather than reinforcement.
Generalization and Discrimination
NEW LEARNING BASED ON ORIGINAL LEARNING
In both classical and operant conditioning, original learning serves as a building block for new learning.
Higher-Order Conditioning in Classical Conditioning
Higher-order conditioning in classical conditioning uses an earlier conditioned stimulus as an unconditioned stimulus for further training. For example, Pavlov used the bell to condition his dogs to salivate at the sight of a black square. This sort of conditioning is difficult to achieve because of extinction: Unless the first unconditioned stimulus is presented occasionally, the initial conditioned response will be extinguished.
Secondary Reinforcers in Operant Conditioning
In operant conditioning, neutral stimuli can become reinforcers by being paired or associated with other reinforcers. A primary reinforcer is one that, like food and water, is rewarding in and of itself. A secondary reinforcer is one whose value is learned through its association with primary reinforcers or with other secondary reinforcers. Money is an example of a secondary reinforcerin and of itself, it is not rewarding; it is valuable only for what it can buy.
The "if-then" relationship between conditioned stimuli and unconditioned stimuli in classical conditioning or between responses and reinforcers (or punishers) in operant conditioning is called a contingency.
Contingencies in Classical Conditioning
Robert Rescorla has demonstrated that classical conditioning requires more than merely presenting an unconditioned stimulus and a conditioned stimulus together in time. His work shows that for conditioning to occur, a conditioned stimulus must provide information about the unconditioned stimulusthat is, there must be a CSUS contingency. Blocking can occur when prior conditioning prevents conditioning to a second stimulus, even when the two stimuli are presented simultaneously.
Contingencies in Operant Conditioning
In operant conditioning, response contingencies are usually referred to as schedules of reinforcement. We rarely receive reinforcement every time we do something. Interestingly, it turns out that partial reinforcement, in which rewards are given for some correct responses but not for every oneresults in behavior that persists longer than that learned by continuous reinforcement. The schedule of reinforcement specifies when a reinforcer will be delivered. Reinforcers may be provided on the basis of time since the last reinforcement (the interval between reinforcements). Or reinforcement may depend on the number of correct responses since the last reinforcement (the ratio of reinforcement per correct response).
A fixed-interval schedule provides reinforcement of the first correct response after a fixed, unchanging period of time. A variable-interval schedule reinforces the learner for the first correct response that occurs after various periods of time, so the subject never knows exactly when a reward is going to be delivered. In a fixed-ratio schedule, behavior is rewarded each time a fixed number of correct responses is given; in a variable-ratio schedule, reinforcement follows a varying number of correct responses.
A REVIEW OF CLASSICAL CONDITIONING AND OPERANT CONDITIONING
Despite their differences, classical and operant conditioning share many similarities; both involve associations between stimuli and responses; both are subject to extinction and spontaneous recovery as well as generalization and discrimination. In fact, many psychologists now question whether classical and operant conditioning are not simply two ways of bringing about the same kind of learning. Biofeedback is an operant conditioning technique in which instruments are used to give learners information about the strength of a biological response over which they seek to gain control.
Both human and nonhuman animals also demonstrate cognitive learning, learning that is not tied to immediate experience by stimuli and reinforcers.
Latent Learning and Cognitive Maps
Insight and Learning Sets
One phenomenon that highlights the importance of cognitive processing in learning is insight, in which learning seems to occur in a "flash." Through insight learning, human and some nonhuman animals suddenly discover whole patterns of behavior or solutions to problems. Learning sets refer to the increasing effectiveness at problem solving that comes about as more problems are solved.
Learning by Observing
Social learning theory argues that we learn not just from firsthand experience, but also from watching others or by hearing about something. Albert Bandura contends that observational (or vicarious) learning accounts for many aspects of human learning. His highly influential theory of learning holds that although reinforcement is unrelated to learning itself, reinforcement may influence whether learned behavior is actually displayed. Such observational learning stresses the importance of models in our lives. To imitate a model's behavior, we must (1) pay attention to what the model does; (2) remember what the model did; and (3) convert what we learned from the model into action. The extent to which we display behaviors that have been learned through observation can be affected by vicarious reinforcement and vicarious punishment. Social cognitive theory emphasizes that learning a behavior from observing others does not necessarily lead to performing that behavior. We are more likely to imitate behaviors we have seen rewarded.
Cognitive Learning in Nonhumans
© 1995-2002 by Prentice-Hall, Inc.|
A Pearson Company