Griggs Chapter 4: Learning

Download Report

Transcript Griggs Chapter 4: Learning

Learning
Psychology:
A Concise Introduction
2nd Edition
Richard Griggs
Chapter 4
Prepared by
J. W. Taylor V
Different Perspectives


Behavioral psychologists have focused on
the learning of associations through classical
conditioning and operant conditioning
Cognitive psychologists studying learning
are interested in the more complex type of
earning involved in human memory
The Journey…

Learning Through Classical
Conditioning

Learning Through Operant
Conditioning

Biological and Cognitive Aspects
of Learning
Learning Through
Classical Conditioning
The Elements and Procedures of
Classical Conditioning
General Learning Processes
in Classical Conditioning
Classical Conditioning


The process of learning in which one
stimulus signals the arrival of another
stimulus
Sometimes called “Pavlovian” conditioning
because Ivan Pavlov was the first
researcher to systematically study this type
of learning
Pavlov



A Russian physiologist studying digestive
processes in dogs
The dogs were strapped into harnesses and had tubes
inserted into their cheeks to measure the amount of
salivation, the initial step in
the digestive process
With time, he noticed that
the dogs started to salivate
before the meat powder
was even put in their
mouths, and wanted to know
why this was happening
Elements and Procedures
of Classical Conditioning
Unconditioned
Stimulus
(UCS)
Unconditioned
Response
(UCR)
Associated
Similar
Conditioned
Stimulus
(CS)
Conditioned
Response
(CR)
UCS and UCR



Dogs salivate when meat powder is put
in their mouths – this is a reflex, a
response that occur automatically in
the presence of a certain stimulus
(i.e., meat powder)
An unconditioned stimulus (UCS)
is the stimulus that elicits the reflexive
response (here, the UCS is the meat
powder)
An unconditioned response (UCR)
is the response automatically elicited
by the UCS (here, the UCR is salivating
in response to the meat powder)
CS and CR




A neutral stimulus is a stimulus that does not naturally
elicit the to-be-conditioned response (e.g., auditory tones)
To achieve conditioning, the neutral stimulus (a tone) is
presented just before (ideally one-half to one full second
before) the UCS (meat powder) for several trials
Once the conditioning occurs (that is, the dog
starts to salivate to the sound of the tone
before the food is put in its mouth), the
neutral stimulus is called the
conditioned stimulus (CS)
The learned response to the
conditioned stimulus is called the
conditioned response (CR)
Delayed and Trace Conditioning

In delayed conditioning, the offset of the CS is delayed
until after the UCS is presented so that the two stimuli
occur at the same time


In trace conditioning, there is a period of time between
the offset of the CS and the onset of the UCS when
neither stimulus is present


The tone would be turned on and continue to sound until the
meat powder was placed in the dog’s mouth
It is called “trace” conditioning because there must be a memory
trace of the CS for the association between stimuli to be learned
Delayed conditioning is the most effective procedure for
classical conditioning; trace conditioning can be effective
provided the interval between stimuli is brief
A Summary of
Classical Conditioning
The “Little Albert” Study




John Watson and Rosalie Rayner conducted a
study on an 11-month-old infant named Albert
While Albert was looking at a little white rat,
Watson quietly sneaked behind him with a long
iron bar and a hammer and clanged the two
together
Albert’s reflexive response, the UCR, was a fearavoidance response (e.g., crying and trying to
crawl away) to the loud noise, which was the UCS
After pairing the white rat with the unexpected loud
noise only 7 times, the white rat became a CS
Other Evidence


Elise Bregman was unable to condition infants to
fear inanimate objects such as wooden blocks
and cloth curtains, suggesting possible biological
predispositions to learn certain fears easier than
others
Classical conditioning can be used to condition
positive reactions, such as in advertising to
condition positive attitudes and feelings toward
certain products (e.g., a celebrity serves as a
UCS, and the product as the CS)
General Learning Processes
in Classical Conditioning

Acquisition is the process of acquiring a new
response, that is, a CR to a CS


The strength of the CR increases during acquisition
Other processes:
Stimulus
Generalization
Extinction
Spontaneous
Recovery
Stimulus
Discrimination
Extinction


A CS must reliably signal that the UCS is
coming
Extinction is the disappearance of the CR
when the UCS no longer follows the CS

The strength of the CR decreases during
extinction
Spontaneous Recovery


During the extinction process, however, the
CR mysteriously increases somewhat in
strength following a rest interval
This is called spontaneous recovery

As extinction continues, the recovery observed
following rest intervals continues to decrease
until it is minimized
Acquisition, Extinction,
and Spontaneous Recovery
Stimulus Generalization

Stimulus generalization is giving the CR to a
stimulus similar to the CS



The more similar the stimulus is to the
CS, the stronger the response will be
For example, if a dog learns to bark at
the door bell, she may, at least in a new
home, also bark at the telephone
because both stimuli are ringing noises
This is an adaptive process because
classical conditioning would not be very
useful if it only allowed us to learn
relationships between specific stimuli
Stimulus Discrimination


Overgeneralizing a response may not be adaptive,
however; thus, we need to learn to discriminate
among stimuli
Stimulus discrimination
is learning to give the CR only to the CS or only to a
small set of very similar stimuli
including the CS

For example, after being in her
new home for a period of time,
the dog will learn to
differentiate/discriminate
between the door bell and the
phone ringing noises
Discrimination Training

During discrimination training, you present
many different stimuli numerous times, but
the UCS only follows one CS

This procedure will extinguish the responses to
other stimuli
Stimulus Generalization
and Discrimination
Learning Through
Operant Conditioning
Learning Through Reinforcement
and Punishment
General Learning Processes
in Operant Conditioning
Partial-Reinforcement Schedules
in Operant Conditioning
Motivation, Behavior,
and Reinforcement
Operant Conditioning
Learning to associate behaviors
with their consequences


Behaviors that are reinforced (lead to satisfying
consequences) will be strengthened, and
behaviors that are punished (lead to
unsatisfying consequences) will be weakened
Called “operant” conditioning because the
organism needs to “operate” on the environment
to bring about consequences from which to learn
The Law of Effect

Thorndike’s Law of Effect states that any
behavior that results in satisfying
consequences tends to be repeated, and
any behavior that results in unsatisfying
consequences tends not be repeated
Learning Through
Reinforcement and Punishment


A reinforcer is a stimulus that increases the probability
of a prior response
Reinforcement is the process by which the probability of
a response is increased by the presentation of a
reinforcer following the response

For example, if you operantly
condition your dog to bark by
giving her a treat each time
she “speaks,” the food would
be a reinforcer, and the process
of increasing the dog’s speaking
behavior by using this reinforcer
would be called reinforcement
Learning Through
Reinforcement and Punishment


A punisher is a stimulus that
decreases the probability of a prior
response
Punishment is the process by
which the probability of a response
is decreased by the presentation of
a punisher following the response

For example, if you conditioned your
dog to stop getting up on the couch by
spraying her with water each time she
got up on the couch, the spraying
would be the punisher, and the process
of decreasing her couch jumping
behavior would be called punishment
Positive and Negative

Positive means that a stimulus is presented


In both positive reinforcement and positive
punishment, a stimulus presented
Negative means that a stimulus is removed

In both negative reinforcement and negative
punishment, a stimulus is removed
Appetitive and Aversive


An appetitive stimulus is a stimulus that an
organism finds pleasing (e.g., food, money)
An aversive stimulus is a stimulus that an
organism finds unpleasing (e.g., sickness,
social isolation)
Appetitive and Aversive

Consequently...




In positive reinforcement, an appetitive stimulus
is presented (e.g., praise for good work)
In positive punishment, an aversive stimulus is
presented (e.g., scolding for doing poor work)
In negative reinforcement, an aversive stimulus
is removed (e.g., using a heating pad for a sore
back)
In negative punishment, an appetitive stimulus
is removed (e.g., parents taking away desert from
a child)
Positive and Negative
Reinforcement and Punishment
Positive
Negative
Reinforcement
Appetitive
stimulus
presented
Aversive
stimulus
removed
Punishment
Aversive
stimulus
presented
Appetitive
stimulus
removed
How do we know?

In any example of positive or negative
reinforcement or punishment, it is critical to
realize that we only know if a stimulus has
served as a reinforcer or a punisher and led
to reinforcement or punishment if the
behavior happens again or stops
happening
Primary and
Secondary Reinforcers


A primary reinforcer is innately reinforcing
since birth (e.g., food, social contact)
A secondary reinforcer is not innately
reinforcing, but gains reinforcing properties
through learning (e.g., money, good grades)
General Learning Processes
in Operant Conditioning

Shaping occurs when an animal is trained to make
a particular response by reinforcing successively closer
approximations to the desired response


With humans, this might mean reinforcing a child the closer he
comes to making his bed correctly each morning
Responding in an operant conditioning
experiment is depicted in a
cumulative record – a record
of the total number of
responses over time


It is a visual depiction of the
rate of responding
As the slope of a line in a cumulative
record gets steeper, the response rate is faster
How to Understand
a Cumulative Record
Acquisition, Extinction,
and Spontaneous Recovery


Acquisition refers to the strengthening of the
reinforced operant response
Extinction is the disappearance of the
operant response when it is no longer
reinforced


The decreasing slope of the record indicates that
the response is being extinguished (i.e., there are
fewer and fewer responses over time)
Spontaneous recovery is the temporary
recovery of the operant response following a
break during extinction training
Acquisition, Extinction, and
Spontaneous Recovery
Vending Machines



We learn that by putting money into a
vending machine, we get something we
really like. We acquire the response of
inserting money into this particular machine.
But one day, we put money in and get no
food out. This happens again. Soon, we stop
putting money in the vending machine. Our
response is being extinguished.
However, after a period of time, we go back
and try again (spontaneous recovery). If
the machine has been repaired, we will get
our food, and our response rate returns to its
previous level. If not, we continue along our
extinction trail.
Discrimination and
Generalization

A discriminative stimulus is one that has to be
present for the operant response to be reinforced



It “sets the occasion” for the response to be reinforced
For example, a rat learns that pressing a lever will result in
food only when a light is on, but not when the light is off
Stimulus generalization is the giving the operant
response in the presence of stimuli similar to the
discriminative stimulus

For example, the rat learns to press
the lever for food only when the light
is a certain shade of red. Presentation
of different colored lights following
acquisition constitutes a test for
generalization.
Partial-Reinforcement Schedules
in Operant Conditioning


Reinforcing every response is called a
continuous schedule of reinforcement
Partial schedules of reinforcement reinforce
behavior only part of the time

The partial-reinforcement effect states that responses
that are reinforced according to a partial schedule
rather than a continuous schedule are more
resistant to extinction
Partial-Reinforcement Schedules




A ratio schedule is based on the number of
responses made
An interval schedule is based on the amount of
time that has elapsed
In a fixed schedule, the number of responses
required for a ratio schedule or the amount of time
needed for an interval schedule is fixed
In a variable schedule, the number of responses
required for a ratio schedule and amount of time
for an interval schedule varies on each trial
Ratio Schedules


In a fixed ratio schedule, a reinforcer is delivered
after a fixed number of responses are made (e.g.,
a rat has to press a lever 10 times before receiving
the reinforcer of food)
In a variable ratio schedule, the number of
responses it takes to obtain a reinforcer varies on
each trial but averages out to be a certain number
over trials (e.g., slot machine payoffs)
Fixed Ratio and Variable Ratio
Schedules of Partial Reinforcement
Interval Schedules


In a fixed interval schedule, a reinforcer is
delivered after the first response is given once a
set interval of time has elapsed (e.g., periodic
exams in a class, with most behaving/studying
occurring right before the exam/reinforcer)
In a variable interval schedule, a reinforcer is
delivered after a different time interval on each
trial, but the time intervals over trials average to be
a set time (e.g., pop quizzes)
Fixed Interval and Variable Interval
Schedules of Partial Reinforcement
Which is best?

Ratio schedules lead to higher rates of
responding than interval schedules (steeper
slopes on the cumulative record)

Variable schedules lead to fewer breaks (no
responding occurring) after reinforcement than
fixed schedules

With respect to extinction, it will take longer to
extinguish a response with a partialreinforcement schedule than a continuous
reinforcement
Motivation, Behavior,
and Reinforcement
Motivation is the set of internal and external
factors that energize our behavior and direct it
toward goals
Theories of Motivation

Drive-reduction theory proposes that first,
a bodily need (such as hunger) creates a
state of bodily tension called drive; then, a
motivated behavior (seeking food) works to
reduce this drive by obtaining
reinforcement (food) to eliminate
this need and return the body to a
balanced internal state.


In essence, we are “pushed” into action
by unpleasant drive states
Effective at explaining biological needs
such as hunger and thirst
Theories of Motivation

Incentive theory proposes that we are “pulled”
into action by incentives, external environmental
stimuli that do not involve drive reduction


For instance, students may be
motivated by getting good grades,
leading them to work and study hard
Money is another classic example of
an incentive that “pulls” us into
behaving in certain ways
Theories of Motivation

Arousal theory contends that our behavior is
motivated to maintain an optimal level of
arousal, which varies among people



When below the optimal level, we are motivated to
raise our arousal to that level
When over-aroused, we are motivated to lower our
arousal level to our optimal level of arousal
Arousal theory argues that our level of arousal
impacts our performance level, with a certain level
being optimal
The Yerkes-Dodson’s Law
Increased
arousal will
aid
performance
up to a point,
after which
further arousal
impairs
performance
Extrinsic vs. Intrinsic Motivation


Extrinsic motivation is the desire to perform
behavior to obtain an external reinforcement or to
avoid an external aversive stimulus
Intrinsic motivation is the desire to perform a
behavior effectively and for its own sake


Reinforcement is provided by the activity itself
For example, why do students study for classes?


An extrinsic motivator would be grades
An intrinsic motivator would be enjoyment of the
information
The Overjustification Effect

In a study by Lepper, Greene, and Nisbett
(1973), some children who enjoyed playing
with felt-tipped pens (they did so during freeplay periods and were hence initially
intrinsically motivated) were subsequently
given prizes, an extrinsic
incentive, for playing with
the pens. Other such
children were not given
prizes for playing with
the pens.
The Overjustification Effect

A week later, when no prizes were give for
playing with the pens, the children who had not
received prizes a week earlier still continued
to play with the pens, but children who had
been given prizes spent much less time
playing with the pens.
The Overjustification Effect



Occurs when there is a decrease in an
intrinsically motivated behavior after the
behavior is extrinsically reinforced and then
the reinforcement discontinued
The overjustification effect indicates that a
person’s cognitive processing is influencing
their behavior and that such processing may
lessen the effectiveness of extrinsic reinforcers
However, extrinsic reinforcement is not likely
to impact intrinsic motivation if the extrinsic
reinforcement is dependent upon doing
something well versus just doing it
Biological and Cognitive
Aspects of Learning
Biological Preparedness
in Learning
Latent Learning and
Observational Learning
Biological Preparedness
in Learning
Our preparedness to learn to fear objects
dangerous to us (e.g., heights) and to
avoid foods and drinks that make us sick
has adaptive significance
Taste Aversion


Garcia and Koelling (1966) were studying the
effects of radiation on rats
After several radiation treatments, the rats
would still go into their experimental cages
where they had been radiated, but would no
longer drink the water in their experimental
cages
Taste Aversion


The researchers discovered that the water
bottles in the experimental cages were made
of a different substance than those in the home
cages – plastic versus glass
Thus, the water had a different taste in the two
cages, and the rats quickly learned an
aversion to the experimental cage water
bottles, even though the sickness came hours
after being in the experimental cages
Taste Aversion


Thus, taste aversion is a dramatic
counterexample to the rule that the UCS
(sickness) must immediately follow the CS (the
different tasting water) for learning to happen
The rats did not learn
taste aversion for any
pairing of cue and
consequence…
Taste Aversion

The researchers examined two cues that were both
paired with sickness through radiation:



Sweet-tasting water
Normal tasting water with clicking noises and flashing lights
occurring when the rats drank the water
The rats that drank the sweet-tasting water easily
learned the aversion to the water, but the rats that
drank normal-tasting water while they experienced
clicking noises and flashing lights did not do so

The rats just couldn’t learn to pair these environmental
auditory and visual cues that occurred during their drinking
with their later sickness; this pairing did not make any
“biological” sense to the rats
Instinctual Drift



The tendency of an animal to drift back from
learned operant response to an object to an
innate, instinctual response
Thus, biologically instinctual responses
sometimes limit or hinder our ability to condition
other less natural responses
Organisms will learn certain association (those
consistent with their natural behavior) more
easily than others (those less consistent with
their natural behavior)
Latent Learning and
Observational Learning

Latent learning is learning that occurs but is not
demonstrated until there is incentive to do so


For example, students study for classes, but do not
openly demonstrate learning until an exam, for which
the incentive is a good grade
Observational learning (modeling) is learning
by observing others and imitating their behavior
Latent Learning


Edward Tolman was a pioneer researcher
on latent learning
Food-deprived rats had to negotiate a maze, and the
number of wrong turns/errors was counted
 For example, in a three-group
experiment, food (reinforcement)
was always available in the
goal box at the maze’s end
for one group but never
available for another group.
For a third group, no food was
available until the 11th day of the
experiment.
Latent Learning

Interestingly, the performance for the group that only
started getting food reinforcement on the 11th day
improved immediately on the next day equal to that of
the group that had always gotten food reinforcement
 Thus, it appears that they had been
learning the maze all along, but
did not demonstrate their
learning until the incentive
was made available
Latent Learning
Observational Learning
Albert Bandura’s pioneering research on modeling
 In one experiment, some children were exposed to an adult
who beats, kicks, and yells at a Bobo doll





After observing this behavior, a child is taken to another room
filled with many appealing toys, but is told the toys are for other
children and s/he cannot play with them
Later, the child goes to another room with toys s/he can play
with, including a Bobo doll
The child proceeds to beat the Bobo doll in much the same way
the adult model did
However, when exposed to a gentle model, children acted
more gently toward the doll than children exposed to model
Thus the children’s behavior was guided by the behavior of
the model to which they were exposed
Observational Learning



In another experiment, an adult was rewarded for
aggressive behavior, punished for aggressive behavior,
or received no consequences at all
The children who saw the adult get reinforced for
aggressive behavior acted more aggressively toward
the Bobo doll than those who had seen the model act
with no consequences
In addition, the children who had watched the adult get
punished were less likely to act aggressively toward the
doll than children who had not been exposed to any
consequences for acting aggressively toward the doll