No Slide Title

Transcript No Slide Title

Chapter 5
Learning
What is Learning?
•
•
•
Learning: experience leads to a
relatively permanent change in
behavior
Conditioning: a behavior becomes
associated with a stimulus
Stimulus: anything that influences
behavior
Classical Conditioning
•
•
Classical (Pavlovian) Conditioning:
a response (salivation) naturally
elicited by one stimulus (food)
comes to be elicited by a different
stimulus (bell)
Ivan Pavlov: Russian physiologist,
discovered classical conditioning
4 Classical Conditioning Terms
•
1. unconditioned stimulus (US):
stimulus that naturally (reflexively)
“elicits” a response
–
•
food
2. unconditioned response (UR):
response (reflexively) elicited by
the unconditioned stimulus
–
salivation
4 Classical Conditioning Terms
•
•
3. conditioned stimulus (CS): through
pairing with an US (food) the CS
(bell) comes to elicit the same
response (salivation)
4. conditioned response (CR): same
as the unconditioned response
(salivation) BUT is elicited by the CS
(bell) NOT the US (food)
Classical Conditioning Procedure
Before Conditioning
Food
(US)
Salivation
(UR)
Bell
(CS)
No Salivation
(CR)
Classical Conditioning Procedure
During Conditioning
Food
(US)
Bell
(CS)
Salivation
(UR)
Classical Conditioning Procedure
After Conditioning
Bell
(CS)
Salivation
(CR)
J.B. Watson’s Little Albert Study
•
•
•
•
•
By “pairing” a loud noise with a white
rat (cute), the white rat became a CS
for fear in little Albert
What is US? > Loud Noise
What is UR? > Crying
What is CS? > White Rat
What is CR? > Crying
Factors Affecting Conditioning
(not in book)
•
•
•
Order of presentation: Conditioning is
much more effective if CS precedes
US (e.g., first the white rat, then the noise)
Inter-Stimulus Interval (ISI): about 1/2
second to a few seconds is best
Numer of CS - US pairings: usually
many are needed (except in conditioned taste
aversion!)
Biological Preparedness Hypothesis
•
Martin Seligman: evolution has
made us more likely to become
conditioned to stimuli that are
“potentially dangerous”
–
heights, thunder, animals, water, fire,
people, insects, etc.
Conditioned Taste Aversion
•
•
An exception to the need for
multiple CS - US pairings
One trial learning: animals (and
sometimes humans) will learn to
avoid a taste/smell that has been
associated with sickness ONE time.
This has “survival value”
Operant Conditioning
•
•
Operant conditioning: organism
“operates” on the environment to
cause the occurrence (or nonocurrence) of some event (an if then statement)
Instrumental Conditioning: another
name for operant conditioning
(organism is “instrumental” in its
own learning
Edward L. Thorndike
•
used his “puzzle box” with cats to
study operant type behavior
Thorndike’s Law of Effect
•
Behavior that leads to a pleasant
outcome is “stamped in”
–
•
e.g., cat hits lever and escapes box
Behavior that leads to an
unpleasant outcome is “stamped
out”
–
e.g., cat hits lever and gets shocked
B. F. Skinner
•
•
Greatly expanded on Thorndike’s
ideas
Invented the “Skinner Box”
–
–
–
•
A box with a lever (for a rat to press) or a disk
(for a pigeon to peck
a system for delivering food
a metal floor that could deliver a mild shock
Believed all behaviors, thoughts,
words, were “learned” and could be
studies and, perhaps, changed
Language of Operant Conditioning
•
•
•
reinforcement: following a behavior
with a consequence (event) that
INCREASES the probability that the
behavior will be repeated
Positive: something pleasant is added
to the situation (e.g., candy)
Negative: something unpleasant is
removed from the situation (e.g., no
homework)
Language of Operant Conditioning
•
•
•
punishment: following a behavior
with a consequence (event) that
DECREASES the probability that
the behavior will be repeated
Positive: something is added to the
situation (e.g., a shock)
Negative: something is removed from
the situation (e.g., fine for speeding)
Four Operant Procedures
•
1. Positive Reinforcement: behavior
results in something pleasant being
added (behavior increases)
–
•
rat presses lever > rat gets food
2. Negative Reinforcement: behavior
results in something unpleasant being
removed (behavior increases)
–
rat presses lever > bigger aggressive rat is removed
Four Operant Procedures
•
3. Positive Punishment: behavior
results in something unpleasant
being added (behavior decreases)
–
•
rat presses lever > rat gets shocked
4. Negative Punishment: behavior
results in something pleasant being
removed (behavior decreases)
–
rat presses lever > rat’s food gets taken away
Superstitious Behavior
•
•
•
Discovered by Skinner
Organism learns “incorrectly” that a
behavior produces an outcome
Example:
pigeon is turning as food pellet drops
– pigeon “assumes” turning will bring
food
–
Learned Helplessness
•
•
•
Learned helplessness: may be one
cause of human depression
Organism learns it has no control
over situation (environment)
Martin Seligman: demonstrated this
using dogs and shock
Learned Helplessness Study
Seligman’s Learned Helplessness Study
•
Two groups of dogs are exposed to
shock
–
–
control group could escape shock
“no escape” group could NOT escape
shock
•
Later, when escape was possible,
“no escape” dogs didn’t even try
•
Learned that they had NO CONTROL
Shaping
•
•
shaping: complex behaviors are
learned in small steps (“reinforcing
successive approximations to a
goal”)
any complex behavior (e.g., dog
fetching a stick) is learned in small
steps
Response Acquisition
•
Learning is slow at first, becomes
more rapid, then levels off
CR
0
5
10
15 20
Trials
25
Extinction
•
Repeated presentation of CS without
US weakens, then eliminates the CR
NO US (food)
CR
0
5
10
15 20
Trials
25
Spontaneous Recovery
•
After extinction, and then a period of
rest, the CR returns without any
additional conditioning
CR reappears
CR
0
5
10
15 20
Trials
25
Generalization - Discrimination
•
stimulus generalization: response is
made to original AND “similar” CS
–
•
Little Albert
stimulus discrimination: organism
learns to respond to original CS but
NOT to similar ones
–
pigeon learns to peck at red light but
not at green light
Higher Order Conditioning
•
•
•
Can a CS be used as if it were a
US to condition a new second CS?
YES
A CS (bell) is used as a US to
condition a new CS (light) to elicit
the same CR (salivation)
HIgher Order Conditioning
Before Higher Order Conditioning
Bell
(CS1)
Salivation
(UR)
Light
(CS2)
No
Response
Higher Order Conditioning
During HIgher Order Conditioning
Bell
(CS1)
Light
(CS2)
Salivation
(UR)
Higher Order Conditioning
After Higher Order Conditioning
Light
(CS2)
Salivation
(CR)
Primary and SecondaryReinforcers
•
primary reinforcers: naturally
reinforcing
•
food, water, sex
•
secondary reinforcers: become
reinforcing through association with
primary reinforcers
•
for humans, MONEY is the most potent
secondary reinforcer
The Blocking Effect & Contingencies
•
A problem for Pavlov:
–
•
Pavlov - Thought all that is needed for
conditioning is for a CS and US to be
presented together
Leon Kamin: showed that it was not
this simple
– prior conditioning of a tone with shock
prevented rats from later being able
to associate a light with the shock
Original Fear Conditioning
Tone is paired with a shock
Shock
(US)
Tone
(CS)
Fear
(UR)
After Original Fear Conditioning
Tone
(CS)
Fear
(CR)
Tone signals shock and elicits fear
Try to Condition Light
Light is presented WITH tone AND
shock
Shock
(US)
Tone
(CS)
Light
(CS)
Fear
(UR/CR)
IMPORTANT! In contrast to
“higher order conditioning,” the
US “shock” is still present
Light has NOT Conditioned
Light
(CS)
NO Fear
(CR)
Light - shock association has NOT been
learned
Light does not tell the rat anything it
doesn’t already know, rat ignores light
Prior learning of the tone-shock
relationship “BLOCKED” learning of the
light-shock relationship
Contingency Theory
•
•
Explains the blocking effect
For conditioning to occur, the CS
must provide “useful information”
about the US to the organism
Schedules of Reinforcement
•
•
Continuous Reinforcement: each
response is reinforced
partial (intermittent) Reinforcement:
reinforcement does not follow every
response
–
–
ratio schedules: several responses are
required before reinforcement
interval schedules: a certain amount of
time must pass before next reinforcement
Fixed Ratio
•
•
FR-10: a set number of bar presses
(10) must be performed to get food
rat presses 10 times, gets food,
rests, starts again
–
–
produces a “stepped” response record
ex. being paid for “piece work”
Variable Ratio
•
•
VR-10: number of bar presses needed
for food will vary but will AVERAGE a
certain number (10)
Rat never knows when food is coming
so he keeps pressing
– produces a steady steep response
record
– ex. Playing a slot machine
– **** “Highest” rate of responding ****
Fixed Interval
•
FI-10: food is available every 10
seconds IF bar is being pressed
–
–
–
rat senses time to as 10 sec. Mark
approaches, he presses more quickly
produces a “scallop shaped” record
ex. more frequent trips to the employment
office when more jobs are available
Variable Interval
•
VI-10: food is available ABOUT
every 10 seconds but rat must be
pressing bar as the time approaches
steady low rate response record
– ex. studying for a teacher who gives
“pop” quizzes
–
Cumulative records of a rat bar-pressing in a Skinner
Box. The lines are made up of dots with each dot
(moving upward) representing a bar press.
If you want a lot of work from a rat or a person, use a
variable ratio schedule!,
Latent Learning
•
•
•
Latent Learning: learning that is not
apparent in behavior
Demonstrated in an experiment by
“Tolman and Honzik”
Problem for Skinner’s idea that
learning requires “reinforcement” or
“punishment”
Tolman and Honzik’s Study
•
•
•
•
rats (3 groups) learned a maze over about 2
weeks
Group A:
– no food at end, their speed through the
maze did not increase muchat all
Group B:
– food at end daily, they ran the maze a little
faster each day
Group C: (actually a sub-group of A)
– no food till day 11, then they immediately
ran the maze as fast as group B
Influence of Tolman’s Study
•
•
•
•
It showed that learning CAN occur
without “reinforcement”
Group C rats were learning about the
maze but didn’t show it until given a
good reason (food at end of maze)
Learning may be naturally reinforcing
So, there is more to operant learning
than Skinner thought
Albert Bandura’s Social Learning Theory
•
•
•
•
We can learn by watching
Observational (vicarious) learning - We
observe the behaviors of others and
the consequences of those behaviors.
Vicarious reinforcement - If their
behaviors are reinforced we tend to
imitate the behaviors
Vicarious punishment - If their
behaviors are punished we tend NOT
to imitate the ©behaviors
Prentice Hall, 1999
Social Learning Theory (cont.)
• the “Bobo Doll Study” - Bandura’s
classic experiment demonstrating
observational learning
• two groups of children watched an
adult get either rewarded or punished
for behaving aggressively with a doll
• children who saw the adult rewarded
were later more likely to be aggressive
when placed in the same situation
Learning and Human Behavior
Classical Conditioning
•
Explains how some phobias
develop
–
–
–
dog is not feared
dog bites person
dog becomes CS for fear
Two Factor Theory of Avoidance
•
•
•
•
O. H. Mowrer (not in book)
– the two “factors” are classical and
operant conditioning
1. person learns to fear a dog via classical
conditioning
2. each time dog is avoided, avoidance
behavior is “reinforced” by?
negative reinforcement
–
avoidance (removal) of dog lowers anxiety so
behavior is repeated

No Slide Title

Transcript No Slide Title

Directory