Operant Conditioning

Download Report

Transcript Operant Conditioning

Operant Conditioning

Introduction

Through classical (Pavlov) conditioning, an organism associates different stimuli that it does not control. Through operant conditioning, the organism associates its behaviors with consequences. Behaviors followed by reinforcements increase; those followed by punishers decrease. This simple but powerful principle has many applications and also several important qualifications.

Operant means…

….Explain and train

Operant Conditioning

• A type of learning in which responses can be controlled by their consequences

i.e. rewards or punishments

Reward vs. Punishment

Reward = more likely behavior will repeat Punishment = less likely to repeat behaviors Which is better?

Behavior

Respondent Behavior

Operant Behavior

– Behavior that occurs as an automatic response to some stimulus Ex: food when hungry; water when thirsty – The act

operates

on the environment to produce rewarding or punishing stimuli Ex: good grades = MONEY; bad grades = grounded

Important People in Operant Conditioning

B.F. Skinner

Edward Thorndike Radical Behaviorism Skinner Box Law of Effect Puzzle Box

Skinner

Operant Chamber-

– “Skinner Box” – Soundproof – Bar or key that an animal presses or pecks to release a rewards of food or water – Device that records these responses •

Shaping -

-Procedure in which reinforcers (like food) gradually guide an animal’s actions toward a desired behavior Operant Conditioning

Edward L. Thorndike

Law of Effect:

– Rewarded behavior is likely to recur – Puzzle Box

Operant Conditioning Chamber

Skinner Box Puzzle Box

Two important concepts used in Operant Conditioning • Reinforcer – A stimulus or event that

increases

• I give my kids money when they clean their room…this stimulus increases the odds they will do it again the odds of repeating • Punisher – A stimulus or event that functions to

decreases

• I spank my kids when they throw food at the dinner table…this event decreases the odds they will do it again • Remember… – It is often the learner that determines if something is a reinforcement or punishment – This is called the – I might give Ryan broccoli after he did a chore and if he likes it he will do more chores do that chore again

Premack Principle

– My feelings toward broccoli make no difference the odds – Or I might give Ryan broccoli after he did a chore and he may never

Reinforcer

Anything likely to increase a behavior

Two Types of Reinforcement: Positive and Negative

Positive Reinforcement

• Something desirable is

added

to the environment and this encourages (reinforces) behavior – Behaviors are strengthened when they are followed by the introduction of a stimulus

A

Negative Reinforcement

• Something undesirable is

subtracted

from the environment and this encourages (reinforces) behavior – NR are aversive stimuli such as loud noise, cold, pain, or nagging • We are more likely to repeat behaviors that lead to their removal – Example • Say I have a headache • The NR is the pain of the headache • I take aspirin and the headache goes away • Headache pain (stimulus) - - aspirin (response) - - consequence (headache gone) • I will take aspirin again because it removed something unpleasant

So…positive and negative do not mean good or bad. Instead, positive means adding a stimulus, and negative means removing a stimulus.

The Simpsons

Reinforcement Schedules

The pattern (schedule) in which reinforcement (reward or punishment) is given.

These schedules influence learning

Continuous Reinforcement

Reinforcing the desired response every

time it occurs.

Example – vending machine

Quick Acquisition Quick Extinction

Partial Reinforcement

Reinforcing a response only part of the time. – slot machine – You don’t expect to win every time but hope to win sometime – The acquisition process is slower, but… – Greater resistance to extinction.

• 4 different partial reinforcement schedules – Two focus on

time

between reinforcement

(interval schedule)

– Two focus on

number

reinforcement of responses between

(ratio schedule)

Fixed-Interval Schedule

• Reinforcement of a behavior after a specified or fixed

time (interval)

has passed.

• You get paid every two weeks • A worker gets a bonus once a year – After receiving a reward (a reinforcement) the worker has to wait one year for another reward (fixed interval)

Variable-interval Schedule

• Reinforcement of a behavior at

unpredictable (variable) time

intervals.

• You don’t know when the reinforcement is coming so you keep trying or have to be prepared to take action

Pop Quizzes

Fixed-ratio Schedules

• Reinforcement of a behavior only after a specified (fixed)

number

of

responses

• Movie rentals that say rent 5 get one free • A worker gets a bonus after every three items he sells

Variable-ratio Schedule

• Reinforcement of a behavior after an

unpredictable (variable) number

of responses.

– Working on sales commission • Sometimes called the

gambler’s schedule

– Back to the lottery… – You don’t know when you will win but you do know the more you buy the better your chances

Overjustification Effect

• When external rewards undermine the intrinsic satisfaction of performing a behavior – Makes people only do something for reward or prize and not for pure joy – Usually the reward may lesson and replace the person’s original, natural motivation so that the behavior stops if the reward is eliminated • Pizza for reading – “what, I don’t get a free pizza for reading 10 books?”

Before we move on…

• Operant Conditioning uses much of the same terminology as classical conditioning…(acquisition, extinction, generalization, discrimination, etc…) • For example, if I want a child to increase his bathing behavior, I can give him an extra 30 minutes of TV time after he bathes.

• The reinforcer is extra TV time and acquisition occurs when he links together the idea that bathing gives him more Cartoon Network.

• Extinction would occur if I stop giving him TV time for bathing and he stops seeing the association.

Types of Reinforcers

Types of Reinforcers

Primary Reinforcers-

reinforcements that happen naturally; not learned (i.e. getting food when hungry, taking your hand off a burning stove to relieve pain) •

Conditioned Reinforcers-

(secondary reinforcers) are learned. (i.e. if a rat in Skinner’s box learns that when a light signal goes off it signals food, the light becomes on the secondary reinforcer

Primary Reinforcer

Things that are in themselves rewarding and satisfy biological needs

Like food, warmth, or water

Secondary (or Conditioned) Reinforcer

• Something that you have learned to value through classical conditioning – Money, fines or grades • Secondary reinforcers can loose their effectiveness

Intrinsic vs. Extrinsic Motivation

Punishment

• Flip side of reinforcement • The introduction of a bad stimulus or the removal of a reinforcing stimulus after a response occurs – Weakens a behavior or makes it less likely to occur again in the future

Does punishment work?

Yes, but…

Often tells the learner what behavior should NOT be exhibited and not what behavior should be

And…don’t forget the Premack Principle

Difference between Negative Reinforcement and Punishment

Punishment

 the introduction of a negative consequence after a behavior weakens the behavior  Time out for hitting other children 

Negative Reinforcement

  the removal of a negative stimulus after a behavior strengthens the behavior Picking up a crying baby

Observational Learning

Learning by Observation

Learning occurs not only through conditioning but also from our observation of others.

“We are, in truth, more than half what we are by imitation” Lord Chesterfield

Observational Learning: Definition

 Observe and imitate others   Modeling- Process of observing and imitating a specific behavior We learn all kinds of social behaviors by observing and imitating others

Mirror Neurons

 Mirror neurons provide a neural basis for observational learning  Example: when a baby imitates a face an adult is making, mirror neurons are firing

Bandura’s Experiment

Albert Bandura

 

Pioneer of research in observational learning BoBo Doll Experiment

Reinforcement and punishment leads to imitating a behavior

Social Influence on Observational Learning

  Columbine High School “copycat threats” Prosocial- models can have positive effects  Gandhi and Martin Luther King Jr.

 Television:   More hours children spend watching violent TV or playing violet video games, more at risk for aggression and crime as teens and adults Homicides doubled between 1957 and 1974, coinciding with the introduction of television

Aversive Conditioning

     In aversive conditioning, client is exposed to an unpleasant stimulus while engaging in the targeted behavior Goal- create an aversion to it. In adults, aversive conditioning is often used to combat addictions such as smoking or alcoholism. Examples-Nausea-producing drug while the client is smoking or drinking so that unpleasant associations are paired with the addictive behavior. Also used to treat nail biting, sex addiction, and other strong habits or addictions.

Observational Learning influenced debates on the effect of television violence and parental role models

Studies have shown the amount of violent TV watched by children in elementary school is correlated with their aggressiveness as teenagers and with their criminal behavior as adults

Antisocial models vs. prosocial models