Learning and Sleeping

Download Report

Transcript Learning and Sleeping

Perceptual Learning, Roving
and the Unsupervised Bias
By Aaron Clarke, Henning Sprekeler, Wolfram
Gerstner and Michael Herzog
Brain Mind Institute
École Polytechnique Fédérale De Lausanne
Switzerland
Talk Outline
• Perceptual Learning & Roving
• The Unsupervised Bias
• Critical Experiment
Perceptual Learning
Perceptual Learning
4.5
4
3.5
d'
3
2.5
2
1.5
1
0.5
0
0
5
10
15
20
25
30
35
Block Number
40
45
50
Talk Outline
• Perceptual Learning & Roving
• The Unsupervised Bias
• Critical Experiment
Roving
Learning Task 1
1200”
1200”
4.5
4
3.5
d'
3
2.5
2
1.5
1
0.5
0
0
5
10
15
20
25
30
35
Block Number
40
45
50
Roving
Learning Task 1
1200”
1200”
Learning Task 2
1800”
1800”
Roving
Non-Roved
Roved
4
4
3.5
3
3
2.5
2.5
d'
d'
3.5
1200"
1800"
2
2
1.5
1.5
1
1
0.5
0.5
0
5
10
15
20
Block Number
25
0
5
10
15
20
Block Number
Adapted from Tartaglia, Bamert, Mast & H. Herzog (2009)
25
Hypotheses
• Roving may disrupt memory-trace-buildup for
the roved stimuli (Yu et al., 2004).
• Roving may diminish the stimuli’s
predictability (Adini et al., 2004).
• Roving may prevent the participants from
conceptually tagging each stimulus type in
order to switch their attention to the
appropriate perceptual template (Zhang et al.,
2008).
Roving
Learning Task 1
1200”
1200”
Learning Task 2
1800”
1800”
Hypotheses
• Roving may disrupt memory-trace-buildup for
the roved stimuli (Yu et al., 2004).
• Roving may diminish the stimuli’s
predictability (Adini et al., 2004).
• Roving may prevent the participants from
conceptually tagging each stimulus type in
order to switch their attention to the
appropriate perceptual template (Zhang et al.,
2008).
Talk Outline
• Perceptual Learning & Roving
• The Unsupervised Bias
• Critical Experiment
Talk Outline
• Perceptual Learning & Roving
• The Unsupervised Bias
• Critical Experiment
Model Predictions
Unsupervised
Supervised
Reward-Based
Δwij = prei × postj
Δwij = prei × eij
Δwij = Cov(R,wij) + ‹R› ‹wij›
Output
Desired
Output
Output
Error
j
Error
i
Input
• No feedback
Desired
Output
Reward
Input
• Trial by trial feedback
• Error feedback
• Teacher signal
Input
• Feedback after many trials
• Error feedback
• Teacher signal
Model Predictions
Unsupervised
Supervised
Reward-Based
Δwij = prei × postj
Δwij = prei × eij
Δwij = Cov(R,wij) + ‹R› ‹wij›
Desired
Output
Learning is possible
Output
Feedback improves
performance.
j
Output
without feedback
Error
Desired
Output
Error
i
Reward
Input
• No feedback
• Trial by trial feedback
• Error feedback
• Teacher signal
Herzog & Fahle (1998)
Input
• Feedback after many trials
• Error feedback
• Teacher signal
Reward-Based Learning
Reward & current activations
Averages of past trials
Δwij = Cov(R,wij) + ‹R› ‹wij›
weight change
Covariation
between reward
weight change
Average
reward
Reward-Based Learning
Reward & current activations
Averages of past trials
Δwij = Cov(R,wij) + ‹R› ‹wij›
weight change
Covariation
between reward
weight change
Average
reward
=0
Reward-Based Learning
Reward & current activations
Averages of past trials
Δwij = Cov(R1+R2,wij) + ‹R1+R2› ‹wij›
weight change
Covariation
between reward
weight change
Average
reward
• Learning is impossible with two stimuli.
Roving
Non-Roved
Roved
4
4
3.5
3
3
2.5
2.5
d'
d'
3.5
1200"
1800"
2
2
1.5
1.5
1
1
0.5
0.5
0
5
10
15
20
Block Number
25
0
5
10
15
20
Block Number
Adapted from Tartaglia, Bamert, Mast & H. Herzog (2009)
25
Talk Outline
• Perceptual Learning & Roving
• The Unsupervised Bias
• Critical Experiment
Hypothesis
• Roving impairs perceptual learning when the
average reward for the two learned stimuli
differs significantly.
– This kind of situation occurs when the two roved
tasks differ in their difficulty levels.
Roving
Learning Task 1
1200”
1200”
Learning Task 2
1800”
1800”
Results
4.5
Easy
Hard
4
H0: Mean Easy Slopes = 0:
t(7) = -0.222, p = 0.415
1800”
3.5
d'
3
2.5
H0: Mean Hard Slopes = 0:
t(7) = -1.115, p = 0.151
2
1.5
1200”
1
0.5
0
0
5
10
15
Block Number
20
Results
4.5
4.5
Easy
Hard
3.5
3.5
3
3
2.5
2.5
2
2
1.5
1.5
1
1
0.5
0.5
0
0
5
10
Block Number
15
20
H0: Mean Non-Roved Slopes = 0:
t(7) = 2.144, p = 0.035
4
d'
d'
4
0
0
5
10
Block Number
15
20
Summary
• There are three types of learning models: supervised,
unsupervised and reward-based.
• Only reward-based learning withstands empirical
falsification, and it suffers from the unsupervised bias.
• When roving two tasks, easy and hard, learning fails, as
can be shown mathematically. And that is why roving
occurs empirically.
• A strange prediction from this is that roving a hard and
a very easy task should deteriorate performance.
Roving two hard tasks might make learning easier than
roving a hard and an easy task, and this has actually
been shown in other studies.
Thank for your attention.
When is Learning During Roving
Successful?
Vs.
Vs.
Vs.
150 ms
500 ms
Experiment
• Used two stimuli: 1800” and
1200”.
• Measured pre-training
thresholds for both stimuli in
isolation.
• Trained subjects with fixed
offsets (easy = 1.5 × pretraining threshold, hard = 0.9
× pre-training threshold).
• In 20 blocks of 80 trials.
• Roved stimuli.
Easy
1200”
1800”
Hard
1200”
Easy
Other Hypotheses
• Roving may interact with the participants’
initial performance levels where worse initial
performers learn more than high initial
performers.
• Roving might cause low-level interference
between stimulus types (Tartaglia et al., 2009;
Zhaoping, Herzog, & Dayan, 2003).