How good is a robot tutor? resource multiplier in teaching statistics

Download Report

Transcript How good is a robot tutor? resource multiplier in teaching statistics

How good is a robot tutor?
The effectiveness of excel as a teaching
resource multiplier in teaching statistics
Dave Nunez
Colin Tredoux
Susan Malcolm-Smith
ACSENT Lab
University of Cape Town
Jacob Jaftha
Dept. of Mathematics and Applied
Mathematics
University of Cape Town
2
Context
UCT Psychology has an extensive statistics teaching
programme (1st year to honours)
Research focus makes this an imperative
By honours, are expected to apply stats to a significant
individual research project
A mixed group of students entering
All have high-school maths, or have
completed/concurrently completing a year-long
numeracy course
Stats is largely disliked, and provokes significant anxiety
3
Context
Large classes, few tutors
Typically 40:1 student:tutor ratio
Excel based tutorials developed to counter this (lab
facilities can cope with numbers) – “tutor in a can”;
“tutorbot”; “tutortron-2000”
Positive student feedback from excel tuts
Liked that they could take them home
Seemed to compensate for poor lecture attendance
BUT – very little interaction between teachers & students
(how were explanations/queries handled? Was it
necessary?)
4
The excel based tutorials used
In development since 2003
Almost all technical glitches resolved
Contain text, exercises and evaluation
Text supplements textbook (text and images); also
includes animations & simulations
Teaches concepts and tools
Provides exercises which are immediately scored
(feedback given for each question)
Each tut ends with a mini-test which must be submitted
[electronically]
Each tut takes 120-150 minutes to complete
5
The excel based tutorials used
The tutorials aim to be more than simple exercises
Embed some teaching by interaction & feedback
Raises the issue: Can interactive, discovery based
learning surpass student-tutor interaction for
learning statistics
Some topics are well suited for discovery (sampling
distribution of the mean)
Some topics are poorly suited for it (probability)
Do the excel tutorials lead to skill transfer?
7
Methods used in the past
Pre-test/post-test
Without a control, cannot show the tutorial is the cause
(even a bad tut teaches something)
Voluntary assignment
No control for motivation variables
No control for repetition
Performance often measured by means of
psychological variables
Confidence, mastery, conceptual learning
No absolute task-based criteria
8
Deficits in past methods
Poor controls
No proper control within subjects (natural learning)
No proper control across conditions (subjects self-assign
to conditions)
These are often related to ethical concerns
Measures are generally poor
Single measure of complex, time-dependent
phenomenon
No criterion based assessment (i.e. low ecological validity
of findings)
9
Research questions
Do Excel based tutorials (EBTs) compare in
performance (marks scored) to pen-andpaper tutorials (PnPs)?
Is there a difference in terms of
psychological variables (mastery, confidence)
between EBTs and PnPs?
10
Strengths of the current study
Two-group quasi-experiment
Pseudo random assignment of students to
excel/pen-and-paper tutorials
Strong control/similarity of tutorials (we think)
Semester long, continuous assessment
Standard test after each tutorial (criterion and
psychological measures)
Final exam at the end of the semester
11
Sample
The 2007 PSY2006F class
Statistics lecture each Friday; One stats tut a
week
172 students (only Humanities students)
Almost all have been through 3 tutorials in
PSY1001W on using excel for stats
2007 cohort not significantly different from other
years
Not told about the study; simply told strange
tutorial structure was due to logistical reasons
12
Materials
PnP tuts are ‘traditional’ as done in the dept.
before advent of excel tuts
Published in a textbook (we partly wrote) – in 2001
Choose tutors who excel (!) at statistics
They lead students (groups of 30-40) through worksheets
and explain problems and theory as they go along
Students are given 2 hour classroom sessions to complete
tuts (mostly don’t finish)
Students are required to submit the completed worksheet
a week after the classroom session
13
Materials
Excel tuts (latest versions)
Developed by us (2003-2007)
1 senior tutor in the lab for stats queries, junior
tutors for technical problems
Students are given 2 hour lab sessions (groups of
30-40) to complete tuts (mostly don’t finish)
Students are required to submit the completed
excel worksheet a week after the lab session
14
Design
Control for individual variation and crossgroup effects
Each student does 4 EBTs, 4 PnPs (8 topics in the
course)
Two ‘streams’ – EPEPEPEP, PEPEPEPE
Within subjects design, and cross-group
comparison
The non-statistics marks in the course (research
methods, psychometrics & qualitative methods)
can be used to validate (traditionally high R2
between them)
15
Measures
Exam at the end
2 hour practical exam (given data, problem
solving – no concepts)
Do each exam section in the same technology
form as the tuts were done in
16
Measures
Monday assessments
Each tut has a set of MCQ items
6 MCQ items, 3 concepts, 3 calculations; one
each easy, moderate, hard
5 Likert items about confidence with the
material, usefulness of tut, degree of
understanding, how much extra help is needed
17
Two students, Able and Baker, want to get into
the honours class, but they have taken
different third year subjects. Able did
the PSY300X course (which had a mean
mark of 53% and a standard deviation of
11%) and he got a mark of 80%. Baker on
the other hand did the PSY300Y course
(mean mark of 57% and a standard
deviation of 7.5%), and got a mark of
77%. If honours places are awarded to
students who stand out the most in their
courses, which one of the students
should get into honours and why?
a)
b)
c)
d)
Able should get in, because he scored
27% above the course average
Baker should get in, because he scored
20% above the course average
Able should get in, because he scored
proportionately higher above the course
average
Baker should get in, because he scored
proportionately higher above the course
average
Measures (3)
Distribution X is normally distributed;
distribution Y has a standard
normal distribution. Which of
the following statements MUST
BE FALSE?
a)
b)
c)
d)
The mean of distribution X is 2
The standard deviation of
distribution Y is 1
Distribution Y must always give
the same proportion of high
scores as low scores when
sampled randomly
Distribution X never gives scores
lower than distribution Y when
sampled randomly.
18
Validation
N=170
Comp.
Paper
Quant. methods
0.15
0.33
Psychometrics
0.35
0.40
Qual. methods
0.25
0.36
19
GROUP; LS Means
Wilks lambda=.99259, F(3, 165)=.41061, p=.74559
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
Validation
32
30
28
26
24
22
20
18
16
A
B
GROUP
exam_quant
exam_psychom
exam_qual
20
Attitude
results
R1*GROUP; LS Means
Current effect: F(5, 370)=4.7192, p=.00034
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
3.8
3.6
C
3.4
C
3.2
C
C
3.0
C
2.8
C
2.6
2.4
Positive attitude (0-5)
2.2
2.0
1.8
1.6
att1
att3
att4
att6
att7
att8
Paper first
Comp. first
21
eval_1; LS Means
Current effect: F(2, 150)=1.1240, p=.32769
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
Preference
effects
30
29
28
27
26
25
24
Score for computer questions
23
22
21
Classroom tuts
Both were helpful
Lab tuts
22
eval_1; LS Means
Current effect: F(2, 151)=1.5940, p=.20651
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
Preference
effects
29
28
27
26
25
24
23
22
21
Paper based questions
20
19
18
17
Classroom tuts
Both were helpful
Lab tuts
23
Monday
assessments
R1*GROUP; LS Means
Current effect: F(5, 455)=1.5736, p=.16607
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
6.0
5.5
5.0
C
C
4.5
C
4.0
C
3.5
C
Test score (out of 6)
3.0
C
2.5
2.0
Computer first
Paper first
1.5
Topic1
Topic3
Topic4
Topic6
Topic7
Topic8
24
Exam
results
R1*GROUP; LS Means
Current effect: F(7, 1169)=7.3499, p=.00000
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
1.0
0.9
C
C
0.8
0.7
C
0.6
C
C
C
0.5
C
0.4
Exam mark (0-1)
0.3
0.2
C
0.1
0.0
Q1std
Q2std
Q3std
Q4std
Q5std
R1
Q6std
Q7std
Q8std
GROUP
A
GROUP
B
25
Testing
effects
R1*GROUP; LS Means
Current effect: F(5, 840)=1.0067, p=.41255
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
0.4
C
C
0.2
C
C
C
0.0
-0.2
-0.4
C
-0.6
Improvement from tut to exam
-0.8
-1.0
-1.2
q1-t1
q3-t3
q4-t4
q6-t6
q7-t7
q8-t8
Paper first
Comp first
26
What the data shows
The EBTs can function as a robot tutor
With small tutor team, marks at least as good as
traditional tutorials, better in a few topics for
some students
Student preference/attitude is not
associated with performance
Lack of significant findings
No patterned differences
27
What the data shows
EBTs can show an advantage
At exam time rather than test time
May indicate poor test or that EBTs need
repetition to take effect
It is a weak effect - does not generalize to the
entire class easily (group B only)
28
What the data DOES NOT show
Excel based statistics teaching is better
Content is confounded with form
Tutor ability is confounded with form
Students enjoy/get confidence from the EBTs
Only differences show the opposite
Students can leverage existing computer skills for
learning statistics
Skills were pre-existing and not manipulated