CIS Drexel University - The Math Forum @ Drexel

Download Report

Transcript CIS Drexel University - The Math Forum @ Drexel

Data Analysis of Coded Chats
Study of correlation and regression between different
dimension variables
Progress Report, VMT Meeting, Jan. 19th 2005
Fatos Xhafa
VMT Project
Outline





The variables under study
Test for Normal distribution of variables
Correlation between different variables
Regression between different variables
Discussion


From statistical perspective
From interaction based / CA perspective
January 19th, 2005. VMT Meeting
The variables under study



-
Social Reference
Pbm Solving
Math Move
Still at the first level of analysis
The same sample of six powwows
January 19th, 2005. VMT Meeting
Test for Normal distributions (I)

In correlation and regression variables under
study are assumed to approximate a Normal
distribution

We tested the normality distribution of the
dimension variables:



Social reference
Problem Solving
Math Move
January 19th, 2005. VMT Meeting
Test for Normal distributions (II)
Social reference
dimension variable:


Not a good approximation to
Normal distribution
Could be indicating outlier/s
Normal P-P Plot of Percentage Social reference
postings
1.0
Expected Cum Prob

0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
Observed Cum Prob
January 19th, 2005. VMT Meeting
1.0
Test for Normal distributions (III)
Social reference
dimension variable:


Pow18 shows to be an
outlier
After removing it from the
sample a “perfect”
approximation to Normal
distribution is obtained
Normal P-P Plot of Percentage Social reference
postings
1.0
Expected Cum Prob

0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
Observed Cum Prob
January 19th, 2005. VMT Meeting
1.0
Test for Normal distributions (IV)


The Pbm Solving and Math Move show good
approximations to Normal distribution
Correlation and regression between:
Social reference and Pbm Solving
 Social Reference and Math Move
can be studied (pow18 excluded)


Correlation and regression between:
Pbm Solving and Math Move
can be studied for the whole sample

January 19th, 2005. VMT Meeting
Correlations
Correlations
Percentage
Social reference
postings
Pearson
Correlation
Pearson
Correlation
Sig. (2-tailed)
N
Percentage Math
Move postings
Percentage
Pbm
Solving
postings
1
-.970(**)
.
.006
5
5
-.970(**)
1
.006
.
5
5
-.942(*)
.967(**)
.017
.007
5
5
Sig. (2-tailed)
N
Percentage Pbm
Solving
postings
Percentage
Social
reference
postings
Pearson
Correlation
Sig. (2-tailed)
N
** Correlation is significant at the 0.01 level (2-tailed).
* Correlation is significant at the 0.05 level (2-tailed).
January 19th, 2005. VMT Meeting
Percentage
Math Move
postings
-.942(*)
.017
5
.967(**)
.007
5
1
.
5
Regression: Social reference vs. Pbm
Solving


The two variables are strongly and negatively
correlated (-.970)
What type of correlation? How are they
correlated?
January 19th, 2005. VMT Meeting
Regression: Social reference vs. Pbm
Solving
Percentage Pbm Solv ing postings = 126.01 + -3.36 * PercSocR
R-Square = 0.94

Percentage Pbm Solv ing postings
40.00
pow2_2
Linear Regression

pow10
30.00

pow9
20.00

pow2_1

pow1
10.00
26.00
28.00
30.00
32.00
34.00
Percentage Social reference postings
January 19th, 2005. VMT Meeting
Analytically…
Model Summary
Model
1
R
R
Square
Adjusted R
Square
.941
.921
.970(a)
Std. Error of the Estimate
3.17911
a Predictors: (Constant), Percentage Social reference postings
ANOVA(b)
Model
1
Regression
Residual
Sum of
Squares
df
Mean
Square
480.208
1
480.208
30.320
3
10.107
F
47.514
Total
510.528
4
a Predictors: (Constant), Percentage Social reference postings
b Dependent Variable: Percentage Pbm Solving postings
January 19th, 2005. VMT Meeting
Sig.
.006(a)
Analytically…
Coefficients(a)
Unstandardized
Coefficients
Model
1
(Constant)
Percentage
Social
reference
postings
B
Std.
Error
126.01
4
14.68
3
-3.356
.487
Standard
ized
Coefficie
nts
Sig.
Beta
-.970
a Dependent Variable: Percentage Pbm Solving postings
January 19th, 2005. VMT Meeting
t
8.582
.003
-6.893
.006
Regression: Social reference vs. Math
Move


The two variables are strongly and negatively
correlated (-.942)
What type of correlation? How are they
correlated?
January 19th, 2005. VMT Meeting
Regression: Social reference vs. Math
Move
Percentage Math Mov e postings = 89.50 + -2.44 * PercSocR
R-Square = 0.89
Perc entage Math Move postings

pow2_2
Linear Regression
25.00

pow10
20.00
15.00

pow9
pow2_1
10.00

pow1
5.00
26.00
28.00
30.00
32.00
34.00
Percentage Social reference postings
January 19th, 2005. VMT Meeting
Analytically…
Model Summary
Model
1
R
R
Square
.942(a)
Adjusted R
Square
.887
.850
Std. Error of
the
Estimate
3.27875
a Predictors: (Constant), Percentage Social reference postings
ANOVA(b)
Model
1
Regression
Residual
Total
Sum of
Squares
df
Mean
Square
F
Sig.
254.157
1
254.157
23.642
.017(a)
32.251
3
10.750
286.408
4
a Predictors: (Constant), Percentage Social reference postings
b Dependent Variable: Percentage Math Move postings
January 19th, 2005. VMT Meeting
Analytically…
Coefficients(a)
Unstandardized
Coefficients
Model
1
B
Std.
Error
(Constant)
89.505
15.143
Percentage
Social
reference
postings
-2.441
.502
Standardize
d
Coefficients
a Dependent Variable: Percentage Math Move postings
January 19th, 2005. VMT Meeting
t
Sig.
Beta
-.942
5.911
.010
4.862
.017
Regression: Pbm Solving vs. Math Move


The two variables are strongly and positively
correlated (.967)
What type of correlation? How are they
correlated?
January 19th, 2005. VMT Meeting
Regression: Pbm Solving vs. Math Move
Percentage Math Mov e postings = -2.63 + 0.73 * PercPbmS
R-Square = 0.91
Perc entage Math Move postings

25.00

pow10
20.00
15.00



pow2_1
pow18
pow9
10.00

pow1
20.00
30.00
40.00
Percentage Pbm Solv ing postings
January 19th, 2005. VMT Meeting
pow2_2
Linear Regression
Regression: Math Move vs. Pbm Solving
Percentage Pbm Solv ing postings = 5.50 + 1.26 * PerMathM
R-Square = 0.91

Perc entage Pbm Solving postings
40.00

pow10
30.00


pow18
pow9
20.00


pow2_1
pow1
10.00
15.00
20.00
25.00
Percentage Math Mov e postings
January 19th, 2005. VMT Meeting
pow2_2
Linear Regression
Discussion: correlations (I)



The Social reference is strongly and negatively
correlated to Pbm Solving (-.970) and Math
Move (-.942)
The degree of the correlation may vary by
enlarging the sample size
The strong correlation indicates that such a
tendency is expected:



by enlarging the sample size (the sample was ‘randomly’ chosen)
even if coders might have influenced the strong correlation
Pow18 shows to be an outlier and requires a
careful examination
January 19th, 2005. VMT Meeting
Discussion: correlations (II)



Question1: Why the “production” of Social reference
influences negatively the “production” of Pbm Solving
and Math Move?
A first interpretation
 The math pbm solving activity takes place during a fixed
amount of time (roughly an hour).
 The more effort in “production” of Social Reference, less
“production” of Math
Question2: Does this have anything to do with
“exploratory” vs. “expository” mode?
 e.g. pow2-1 vs. pow2-2
 we see that there is a considerable “distance” between
the two (cf. regression)
January 19th, 2005. VMT Meeting
Discussion: correlations (III)

Study at the second level (subcategories)





Two codes from Social Ref. dimension seem particularly interesting:
References to individual actions vs. group actions seem to be a key
point!
Code: Individual reference = Any utterance with a
reference to the self or another member. This refers to
the collaboration in a broader sense (an activity that
has been done or will be done by the self or another
group member)
Code: Group reference = Any utterance with a
reference to the group. This refers to the collaboration
in a broader sense (an activity that has been done or is
assumed to be done or will be done by the group)
Let’s look at pow2-1 vs. pow2-2
January 19th, 2005. VMT Meeting
Individual vs. group references in Pbm Solving
I thought of factoring (n + 2)^2 and n(n + 5)  Pbm Solving (Tactic) & Individual Ref.
we could find a range  Pbm Solving (Tactic) & Group Ref.
Check
Orientation
Perform
Pies show percents
Check
Orientation
Perform
social
Group Ref.
Individual Ref.
Identify other
Risk-taking
20.00%
40.00%
20.00%
20.00%
100.00%
100.00%
100.00%
100.00%
100.00%
Result
Restate
Reflect
Result
Restate
Reflect
20.00%
16.67%
20.00%
25.00%
25.00%
42.86%
57.14%
83.33%
100.00%
60.00%
Strategy
Tactic
Strategy
100.00%
Tactic
33.33%
66.67%
100.00%
100.00%
POWWOW2-1
January 19th, 2005. VMT Meeting
100.00%
POWWOW2-2
50.00%
This leads to…

Hypothesis:


in “expository” powwows there is more Individual ref. than
Group Ref. and,
in “exploratory” powwows there is more Group Ref. than
Individual ref.
that we will study from

Statistical approach (second level of analysis)



Thread analysis


distribution of freqs of individual vs. group refs
distribution of freqs of other subcategories
computing and visualizing individual-like threads and grouplike threads and combinations of them
CA approach
January 19th, 2005. VMT Meeting
Discussion: from CA perspective


How does the “social activity” unfolds
sequentially during the pbm solving?
And, specifically, how does the individual vs.
group reference unfolds?
January 19th, 2005. VMT Meeting
Discussion: from CA perspective (I)
Handle
Posting
AVR
it's okay
PIN
hahaa
SUP
my internet explorer wouldnt open
PIN
ena you gotta hurtet!
Ci
PIN
haha jk
Ss
PIN
hurry*
AVR
so now for the new triangle we have: 194.79 = 1/2bh
Cg
AVR
do you follow me?
Cg
PIN
hey its 124.708
PIN
cuz look
AVR
http://www.math.com/students/calculators/source/scientific.htm
AVR
and do the calculation
PIN
we agree it is 10.392
Cg
SUP
then einstein over here was confusing me
Io
PIN
or no?
AVR
yes we do
Soc. Ref
Math Move
P
Geo
Ch
Nc
Ch
Nc
Ss
Rs
Cg
Powwow2-1
January 19th, 2005. VMT Meeting
Pbm Solving
Ch
Discussion: from CA perspective (II)
Handle
Posting
REA
I got 15
MCP
REA
AH3
REA
AH3
REA
REA
REA
Soc. Ref.
Yep, that's right– I got 15 also
Math Move
R
I'm getting 15 also
I'll explain
Pbm Solv.
Ch
Nc
Ci
Ch
Nc
Ci
T
Geo
Ci
P
Geo
Ci
now
For the extra, let
first i got the area to both triangles
With the first one with edgelengths of 9
I used the 30-60-90 fourmla
Powwow2-2
January 19th, 2005. VMT Meeting
Discussion: from CA perspective (III)
Handle
Posting
Soc.
Ref
Pbm
Solving
AVR
so now we add the two areas
Cg
T
SUP
just a little
PIN
its 194.852
R
AVR
exactly
Ch
AVR
or 194.85 as I got it :-)
AVR
multiply it by two
AVR
and you get 389.704 = bh
PIN
we should get the exact measure
ment
Cg
Ci
Ci
Nc
Re
P
Nc
P
Geo
Ch
Nc
Powwow2-1
January 19th, 2005. VMT Meeting
Math Move
Discussion: from CA perspective (IV)
Handle
Posting
Soc. Ref
OFF
hey...
Gr
SUP
what do we fdomwith the area
Ss
AVR
off spring do SO not rule!
OFF
lol
AVR
especially if you are a woman!
AVR
no jk jk
Ss
OFF
lol
Ss
OFF
im no woman
PIN
lol
AVR
well I am
SUP
hey hey
SUP
women are great
GER
why don't the three old timers explain what you have figured out
OFF
oh
AVR
women are great...
SUP
ok
AVR
but pain-enduring
SUP
they wont explain it to me
Cg
AVR
okay, let's explain
Cg
January 19th, 2005. VMT Meeting
Pbm Sol
Math Move
Ss
Ss
Ss
Ss
Powwow2-1
Discussion: regression

Significant linear regressions between:




Social reference and Pbm Solving
Social reference and Math Move
Pbm Solving and Math Move
Coefficients in each equation show the
estimation for each case.
January 19th, 2005. VMT Meeting
Annex
January 19th, 2005. VMT Meeting
LLR Smoother (for the whole sample)
Perc entage Pbm Solving postings
LLR Smoother

40.00
pow2_2

A smoother is a trend line that
shows how the two variables
(X and Y) are related to one
another.
pow10
30.00

pow18

pow9
20.00

pow2_1

15.00
20.00
25.00
30.00
pow1
35.00
Percentage Social reference postings
January 19th, 2005. VMT Meeting
It is not a statistical test !!!
of the relationship of X and Y,
although in most cases it is
possible to infer the practical
significance of the relationship.
Correlation Pbm Solving vs. Math Move
(without removing pow18)
Correlations
Percenta
ge Pbm
Solving
postings
Percentage
Pbm Solving
postings
Pearson
Correlation
Sig. (2-tailed)
N
Percentage
Math Move
postings
Pearson
Correlation
Sig. (2-tailed)
N
Percenta
ge Math
Move
postings
1
.
.003
6
.956(**)
.003
6
** Correlation is significant at the 0.01 level (2-tailed).
January 19th, 2005. VMT Meeting
.956(**)
6
1
.
6
Individual vs. group action references in Social
Activity (count; for percents look at slide 23)
I thought of factoring (n + 2)^2 and n(n + 5)  Pbm Solving (Tactic) & Individual Ref.
we could find a range  Pbm Solving (Tactic) & Group Ref.
Composition of Pbm Solving in terms of Social Ref.
Social Ref.
Collaboration
group
Collaboration
individual
Identify other
Risk-taking
7
6
Count
5
4
Composition of Pbm Solving in terms of Social
Reference
Social Ref.
Collaboration
group
Collaboration
individual
Identify self
Resource
6
5
4
3
3
Count
2
2
1
1
0
Check
Perform
Restate
Strategy
Orientation
Result
Reflect
Tactic
POWWOW2-1
January 19th, 2005. VMT Meeting
0
Check
Perform
Restate
Strategy
Orientation
Result
Reflect
Tactic
POWWOW2-2