Quantitative Measures

Download Report

Transcript Quantitative Measures

Statistics for Linguistics
Students
Michaelmas 2004
Week 5
Bettina Braun
www.phon.ox.ac.uk/~bettina
Overview
• P-values
• How can we tell that data are taken from a
normal distribution?
• Speaker normalisation
• Data aggregation
• Practicals
• Non-parametric tests
p-values
• p-values for all tests tell us whether or not
to reject the null hypothesis (and with what
confidence)
• In linguistic research, a confidence level of
95% is often sufficient, some use 99%
• This decision is up to you. Note that the
more stringent your confidence level, the
more likely is a type II error (you don’t find
a difference that is actually there)
p-values
• If you decide for a p-value of 0.05 (95% certainty
that there indeed is a significant difference), then
a value smaller than 0.05 indicates that you can
reject the null-hypothesis
• Remember: the null-hypothesis generally
predicts that there is no difference
• If we find an output saying p = 0.000, we cannot
certainly say that it is not 0.00049; so we
generally say p < 0.001
p-values
• So, in a t-test, if you have p = 0.07 means that
you cannot reject the null hypothesis that there
is no difference
 there is no significant difference between the
two groups
• In the Levene test for homogenity of variances, if
p = 0.001, then you have to reject the nullhypothesis that there is no difference
 so there is a difference in the variances for
the two groups
Kolmogorov-Smirnov test
• Parametric tests assume that the data are
taken from normal distributions
• Kolmogorov-Smirnov test can be used to
compare actual data to normal distribution
-- the cumulative probabilities of values in the
data are compared with the cumulative
probabilities in a theoretical normal
distribution
– Null-hypothesis: your sample is taken from a
normal distribution
Kolmogorov-Smirnov test
• Non-parametric test
• Kolmogorov-Smirnoff
statistic is the
greatest difference in
cumulative
probabilities across
range of values
• If its value exceeds a
threshold, nullhypothesis is to be
rejected
Kolmogorov-Smirnov test
• Kolmogorov test is
not significant, i.e. the
null-hypothesis that
our sample is drawn
from a normal
distribution holds
• The distribution can
therefore be assumed
to be normal:
Kolmogorov-Smirnov
Z = 0.59; p = 0.9
Speaker normalisation
• We often collect data from different
subjects but we are not interested in the
speaker differences (e.g. mean pitch
height, average speaking rate)
• We can convert the data to z-scores
(which tell us how many sd away a given
score is from the speaker mean)
Speaker normalisation in SPSS
• First, you have the split the file according
to the speakers (Data -> split file)
Speaker normalisation in SPSS
• Then, Analyze -> Descriptive Statistics ->
Descriptives
• This will create an output, but also a new column
with z-values
Sorting data for within-subjects
desings
Aggregating data
• One can easily build a mean for different
categories, preserving the structure of the
SPSS table
• Data -> Aggregate
– Independent variables you want to preserve
are “break variables”
– Dependent variables for which you’d like to
calculate the mean are “Aggregated variables”
– Per default, new table will be stored as
aggr.sav
Aggregating data
• SPSS-dialogue-box
Non-parametric tests
• If assumptions for parametric tests are not
met, you have to do non-parametric tests.
• They are statistically less powerful (i.e.
they are more likely not to find a difference
that is actually there – Type I error)
• On the other hand, if a non-parametric test
shows a significant difference, you can
draw strong conclusions
Mann-Whitney test
• Non-parametric equivalent to independent
t-test
• Null-hypothesis: The two samples we are
comparing are from the same distribution
• All data are ranked and calculations are
done on the ranks
Wilcoxon Signed ranks test
• Non-parametric equivalent to paired t-test
• The absolute differences in the two
conditions are ranked
• Then the sign is added and the sum of the
negative and positive ranks is compared
• Requires that the two samples are drawn
from populations with the same distribution
shape
(if this is not the case, use the Sign Test)
Examples
• English is closer to German than French is
• A teacher compares the marks of a group
of German students who take English and
French (according to the German system
from 1 to 15)
• His research hypothesis is that pupils have
better marks in English than in French
• One-tailed prediction!
• File: language_marks.sav
Example
• For a one-tailed
test divide the
significance value
bz 2
• Marks in English
are better than in
French
(Z= -2.28, p =
0.011)
What are frequency data?
• Number of subjects/events in a given
category
• You can then test whether the observed
frequencies deviate from your expected
frequencies
• E.g. In an election, there is an a priori
change of 50-50 for each candidate.
• Note that you must determine your
expected frequencies beforehand
X2-test
• Null-hypothesis: there is no difference between
expected and observed frequency
• Data
Kerry
Bush
observed
expected
• Calculation
supporter
56
50
supporter
44
50
X2-test example
• Null-hypothesis: there is no difference between
expected and observed frequency
• Data
Kerry
Bush
supporter
observed
expected
• Calculation
supporter
Looking up the p-value
Calculated value for X2
must be larger than the
one found in the table
Degrees of freedom:
• If there is one
independent variable
df = (a – 1)
• Iif there are two
independent variables:
df = (a-1)(b-1)
X2-test
• Limitations:
– All raw data for X2 must be frequencies (not
percentages!)
– Each subject or event is counted only once
(if we wish to find out whether boys or girls are more
likely to pass or fail a test, we might observe the
performance of 100 children on a test. We may not
observe the performance of 25 children on 4 tests,
however)
– The total number of observations should be greater
than 20
– The expected frequency in any cell should be greater
than 5
X2 as test of association
• Calculation of expected frequencies:
Row total x column total
Cell freq =
Grand total
Past tense Present
tense
Progressive 308
476
Non315
297
progressive
Total
623
773
Apect
total
784
612
1396