Transcript View Notes
Non-parametric statistical methods
for testing questionable datapopulation assumptions
Philip Twumasi-Ankrah, PhD
November 15, 2012
Parametric or Non-Parametric Tests
• Choosing the right test to compare
measurements is a bit tricky, as you must
choose between two families of tests:
– parametric and
– nonparametric
Parametric Tests
• Parametric statistical test are based upon the
assumption that the data are sampled from a
Gaussian distribution.
• These tests include the t test and analysis of
variance.
Non-Parametric Tests
• Tests that do not make assumptions about the
population distribution are referred to as
nonparametric- tests.
• All commonly used nonparametric tests rank
the outcome variable from low to high and
then analyze the ranks.
• These tests include the Wilcoxon, MannWhitney test, and Kruskal-Wallis tests.
• These tests are also called distribution-free
tests.
Validity of Assumptions
• For Parametric statistical tests, it is important
that the assumptions made on the probability
distribution is valid.
• If this assumption about the data is true,
parametric tests are:
– more powerful than their equivalent nonparametric counterparts
– can detect differences with smaller sample sizes,
– detect smaller differences with the same sample
size.
Tests of Normality
• It is usually important to assure yourself of the
validity of the Normality Assumption.
• This involves tests of univariate normality and
include:
– Graphical Methods
– Back-of-envelope Tests
– Some Historical Tests
– Diagnostic Tests
Graphical Tests
• Graphical Methods
– The Normal Quantile-Quantile (Q-Q) plot constructed by plotting the empirical quantiles of the
data against corresponding quantiles of the normal
distribution.
– Kernel Density Plot - Plot of the approximation a
hypothesized probability density function from the
observed data.
– The probability-probability plot (P-P plot or percent
plot - Compares an empirical cumulative distribution
function of a variable with a specific theoretical
cumulative distribution function (e.g., the standard
normal distribution function)
More Graphical Tests
• Graphical Methods
– Histogram plot of the data
– A box-plot of the data should indicate the nature
of skewness of the data.
– Stem-and-Leaf Plot
Fast-and-Easy Tests
• Back-of-envelope Tests
– Using the sample maximum and minimum values,
computes their z-score, and compare to the 68–
95–99.7 rule:
Historically Relevant Tests
• Some Historical Tests
– The third and fourth standardized moments
(skewness and kurtosis) were some of the earliest
tests for normality.
– Other early test statistics include the ratio of the
mean absolute deviation to the standard deviation
OR
– The ratio of the range to the standard deviation.
Diagnostic Tests
• Diagnostic Tests
–
–
–
–
–
–
–
–
–
–
D'Agostino's K-squared test,
Jarque–Bera test,
Anderson–Darling test,
Cramér–von Mises criterion,
Lilliefors test for normality
Kolmogorov–Smirnov test),
Shapiro–Wilk test,
Pearson's chi-squared test, and
Shapiro–Francia test.
More recent tests include:
• The energy test
• Tests based on the empirical characteristic function like those by
Henze and Zirkler, and the BHEP tests.
Choosing Between Parametric and Non-Parametric
Tests:
Does it Matter?
• Does it matter whether you choose a
parametric or nonparametric test?
The answer depends on sample size.
There are four cases to think about:
Choosing Between Parametric and Non-Parametric
Tests:
Does it Matter?
• Using a parametric test with data from a Non-Normal
population when sample sizes are large:
– The central limit theorem ensures that parametric tests
work well with large samples even if the population is nonGaussian. That is, parametric tests are robust to deviations
from Normal distributions, so long as the samples are
large.
– It is impossible to say how large is large enough.
• Nonparametric tests work well with large samples from
Normal populations.
– The P values tend to be a bit too large, but the discrepancy
is small. In other words, nonparametric tests are only
slightly less powerful than parametric tests with large
samples.
Choosing Between Parametric and Non-Parametric
Tests:
Does it Matter?
• For small samples
– You can't rely on the central limit theorem, so the
P value may be inaccurate.
– In a nonparametric test with data from a Gaussian
population, the p - values tend to be too high.
– The nonparametric tests lack statistical power
with small samples.
Choosing Between Parametric and NonParametric Tests:
Does it Matter?
• Does it matter whether you choose a
parametric or nonparametric test?
– Large data sets present no problems.
– Small data sets present a dilemma.
Non-Parametric Tests…
• Assume that your data have an underlying continuous
distribution.
• Assume that for groups being compared, their parent
distributions are similar in all characteristics other than
location.
• Are usually less sensitive than parametric methods.
• Are often more robust than parametric methods when
their assumptions are properly met.
• Can run into problems when there are many ties (data
with the same value).
• That take into account the magnitude of the difference
between categories (e.g. Wilcoxon signed ranks test) are
more powerful than those that do not (e.g. sign test).
Choice of Non-Parametric Test
• It depends on the level of measurement obtained
(nominal, ordinal, or interval), the power of the test,
whether samples are related or independent, number of
samples, availability of software support (e.g. SPSS)
• Related samples are usually referred to match-pair (using
randomization) samples or before-after samples.
• Other cases are usually treated as independent
samples. For instance, in a survey using random
sampling, we have a sub-sample of males and a subsample of females. They can be considered as
independent samples as they are all randomly selected.
Non-Parametric Tests in SPSS
Level of
One-sample
measurement test
Two-sample case
Related Samples Independent samples
Nominal
Binomial
Ordinal
Kolmogorov
Smirnov
McNemar for
significance of
changes
Sign Wilcoxon
matched-pair
signed-ranks
Runs
Interval
Fisher exact
probability
Chi-square
Mann-Whitney U
Kolmogorov-Smirnov
Wald-Wolfowitz runs
Walsh
Moses of extreme
reactions
Randomization
K-sample case
Related
Independent
samples
samples
Cochran Q
Chi-square
(Dichotomous)
Friedman
two-way
analysis of
variance
Kendall’s W
Kruskal-Wallis
one-way
analysis of
variance
One-sample case
• Binomial – tests whether the observed
distribution of dichotomous variable (a variable
that has two values only) is the same as that
expected from a given binomial distribution.
• The default value of p is 0.5. You can change
the value of p.
• For example, a couple has given birth
consecutively 8 baby girls, and you would like
to test if their probability of given birth to baby
girls is > 0.6 or >0.7, you can test the
hypothesis by changing the default value of p
in the SPSS programme.
One Sample Test Continued
• Kolmogorov-Smirnov – Compares
the distribution of a variable with a
uniform, normal, Poisson, or
exponential distribution,
• Null hypothesis: the observed values
were sampled from a distribution of
that type.
More One Sample Tests
Runs
• A run is defined as a sequence of cases on the
same side of the cut point. (An uninterrupted
course of some state or condition, for e.g. a run of
good luck).
• You should use the Runs Test procedure when you
want to test the hypothesis that the values of a
variable are ordered randomly with respect to a
cut point of your choosing (Default cut point:
median.
• Example:
• If you ask 20 students about how well they understand
a lecture on a scale ranged from 1 to 5 (and the
median in the class is 3). If you find that, the first 10
students give a value higher than 3 and the second 10
give a value lower than 3 (there are only 2 runs).
5445444545 2222112211
• For random situation, there should be more runs (but
will not be close to 20, which means they are ordered
exactly in an alternative fashion; for example a value
below 3 will be followed by one higher than it and vice
versa). 2,4,1,5,1,4,2,5,1,4,2,4
• The Runs Test is often used as a precursor to running
tests that compare the means of two or more groups,
including:
–
–
–
–
The Independent-Samples T Test procedure.
The One-Way ANOVA procedure.
The Two-Independent-Samples Tests procedure.
The Tests for Several Independent Samples procedure.
Runs Test
Test Valuea
Cases < Test Value
Cases >= Test Value
Total Cases
Number of Runs
Z
Asymp. Sig. (2-tailed)
a. Median
siblings
1.00
4
36
40
7
-.654
.513
Sample cases (Related Samples)
• McNemar – tests whether the
changes in proportions are the same
for pairs of dichotomous variables.
McNemar’s test is computed like the
usual chi-square test, but only the
two cells in which the classification
don’t match are used.
• Null hypothesis: People are equally
likely to fall into two contradictory
classification categories.
Related Sample Cases
• Sign test – tests whether the numbers of
differences (+ve or –ve) between two
samples are approximately the same. Each
pair of scores (before and after) are
compared.
• When “after” > “before” (+ sign), if smaller
(- sign). When both are the same, it is a tie.
• Sign-test did not use all the information
available (the size of difference), but it
requires less assumptions about the
sample and can avoid the influence of the
outliers.
Sign Test
• To test the association between the
following two perceptions
• Social workers help the
disadvantaged and Social workers
bring hopes to those in averse
situation
More Related Sample Cases
• Wilcoxon matched-pairs signed-ranks test –
Similar to sign test, but take into consideration the
ranking of the magnitude of the difference among
the pairs of values. (Sign test only considers the
direction of difference but not the magnitude of
differences.)
• The test requires that the differences (of the true
values) be a sample from a symmetric distribution
(but not require normality). It’s better to run
stem-and-leaf plot of the differences.
Two-sample case
(independent samples)
• Mann-Whitney U – similar to Wilcoxon matchedpaired signed-ranks test except that the samples
are independent and not paired. It’s the most
commonly used alternative to the independentsamples t test.
• Null hypothesis: the population means are the
same for the two groups.
• The actual computation of the Mann-Whitney test
is simple. You rank the combined data values for
the two groups. Then you find the average rank in
each group.
• Requirement: the population variances for the
two groups must be the same, but the shape of
the distribution does not matter.
Two Independent Sample Cases
• Kolmogorov-Smirnov Z– to test if
two distributions are different. It is
used when there are only a few
values available on the ordinal
scale. K-S test is more powerful than
M-W U test if the two distributions
differ in terms of dispersion instead
of central tendency.
More Two Independent Sample Cases
• Wald-Wolfowitz Run – Based on the
number of runs within each group
when the cases are placed in rank
order.
• Moses test of extreme reactions –
Tests whether the range (excluding
the lowest 5% and the highest 5%) of
an ordinal variables is the same in the
two groups.
K-sample case
(Independent samples)
• Kruskal-Wallis One-way ANOVA – It’s
more powerful than Chi-square test
when ordinal scale can be assumed.
It is computed exactly like the MannWhitney test, except that there are
more groups. The data must be
independent samples from
populations with the same shape
(but not necessarily normal).
K Related samples
• Friedman two-way ANOVA – test
whether the k related samples could
probably have come from the same
population with respect to mean rank.
More K Related Samples Cases
• Cochran Q – determines whether it is
likely that the k related samples could
have come from the same population
with respect to proportion or
frequency of “successes” in the
various samples.
• In other words, it requires
dichotomous variables.
Other Interesting Use of
Non-Parametrics
• Non-parametric regression
– Is a form of regression analysis in which the
predictor does not take a predetermined form but
is constructed according to information derived
from the data.
– Nonparametric regression requires larger sample
sizes than regression based on parametric models
because the data must supply the model structure
as well as the model estimates.
Questions