Six Sigma Black Belt Training

Download Report

Transcript Six Sigma Black Belt Training

Hypothesis Testing
Introduction
Always about a population parameter
Attempt to prove (or disprove) some assumption
Setup:
alternate hypothesis: What you wish to prove
Example: Person is guilty of crime
null hypothesis: Assume the opposite of what is
to be proven. The null is always stated as an
equality.
Example: Person is innocent
The test
1.
Take a sample, compute statistic of interest.
The evidence gathered against defendent
2.
How likely is it that if the null were true, you
would get such a statistic? (the p-value)
How likely is it that an innocent person would be
found at the scene of crime, with gun in hand,
etc.
3.
4.
If very unlikely, then null must be false, hence
alternate is proven beyond reasonable doubt.
If quite likely, then null may be true, so not
enough evidence to discard it in favor of the
alternate.
Types of Errors
Null is really
True
reject null,
Type I Error
assume alternate is
proven
(convict the
innocent)
do not reject null,
Good Decision
evidence for alternate
not strong enough
Null is really
False
Good Decision
Type II Error
(let guilty go free)
Hypothesis Testing Roadmap
Hypothesis Testing
Continuous
Normal,
Interval Scaled
Attribute
Non-Normal,
Ordinal Scaled
c2 Contingency
Tables
Means
Variance
Medians
Variance
Correlation
Z-tests
c2
Correlation
Levene’s
t-tests
F-test
Sign Test
Same tests as
Non-Normal
Medians
ANOVA
Bartlett’s
Wilcoxon
Correlation
KruskalWallis
Regression
Mood’s
Friedman’s
Parametric Tests
Use parametric tests when:
1.
2.
3.
The data are normally distributed
The variances of populations (if more than one is sampled
from) are equal
The data are at least interval scaled
One sample z - test
Used when testing to see if sample comes from a known
population. A sample of 25 measurements shows a mean of 17.
Test whether this is significantly different from a the hypothesized
mean of 15, assuming the population standard deviation is known
to be 4.
One-Sample Z
Test of mu = 15 vs not = 15
The assumed standard deviation = 4
N Mean SE Mean
95% CI
Z
P
25 17.0000 0.8000 (15.4320, 18.5680) 2.50 0.012
Z-test for proportions
70% of 200 customers surveyed say they prefer the taste of Brand X
over competitors. Test the hypothesis that more than 66% of
people in the population prefer Brand X.
Test and CI for One Proportion
Test of p = 0.66 vs p > 0.66
Sample X N Sample p
1
140 200 0.700000
95%
Lower
Bound Z-Value P-Value
0.646701
1.19
0.116
One sample t-test
BP
Reduction
%
Normal - 95% CI
99
Mean
StDev
N
AD
P-Value
95
90
13.82
3.925
17
0.204
0.850
80
Percent
10
12
9
8
7
12
14
13
15
16
18
12
18
19
20
17
15
Probability Plot of BP Reduction
70
60
50
40
30
20
10
5
1
0
5
10
15
BP Reduction
20
25
30
The data show reductions in Blood Pressure in a
sample of 17 people after a certain treatment. We
wish to test whether the average reduction in BP
was at least 13%, a benchmark set by some other
treatment that we wish to match or better.
One Sample t-test – Minitab results
One-Sample T: BP Reduction
Test of mu = 13 vs > 13
95%
Lower
Variable
N Mean StDev SE Mean Bound T
P
BP Reduction 17 13.8235 3.9248 0.9519 12.1616 0.87 0.200
The p-value of 0.20 indicates that the reduction in BP could not be
proven to be greater than 13%. There is a 0.20 probability that it is
not greater than 13%.
Two Sample t-test
You realize that though the overall reduction is not proven to be
more than 13%, there seems to be a difference between how men
and women react to the treatment. You separate the 17
observations by gender, and wish to test whether there is in fact a
significant difference between genders.
Test for Equal Variances for BP Reduction
F-Test
Test Statistic
P-Value
F
0.96
0.941
Lev ene's Test
Gender
F
15
16
18
12
18
19
20
17
15
Test Statistic
P-Value
M
1
2
3
4
5
95% Bonferroni Confidence Intervals for StDevs
6
F
Gender
M
10
12
9
8
7
12
14
13
M
6
8
10
12
14
BP Reduction
16
18
20
0.14
0.716
Two Sample t-test
The test for equal variances shows that they are not different for the 2
samples. Thus a 2-sample t test may be conducted. The results are
shown below. The p-value indicates there is a significant difference
between the genders in their reaction to the treatment.
Two-sample T for BP Reduction M vs BP Reduction F
N Mean StDev SE Mean
BP Red M 8 10.63 2.50 0.89
BP Red F 9 16.67 2.45 0.82
Difference = mu (BP Red M) - mu (BP Red F)
Estimate for difference: -6.04167
95% CI for difference: (-8.60489, -3.47844)
T-Test of difference = 0 (vs not =): T-Value = -5.02 P-Value = 0.000
DF = 15
Both use Pooled StDev = 2.4749
Basics of ANOVA
Analysis of Variance, or ANOVA is a technique used to
test the hypothesis that there is a difference between
the means of two or more populations. It is used in
Regression, as well as to analyze a factorial
experiment design, and in Gauge R&R studies.
The basic premise of ANOVA is that differences in the
means of 2 or more groups can be seen by
partitioning the Sum of Squares. Sum of Squares
(SS) is simply the sum of the squared deviations of the
observations from their means. Consider the following
example with two groups. The measurements show the
thumb lengths in centimeters of two types of
primates.
Total variation (SS) is 28, of which only 4 (2+2) is
within the two groups. Thus 24 of the 28 is due to the
differences between the groups. This partitioning of
SS into ‘between’ and ‘within’ is used to test the
hypothesis that the groups are in fact different from
each other.
See www.statsoft.com for more details.
Obs.
Type A Type B
1
2
3
2
3
4
6
7
8
Mean
SS
3
2
7
2
Overall
Mean = 5
SS = 28
Results of ANOVA
The results of
running an ANOVA on
the sample data from
the previous slide are shown
here. The hypothesis test
computes the F-value as the
ratio of MS ‘Between’ to
MS ‘Within’. The greater the
value of F, the greater the
likelihood that there is in fact
a difference between the groups.
looking it up in an F-distribution
table shows a p-value of 0.008,
indicating a 99.2% confidence that
the difference is real (exists in the
Population, not just in the sample).
One-way ANOVA: Type A, Type B
Source DF SS MS
F
P
Factor 1 24.00 24.00 24.00 0.008
Error 4 4.00
1.00
Total 5 28.00
___________________________________
S = 1 R-Sq = 85.71% R-Sq(adj) = 82.14%
Minitab: Stat/ANOVA/One-Way (unstacked)
Two-Way ANOVA
Strength
20.0
22.0
21.5
23.0
24.0
22.0
25.0
24.0
24.5
17.0
18.0
17.5
Temp
Low
Low
Low
Low
Low
Low
High
High
High
High
High
High
Speed
Slow
Slow
Slow
Fast
Fast
Fast
Slow
Slow
Slow
Fast
Fast
Fast
The results show
significant main effects
as well as an
interaction effect.
Is the strength of steel produced different
for different temperatures to which it is
heated and the speed with which it is
cooled? Here 2 factors (speed and temp)
are varied at 2 levels each, and strengths
of 3 parts produced at each combination
are measured as the response variable.
Two-way ANOVA: Strength versus Temp, Speed
Source
DF
SS
Temp
1 3.5208
Speed
1 20.0208
Interaction 1 58.5208
Error
8 5.1667
Total
11 87.2292
MS
F
P
3.5208 5.45 0.048
20.0208 31.00 0.001
58.5208 90.61 0.000
0.6458
S = 0.8036 R-Sq = 94.08% R-Sq(adj) = 91.86%
Two-Way ANOVA
The box plots give an indication of the interaction effect. The
effect of speed on the response is different for different levels of
temperature. Thus, there is an interaction effect between
temperature and speed.
Boxplot of Strength by Temp, Speed
25
24
23
Strength
22
21
20
19
18
17
16
Speed
Temp
Fast
Slow
High
Fast
Slow
Low