Hypothesis Testing Example: • Test the performance of two lists in terms of response rates • Sample (1,000) from the first list provides.

Download Report

Transcript Hypothesis Testing Example: • Test the performance of two lists in terms of response rates • Sample (1,000) from the first list provides.

Hypothesis Testing
Example:
• Test the performance of two lists in terms of
response rates
• Sample (1,000) from the first list provides a
response rate of 3.5%
• Sample (1,200) from the second list provides
a response rate of 4.5%
• Do the two lists (population) really have a
difference or is it an artifact of the sample?
The Logic of Hypothesis Testing
• When you want to make statements about a
population, you usually draw samples
• How generalizable is your sample-based finding?
• Evidence has to be evaluated statistically before
arriving at a conclusion regarding the hypothesis
• Depends on whether information is generated
from the sample with fewer or larger observations
Steps in Hyp
Testing
Problem Definition
Clearly state the null and
alternate hypotheses.
Choose the relevant test and
the appropriate probability
distribution
Determine the
significance level
Choose the critical value
Determine the
degrees of
freedom
Compute relevant
test statistic
Compare test statistic and
critical value
Decide if one-or
two-tailed test
Does the test statistic fall in
the critical region?
Yes
Reject null
No
Do not reject null
Basic Concepts of Hypothesis Testing
• The Null and Alternate hypothesis
• Choosing the relevant statistical test and
appropriate probability distribution. Depends on
- Size of the sample
- Whether the population standard deviation is
known or not
• Choosing the Critical Value. The three criteria
used are
- Significance Level
- Degrees of Freedom
- One or Two Tailed Test
Significance Level
• Indicates the percentage of sample means that is
outside the cut-off limits (critical value)
• The higher the significance level () used for
testing a hypothesis, the higher the probability of
rejecting a null hypothesis when it is true (Type I
error)
• Accepting a null hypothesis when it is false is
called a Type II error and its probability is ()
Significance Level (Contd.)
• When choosing a level of significance, there
is an inherent tradeoff between these two
types of errors
• Power of hypothesis test (1 - )
• A good test of hypothesis ought to reject a
null hypothesis when it is false
• 1 -  should be as high a value as possible
Degree of Freedom
• The number or bits of "free" or
unconstrained data used in calculating a
sample statistic or test statistic
• A sample mean (X) has `n' degree of
freedom
• A sample variance (s2) has (n-1) degrees of
freedom
One or Two-tail Test
• One-tailed Hypothesis Test
• Determines whether a particular population parameter is larger
or smaller than some predefined value
• Uses one critical value of test statistic
• Two-tailed Hypothesis Test
• Determines the likelihood that a population parameter is within
certain upper and lower bounds
• May use one or two critical values
Hypothesis Testing
DATA ANALYSIS
OUTCOME
In Population
Accept Null
Hypothesis
Null Hypothesis Correct Decision
True
Null Hypothesis
False
Type II Error
Reject Null
Hypothesis
Type I Error
Correct
Decision
Hypothesis Testing About
a Single Mean - Step-by-Step
1) Formulate Hypotheses
2) Select appropriate formula
3) Select significance level
4) Calculate z or t statistic
5) Calculate degrees of freedom (for t-test)
6) Obtain critical value from table
7) Make decision regarding the Null-hypothesis
Hypothesis Testing About
a Single Mean - Example 1(2 tailed)
• Ho:  = 5000 (hypothesized value of population)
• Ha:   5000 (alternative hypothesis)
• n = 100
X = 4960
•  = 250
•  = 0.05
Rejection rule: if |zcalc| > z/2 then reject Ho.
Hypothesis Testing About
a Single Mean - Example 2
• Ho:  = 1000 (hypothesized value of population)
• Ha:   1000 (alternative hypothesis)
• n = 12
X = 1087.1
• s = 191.6
•  = 0.01
Rejection rule: if |tcalc| > tdf, /2 then reject Ho.
Hypothesis Testing About
a Single Mean - Example 3(1 tailed)
• Ho:   5000 (hypothesized value of population)
• Ha:  < 5000 (alternative hypothesis)
• n = 50
X = 4970
•  = 250
•  = 0.01
Rejection rule: if  Z  ZCalc then reject Ho.
Hypothesis Test of Difference
between Means
• Mayor of a city wants to see if males and
females earn the same
• A random sample of 400 males and 576
females was taken and following was found
Males
Females
Mean
$105.70
$112.80
Standard dev
$5.00
$4.80
Hypothesis Test of Difference
between Means
• The appropriate test depends on
- whether samples are from related or
unrelated samples
- whether population standard deviations are
known or not
- if not, whether they can be assumed to be
equal or not
Hypothesis Test of Difference
between Means
• In salary example, the null hypothesis is
Ho: 1- 2 =c (=0)
Ha: 1- 2  c
• Since we have unrelated samples with known (for
large samples, we can use sample SD as pop SD) but
unequal  ’s the standard error of difference in means
is
S X1  X 2
2
1
2
2
2
2
s s
5 (4.80)

 

 .32
n1 n2
400 576
Hypothesis Test of Difference
between Means
• The calculated value of z is
( X1  X 2 )  ( 1  2 )
zcalc 
 22.19
S X1 X 2
• For  =.01 and a two-tailed test, the Z-table
value is 2.58
• Since zcalc is greater than z / 2 , the null
hypothesis is rejected
Hypothesis Testing of Proportion
• Quality control dept of a light bulb
company claims 95% of its products are
defect free
• The CEO checks 225 bulbs and finds only
87% to be defect free
• Is the claim of 95% true at .05 level of
significance ?
• So we have hypothesized values po  0.95, qo  0.05,
and sample values p  0.87, q  0.13
Hypothesis Testing of Proportion
• The null hypothesis is Ho: p=0.95
• The alternate hypothesis is Ha: p 0.95
• First, calculate the standard error of the proportion
using hypothesized values as
p 
po qo
.95 .05

 .0145
n
225
• Since np and nq are large, we can use the Z table.
The appropriate z value is 1.96
Hypothesis Testing of Proportion
• The limits of the acceptance region are
po  1.96 p  .95  (1.96 .0145)  (.922, .978)
• Since the sample proportion of 0.87 does
not fall within the acceptance region, the
CEO should reject the quality control
department’s claim
Hypothesis Testing of Difference
between Proportions
• Manager wants to see if John and Linda,
two salespeople, have the same conversion
• He picks samples and finds that
John
Sample
size
100
Number
converted
84
Proportion
converted
0.84 (= p j )
Linda
100
82
0.82(= pl )
Hypothesis Testing of Difference
between Proportions
• Are their conversion rates different at 0.05
significance level?
• The null hypothesis is Ho: p j  pl
• The alternate hypothesis is Ha: p j  pl
• The best estimate of p (proportion of success)
is
pˆ 
n1 p j  n2 pl
n1  n2
 0.83 also,
qˆ  1  pˆ  .17
Hypothesis Testing of Difference
between Proportions
• An estimate of the standard error of the difference of
proportions is
ˆ p  p 
j
l
pˆ qˆ pˆ qˆ

 .053
n1 n2
• The z value can be calculated as
z calc 
( p j  pl )  0
ˆ p  p
j
 .38
l
• The z value obtained from the table is 1.96 (for   .05 ).
Thus, we fail to reject the null hypothesis
The Probability Values (P-value)
Approach to Hypothesis Testing
• P-value provides researcher with
alternative method of testing hypothesis
without pre-specifying 
• Largest level of significance at which we
would not reject Ho
The Probability Values (P-value)
Approach to Hypothesis Testing
Difference Between Using  and p-value
• Hypothesis testing with a pre-specified 
• Researcher is trying to determine, "is the probability
of what has been observed less than ?"
• Reject or fail to reject Ho accordingly
The Probability Values (P-value)
Approach to Hypothesis Testing
Using the p-Value
• Researcher can determine "how unlikely is the
result that has been observed?"
• Decide whether to reject or fail to reject Ho
without being bound by a pre-specified
significance level
• In general, the smaller the p-value, the greater is
the researcher's confidence in sample findings
The Probability Values (P-value) Approach
to Hypothesis Testing: Example
• Ho:  = 25 (hypothesized value of population)
• Ha:   25 (alternative hypothesis)
• n = 50
X = 25.2
•  = 0.7

X 
• SE( X )=
= 0.1;
Z=
=2
n

X
• From Z-table, prob Z >2 is 0.0228. As this is a 2tailed test, the p-value is 2 0.228=.0456
The Probability Values (P-value)
Approach to Hypothesis Testing
Using the p-Value
• P-value is generally sensitive to sample size
• A large sample should yield a low p-value
• P-value can report the impact of the sample
size on the reliability of the results
Relationship between C.I and
Hypothesis Testing (Example 1)
• A direct mktr knows that average no of purchases
per month in entire database is 5.6
• By sampling ‘loyals’ he finds that their average is
6.1(i.e, X =6.1)
• Is it merely a sampling accident?
• Ho:  = 5.6 (hypothesized value of population)
• Ha:   5.6 (alternative hypothesis)
• n = 35
•  = 2.5
Relationship between C.I and
Hypothesis Testing (Example 1)

• Std err  X 
n
=0.42
• The appropriate Z for  =.05 is 1.96
• The Confidence Interval is
0  1.96 X = (4.78, 6.42)
• Since 6.1 falls in the interval, we cannot
reject the null hypothesis
0  1.96 X
Confidence Intervals and
Hypothesis Testing
• Hypothesis testing and Confidence Intervals
are two sides of the same coin.
t=
( X  )
 X  tsx
sx
= Interval estimate for 
Relationship between C.I and
Hypothesis Testing (Example 2)
• Revisit the first example we started with
• Test the performance of two lists in terms of
response rates
• Sample (1,000) from the first list provides a
response rate of 3.5%
• Sample (1,200) from the second list provides
a response rate of 4.5%
• Do the two lists (population) really have a
difference or is it an artifact of the sample?
Relationship between C.I and
Hypothesis Testing (Example 2)
– C.I. of list 1:
• (0.035)+/- 1.96*(SE1)
• SE1 = Sqrt[(0.035*0.965)/1000]=0.006
• C.I.1=(0.0232,0.0467)
– C.I. of list 2:
• (0.045)+/-1.96*(SE2)
• SE2=Sqrt[(0.045*0.955)/1200]=0.006
• C.I.2 =(0.033,0.0568)
– What can we infer based on these confidence Intervals?
• Lack of sufficient evidence to infer that there is any difference
between the response rates in the two samples.