Statistical Hypothesis Testing & Bir

Download Report

Transcript Statistical Hypothesis Testing & Bir

By: Becca Doll, Stevie Lemons, Eloise Nelson





Groups of five or fewer.
No looking at notes or readings.
You have one minute.
On scratch paper, write down as many key
terms and concepts from this class as you
can remember. Anything from the quarter.
There is a reward

“Calculates the probability, p, that an
observed difference between two or more
data samples can be explained by random
chance alone, as opposed to any fundamental
difference between the underlying
populations that the sample came from.”
(www.dmaictools.com/dmaic-analyze/hypothesis-testing)
How to construct a test
1- Formulate null hypothesis (H₀)
and research hypothesis (H₁/ H‸)
2- Choose a sample distribution
and a statistical test according to
the null hypothesis and data type.
 3- Specify a significance level (α) and define
the region of rejection

(Nachmias ©1992 )
Construction continued.


4- Compute and reject/ fail to reject the null
5- Describe your findings in English.
(Nachmias © 1992)
A little History



Hypothesis testing is largely the product of
Ronald Fisher, Jerzy Neyman, Karl Pearson
and (son) Egon Pearson.
Fisher→ Rigorous experimental design to extract a
result from few samples. Chi-square
Pearson→ Emphasized mathematical rigor and methods to obtain more
results from many samples and a wider range of distributions. Pearson’s
correlation
(Wikipedia)
Formulating Hypothesis

To formulate use The Principle of Parsimony
or economy.
♫ A null hypothesis (H₀) is a statement made by a researcher at
the beginning of an experiment that says, essentially, that nothing is
happening in the experiment.
♫
The alternative also called the research hypothesis (H₁/ H‸) is
the antithesis of the null.
(Nachmias © 1992)
Sample distribution and Statistical
Test

The data collected from your sample
population can be classified into one of four
categories.
Nominal→
Yes/ No answer
Ordinal→ First, Second, Third Discrete scales, not continuous
Interval→ No true zero, like temperature
Ratio → True zero, like money

Once you know what kind of data you have
you can choose the appropriate Statistical
Test
(Richardson)
Significance level and Region of
Rejection


“The decision as to what result is sufficiently
unlikely to justify the rejection of the null
hypothesis is quite arbitrary”
“The sum of the probabilities of the results included in the region of
rejection is denoted as the level of significance (α)”
◦


The level is usually set at .05 or .01
Region of Rejection
 “The area specified by a null hypothesis that covers the values of
the observed statistics that lead to the rejection of the null
hypothesis.”
(Nachmias © 1992)
Region of Rejection continued



A statistical test can be one tailed or two.
“Two-tailed-region of rejection is located at
both left and right tails.” ≠
“One-tailed- extreme results can be located
at either tail.” </>
(Nachmias © 1992)
Reject or Fail to Reject the Null

Type I and Type II Errors
Decision
Null Hypothesis
is True
Null Hypothesis
is False
Reject
Hypothesis
Type I error
No error
Accept
Hypothesis
No error
Type II error
(Nachmias © 1992)
Defendant Innocent
Defendant Guilty
Reject the Null
Hypothesis
Type I Error
(innocent person
convicted)
Correct
Fail to Reject the
Null Hypothesis
Correct
Type II Error
(guilty person goes
free)
(Rogers)
1) COURT CASE : H₀: The Defendant is not guilty
2) MANUFACTURING AIRBAGS: H₀: The airbags are up to
industry standards
3) MR. MAGIC MALE ENHANCER
(includes a drug, not approved by the FDA for male
Erectile Dysfunction (ED). The drug may interact with Nitrates in prescription medications for heart
conditions, high blood pressure, and diabetes, lowering blood pressure to dangerous levels)
H₀: This drug has no side effects on blood pressure.
(Rogers) (U.S Food and Drug Administration)

The probability of a Type I Error (α) is the
level of significance.
α
Type I
Error
α
Type II
Error
(Richardson)

Power = the probability of rejecting a false
hypothesis.

β is the probability of a Type II Error.

To find Power use 1-β.

We want the power of a test to approach 1
(means less likelihood of a Type II Error)
(Black 1999)




Counter-intuitive formulation and generally confusing
terminology
A logical claim, not an empirical one
The larger the sample size, the greater the chance of finding a
statistically significant difference, regardless of whether the
difference is notable
Statistical significance does not tell you anything about the size
of the effect being measured



Statistical significance does not equal
importance
Conversely, statistical insignificance does not
equal unimportance
“Absence of evidence is not evidence of
absence”

Ronald Carver's 4 suggestions:
◦ Insist that "statistically" be inserted in front of "significant" in research
reports.
◦ Insist that the results always be interpreted with respect to the data first,
and statistical significance, second.
◦ Insist that attention be paid to the size of the effect, whether it is
statistically significant or not.
◦ Insist that new journal editors present their views on statistical significance
testing prior to their selection.
(Carver 1993)

Focus instead on estimates of effect size and
sampling error
 Repeat
studies!!!







Resources
Davis Nachmias. Research Methods in the Social Sciences, Fourth Edition, (New York, 1992),447456
Wikipedia. “Hypothesis Testing: Origins.” Available at
http://en.wikipedia.org/wiki/Hypothesis_testing#Origins . Accessed November 1, 2010.
DMAIC Tools: Six Sigma Training Resources. “Hypothesis Testing: Analyze Overview.” Available at
www.dmaictools.com/dmaic-analyze/hypothesis-testing. Accessed November 1, 2010.
Thomas R. Black. "Simulations for Complex Concepts: Teaching Statistical Power as
an Example." International Journal of Mathematical Education in Science and
Technology 30, no. 4 (1999): 473-81. Available at http://
dx.doi.org/10.1080/002073999287752. . Accessed November 23, 2010.
John V. Richardson. Hypothesis Testing: Role of Type I and Type II Errors, Level of
Significance (alpha) and Power (Beta). Class Handout. Distributed in IS 280
on November 18, 2010
Tom Rogers. "Type I and Type II Errors." Intuitor: How to Succeed through Creative
Learning. Available at http://www.intuitor.com/statistics/T1T2Errors.html. Accessed November
23, 2010.




U.S. Food and Drug Administration. "Mr. Magic Male Enhancer: Undeclared Drug
Ingredient." FDA: Safety. August 18, 2010. Accessed November 23, 2010.
http://www.fda.gov/Safety/MedWatch/SafetyInformation/SafetyAlertsforHumanMedicalProducts
/ucm223837.htm.
Ronald P. Carver. 1993. “The Case against Statistical Significance Testing, Revisited.” The Journal of
Experimental Education, Vol. 61, No. 4, Statistical Significance Testing in Contemporary Practice
(Summer, 1993), pp. 287-292.
Bruce Thompson. 1995. “The Concept of Statistical Significance Testing.” ERIC Clearinghouse on
Assessment and Evaluation. Available at http://www.ericdigests.org/1995-1/testing.htm.
Accessed on November 21, 2010.
James P. Shaver. 1985. "Change and Nonsense: A Conversation about Interpreting Tests of Statistical
Significance.” The Phi Delta Kappan, Vol. 67, No. 1 (Sep., 1985), pp. 57-60.