C19-22 - Inference for Proportions AP Statistics Review

download report

Transcript C19-22 - Inference for Proportions AP Statistics Review

Inference for Proportions(C18-C22 BVD)
C19-22: Inference for Proportions
* When you calculate a statistic from a sample, you often
do so to estimate a population parameter. The sample
result is a point estimate of the true value in the
* We realize the population parameter is probably not
exactly equal to our sample result, so we want to say how
accurate the estimate is.
* We know from Ch. 18 that sampling distributions for
means and proportions follow a Normal distribution
centered at the true population parameter.
* If we “reach out” a certain number of standard errors
(margin of error)from the sample statistic, we can tell
how close the estimate tends to be to the population true
value in repeated random sampling.
* A confidence level (like C%) tells the success rate of
“capturing” the true parameter in an interval of that size
if it were to be done repeatedly (although usually we
really only do it once).
* Confidence Level: To say that we are 95% confident
is shorthand for “95% of all possible samples of a
given size from this population will result in an
interval that captures the true population
* Confidence Interval: To interpret a C% confidence
interval for an unknown parameter, say, “We are C%
confident that the interval from __________ to
___________ captures the actual value of the
(population parameter in context).
*P-hat +/- z*(sqrt(p-hat*q-hat/n)
*Sample statistic +/- ME
*ME = # standard errors reaching
out from statistic.
*1 prop Z Int on calculator
* Draw or imagine a normal model with C%
shaded, symmetric about the center.
* What percent is left in the two tails?
* What percentile is the upper or lower fence at?
* Look up that percentile in z-table to read off z
(or use invNorm(.95) or whatever percent is
* Good to memorize most common ones, like 95%
=> 1.96
*ME = z*(SE)
Plug in desired ME (like within 3% means ME =
Plug in z* for desired level of confidence.
Plug in sample p-hat and q-hat if they exist,
or use 0.5 for both.
Solve equation for n.
See pages 374-375 for an example.
*For all inference for proportions check:
*1. Is it plausible that sample/experiment
results are independent of one another?
*2. Random sampling/assignment?
*3. Sample less than 10% of population?
*4. Success/Fail – np and nq both greater
than 10?
*Success/Fail – if doing interval, use p-hat
and q-hat, if doing hypothesis test, use p0
and q0.
* Null: p is hypothesized value (no hats!)
* Alternate: isn’t, is greater, is less than
* Hypothesized Model: centers at p, has a
standard deviation of sqrt(poq0/n)
* Find z-score of sample value.
* Use table or normalcdf to find area of shaded
region. (double for two-tail test).
* 1-prop-z-test on calculator – report z and pvalue.
* When stating which test/interval you’ll be
doing, and/or writing hypotheses, you should
define the variables used in writing those
hypotheses so there is no confusion.
* Beware of writing down whatever your
calculator spits out without defining/explaining
what those numbers represent.
* Two – not equal alternate if all you want to
know is if there is evidence that the null isn’t
* One – tailed alternate if you want to know if
there is evidence that the null is too high or
too low.
* Use given alpha-level (significance level) or
choose alpha-level.
* If p-value is below alpha, reject null.
If pvalue is above alpha, fail to reject null.
* “The p-value of ________ is above/below our
alpha level of _________ therefore we fail to
reject/reject the null hypothesis and conclude
(what the null says in context/what the
alternate says in context).
* If you choose alpha = 0.05, 1-alpha gives
confidence level for a two-tail test (95% CI)
that would lead to the same conclusions.
* If you’re doing a one-tail test, all of alpha is
one tail. The corresponding CI ends up being
shortened. Instead of 1-alpha, it is 1-2alpha.
* For a one-tail test with alpha = 0.05, the CI
that would give same conclusions is 1-.1 or
* Type I error – rejecting null when it is really true
* Type II error – failing to reject null that is really false
* Probability of Type I error = alpha
* Probability of Type II error represented by Beta, but there is
a different Beta for each specific alternate value you might
want to detect. You will not have to calculate Beta.
* The Power of a test is 1-Beta, and is the probability or
ability to correctly detect that a null is false.
* If you increase alpha (i.e. lowering C), you increase the
probability of a Type I error, you also increase the power of
the test, but you decrease the probability of a Type II error.
* Increasing sample size can decrease the probability of both
types of error while still increasing power.
* In reality, you almost never know what type of errors have
occurred, if any. You analyze what types of errors would be
the most damaging, and choose your sample size, alpha
level, confidence level and so on accordingly. Make a truth
table to help you analyze the risks.
* Changes:
* check if two groups are independent, not just if
individual sample data are independent.
* Check Success/Fail for both groups.
* SE formula for CI is sqrt(p1q1/n1 +p2q2/n2)
* CI = sample difference +/- ME
* Null usually is p1-p2 = 0
* SE formula for hypothesis uses p-total and q-total
instead of p1,q1,p2,q2.