No Slide Title
Download
Report
Transcript No Slide Title
Chapters 8 - 9
Estimation
Mat og metlar
©
Estimator and Estimate
Metill og mat
An estimator of a population parameter is a random
variable that depends on the sample information and
whose value provides approximations to this
unknown parameter. A specific value of that random
variable is called an estimate.
Metill fyrir þýðisstika er hending sem er háð
úrtaksupplýsingum og gildi metilsins sem kallast
mat gefur nálgun á hinn óþekkta þýðisstika.
Point Estimator and Point Estimate
Punktmetill og punktmat
Let represent a population parameter (such as
the population mean or the population
proportion ). A point estimator, θˆ, of a
population parameter, , is a function of the
sample information that yields a single number
called a point estimate. For example, the sample
mean, X, is a point estimator of the population
mean , and the value that X assumes for a given
set of data is called the point estimate.
ˆ
θ
X
Þýðisstiki (population parameter)
Unbiasedness
Óhneigður (óbjagaður)
The point estimator θˆ is said to be an unbiased
estimator of the parameter if the expected
value, or mean, of the sampling distribution of θˆ
is ; that is,
E (ˆ)
Punktmetill θˆ er sagður óhneigður metill fyrir
stikann ef vongildi líkindadreifingar úrtaks
fyrir er ; þ.e., θˆ
E (ˆ)
Probability Density Functions for unbiased and
Biased Estimators
Þéttifall fyrir hneigðan og óhneigðan metil
(Figure 8.1)
ˆ1
ˆ2
ˆ
Bias Bjögun (skekkja)
Let θˆ be an estimator of . The bias in θˆ is defined as
the difference between its mean and ; that is
Bias(ˆ) E(ˆ)
It follows that the bias of an unbiased estimator is 0.
Látum θˆ vera metil fyrir . Bjögun í θˆ er skilgreind
sem mismunur milli vongildis metilsins og ; þ.e.
Bias(ˆ) E(ˆ)
Samkvæmt þessu er bjögun (bias) fyrir óhneigðan
metil 0.
Most Efficient Estimator and Relative Efficiency
Skilvirkasti metillinn og hlutfallsleg skilvirkni
Suppose there are several unbiased estimators of . Then the
unbiased estimator with the smallest variance is said to
be the most efficient estimator or to be the minimum
variance unbiased estimator of . Let θˆ 1and θˆ 2 be two
unbiased estimators of , based on the same number of
sample observations. Then,
a) θˆ 1is said to be more efficient than θˆ 2 if Var(ˆ1 ) Var(ˆ2 )
b) The relative efficiency of θˆ 1 with respect to θˆ 2 is the ratio of
their variances; that is, hlutfallsleg skilvirkni
Var(θˆ2 )
RelativeEfficiency
Var(θˆ1 )
Point Estimators of Selected
Population Parameters
(Table 8.1)
Population
Parameter
Point
Estimator
Properties
Mean,
X
Unbiased, Most Efficient
(assuming normality)
Mean,
Xm
Unbiased (assuming
normality), but not most
efficient
Proportion,
p
Unbiased, Most Efficient
Variance, 2
s2
Unbiased, Most Efficient
(assuming normality)
Confidence Interval Estimator
Metill fyrir öryggismörk
A confidence interval estimator for a population
parameter is a rule for determining (based on
sample information) a range, or interval that is
likely to include the parameter. The corresponding
estimate is called a confidence interval estimate.
Metill fyrir öryggismörk á þýðisstika er til að
ákvarða (byggt á úrtaksgögnum) spönn, eða bil
sem líklegt er til að ná utan um hinn sanna stika.
Samsvarandi mat köllum við mat fyrir
öryggismörk eða bara öryggismörk.
Confidence Interval and Confidence Level
Let be an unknown parameter. Suppose that on the basis of sample
information, random variables A and B are found such that P(A < < B) = 1 - ,
where is any number between 0 and 1. If specific sample values of A and B
are a and b, then the interval from a to b is called a 100(1 - )% confidence
interval of . The quantity of (1 - ) is called the confidence level of the
interval.
If the population were repeatedly sampled a very large number of
times, the true value of the parameter would be contained in 100(1 - )% of
intervals calculated this way. The confidence interval calculated in this manner
is written as a < < b with 100(1 - )% confidence.
Látum vera óþekktan stika. Hugsum okkur á að á grunni úrtaksupplýsinga
séu hendingar A og B reiknaðar þannig að P(A < < B) = 1 - , þar sem er
einhver tala milli 0 og 1. Ef ákveðin gildi A og B eru a and b, þá er bilið frá a til
b kallað 100(1 - )% öryggismörk fyrir . Stærðin (1 - ) er kallað öryggsstig
bilsins.
Ef endurtekin úrtök væru tekin úr þýðinu mjög oft þá myndi 100(1 )% allra þeirra bila sem reiknuð væri út innihalda hinn sanna stika .
Öryggismörkin sem reiknuð eru á þennan hátt eru skrifuð sem a < < b með
100(1 - )% vissu.
P(-1.96 < Z < 1.96) = 0.95, where Z is
a Standard Normal Variable
(Figure 8.3)
0.95 = P(-1.96 < Z < 1.96)
0.025
0.025
-1.96
1.96
Notation Táknmálsnotkun
Let Z/2 be the number for which
P( Z Z / 2 )
2
where the random variable Z follows a standard
normal distribution.
Látum Z/2 vera tölu sem
P( Z Z / 2 )
2
Þar sem hendingin Z fylgir staðlaðri
normaldreifingu
Selected Values Z/2 from the Standard
Normal Distribution Table
(Table 8.2)
Z/2
Confidence
Level
0.01
0.02
0.05
0.10
2.58
2.33
1.96
1.645
99%
98%
95%
90%
Confidence Intervals for the Mean of a Population that
is Normally Distributed: Population Variance Known
Öryggismörk fyrir meðaltal þýðis sem er normaldreift
og með þekkta dreifni
Consider a random sample of n observations from a normal
distribution with mean and variance 2. If the sample
mean is X, then a 100(1 - )% confidence interval for the
population mean with known variance is given by
or equivalently,
Z / 2
Z / 2
X
X
n
n
X B
where the margin of error (also called the sampling error,
the bound, or the interval half width) is given by
B Z / 2
n
Basic Terminology for Confidence Interval for a
Population Mean with Known Population Variance
Orðnotkun fyrir öryggismörk þýðismeðaltals með
þekktri dreifni
Terms
(Table 8.3)
Symbol
Standard Error of the Mean
X
Z Value (also called the Reliability
Factor)
Z / 2
Margin of Error skekkjumörk
B
To Obtain:
/ n
Use Standard Normal
Distribution Table
B Z / 2
n
Lower Confidence Limit Neðri mörk
LCL
LCL X Z / 2
Upper Confidence Limit Efri mörk
UCL
UCL X Z / 2
Width (width is twice the bound)
Breidd
w
w 2 B 2 Z / 2
n
n
n
Student’s t Distribution
Given a random sample of n observations, with
mean X and standard deviation s, from a normally
distributed population with mean , the variable t
follows the Student’s t distribution with (n - 1)
degrees of freedom and is given by
X
t
s/ n
Hugsum okkur slembið úrtak n athugana með
úrtaksmeðaltal X og úrtaksstaðalfrávik s, úrtakið er
fengið úr þýði sem er normaldreift með vongildi ,
breytan t er sögð fylgja Student’s t dreifingu með
(n - 1) frígráður og er gefin af
Notation Táknmálsnotkun
A random variable having the Student’s t
distribution with v degrees of freedom will be
denoted tv. The tv,/2 is defined as the number
for which
P(tv tv, / 2 ) / 2
Slembin breyta sem hefur Student’s t dreifingu með v
frelsisgráður verður táknuð með tv. Stærðin tv,/2 er
skilgreind sem stærðin sem
Confidence Intervals for the Mean of a Normal
Population: Population Variance Unknown Öryggismörk
fyrir vongildi í normaldreifðu þýði með óþekktri dreifni
Suppose there is a random sample of n observations from a normal
distribution with mean and unknown variance. If the sample
mean and standard deviation are, respectively, X and s, then a 100(1
- )% confidence interval for the population mean, variance
unknown, is given by
X tn1, / 2
s
s
X tn1, / 2
n
n
or equivalently,
X B
where the margin of error, the sampling error, or bound, B, is given
s
by
B t n 1, / 2
n
and tn-1,/2 is the number for which
P(tn1 tn1, / 2 ) / 2
The random variable tn-1 has a Student’s t distribution with v=(n-1) degrees of freedom.
Confidence Intervals for Population Proportion (Large
Samples) Öryggismörk fyrir þýðishlutfall
(Stór úrtök)
Let p denote the observed proportion of “successes” in a random
sample of n observations from a population with a proportion of
successes. Then, if n is large enough that (n)()(1- )>9, then a 100(1
- )% confidence interval for the population proportion is given
by
p Z / 2
p(1 p)
p(1 p)
p Z / 2
n
n
or equivalently,
pB
where the margin of error, the sampling error, or bound, B, is given
by
p(1 p)
B Z / 2
n
and Z/2, is the number for which a standard normal variable Z
satisfies
P(Z Z / 2 ) / 2
Notation Táknmálsnotkun
A random variable having the chi-square
distribution with v = n-1 degrees of freedom
will be denoted by 2v or simply 2n-1. Define
as 2n-1, the number for which
P(
2
n1
2
n1,
)
Hending með chi-square dreifingu þar sem
v = n-1 frelsisgráður er táknuð með 2v eða
2n-1. Skilgreinum 2n-1, sem töluna sem um
gildir að
The Chi-Square Distribution
(Figure 8.17)
1-
0
2n-1,
The Chi-Square Distribution for n – 1
and (1-)% Confidence Level
(Figure 8.18)
/2
/2
1-
2n-1,1- /2
2n-1,/2
Confidence Intervals for the Variance of a Normal
Population Öryggismörk fyrir dreifni í normaldreifðu þýði
Suppose there is a random sample of n observations from a
normally distributed population with variance 2. If the observed
variance is s2 , then a 100(1 - )% confidence interval for the
population variance is given by Hugsum okkur slembið úrtak n
gagna úr normaldreifðu þýði með dreifni 2. Ef úrtaksdreifni er s2 ,
þá eru 100(1 - )% öryggismörk fyrir þýðisdreifni gefin sem
(n 1)s 2
2
n 1, / 2
2
(n 1)s 2
2
n 1,1 / 2
is the number for which
P(
and 2n-1,1 - /2 is the number for which
P(
where
2
n-1,/2
2
2
n 1
2
n 1
2
n 1, / 2
)
2
n 1,1 / 2
)
2
2
And the random variable n-1 follows a chi-square distribution
with (n – 1) degrees of freedom. Og hendingin 2n-1 fylgir chi-square
dreifingu með (n – 1) frelsisgráður
Confidence Intervals for Two Means: Matched Pairs
Öryggismörk fyrir tvö vongildi : Pör (Matched Pairs)
Suppose that there is a random sample of n matched pairs of
observations from a normal distributions with means X and Y .
That is, x1, x2, . . ., xn denotes the values of the observations from the
population with mean X ; and y1, y2, . . ., yn the matched sampled
values from the population with mean Y . Let d and sd denote the
observed sample mean and standard deviation for the n differences
di = xi – yi . If the population distribution of the differences is
assumed to be normal, then a 100(1 - )% confidence interval for
the difference between means (d = X - Y) is given by
d tn1, / 2
or equivalently,
sd
sd
d d tn1, / 2
n
n
d B
Confidence Intervals for Two Means:
Matched Pairs
(continued)
Where the margin of error, the sampling error or the bound,
B, is given by
B t n 1, / 2
sd
n
And tn-1,/2 is the number for which
P(t n 1 t n 1, / 2 )
2
The random variable tn – 1, has a Student’s t distribution
with (n – 1) degrees of freedom.
Confidence Intervals for Difference Between Means:
Independent Samples (Normal Distributions and Known
Population Variances) Öryggismörk fyrir mismun vongilda:
Óháð úrtök
Suppose that there are two independent random samples of nx and
ny observations from normally distributed populations with means
X and Y and variances 2x and 2y . If the observed sample means
are X and Y, then a 100(1 - )% confidence interval for (X - Y) is
given by
( X Y ) Z / 2
or equivalently,
X2
nx
Y2
ny
X Y ( X Y ) Z / 2
(X Y ) B
where the margin of error is given by
B Z / 2
X2
nx
Y2
ny
X2
nx
Y2
ny
Confidence Intervals for Two Means: Unknown
Population Variances that are Assumed to be Equal
Öryggismörk fyrir mismun vongilda: Óþekkt dreifni en
dreifnin er eins skv. Forsendu.
Suppose that there are two independent random samples with nx and
ny observations from normally distributed populations with means X
and Y and a common, but unknown population variance. If the
observed sample means are X and Y, and the observed sample
variances are s2X and s2Y, then a 100(1 - )% confidence interval for (X
- Y) is given by
s 2p s 2p
s 2p s 2p
( X Y ) tnx n y 2, / 2
X Y ( X Y ) tnx n y 2, / 2
nx n y
nx n y
or equivalently,
(X Y ) B
where the margin of error is given by
B tnx n y 2, / 2
s 2p
nx
s 2p
ny
Confidence Intervals for Two Means: Unknown
Population Variances that are Assumed to be Equal
(continued)
The pooled sample variance, s2p, is given by
s
2
p
tnx ny 2, / 2 is the number for which
(nx 1) s X2 (n y 1) sY2
nx n y 2
P(t nx n y 2 t nx n y 2, / 2 )
2
The random variable, T, is approximately a Student’s t distribution
with nX + nY –2 degrees of freedom and T is given by,
( X Y ) ( X Y )
T
1
1
sp
n X nY
Confidence Intervals for Two Means:
Unknown Population Variances, Assumed
Not Equal
Suppose that there are two independent random samples of nx and ny
observations from normally distributed populations with means X
and Y and it is assumed that the population variances are not equal.
If the observed sample means and variances are X, Y, and s2X , s2Y, then
a 100(1 - )% confidence interval for (X - Y) is given by
( X Y ) t( v , / 2)
s X2 sY2
s X2 sY2
X Y ( X Y ) t( v , / 2)
nx n y
nx n y
where the margin of error is given by
B t( v , / 2)
s X2 sY2
nx n y
Confidence Intervals for Two Means: Unknown
Population Variances, Assumed Not Equal
(continued)
The degrees of freedom, v, is given by
s X2
sY2 2
[( ) ( )]
nX
nY
v 2
sX 2
sY2 2
( ) /(n X 1) ( ) /(nY 1)
nX
nY
If the sample sizes are equal, then the degrees of freedom reduces to
2
(n 1)
v 1 2
s X sY2
2
2
sY s X
Confidence Intervals for the Difference Between Two
Population Proportions (Large Samples) Öryggismörk
fyrir mismun þýðishlutfalla (stór úrtök)
Let pX, denote the observed proportion of successes in a random
sample of nX observations from a population with proportion X
successes, and let pY denote the proportion of successes observed in
an independent random sample from a population with proportion
Y successes. Then, if the sample sizes are large (generally at least
forty observations in each sample), a 100(1 - )% confidence interval
for the difference between population proportions (X - Y) is given
by
( p X pY ) B
Where the margin of error is
B Z / 2
p X (1 p X ) pY (1 pY )
nX
nY
Sample Size for the Mean of a Normally
Distributed Population with Known Population
Variance Gagnasafn fyrir vongildi normaldreifðs
þýðis með þekktri þýðisdreifni
Suppose that a random sample from a normally
distributed population with known variance 2 is
selected. Then a 100(1 - )% confidence interval for
the population mean extends a distance B
(sometimes called the bound, sampling error, or the
margin of error) on each side of the sample mean, if
the sample size, n, is
Z / 2
n
2
B
2
2
Sample Size for Population Proportion
Stærð gagnasafns fyrir þýðishlutfall
Suppose that a random sample is selected from a
population. Then a 100(1 - )% confidence interval
for the population proportion, extending a distance
of at most B on each side of the sample proportion,
can be guaranteed if the sample size, n, is
0.25( Z / 2 )
n
2
B
2
Key Words
Bias
Bound
Confidence interval:
For mean, known variance
For mean, unknown
variance
For proportion
For two means, matched
For two means, variances
equal
For two means, variances
not equal
For variance
Confidence Level
Estimate
Estimator
Interval Half Width
Lower Confidence Limit
(LCL)
Margin of Error
Minimum Variance
Unbiased Estimator
Most Efficient Estimator
Point Estimate
Point Estimator
Key Words
(continued)
Relative Efficiency
Reliability Factor
Sample Size for Mean,
Known Variance
Sample Size for
Proportion
Sampling Error
Student’s t
Unbiased Estimator
Upper Confidence Limit
(UCL)
Width