Aucun titre de diapositive

Download Report

Transcript Aucun titre de diapositive

1
Schaum’s Outline
PROBABILTY and STATISTICS
Chapter 6
ESTIMATION THEORY
Presented by Professor Carol Dahl
Examples from
D. Salvitti
J. Mazumdar
C. Valencia
2
Outline Chapter 6
Trader in energy stocks
random variable Y = value of share
want estimates µy, σY
Y = ß0 + ß1X+ 
want estimates Ŷ, b0, b1
Properties of estimators
unbiased estimates
efficient estimates
3
Outline Chapter 6
Types of estimators
Point estimates
µ=7
Interval estimates
µ = 7+/-2
confidence interval
4
Outline Chapter 6
Population parameters and confidence intervals
Means
Large sample sizes
Small sample sizes
Proportions
Differences and Sums
Variances
Variances ratios
5
Properties of Estimators - Unbiased
Unbiased Estimator of Population Parameter
estimator expected value = to population parameter
ˆ 
E()
6
Unbiased Estimates
Population Parameters: μ ; σ
2
Sample Parameters: X , Sˆ 2
E(X )  μ
2
ˆ
X ;S
E(Sˆ 2 )  σ 2
are unbiased estimates
Expected value of standard deviation not unbiased
Ε(Sˆ )  σ
7
Properties of Estimators - Efficient
Efficient Estimator –
if distributions of two statistics same
more efficient estimator = smaller variance
efficient = smallest variance of all unbiased estimators
8
Unbiased and Efficient Estimates
Target
Estimates which are efficient and unbiased
Not always possible
often us biased and inefficient
easy to obtain
9
Types of Estimates for Population
Parameter
Point Estimate
single number
Interval Estimate
between two numbers.
10
Estimates of Mean – Known Variance
Large Sample or Finite with Replacement
X = value of share
sample mean is $32
volatility is known σ2 = $4.00
confidence interval for share value
Need
estimator for mean
need statistic with
mean of population
estimator
11
Estimates of Mean- Sampling Statistic
X 
 N( 0,1)

n
P(-1.96 <
X 

n
<1.96) = 95%
2.5%
12
Confidence Interval for Mean
X 
P(-1.96 < 
n
<1.96) = 95%
P(-1.96  n < X   <1.96  n ) = 95%
P(-1.96 
n
-X < -µ <1.96 
n
- X ) = 95%
Change direction of inequality
P(+1.96 
+X > µ > -1.96 
+ X ) = 95%
n
n
13
Confidence Interval for Mean
P(+1.96 
n
+X > µ > -1.96 
n
+ X ) = 95%
Rearrange
P(X - 1.96  n < µ <X + 1.96  n ) = 95%
Plug in sample values and drop probabilities
X = value of share, sample = 64
sample mean is $32
volatility is σ2 = $4
{32 – 1.96*2/64, 32 + 1.96*2/64} = {31.51,32.49}
14
Estimates for Mean for Normal
Take a sample
point estimate
compute sample mean
interval estimate – 0.95 (95%+) = (1 - 0.05)
X +/-1.96 
n

X +/-Zc
n
(Z<Zc) = 0.975 = (1 – 0.05/2)
95% of intervals contain
5% of intervals do not contain
15
Estimates for Mean for Normal
interval estimate – 0.95 (95%+) = (1 - 0.05)
X +/-Zc 
n
(Z<Zc) = 0.975 = (1 – 0.05/2)
interval estimate – (1-) %

X +/-Zc  n
(Z<Zc) = (1 – /2)
(Z<Zc) = 0.975 = (1 – 0.05/2)
% of intervals don’t contain
(1- )% of intervals do contain
16
Confidence Interval Estimates
of Population Parameters
Common values for corresponding to various
confidence levels used in practice are:
Confidence level 99.73% 99% 98% 96% 95.45% 95% 90% 68.27% 50%
3.00 2.58 2.33 2.05 2.00 1.96 1.645 1.00 0.6745
17
Confidence Interval Estimates
of Population Parameters
Functions in EXCEL
Menu Click on Insert Function or
=confidence(,stdev,n)
=confidence(0.05,2,64)= 0.49
X+/-confidence(0.05,2,64)
=normsinv(1-/2) gives Zc value
X+/-normsinv(1-/2) 
32 +/- 1.96*2/64
n
18
Confidence Intervals for Means
Finite Population (N) no Replacement
 
X  ZC σˆ
N-n
n N -1
Confidence interval
@ (1 - )%
Confidence level
19
Example: Finite Population without
replacement
Evaluate density of oil in new reservoir
81 samples of oil (n)
from population of 500 different wells
samples density average is 29°API
standard deviation is known to be 9 °API
 = 0.05
20
Confidence Intervals for Means
Finite Population (N) no Replacement
Known Variance
X = 29 , N= 500, n = 81 , σ = 9 ,  = 0.05
Zc = 1.96


X  ZC 

 N-n
n  N - 1
500 - 81

9
So μ   29 

29  1.80

81 500  1 

or 27.20  μ  31.80
@ 95%
21
But don’t know Variance
t-Distribution
X 

= N(0.1)
n
= ( n  1)sˆ 2 = tdf
2/df
2
n 1
df
22
Confidence Intervals of Means
t- distribution
X 

n
( n  1)sˆ 2 =
2
n 1
X 

n
( n  1)sˆ
2
n 1
1
2
=
X 

n
X 

n
=
( n  1)sˆ 2 1
2
 ( n  1)
sˆ 2
2

X 
= sˆ
n
23
Confidence Intervals of Means
Normal compared to t- distribution
Normal
X 

n
X +/-Zc  n
t
distribution
X 
sˆ
n
X +/-tc
sˆ
n
24
Confidence Interval Unknown Variance
Example:
Eight independent measurements diameter of drill bit
3.236, 3.223, 3.242, 3.244, 3.228, 3.253, 3.253, 3.230
99% confidence interval for diameter of drill bit
X +/-tc
sˆ
n
25
Confidence Intervals for Means
Unknown Variance
X +/-tc
sˆ
n
X = ΣXi/n
3.236+3.223+3.242+3.244+3.228+3.253+3.253+3.230
8
X = 3.239
ŝ2 = Σ(Xi - X) = (3.236- X)2 + . . .(3.230 - X)2
(n-1)
(8-1)
ŝ = 0.0113
26
Confidence Intervals for Means
Unknown Variance
X +/-tc
sˆ
n
X = 3.239, n = 8, ŝ = 0.0113, =0.01,
1-/2=.975
.005%
-tc
tc
Find tc from Table of Excel
27
Confidence Intervals for Means
Unknown Variance
1-/2=.975
/2= 0.005%
Depends on Table
-tc
tc

3.499483
GHJ /2 = 0.005  tc = 2. 499
Schaums 1- /2 = 0.995  tc = 2.35
Excel =tinv(0.01,7) = 3.499483
3.499483
28
Confidence Intervals for Means
Unknown Variance
X +/-tc
sˆ
n
X = 3.239, n = 8, ŝ = 0.0113, =0.01,


So diam  3.239  3.50 0.0113
or
3.225  diam  3.253
8 
@ 99%
29
Confidence Intervals for Proportions
Example
600 engineers surveyed
250 in favor of drilling a second exploratory well
95% confidence interval for
proportion in favor of drilling the second well
Approximate by Normal in large samples
Solution: n=600, X=250 (successes),  = 0.05
zc = 1.96
and
P
250
 0.4167  41.7%
600
30
Confidence Intervals for Proportions
Example
600 engineers surveyed
250 in favor of drilling a second exploratory well.
95% confidence interval for
proportion in favor of drilling the second well
Approximate by Normal in large samples
Solution: n=600, X=250 (successes),  = 0.05
zc = 1.96
and
250
P
 0.4167  41.7%
600
31
Confidence Intervals for Proportions
sampling from large population
or finite one with replacement
p  zc
p(1 - p)
n
32
Confidence Intervals Differences and Sums
Known Variances
Samples are independent
X
1

 X 2  zc 
 12
n1

 22
n2
33
Confidence Interval for
Differences and Sums – Known Variance
Example
sample of 200 steel milling balls
average life of 350 days - standard deviation 25 days
new model strengthened with molybdenum
sample of 150 steel balls
average life of 250 days - standard deviation 50 days
samples independent
Find 95% confidence interval for difference μ1-μ2
34
Confidence Intervals for
Differences and Sums
Example
X
1

 X 2  zc 

2
1
n1


2
2
n2
Solution: X1=350, σ1=25, n1=200, X2=250, σ2=50, n2=150

252 502 
Then  μ1  μ 2    350  250   1.96 

 or 100  8.72 
200 150 

35
Confidence Intervals for
Differences and Sums – Large Samples
 P1  P2   zc 
p1 1-p1  p2 1-p2 

n1
n2
Where:
P1, P2 two sample proportions,
n1, n2 sizes of two samples
36
Confidence Intervals for
Differences and Sums
Example
random samples
200 drilled holes in mine 1, 150 found minerals
300 drilled holes in mine 2, 100 found minerals c
Construct 95% confidence interval difference in proportions
Solution: P1=150/200=0.75, n1=200, P2=100/300=0.33,n2=300
0.75 1-0.75 0.33 1-0.33
So  0.75  0.33  1.96 

200
300
With 95% of confidence the difference of proportions
{0.42, 0.08}
37
Confidence Intervals Differences and Sums
Example
Solution: P1=150/200=0.75, n1=200,
P2=100/300=0.33, n2=300
0.75 1-0.75 0.33 1-0.33
So  0.75  0.33  1.96 

200
300
95% of confidence the difference of proportions
[0.08, 0.42]

38
Confidence Intervals for Variances
Need statistic with
population parameter 2
estimate for population parameter ŝ2
its distribution - 2
39
Confidence Intervals for Variances
2
ˆ
n  1 S has a chi-squared distribution

nS

2
2
2
n-1 degrees of freedom.
Find interval such that σ lies in the interval for
95% of samples
95% confidence interval
P( 
( n  1)sˆ
2

   / 2 above )  1  
2

2
2
 / 2 below
40
Confidence Intervals for Variances
P(  2 / 2 below
( n  1)sˆ 2
2



 / 2 above )  1  
2

Rearrange
( n  1)sˆ
( n  1)sˆ
2
P( 2
  2
)  1 
  / 2 above
  / 2 below
2
2
Take square root if want confidence interval for
standard deviation
41
Confidence Intervals for Variances and
Standard Deviations
Drop probabilities when substitute in sample values
1 -  confidence interval for variance
 ( n  1)sˆ ( n  1)sˆ 
, 2
 2

  / 2 above   / 2 below 
2
2
1 -  confidence interval for standard deviation
 ( n  1)sˆ 2 ( n  1)sˆ 2 
,
2
2
 


 / 2 above
 / 2 below 

42
Confidence Intervals for Variance
Example
Variance of amount of copper reserves
16 estimates chosen at random
ŝ2 = 2.4 thousand million tons
Find 99% confidence interval variance
Solution: ŝ2=2.4, n=16,
degrees of freedom = 16-1= 15
 ( n  1)sˆ 2 ( n  1)sˆ 2 
2
 2

  / 2 below 
 / 2 above

43
How to get 2 Critical Values
Not
symmetric
/2
/2
2 lower
2 upper
44
How to get 2 Critical Values
1-/2
Not
symmetric
1-/2
/2
GHJ area above 20.995, 20.005
/2
4.60092, 32.8013
Schaums area below 20.005, 20.995 4.60,
32.8
Excel = chiinv(0.995,15) =
4.60091559877155
Excel = chiinv(0.005,15) =
32.8013206461633
45
Confidence Intervals for Variances and
Standard Deviations
Example
99% confidence interval variance of reserves
 ( n  1)sˆ 2 ( n  1)sˆ 2 
,
2
 2

  / 2 below 
 / 2 above

Solution: ŝ=2.4 (n-1)=15
2lower = 4.60, 2upper = 32.8
15 * 2.4 , 15 * 2.4 
 32.8
4.6 
46
Confidence Intervals for Ratio of Variances
Two independent random samples
size m and n
2
2

,

population variances 1 2
estimated variances ŝ21, ŝ22
interested in whether variances are the same
21/ 22
47
Confidence Intervals for Ratio of Variances
Need statistic with
population parameter 21/ 22
estimate for population parameter ŝ21/ ŝ22
its distribution - F
48
F-Distribution
2
df1

df 1
2
df 2
df2
 Fdf 1,df 2
49
F-Distribution


( n1  1)sˆ
2

2
2
1
1
df1
2
df2

( n 2  1)sˆ
2

2
( n1  1)
2
2
( n 2  1)
sˆ 

 F( n 1,n 1 )
sˆ 
2
1
2
2
2
2
2
1
1
2
50
Confidence Intervals for Ratio of Variances
Need statistic with
population parameter 21/ 22
estimate for population parameter ŝ21/ ŝ22
its distribution - F
sˆ 
 F( n 1,n 1 )
sˆ 
2
1
2
2
2
2
2
1
1
2
51
Confidence Intervals for Ratio of Variances
sˆ 
P(F( n 1,n 1) below /2 
 F( n 1,n 1)above /2 )  1  
sˆ 
1
2
1
2
2
2
2
2
2
1
1
2
Rearrange
sˆ
1
 sˆ
1
P(
 
)  1 
sˆ F( n 1,n 1) below /2  sˆ F( n 1,n 1)above /2
2
1
2
2
1
2
2
1
2
2
2
1
2
2
1
2
52
Confidence Intervals for Ratio of Variances
Put smallest first, largest second
sˆ
1
 sˆ
1
P(
 
)  1 
sˆ F( n 1,n 1) above /2  sˆ F( n 1,n 1) below /2
2
1
2
2
1
2
1
2
2
2
2
1
2
2
1
2
When substitute in values drop probabilities
1- confidence interval for 21/ 22
sˆ
1
sˆ F( n 1,n 1 ) above
2
1
2
2
1
2
sˆ
1
,
sˆ F( n 1,n 1 ) below /2
/2
2
1
2
2
1
2
53
Confidence Intervals for Variances
Example
Two nickel ore samples
of sizes 16 and 10
unbiased estimates of variances 24 and 18
Find 90% confidence limits for ratio of variances
Solution: ŝ21 = 24, n1 = 16, ŝ22 = 18, n2 = 10,
sˆ
1
sˆ
1
,
sˆ F( n 1,n 1) above /2 sˆ F( n 1,n 1) below /2
2
1
2
2
1
2
2
1
2
2
1
2
54
Confidence Intervals for Ratio of Variances
/2
Tables
df1
df2 
/2
F upper
F lower
GHJ area above F0.95,15,9, F0.05,15,9
?
3.01
Schaums area below F0.05,15,9, F0.95,15,9 ?
3.01
Area above
Excel = Finv(0.95,15,9) =
0.386454546279388
Excel = Finv(0.05,15,9) =
3.00610197251669
55
Confidence Intervals for Ratio of Variances
GHJ area above F0.95,15,9
P(F15,9>Fc) = 0.95
/2
P(1/F15,9<1/Fc) = 0.95
But 1/F15,9 = F9,15
P(F9,15<1/Fc) = 0.95
P(F9,15<1/Fc) = 0.05
1/Fc = 2.59 Fc = 0.3861
F lower
/2
F upper
56
Confidence Intervals for Variances
Example
Two nickel ore samples
Solution: ŝ21 = 24, n1 = 16, ŝ22 = 18, n2 = 10,
sˆ
1
sˆ
1
,
sˆ F( n 1,n 1) above /2 sˆ F( n 1,n 1) below /2
2
1
2
2
1
2
2
1
2
2
1
2
24
1
24
1
,
 [0.44, 3.45]
18 * 3.0061 18 * 0.3865
57
Maximum Likelihood Estimates
Point Estimates
x is population with density function f(x,)
if know  - know the density function
2 where
 = degrees of freedom
Poisson λxe-λ/x!  = λ (the mean)
If sample independently from f n times
x1, x2, . . .xn
a sample
if consider all possible samples of n
58
Maximum Likelihood Estimates
If sample independently from f n times
x1, x2, . . .xn
a sample
if consider all possible samples of n
a sampling distribution
called likelihood function
L  f  x1,  f  x2 , 
f  x2 , 
59
Maximum Likelihood Estimates
L  f  x1,  f  x2 , 
f  x2 , 
 which maximizes the likelihood function
Derivative of L with respect to  and setting it to 0
Solve for 
Usually easier to take logs first
log(L) = log(f(x1,) + log(f(x2,)+ . . .+ log(f(xn,)
60
Maximum Likelihood Estimates
log(L) = log(f(x1,) + log(f(x2,) +. . .+  log(f(xn,)




f  x1 , 
f  xn , 
1
1
+ +
0
f  x1 , 

f  xn , 

Solution of this equation is maximum likelihood estimator
work out example 6.25
work out example 6.26
61
Sum Up Chapter 6
Y = ß0 + ß1X
Ŷ, b0, b1
Properties of estimators
unbiased estimates
efficient estimates
Types of estimators
Point estimates
Interval estimates
62
Sum Up Chapter 6
Y- µY, Y, Y, ŝ2
In 590-690
Y = ß0 + ß1X
Ŷ, b0, b1
Properties of estimators
unbiased estimates
efficient estimates
Types of estimators
Point estimates
Interval estimates
63
Sum Up Chapter 6
Need statistic with
population parameter
estimate for population parameter
its distribution
64
Sum Up Chapter 6
Population parameters and confidence intervals
Mean – Normal
Know variance and population normal
Large sample size can use estimated
variance
X 

n
X  Zc 
2
n
65
Sum Up Chapter 6
Proportions
large sample approximate by normal
p  zc
p(1 - p)
n
Differences of means (known variance)
X
1

 X 2  zc 
 12
n1

 22
n2
66
Sum Up Chapter 6
Mean
population normal - unknown variance
X 
sˆ
n
X  t n1 sˆ
2
n
67
Sum Up Chapter 6
Variances
( n  1)sˆ 2
2

 ( n  1)sˆ ( n  1)sˆ 
, 2
 2


  / 2 above
 / 2 below 
2
2
68
Sum Up Chapter 6
Variances ratios
sˆ 
 F( n 1,n 1 )
sˆ 
2
1
2
2
2
2
2
1
1
2
sˆ
1
sˆ
1
,
sˆ F( n 1,n 1) above /2 sˆ F( n 1,n 1) below /2
2
1
2
2
1
2
2
1
2
2
1
2
69
Sum Up Chapter 6
Maximum Likelihood Estimators
L  f  x1,  f  x2 , 
f  x2 , 
Pick  which maximizes the function
70
End of Chapter 6!