The Central Limit Theorem

Download Report

Transcript The Central Limit Theorem

Paul Cornwell March 31, 2011 1

 Let X 1 mean ,…, μ is large.

X n be independent, identically distributed random variables with positive variance. Averages of these variables will be approximately normally distributed with and standard deviation σ/ √ n when n 2

 How large of a sample size is required for the Central Limit Theorem (CLT) approximation to be good?

 What is a ‘good’ approximation?

3

 Permits analysis of random variables even when underlying distribution is unknown  Estimating parameters  Hypothesis Testing  Polling 4

 Performing a hypothesis test to determine if set of data came from normal  ◦ Considerations Power: probability that a test will reject the null hypothesis when it is false ◦ Ease of Use 5

 ◦ Problems No test is desirable in every situation (no universally most powerful test) ◦ Some lack ability to verify for composite hypothesis of normality (i.e. nonstandard normal) ◦ The reliability of tests is sensitive to sample size; with enough data, null hypothesis will be rejected 6

 Symmetric  Unimodal  Bell-shaped  Continuous 7

 ◦ ◦ Skewness: Measures the asymmetry of a distribution.

Defined as the third standardized moment Skew of normal distribution is 0  1  E       3   

i n

  1 

X i

(

n

X

 1 )

s

3  3 8

 ◦ ◦ Kurtosis: Measures peakedness or heaviness of the tails.

Defined as the fourth standardized moment Kurtosis of normal distribution is 3  2  E          

x

  

n

     4     

i n

  1 (

n

X i

X

 1 )

s

4 4  9

  Cumulative distribution function:

F

(

x

;

n

,

p

) 

i X

  0

n

C

i

p i

( 1 

p

)

n

i

E [

X

] 

np

Var [

X

] 

np

( 1 

p

) 10

parameters Kurtosis Skewness n p n p n p n p = 20 = .2

= 25 = .2

= 30 = .2

= 50 = .2

n p = 100 = .2

-.0014

(.25) .002

.0235

.0106

.005

.3325

(1.5) .3013

.2786

.209

.149

% outside 1.96*sd .0434

K-S distance .128

.0743

.116

.0363

.0496

.05988

.106

.083

.0574

Mean Std Dev 3.9999

1.786

5.0007

2.002

5.997

2.188

10.001

2.832

19.997

4.0055

*from R 11

 Cumulative distribution function:

F

(

x

;

a

,

b

) 

x b

a a

 E [

X

] 

a

b

Var [

X

]  2 (

b

a

) 2 12 12

parameters ( n = 5 a , b ) = (0,1) ( n = 5 a , b ) = (0,50) ( n = 5 a , b ) = (0, .1) ( n = 3 a , b ) = (0,50) Kurtosis Skewness % outside 1.96*sd -.236

(-1.2) -.234

.004

(0) 0 .0477

.04785

K-S distance .0061

.0058

-.238

-.397

-.0008

-.001

.048

.0468

.0060

.01

Mean Std Dev .4998

.1289 (.129) 24.99

6.468 (6.455) .0500

.0129 (.0129) 24.99

8.326 (8.333) *from R 13

 Cumulative distribution function:

F

(

x

;  )  1 

e

 

x

 E[

X

]  1  Var[

X

]  1  2 14

parameters Kurtosis n λ = 1 n n = 5 = 10 = 15 1.239

(6) .597

.396

Skewness % outside 1.96*sd .904

(2) .0434

.630

.045

K-S distance .0598

.042

.515

.0464

.034

Mean Std Dev .9995

.4473 (.4472) 1.0005

.316 (.316) .9997

.258 (.2581) *from R 15

 Find n values for more distributions  Refine criteria for quality of approximation  Explore meanless distributions  Classify distributions in order to have more general guidelines for minimum sample size 16

Paul Cornwell May 2, 2011 17

 Central Limit Theorem: Averages of i.i.d. variables become normally distributed as sample size increases  Rate of converge depends on underlying distribution  What sample size is needed to produce a good approximation from the CLT?

18

 Real-life applications of the Central Limit Theorem  What does kurtosis tell us about a distribution?

 What is the rationale for requiring np ≥ 5?

 What about distributions with no mean?

19

 Probability for total distance covered in a random walk tends towards normal  Hypothesis testing  Confidence intervals (polling)  Signal processing, noise cancellation 20

 Measures the “peakedness” of a distribution  Higher peaks means fatter tails  2  E          

x

  

n

  4      3 21

 Traditional assumption for normality with binomial is np > 5 or 10  Skewness of binomial distribution increases as p moves away from .5

 Larger n is required for convergence for skewed distributions 22

 Has no moments (including mean, variance)  Distribution of averages looks like regular distribution  CLT does not apply 

f

(

x

)  1  ( 1 

x

2 ) 23

 α = β = 1/3  Distribution is symmetric and bimodal  Convergence to normal is fast in averages 24

 Heavier-tailed, bell-shaped curve  Approaches normal distribution as degrees of freedom increase 25

 4 statistics: K-S distance, tail probabilities, skewness and kurtosis  Different thresholds for “adequate” and “superior” approximations  Both are fairly conservative 26

Distribution Uniform Beta (α=β=1/3) Exponential Binomial (p=.1) Binomial (p=.5) Student’s t with 2.5 df Student’s t with 4.1 df ∣Kurtosis∣ <.5

3 ∣Skewness∣ <.25

1 Tail Prob. .04

2 4 12 1 64 3 5 11 4 NA 114 1 NA 14 12 13 K-S Distance <.05

2 3 8 332 68 20 max 3 4 64 332 68 20 120 1 1 2 120 27

Distribution Uniform Beta (α=β=1/3) Exponential Binomial (p=.1) Binomial (p=.5) Student’s t with 2.5 df Student’s t with 4.1 df ∣Kurtosis∣ <.3

4 ∣Skewness∣ <.15

1 Tail Prob. .04

2 6 20 1 178 3 5 18 7 NA 317 1 NA 14 12 13 K-S Distance <.02

2 4 45 1850 390 320 max 4 6 178 1850 390 320 200 1 1 5 200 28

 Skewness is difficult to shake  Tail probabilities are fairly accurate for small sample sizes  Traditional recommendation is small for many common distributions 29