Descriptive Indices - National University of Singapore

Download Report

Transcript Descriptive Indices - National University of Singapore

Facts about Binomial & Poisson distribution

Fact 1 : If

X

~

Bin E

(

X

) 

np

(

n

,

p

), then var(

X

) 

np

( 1 

p

) Fact 2 : If

X

1 ~

Bin

(

n

1 ,

p

),

X

2 then ~

Bin

(

n

2 ,

X

1 

X

2

p

) independen ~

Bin

(

n

1 

n

2 , tly,

p

) Fact 3 : If

X E

(

X

) ~ 

Poisson

(  ), then var(

X

)  

0.3

0.2

0.1

Normal Approximation to Binomial Distribution

p= 0.4, n=5 p= 0.4, n=10 p= 0.4, n=25 0.2

0.1

0.1

0 1 2 3 4 5

Bin

(

n

,

p

) 

N

np

,

np

( 1 

p

)  if

n

is large.

Demo: http://www.ruf.rice.edu/~lane/stat_sim/index.html

Normal Approximation to Poisson Distribution

Poisson

(  ) 

N

  if

n

is large.

Topic 8: Normal Distribution

A continuous random variable is one that can take on any value within an interval.

The distribution of a continuous variable

X

is given by a probability density function

f

(

x

) satisfying (i)

f

(

x

)  0 for all

x

 (  ,  ) (ii)

P

(

a

 (iii)

X

b

)  

a b

    

f

(

x

)

dx

 1 .

f

(

x

)

dx

Note that it is the integral of

f

(

x

) , i.e., the area under the density curve, and not

f

(

x

) itself, that gives us the probabilit y

In particular ,

P

(

X

x

)

0

f

(

x

) for all

x

Can think of the probability density

f(x)

as the relative frequency histogram of a very large sample

Expectation defined similarly as in discrete but with integral instead of summation:

E

(

X

)     

x f

(

x

)

dx

Can again interpret expected value E(

X

) as the long-run average of

X

under repeated sampling

The most well known continuous distribution is the normal distribution.

Standard normal density

 (

z

)  1 2   Bell-shaped

e

 1 2

z

2 ,   

z

   Symmetric about 0  Mean = 0, variance = SD = 1  Well tabulated

From

N

( 0 , 1 ) to

N

(  ,  2 ) and vice versa  If

Z

~

N

( 0 , 1 ) , then

X

   

Z

~

N

(  ,  2 )  Conversely, if

Z

X

   ~

N X

( 0 , ~ 1 )

N

(  ,  2 ) , then (Standardization) Normally distributed random variables should be standardized before looking up table

X

 Blood Pressure ~

N

(   129

mm Hg

,  2  19 .

8 2 )

P

(

X

 150 ) 

P

  

X

 129 19 .

8  150 19  129 .

8    

P

(

Z

 1 .

06 )  0 .

145 Demo: http://www.isds.duke.edu/sites/java.html

The standard normal density curve

If

X

~

N

(

,

 2

)

P

( |

X

|

2

)

within 2 SD from the mean 

P

 

Z

0.954

X

   

2

 

P

( |

  

|

3

)

within 3 SD from the mean 

0 .

997

This is the empirical rule

Normal distribution is often used to model continuous measurement data such as weight, height, blood pressure, etc.

The use of normal distribution is often justified by the Central Limit Theorem which says that the sum/average of a large number of independent and identically distributed variables is approximately normally distributed.

Measurement error = sum of many indep smaller errors IQ : determined by many genetic & environmental factors

Demo: http://www.ruf.rice.edu/~lane/stat_sim/index.html