Transcript Descriptive Indices - National University of Singapore
Facts about Binomial & Poisson distribution
Fact 1 : If
X
~
Bin E
(
X
)
np
(
n
,
p
), then var(
X
)
np
( 1
p
) Fact 2 : If
X
1 ~
Bin
(
n
1 ,
p
),
X
2 then ~
Bin
(
n
2 ,
X
1
X
2
p
) independen ~
Bin
(
n
1
n
2 , tly,
p
) Fact 3 : If
X E
(
X
) ~
Poisson
( ), then var(
X
)
0.3
0.2
0.1
Normal Approximation to Binomial Distribution
p= 0.4, n=5 p= 0.4, n=10 p= 0.4, n=25 0.2
0.1
0.1
0 1 2 3 4 5
Bin
(
n
,
p
)
N
np
,
np
( 1
p
) if
n
is large.
Demo: http://www.ruf.rice.edu/~lane/stat_sim/index.html
Normal Approximation to Poisson Distribution
Poisson
( )
N
if
n
is large.
Topic 8: Normal Distribution
A continuous random variable is one that can take on any value within an interval.
The distribution of a continuous variable
X
is given by a probability density function
f
(
x
) satisfying (i)
f
(
x
) 0 for all
x
( , ) (ii)
P
(
a
(iii)
X
b
)
a b
f
(
x
)
dx
1 .
f
(
x
)
dx
Note that it is the integral of
f
(
x
) , i.e., the area under the density curve, and not
f
(
x
) itself, that gives us the probabilit y
In particular ,
P
(
X
x
)
0
f
(
x
) for all
x
Can think of the probability density
f(x)
as the relative frequency histogram of a very large sample
Expectation defined similarly as in discrete but with integral instead of summation:
E
(
X
)
x f
(
x
)
dx
Can again interpret expected value E(
X
) as the long-run average of
X
under repeated sampling
The most well known continuous distribution is the normal distribution.
Standard normal density
(
z
) 1 2 Bell-shaped
e
1 2
z
2 ,
z
Symmetric about 0 Mean = 0, variance = SD = 1 Well tabulated
From
N
( 0 , 1 ) to
N
( , 2 ) and vice versa If
Z
~
N
( 0 , 1 ) , then
X
Z
~
N
( , 2 ) Conversely, if
Z
X
~
N X
( 0 , ~ 1 )
N
( , 2 ) , then (Standardization) Normally distributed random variables should be standardized before looking up table
X
Blood Pressure ~
N
( 129
mm Hg
, 2 19 .
8 2 )
P
(
X
150 )
P
X
129 19 .
8 150 19 129 .
8
P
(
Z
1 .
06 ) 0 .
145 Demo: http://www.isds.duke.edu/sites/java.html
The standard normal density curve
If
X
~
N
(
,
2
)
P
( |
X
|
2
)
within 2 SD from the mean
P
Z
0.954
X
2
P
( |
|
3
)
within 3 SD from the mean
0 .
997
This is the empirical rule
Normal distribution is often used to model continuous measurement data such as weight, height, blood pressure, etc.
The use of normal distribution is often justified by the Central Limit Theorem which says that the sum/average of a large number of independent and identically distributed variables is approximately normally distributed.
Measurement error = sum of many indep smaller errors IQ : determined by many genetic & environmental factors
Demo: http://www.ruf.rice.edu/~lane/stat_sim/index.html