Transcript Chapter 7

Chapter 7
Continuous
Distributions
Continuous Random Variables
• Values fall within an interval
• Measurements
• Described by density curves
Density Curve
• Always on or above the x-axis
• Area underneath it equals 1
• Shows what proportion of data
falls within an interval
Unusual Density Curves
• Generic continuous distributions
• Can be any shape
• Probability = area under the curve
.5
How do we find
the area of a
triangle?
.25
1
2
3
4
5
1
1
P(X < 2) = (base)(height)  (2)(.25)  .25
2
2
.5
What is the area
of a line segment?
.25
1
P(X = 2) = 0
P(X < 2) = .25
2
3
4
5
In continuous distributions,
P(X < 2) & P(X < 2)
are the same!
Hmmmm…
Is this different than
discrete distributions?
.5
.25
1
2
3
P(X > 3) = .4375
P(1 < X < 3) = .5
4
5
P(X > 1) = .75
0.50
.5(2)(.25) = .25
0.25
(2)(.25) = .5
1
2
3
4
P(0.5 < X < 1.5) = .28125
0.50
0.25
1
2
3
4
Special Continuous
Distributions
Uniform Distribution
• Evenly (uniformly) distributed
 Every value has equal probability
• Density curve: rectangle
• Probability = area under the curve
ab
x 
2
How do you find the area
a rectangle?
(a & of
b are
the endpoints
b  a
2
 
2
x
12
of the distribution)
Why 12?
Discrete :   ( x   ) P( x)
2
2
b
Continuous:    ( x   ) f ( x)dx
2
2
a
1
1
f ( x)  probability  height 

range b  a
1 b
ab
2
 
( x   ) dx , where  

ba a
2
2
The Citrus Sugar Company packs sugar in bags
labeled 5 pounds. However, the packaging isn’t
perfect and the actual weights are uniformly
distributed withWhat
a mean
of 4.98
and a
shape
doespounds
a uniform
What
is
the
height
of
this
range of .12 pounds. distribution have?
rectangle?
a) Construct this uniform How
distribution.
long is this rectangle?
1/.12
4.92
4.98
5.04
b) What is the probability that a randomly
selected bag will weigh more than 4.97 pounds?
What is the length of the
shaded region?
P(X > 4.97) = .07(1/.12) = .5833
1/.12
4.92
4.98
5.04
c) Find the probability that a randomly
selected bag weighs between 4.93
and 5.03 pounds.
What is the length of the
P(4.93 < X < 5.03) = .1(1/.12)shaded
= .8333region?
1/.12
4.92
4.98
5.04
The time it takes for students to drive to school
is evenly distributed with a minimum of 5
minutes and a range of 35 minutes.
is the height of the
a) Draw theWhat
distribution.
rectangle?
Where should the
rectangle end?
1/35
5
40
b) What is the probability that it takes less
than 20 minutes to drive to school?
P(X < 20) = (15)(1/35) = .4286
1/35
5
40
c) What are the mean and standard deviation
of this distribution?
μ = (5 + 40)/2 = 22.5
2 = (40 – 5)2/12 = 102.083
 = 10.104
Normal Distribution
•
•
•
•
•
How is bell-shaped
this done
Symmetrical,
density curve
mathematically?
Parameters: μ, 
Probability = area under the curve
As  increases, curve flattens & spreads out
As  decreases, curve gets taller and
thinner
Normal distributions occur frequently.
• Length of infants
• Height
• Weight
• ACT scores
• Intelligence
• Number of typing errors
• Velocities of ideal gas
molecules
• Yearly rainfall
• Quantum harmonic
oscillators
• Diffusion
• Thermal light
• Size of living tissue
• Blood pressure
• Compound interest
• Exchange rates
• Stock market indices
A
B
6


Do these two normal curves have the same mean? If so,
what is it? YES
Which normal curve has a standard deviation of 3? B
Which normal curve has a standard deviation of 1? A
Empirical Rule
• Approx. 68% of the data fall within
1 of μ
• Approx. 95% of the data fall within
2 of μ
• Approx. 99.7% of the data fall
within 3 of μ
Suppose the height of male GBHS
students is normally distributed with a
mean of 71 inches and standard deviation
of 2.5 inches. What is the probability that
a randomly selected male student is taller
than 73.5 inches?
1 – .68 = .32
P(X > 73.5) = 0.16
68%
71
Standard Normal Density Curve
μ=0 & =1
To standardize any normal data:
z
x 

Make your life
easier –
memorize this!
To find normal probabilities/proportions:
1.
2.
3.
4.
Write the probability statement
Draw a picture
Calculate the z-score
Look up the probability in the
table
The lifetime of a certain type of battery is normally
distributed with a mean of 200 hours and a standard
deviation of 15 hours. What proportion of these
batteries Write
can bethe
expected to last
less&
than
220the
hours?
Draw
shade
curve
probability
statement
P(X < 220) = .9082
Look up z-score
220
200
in table
z
15
1.33
Calculate z-score
The lifetime of a certain type of battery is
normally distributed with a mean of 200 hours
and a standard deviation of 15 hours. What
proportion of these batteries can be expected to
last more than 220 hours?
P(X > 220) = 1 – .9082
= .0918
220 200
z
1.33
15
The lifetime of a certain type of battery is
normally distributed with a mean of 200 hours
and a standard deviation of 15 hours. How long
must a battery last to be in the top 5%?
Look up 0.95 in table to
find z-score
P(X > ?) = .05
x  200
1.645 
15
x  224.675
.95
.05
1.645
The heights of the female GBHS students are
normally distributed with a mean of 65
inches. What is the standard deviation of this
distribution if 18.5% of the female
Whatstudents
is the z-score
are shorter than 63 inches?
for 63?
P(X < 63) = .185
.9 
63 65

2

 2.22
.9
-0.9
63
The heights of female GBHS teachers are
normally distributed with mean 65.5 inches
and standard deviation 2.25 inches. The
heights of male GBHS teachers are normally
distributed with mean 70 inches and standard
deviation 2.5 inches.
• Describe the distribution of differences of
teacher heights (male – female)
Normal distribution with μ = 4.5,  = 3.3634
What is the probability that a randomly
selected male teacher is shorter than a
randomly selected female teacher?
P(X < 0) = .0901
0  4.5
z
 1.34
3.3634
4.5
Will my calculator do any of
this stuff?
• Normalpdf: Doesn't make sense  P(X = x) = 0!
 Used for graphing ONLY
• Normalcdf: Calculates probability
 normalcdf(lower bound, upper bound)
• Invnorm (inverse normal): Finds z-score for a
probability to the left
Ways to Assess Normality
• Dotplots, boxplots, histograms
• Normal Probability (Quantile)
Plot
Normal Scores
Suppose
we
have
the
following
observations
To
construct
a
normal
probability
plot,
we
Sketch
a
scatterplot
by
pairing
the
smallest
Think
of
selecting
sample
after
sample
of
size 10
What
should
of
of
contact
windows
inscores.
integrated
canwidths
use
quantities
called
normal
The
normal
score
with
the
smallest
data
value,
from
a
standard
normal
distribution.
Then
happen
if
our
nd
nd data value,
circuit
chips:
values
of
the
normal
scores
depend
on
the
2
normal
score
with
2
and
-1.539
is
the
average
of
the
smallest
value
data set
is
1
so
on
sample
size
n.
The
normal
scores
when
n
=
10
from
each
sample,
-1.001
is
the
average
of
the
normally
next smallest value from each sample, etc.
aredistributed?
below:
3.21 2.49 2.94 4.38 4.02 3.62 3.30 2.85
3.34 3.81
1
2
3
4
5
-1
-1.539 -1.001 -0.656 -0.376 -0.123 0.123
0.376 0.656 1.001 1.539
Widths of Contact Windows
Normal Probability (Quantile) Plots
• Plot data against known normal
z-scores
• Points form a straight line  data
is normally distributed
• Stacks of points (repeat
observations): granularity
Are these approximately normally
distributed?
50
52
53
48
51
50
54
48
47
47
48
49
51
54
50
The histogram/boxplot is approx.
symmetrical, so the data are
approx. normal.
52
55
56
46
57
53
53
45
52
The normal probability plot
is approx. linear, so the data
are approx. normal.
Premature babies are those born more than 3 weeks
early. Newsweek (May 16, 1988) reported that 10% of
the live births in the U.S. are premature. Suppose 250
live births are randomly selected and X = the number of
“preemies” is determined. What is the probability that
there are between 15 and 30 preemies, inclusive?
1)Find this probability using the binomial distribution.
P(15 < X < 30) = binomcdf(250, .1, 30) – binomcdf(250, .1, 14)
= .866
2) What is the mean and standard deviation of this
distribution?
μ = 25,  = 4.743
Premature
babiesthis
are distribution:
those born more than 3
Let’s graph
weeks early. Newsweek (May 16, 1988) reported
that•10%
of the live
L1: seq(X,
X, births
0, 45) in the U.S. are
premature. Suppose 250 live births are
randomly
selected
and X to
= the
number
of
• L2: use
binompdf
find
the binomial
“preemies” is determined.
probabilities
3) If• we
were
graph
a histogram
xmin
= to
-0.5,
xmax
= 45, xsclfor
= 1the above
binomial distribution, what shape would it have?
• ymin = 0, ymax = 0.2, yscl = 1
p is only 10%  should be skewed right
Overlay a normal curve on your
4) What
do you notice about the shape?
histogram:
• Y1 = normalpdf(X, 25, 4.743)
We can estimate binomial probabilities
with the normal distribution IF…
1) p is close to .5
or
2) n is sufficiently large
 np > 10 & n(1 –p) > 10
Why 10?
• Normal distributions extend infinitely in
both directions
• Binomial distributions go from 0 to n
• If we use normal to estimate binomial, we
have to cut off the tails of the normal
distribution
• This is okay if the mean of the normal
distribution (which we use the mean of the
binomial for) is at least three standard
deviations (3) from 0 and from n
We require:
Or
As binomial:

Square it:
Simplify:

Since (1 – p) < 1:

And since
p < 1:
  3  0
  3
np  3 np1  p
2 2
n p  9np1  p
np  91  p
np  9
n1 p  9
Therefore, 
np should be at least 10 and n(1 – p)
should be at least 10.

Continuity Correction
• Discrete histograms: Each bar is centered
over a discrete value
• Bar for "1" actually goes from 0.5 to 1.5,
bar for "2" goes from 1.5 to 2.5, etc.
• So if we want to estimate a discrete
distribution with a continuous one…
 Add/subtract 0.5 from each discrete
value
5) Since P(preemie) = .1 which is not close to .5, is n
large enough?
np = 250(.1) = 25 > 10
n(1 – p) = 250(.9) = 225 > 10
 We can use normal to approximate binomial
6) Use a normal distribution to estimate the
probability that between 15 and 30 preemies,
inclusive, are born in the 250 randomly selected
babies.
Binomial
P(15 < X < 30)

Normal (w/ cont. correction)
P(14.5 < X < 30.5)
P(14.5 < X < 30.5) = normalcdf(14.5, 30.5, 25, 4.743)
= .8634
7) How does the normal answer compare to the
binomial answer?
Pretty darn close!
Estimate each probability using the normal distribution:
a) What is the probability that less than 20 preemies are
born out of the 250 babies?
b) What is the probability that at least 30 preemies are
born out of the 250 babies?
c) What is the probability that less than 35 preemies but
more than 20 preemies are born out of the 250 babies?