Transcript Probability Distributions
Normal Distribution
Introduction
Compare to Discrete Variables
• No. of Doctor’s Visits During the Year • No. of Patients P No. of Visits 400 0.14
0 950 0.34 1 850 0.30 2 600 0.21
2800 0.99
3+
Histograms
• The height of each bar represents the probability of that event • If each bar is one unit in width, then the area also equals the probability • The total area under all the bars has to add to 1.
1000 800 600 400 200 0 Doctor's Visits No. of Visits 0 1 2 3+
Continuous Variables
Patient’s Weight Frequency 300 2 290 3 280 7
But… Can take on any value
• Can make the weight intervals as small as we want: every 10 lbs or 5 or 1, or … 0.5, 0.1, 0.001
• Histogram: As the intervals get smaller, the bars decrease in width
Line Graph
• Completely continuous, no width at all. Just connected points
100 90 80 70 60 50 40 30 20 10 0
Line Graph
Infinitesimally Small Intervals
• Then really just points on a smooth curve.
• We can also have n, the number of cases, increase to infinity.
• The total probability is still one.
Infinitesimally Small Intervals
Smoother Curve Area under the curve = 1.
Probabilities
• Can no longer read the probability of a single event.
• In a continuous distribution, can only measure the probability of a value falling within some range
Probability Within a Range
Probability of a value falling within the range is equal to the area under the curve.
Bad News
• To calculate the area under the curve we would need to use calculus • But not so bad news, others have done the calculations and set up tables for us • Applause
Diversity of Continuous Distributions
• Lots of different distributions • Lots of different shaped curves • Would need lots of different tables, however….
The Most Important Distribution
Introducing the Normal Distribution “Bell-Shaped Curve” What are its characteristics?
Normal Distribution
• First described in 1754.
• A lot of the relevant math done by Carl Gauss, therefore “Gaussian Curve”
Properties
• Symmetrical about the mean • Mean, Median & Mode are all equal • Asymptotic, height never reaches zero.
• What’s the total area under the curve?
Ranges & Probabilities
• 50% of all values fall above the mean & 50% below it.
• All probabilities depend on how far the values lie from the mean • Distance measured in number of standard deviations from the mean
Probabilities related to S.D.
One S.D. on either side of the mean Area =
Other Distances
• 1 S.D. on either side of the mean includes 68% of the cases • 2 S.D. on either side of the mean includes 95% of the cases • 3 S.D. on either side of the mean includes 99.7% of the cases
Many Different Normal Distributions
• Determined by their mean and standard deviation Mean gives location. Standard Deviation gives shape – more or less dispersed.
Proportions remain Same
• Relationships between probability and standard deviation are the same in all Normal Distributions • However in order to use the tables provided, we have to convert to the “Standard Normal Distribution”
The Standard Normal Distribution
Mean = 0. Standard Deviation = 1.
Z-values
• Converts values in any normal distribution to the standard normal distribution.
• It’s a way to express the distance from the mean in units of S.D.
• Z = X – X Compare this to 18 eggs.
s.d. How many dozen?
From Z find Probabilities
Use Table A-3. Gives areas in the upper tail of the S.N.D.
What is the area above Z = 1.28? Go to the Table. Go to 1.2 in Left-hand column & across to 0.08 A = 0.10. The probability that a value will fall above Z = 1.28 is 10%
S.N.D. mean = 0. S.D. = 1
Test It
• Let’s look up the ones we already know.
• Range = 1 S.D. on either side of the mean • Z = 1. Find 1.0 in the right hand column • Go across to 0.00
• Reads 1.59. So area in the tail is 1.59.
• What’s the area between 1.59 and the mean?
Always draw the N.D.
A = .159
If Area above z = 1 is 0.159, what is the area between Z and the mean?
A = 0.500 - 0.159 = 0.341 We need to add an equal area on the other side of the mean. Total shaded area = 0.682
You Try It
• What is the probability that a value will fall within 2 s.d. of the mean?
• Draw the N.D
• Look up area that corresponds to Z = 2.
• A = 0.023
• Find the area between mean & Z = 2.
• 0.500 – 0.023 = 0.477
• Double it. A = 0.954
Try the Reverse
• I want to find the value above which 10% of the population falls.
• This time, area = 0.100
• Look in body of table for 0.100
• Read across and up. Z = 1.28
• Would have to use the formula for Z in reverse in order to get the value for X
Finding X
Z = X – X S.D.
1.28 = X – X S.D. S.D.
*
1.28 + X = X To convert to X, have to know mean & S.D.
Example
• Weights of 40-yr old women are normally distributed with a mean of 150 and an S.D. of 10.
• What is the value above which the highest 10% of weights falls?
• X = 1.28
*
150 + 10 = 202
Application
• Studying a progressive neurological disorder. At autopsy, we weigh the brains. Find the wts are normally distributed with a mean of 1100 grams and an S.D. of 100 g.
• Find the probability that one of the brains weighs less than 850 g.
Draw the N.D.
800 1100 Z = (800 – 1100)/100 = -3 P(X<800) = Area = 0.0001
The End
For Now
More Ranges
• The cholesterol levels for a certain population are approximately normally distributed with a mean of 200 mg/100 ml & an S.D. of 20 mg/100 ml.
• Find the probabilities for an individual picked at random to have cholesterol levels in the following ranges
Mean = 200 mg/100ml S.D. = 20 mg/100 ml A. Between 180 & 200 B. Greater than 225 C. Between 190 & 210
Mean = 200 mg/100ml S.D. = 20 mg/100 ml
A. Between 180 & 200
• Z1 = 0. Z2 = (180 – 200)/20 = -1 So the area is from the mean to one S.D.
If it was both sides, would be .68. Since only one side = 0.32. P = 0.32.
Mean = 200 mg/100ml S.D. = 20 mg/100 ml
B. Greater than 225
• Z = (225 – 200)/20 = 1.25
• Look it up. Area = 0.106
• P(X>225) = 0.106
Mean = 200 mg/100ml S.D. = 20 mg/100 ml
C. Between 190 & 210
• Z1 = (190 – 200)/20 = -0.5 Look up = 0.309. But that is the tail. What is Z = 0.5 to mean? 0.500 – 0.309 = 0.191
• Z2 = 0.5. Symmetrical. So Z2 to the mean is also 0.191.
• P = 2 times 0.191 = 0.382