Standard Normal Calculations

Download Report

Transcript Standard Normal Calculations

Standard Normal Calculations

    

What you’ll learn

Properties of the standard normal dist n How to transform scores into normal dist n scores Determine the proportion of observations above, below and between two stated numbers in a normal distribution.

Calculate the point for a variable with a normal distribution for which a stated proportion of values lie either above or below.

Comparing individuals from different distributions

Standard Normal Distribution

 The Standard Normal Distribution (also known as the “z-distribution”) no data 0.5

0.4

0.3

0.2

0.1

y = -3 -2 -1 normalDensity x 0 x Function Plot 1 2 3 N( 0, 1)

Standardizing Scores

We find that all normal distributions are the same if we measure in units of σ.

We

Using the Standard Normal Distribution

 The level of cholesterol in the blood is important because high cholesterol levels may increase the risk of heart disease. We know that the distribution of blood cholesterol levels in a large population of people of the same age and sex is roughly normal. For 14-year-old boys, the mean is μ=170 mg/dl and the standard deviation, σ=30m/dl. Levels above 240 mg/dl may require medical attention.

Steps to solving a “normal” dist

n

problem.

    Step 1: – Write the question as a probability statement.

Step 2: – Calculate a z-score – Draw a picture and shade the region Step 3: – Find the appropriate region using a standard normal table Step 4:  Write the answer in the context of the problem

    

Question with Area below

What percent of 14-year-old boys have less than 160 mg/dl of cholesterol?

Step 1 (probability statement) – P(X< 160) no data 0.5

0.4

Function Plot 0.3

Step 2: (z-score)

z

X

  

z

 160  170 30   .

33 0.2

0.1

-3 -2 -1 y = normalDensity = -0.33

x 0 x 1 2 3 Since we want the percent of boys whose cholesterol is less than 160, we will find the percent of boys whose cholesterol -.33

σ or more below the mean.

Step 3: (Area from Table A) We can now use Table A to find the percent of observations below 0.33. (Remember that Table A always gives the area under the curve below a given value.

-0.3

-0.2

-0.1

-0.0

0.0

Z -0.5

-0.4

.00

.3085

.3446

.01

.3050

.3409

.02

.3015

.3372

.03

.2981

.3336

.04

.2946

.3300

.05

.2912

.3264

.06

.2877

.3228

.07

.2843

.3192

.08

.2810

.3156

.09

.2776

.3121

.3821

.4207

.3783

.4168

.4602

.5000

.5000

.4562

.4960

.5040

.3745

.4129

.4522

.4920

.5080

.3707

.4090

.4483

.4880

.5120

.3669

.4052

.4443

.4840

.5160

.3632

.4013

.4404

.4801

.5199

.3594

.3974

.4364

.4761

.5239

.3557

.3936

.4325

.4721

.5279

.3520

.3897

.4286

.4681

.5319

.3483

.3859

.4247

.4641

.5359

 

Step 3 (cont.) The area under the curve (the proportion of observations) below -3.3

σ is .3707

Step 4: (Context) The percent of 14-year-old boys whose cholesterol level is less than 160mg/dl is approximately 37.07%

   

Question with Area above

What percent of 14-year-od boys have more that 240mg/dl of cholesterol?

Step 1 (probability statement) – P(X> 240) Step 2: (z-score)

z

X

   no data 0.5

0.4

0.3

0.2

0.1

Function Plot

z

 240  170  2 .

33 30 y = -3 = 2.33

-2 -1 normalDensity x 0 x 1 2 3  Since we want the percent of boys whose cholesterol is greater than 240, we will find the percent of boys whose cholesterol 2.33

σ or more above the mean.

Step 3: (Area from Table A) We can now use Table A to find the percent of observations below 2.33. (below because that’s what our table gives us) Z 2.0

.00

.9772

2.1

2.2

2.3

2.4

2.5

2.6

2.7

.9821

.9861

.9893

.9918

.9938

.9953

.9965

.01

.9778

.02

.9783

.9826

.9864

.9896

.9920

.9940

.9955

.9966

.9830

.9868

.9898

.9922

.9941

.9956

.9967

.03

.9788

.9834

.9871

.9901

.9925

.9943

.9957

.9968

.04

.9793

.9838

.9875

.9904

.9927

.9945

.9959

.9969

.05

.9798

.06

.9803

.9842

.9878

.9906

.9929

.9946

.9960

.9970

.9846

.9881

.9909

.9931

.9948

.9961

.9971

.07

.9808

.9850

.9884

.9911

.9932

.9949

.9962

.9972

.08

.9812

.09

.9817

.9854

.9887

.9913

.9934

.9951

.9963

.9973

.9857

.9890

.9916

.9936

.9952

.9964

.9974

Step 3: Area (continued)

– The value from the table is .9901. We need to remember that the table gives us area below a value. Since the total area under the curve is 1, to find the area above we can subtract the area from the table from 1. So 1- .9901 = .0099

Step 4: (context)

– The percent of 14-year-old boys whose cholesterol level is more than 240 mg/dl is approximately .99%.

    

Question between two values

What percent of 14-year-old boys have cholesterol levels between 170mg/dl and 240 mg/dl Step 1 (probability statement) – P(170 < X < 240) Step 2: (z-scores, we need to find z scores for both endpoints)

z

X

  

z

X

  

z

 170  170 30  0 no data 0.5

0.4

0.3

0.2

0.1

-3 -2 -1 0 x Function Plot 1 2 3

z

 240  170  2 .

33 30 y = normalDensity = 2.33

= 0 x  Since we want the percent of boys whose cholesterol is between 170 mg/dl and 240mg/dl, we will find the percent of boys whose cholesterol is between 0 σ and 2.33σ.

Step 3: (Area from Table A) We can now use Table A to find the percent of observations below 2.33 and the area below z= 0.00 (below because that’s what our table gives us) Z -0.2

-0.1

-0.0

0.0

0.1

2.1

2.2

2.3

2.4

2.5

2.6

.00

.4207

.4602

.5000

.5000

.5398

.9821

.9861

.9893

.9918

.9938

.9953

.01

.4168

.4562

.4960

.5040

.5438

.9826

.9864

.9896

.9920

.9940

.9955

.02

.4129

.4522

.4920

.5080

.5478

.9830

.9868

.9898

.9922

.9941

.9956

.03

.4090

.4483

.4880

.5120

.5517

.9834

.9871

.9901

.9925

.9943

.9957

.04

.4052

.4443

.4840

.5160

.5557

.9838

.9875

.9904

.9927

.9945

.9959

.05

.4013

.4404

.4801

.5199

.5596

.9842

.9878

.9906

.9929

.9946

.9960

.06

.3974

.4364

.4761

.5239

.5636

.9846

.9881

.9909

.9931

.9948

.9961

.07

.3936

.4325

.4721

.5279

.5675

.9850

.9884

.9911

.9932

.9949

.9962

.08

.3897

.4286

.4681

.5319

.5714

.9854

.9887

.9913

.9934

.9951

.9963

.09

.3859

.4247

.4641

.5359

.5753

.9857

.9890

.9916

.9936

.9952

.9964

 Step 3: Area (continued) – The values from the table are .9901 for the z-score of 2.33 and .5000 for the z-score of 0. We need to remember that the table gives us area below a value. We can take the area from 2.33 (.9901) and subtract the area from 0 (.5000) to get the area between.

Function Plot So: .9901 - .5000 = .4901

no data 0.5

0.4

0.3

0.2

0.1

y = -3 = 0 = 2.33

-2 -1 normalDensity x 0 x 1 2 3 Step 4: Context-- The percent of 14 year-old boys whose cholesterol is between 170 and 240 is approximately 49.01%

Finding the value of the variable when we know the percent above or below

 What cholesterol level do the top 10% of 14 year-old boys have?

Function Plot Step 1: Write a probability statement P ( X >x)= .10 This statement says: we want to find the value that separates the top 10% from the bottom 90% of the curve.

no data 0.5

0.4

0.3

0.2

0.1

Since our table gives area below the curve, we will find a z-score that corresponds to 90% area y = -3 -2 -1 normalDensity x 0 x 1 2 3

Z 0.8

0.9

1.0

1.1

1.2

1.3

1.4

1.5

1.6

1.7

Step 2: Find the z-score from the table. Remember that the area is located on the “inside” of the table. Since the z-score that we are looking for is above the mean, we know the z score will be positive. We’ll look for a value close to .9000.

.00

.7881

.8159

.8413

.8643

.8849

.9032

.9192

.9332

.9452

.9554

.01

.7910

.8186

.8438

.8665

.8869

.9049

.9207

.9345

.9463

.9564

.02

.7939

.8212

.8461

.8686

.8888

.9066

.9222

.9357

.9474

.9573

Standard Normal Probability Distribution .03

.7967

.8238

.04

.7995

.8264

.05

.8023

.8289

.06

.8051

.8315

.8485

.8708

.8907

.9082

.9236

.9370

.9484

.9582

.8508

.8729

.8925

.9099

.9251

.9382

.9495

.9591

.8531

.8749

.8944

.9115

.9265

.9394

.9505

.9599

.9515

.9608

.8554

.8770

.8962

.9131

.9279

.9406

.07

.8078

.8340

.8577

.8790

.8980

.9147

.9292

.9418

.9525

.9616

.08

.8106

.8365

.8599

.8810

.8997

.9162

.9306

.9429

.9535

.9625

.09

.8133

.8389

.8621

.8830

.9015

.9177

.9319

.9441

.9545

.9633

The closest value is .8997, so we will use a z-score of 1.28

Step 3: Using the z-score found, use the formula to standardize values substituting the three known values.

z

X

   Now using algebra, solve the equation for X 1 .

28 

X

 170 30 ( 30 ) 1 .

28 

X

 170 ( 30 ) 30 ( 30 ) 1 .

28  170 

X

Step 4: Write a statement back in context A 14-year-old boys cholesterol level must be at least 208.40 to be in the top 10% of cholesterol levels.

208 .

40 

X

Comparing Individuals

  One of the best reasons to standardize values (find their corresponding z-scores) is to be able to compare individuals from different distributions.

Consider again the three baseball players that we looked at earlier in the year Ty Cobb Ted Williams .420

.406

George Brett .390

How can we compare the batting averages of these players when they played in different eras under different conditions? Was Ty Cobb actually the best hitter of these three? Let’s find out.

Comparing Individuals (Cont.)

 We know that batting averages are quite symmetric and reasonably normal with the following characteristics for each era: Decade Mean 1910s .266

1940s 1970s .267

.261

Std Dev .0371

.0326

.0317

Now, using that information, find the corresponding z-score for each player.

 Ty Cobb

z

X

  

z

 .

420  .

266 .

0371

z

 4 .

15 Ted Williams

z z

X

    .

406  .

267 .

0326

z

 4 .

26 George Brett

z

X

  

z

 .

390  .

261 .

0317

z

 4 .

07 Now that we have standardized each score onto the standard normal curve, we can compare the scores of these three individuals. Since, in this case, a larger value indicates a better batting average---it appears that Ted Williams is the best batter of these three. 4.26 > 4.15 > 4.07

Additional Resources

Practice of Statistics, Pg 83-97