Transcript Slide 1

t-Tests

Interval Estimation and the t Distribution

Large Sample z-Test

• Sometimes we have reason to test hypotheses involving specific values for the mean.

– Example 1. Claim: On average, people sleep less than the often recommended eight hours per night.

– Example 2. Claim: On average, people drink more than the recommended 2 drinks per day.

– Example 3. Claim: On average, women take more than 4 hours to run the marathon.

• However, it is rare that we have a specific hypothesis about the standard deviation of the population under study.

• For these situations, we can use the sample standard deviation

s

an estimator for the population standard deviation s .

as • If the sample size is pretty big (e.g., >100), then this estimate is pretty good, and we can just use the standard z test.

3 PSYC 6130, PROF. J. ELDER

Example: Canadian General Social Survey, Cycle 6 (1991) PSYC 6130, PROF. J. ELDER 4

But what if we don’t have such a large sample?

PSYC 6130, PROF. J. ELDER 5

Student’s

t

Distribution

• Problem: for small n,

s

of s .

is not a very accurate estimator • The result is that the computed z-score will not follow a standard normal distribution.

• Instead, the standardized score will follow what has become known as the

Student’s t distribution

.

t

X

s X

where

s X

s n

6 PSYC 6130, PROF. J. ELDER

Student’s

t

Distribution

Normal distribution t distribution, n=2, df=1 t distribution, n=10, df=9 t distribution, n=30, df=29 How would you describe the difference between the normal and

t

distributions?

PSYC 6130, PROF. J. ELDER 7

Student’s t distribution

• Student’s t distribution is leptokurtic – More peaked – Fatter tails • What would happen if we were to ignore this difference, and use the standard normal table for small samples?

PSYC 6130, PROF. J. ELDER 8

Student’s

t

Distribution

• Critical

t

values decrease as

df

increases • As

df

 infinity, critical

t

values  critical

z

values • Using the standard normal table for small samples would result in an inflated rate of Type I errors.

PSYC 6130, PROF. J. ELDER 9

One-Sample

t

Test: Example

PSYC 6130, PROF. J. ELDER 10

PSYC 6130, PROF. J. ELDER 11

Reporting Results

• Respondents who report being very forgetful sleep, on average, 7.11 hours/night, significantly less than the recommended 8 hours/night,

t

(37)=2.25, p<.05, two tailed.

PSYC 6130, PROF. J. ELDER 12

Confidence Intervals

• NHT allows us to test specific hypotheses about the mean.

– e.g., is  < 8 hours?

• Sometimes it is just as valuable, or more valuable, to know the range of plausible values.

• This range of plausible values is called a

confidence interval.

13 PSYC 6130, PROF. J. ELDER

Confidence Intervals

• The confidence interval (CI) of the mean is the interval of values, centred on the sample mean, that contains the population mean with specified probability.

• e.g., there is a 95% chance that the 95% confidence interval contains the population mean.

• NB: This assumes a flat prior on the population mean (non Bayesian).

X

Confidence Interval 14 PSYC 6130, PROF. J. ELDER

Confidence Intervals

p

p

 .025

s X X

95% Confidence Interval

p

 .025

PSYC 6130, PROF. J. ELDER 15

Basic Procedure for Confidence Interval Estimation 1.

2.

3.

4.

Select the sample size (e.g.,

n

= 38) Select the level of confidence (e.g., 95%) Select the sample and collect the data (Random sampling!) Calculate the limits of the interval

t

X

 

s X X

s t X

  

X

s t X

 / 2   

X

s t X

 / 2 PSYC 6130, PROF. J. ELDER 16

End of Lecture 4

Oct 8, 2008

Selecting Sample Size

• Suppose that 1. You have a rough estimate

s

population, and of the standard deviation of the 2. You want to do an experiment to estimate the mean within some 95% confidence interval of size

W.

n

oughl y

n W s

 2 18 PSYC 6130, PROF. J. ELDER

Assumptions Underlying Use of the t Distribution for NHT and Interval Estimation

• Same as for z test: – Random sampling – Variable is normal • CLT: Deviations from normality ok as long as sample is large.

– Dispersion of sampled population is the same as for the comparison population PSYC 6130, PROF. J. ELDER 19

Sampling Distribution of the Variance

Sampling Distribution of the Variance

• We are sometimes interested in testing a hypothesis about the variance of a population.

– e.g., is IQ more diverse in university students than in the general population?

Suppose we measure the IQ of a random sample of 13 university students We then calculate the sample variance:

s

2   (

X N

  1

X

) 2  400 Suppose that we know the variance s 0 2 of IQs in the general population: s 0 2  15 2  225 Can we conclude that student IQs are more diverse?

To solve this problem we need to know the range of plausible values for the test statistic

s

2 under the null hypothesis.

PSYC 6130, PROF. J. ELDER 21

Sampling Distribution of the Variance

• What form does the sampling distribution of the variance assume?

• If the variable of interest (e.g., IQ) is normal, the sampling distribution of the variance c -squared distribution: takes the shape of a 0 PSYC 6130, PROF. J. ELDER 2 )  s 2 22 s 2

Sample Variances and the

c

-Square Distribution

We first standardize the sample variance statistic by multiplying by the degrees of freedom and dividing by the population variance s 2 : c 2   

s

2 s 2  (

n

 1)

s

2 s 2 The resulting variable c 2 follows a ( ) distribution with  

df

1.

 =9  =29  =99 0 PSYC 6130, PROF. J. ELDER 50 c 2 23 100 150

Sample Variances and the

c

-Square Distribution

• The c -square distribution is: – strictly positive.

– positively skewed.

• Since the sample variance is an unbiased estimator of the population variance:

E(s 2 ) =

s

2

• Due to the positive skew, the mean of the distribution

E(s 2 )

than the mode.

is greater • As the sample size increases, the distribution approaches a normal distribution.

• If the original distribution is not normal and the sample size is not large, the sampling distribution of the variance may be far from c square, and tests based on this assumption may be flawed.

PSYC 6130, PROF. J. ELDER 24

Example: Height of Female Psychology Graduate Students

2005 PSYC 6130A Students (Female)

Canadian Adult Female Population:  63.937 in s 2.7165 in Canadian Adult Male Population:  69.252 in s 3.189 in

n

 131,110!

Source: Canadian Community Health Survey Cycle 3.1 (2005) Caution: self report!

PSYC 6130, PROF. J. ELDER 25

Properties of Estimators

• We have now met two statistical estimators:

X s

2 is an estimator for s 2 .

Both of these estimator s are: Unbiased , i.e.,  2 )= s 2 Consistent , i.e., the quality of the estimate improves as the sample size increases.

Efficient , i.e., given a fixed sample size, the accuracy of these estimators is better than competing estimators.

26 PSYC 6130, PROF. J. ELDER

NHT for Two Independent Sample Means

Conditions of Applicability

• Comparing two samples (treated differently) • Don’t know means of either population • Don’t know variances of either population • Samples are independent of each other PSYC 6130, PROF. J. ELDER 28

Example: Height of Canadian Males by Income Category (Canadian Community Health Survey, 2004) PSYC 6130, PROF. J. ELDER 29

Sampling Distribution

To solve this problem we need to know the sampling distribution for the difference of the means, i.e.,

X

1 

X

2 Under the null hypothesis, both samples come from the same distribution.

Suppose this distribution is normal.

Then we know that

X

1 and

X

2 are also normally distributed:

X

1

X

2  s

X

1 ) 

N

  s

X

2 ) 

N

s 1

n

1 s 2 ) )

n

2 30 PSYC 6130, PROF. J. ELDER

Sampling Distribution (cntd…)

Major Theorem o f Pr obab i lity: Any linear combination of normal variables is i tsel f no r mal.

Thus

X

1 

X

1 

X

2

X

2 is also normal: s

X

1 

X

2 ) What is the dispersion s

X

1 

X

2 ?

Basic principle for normal distributions - variances add: s 2

X

1 

X

2  s 2

X

1  s 2

X

2 Knowing the standard error for the 2 distributions, we can calculate our sampling distribution.

z

 

X

1 

X

2   s

X

1 

X

2   1   2  PSYC 6130, PROF. J. ELDER 31

NHT for Two Large Samples

R ec al l: If sample is large (e.g.,

n

 100), can approximate population variance by sample v arianc e : s 2

X

1 s 2

X

2 

s

2

X

1 

s

2

X

2 And thus we can estimate s 2

X

1 

X

2  s 2

X

1  s 2

X

2 PSYC 6130, PROF. J. ELDER 32

Height of Canadian Males by Income Category (Canadian Community Health Survey, 2004)

X s

  69.87 " 2.63 "

n

 7586

X s

 69.01"  2.85"

n

 7777 PSYC 6130, PROF. J. ELDER 33

NHT for Two Small Samples

Example: Social Factors in Psychological Well-Being Canadian Community Health Survey, 2004 PSYC 6130, PROF. J. ELDER 35

Social Factors in Psychological Well Being (cntd…) Canadian Community Health Survey, 2004 PSYC 6130, PROF. J. ELDER 36

Social Factors in Psychological Well Being (cntd…) Canadian Community Health Survey, 2004: Respondents who report never getting along with others PSYC 6130, PROF. J. ELDER 37

NHT for Two Small Independent Samples

By analogy with one-sample NHT, we might approximate the standard errors s

X

1 and s

X

2 by the sample standard errors s and s .

X

1

X

2 Unfortunately, the resulting sampling distribution of the difference of the means is not straightforward to analyze.

So what do we do?

If we can assume homogeneit y of var ia nce (the two populations have the same variance), d is simpl e to ana l yze.

PSYC 6130, PROF. J. ELDER 38

NHT for Two Small Independent Samples (cntd…)

If both populations have the same variance, we want to use both samples simultaneously to get the best possible estimate of this variance.

In general, recall that

s

2 

SS n

 1 Thus our formula for the pooled varia nce is

s

2

p

 

n

1

SS

1   

SS

2

n

2  1   

n

1  

n

1 1 

s

1 2    

n

2

n

2   1 1  

s

2 2 And the sample standard error is

s

2

X

1 

X

2 

s p

2

n

1 

s n

2 2

p

and

t

 

X

1 

X

2   

s X

1 

X

2 2  PSYC 6130, PROF. J. ELDER 39

n

1 

n

2  2 degrees of freedom.

Pooled Variance

P ooled variance is

s p

2  

n

1  

n

1 1 

s

1 2    

n

2

n

2   1 1  

s

2 2 

df s

2 1 1

df

1 

df s

2 2 2 

df

2 Not e that the pooled variance is a wei ght ed sum of the sample varian ces.

The weights are proportional to the size of each sample (Bigger samples are more reliable estimators of the common variance)

df

2

df

1

s

1 2 PSYC 6130, PROF. J. ELDER 40

s p

2

s

2 2

Social Factors in Psychological Well Being (cntd…) Canadian Community Health Survey, 2004: Respondents who report never getting along with others

s X

 36.59

 22.40

n

 37

X

 41.84

s n

 23.87

 25 PSYC 6130, PROF. J. ELDER 41

Reporting the Result

No significant difference was found between the psychological well-being of men (

M

 41.8,

SD

 23.9) and women (

M

 36.6,

SD

 22.4)  0.88,

p

 .38.

PSYC 6130, PROF. J. ELDER 42

Confidence Intervals for the Difference Between Two Means

t

 

X

1 

X

2      1 2 

s X

1 

X

2  1    1   2    

X

1 

X

2  

t s crit X

1 

X

2 2  

X

1 

X

2  

ts X

1 

X

2

p

(   2 )

s X

1 

X

2

p

 .025

t

.025

PSYC 6130, PROF. J. ELDER

X

1 

X

2 95% Confidence Interval 43 

t

.025

p

 .025

Underlying Assumptions

• Dependent variable measured on interval or ratio scale.

• Independent random sampling – (independence

within

and

between

samples) – In experimental work, often make do with

random assignment

.

• Normal distributions – Moderate deviations ok due to CLT.

• Homogeneity of Variance – Only critical when sample sizes are small and different.

44 PSYC 6130, PROF. J. ELDER

End of Lecture 5

Oct 15, 2008

Social Factors in Psychological Well Being (cntd…) Canadian Community Health Survey, 2004: Respondents who report never getting along with others

s X

 36.59

 22.40

n

 37

X

 41.84

s n

 23.87

 25 PSYC 6130, PROF. J. ELDER 46

Separate Variances

t

Test

If

– Population variances are different (suggested by substantially different sample variances)

AND

– Samples are small

AND

– Sample sizes are substantially different •

Then

– Pooled variance

t

statistic will not be correct.

• In this case, use

separate variances t test

47 PSYC 6130, PROF. J. ELDER

Separate Variances

t

Test

• • •

t

 

X

1 

X

2     1   2 

s X

1 

X

2 where

s

2

X

1 

X

2 

s

2

X

1 

s

2

X

2 This statistic is well-approximated by a

t

distribution.

Unfortunately, calculating the appropriate

df

is difficult.

SPSS will calculate the

Welch-Satterthwaite

part of a 2-sample

t

test: approximation for

df

as

df

  

s

2

X

1

s

4

X

1

df

1 

s

2

X

2  2 

s

4

X

2

df

2 48 PSYC 6130, PROF. J. ELDER

Social Factors in Psychological Well Being (cntd…) Canadian Community Health Survey, 2004: Respondents who report never getting along with others

s X

 36.59

 22.40

n

 37

X

 41.84

s n

 23.87

 25 PSYC 6130, PROF. J. ELDER 49

Summary: t-Tests for 2 Independent Sample Means

n n

1 , 2  100

n

1                

n

2

s

1

s

2  Test statistic

t

      

t t t z z z z s

2

X

1 

X

2

s

1 2

n

1 

s n

2 2 2

s p

2

n

1 

s n

2 2

p s

1 2

n

1 

s n

2 2 2

s

1 2

n

1 

s n

2 2 2

s

1 2

n

1 

s n

2 2 2

s

1 2

n

1

s

1 2

n

1 

s n

2 2 2 

s n

2 2 2

s

1 2

n

1 

s n

2 2 2 PSYC 6130, PROF. J. ELDER 50

df

Welch Satterthwaite

n

1 

n

2  2

n

1 

n

2  2

n

1 

n

2  2 NA NA NA NA

More on Homogeneity of Variance

• How do we decide if two sample variances are different enough to suggest different population variances?

• Need NHT for homogeneity of variance.

– F-test • Straightforward • Sensitive to deviations from normality – Levene’s test • More robust to deviations from normality • Computed by SPSS 51 PSYC 6130, PROF. J. ELDER

Levene’s Test: Basic Idea

1. Replace each score

X

1

i

,

X

2

i

with its absolute deviation from the sample mean:

d

1

i d

2

i

 |

X

1

i

 |

X

2

i

X

1 | 

X

2 | 2. Now run an independent samples t-test on

d

1

i

and

d

2

i

:

t

d

1 

d

2

s d

1 

d

2 SPSS reports an

F

statistic for Levene’s test • Allows the homogeneity of variance for two or more variables to be tested.

• We will introduce the

F

distribution later in the term.

52 PSYC 6130, PROF. J. ELDER

The Matched

t

Test

Independent or Matched?

• Application of the Independent-Groups

t

test depended on independence both within and

between

groups.

• There are many cases where it is wise, convenient or necessary to use a matched design, in which there is a 1:1 correspondence between scores in the two samples.

• In this case, you cannot assume independence between samples!

• Examples: – Repeated-subject designs (same subjects in both samples).

– Matched-pairs designs (attempt to match possibly important attributes of subjects in two samples) PSYC 6130, PROF. J. ELDER 54

Example: Assignment Marks

A3 A4 72 70 69 83 80 80 93 88 88 88 87 88 93 93 88 93 85 100 85 100 70 80 72 60 90 80 83 81 68 36 75 83 75 93 80 100 65 83 65 41 83 75 73 68 88 75 Mean 73 86 SD n 14 8 23 23 100 90 80 70 60 50 40 40 60 80 Assignment 1 Mark (%) 100 PSYC 6130, PROF. J. ELDER These scores are not independent!

55

Better alternative: The matched t-test using the direct difference method Mean SD n A3 A4 72 70 69 83 88 88 87 88 85 100 85 100 70 72 60 80 90 80 83 81 68 36 75 83 75 93 80 80 93 88 93 93 88 93 80 100 65 83 65 41 83 75 73 68 88 75

A4-A3

20 18 18 34 15 7 8 10 24 5 5 5 1 5 15 15 10 18 20 -8 2 7 57 73 14 23 86 8 23 13 13 23

t

X

  0

s

/

n D

  0

s D

/

n

PSYC 6130, PROF. J. ELDER 56

Matched vs Independent t-test

• Why does a matched

t

-test yield a higher

t

-score than an independent

t

-test in this example?

– The

t

-score is determined by the ratio of the difference

between

the groups and the variance

within

the groups.

– The matched

t

-test

factors out

the portion of the within-group variance due to differences between individuals.

PSYC 6130, PROF. J. ELDER 57

The Matched

t

Test and Linear Correlation

• The degree to which the matched

t

groups

t

value exceeds the independent value depends on how highly correlated the two samples are.

• Alternate formula for matched standard error:

s D

2 

s

1 2 

n s

2 2  2

r s s

1 2 ,

n

100 90 80 70 60 50 40 40 r 2 = 0.18

60 80 Assignment 3 Mark (%) 100 58 PSYC 6130, PROF. J. ELDER

Case 1: r = 0

• Independent t-test • Matched t-test

s

2

X

1 

X

2 

n

1 

s

1 2 

s

2 2 

s D

2  

s

1 2 1

n

 

s

2 2

n s

1 2  2

r s s

1 2

n

s

2 2  Thus the t-score will be the same. But note that

df

 2(

n

 1)

df

1 Thus the critical t-values will be larger for the matched test.

59 PSYC 6130, PROF. J. ELDER

• Independent t-test

s

2

X

1 

X

2 

n

1 

s

1 2 

s

2 2 

Case 2: r > 0

• Matched t-test

s D

2 

s

1 2 

s

2 2

n

 2

r s s

1 2

n

Now the t-score will be larger for the matched test. Although the critical t-values are larger, the net result is that the matched test will often be more powerful. PSYC 6130, PROF. J. ELDER 60

Confidence Intervals

• Just as for one-sample

t

test:

t

D

 

s D

t

 / 2

s D

PSYC 6130, PROF. J. ELDER 61

Repeated Measures Designs

• Many matched sample designs involve repeated measures of the same individuals.

• This can result in carry-over effects, including learning and fatigue.

• These effects can be minimized by counter-balancing the ordering of conditions across participants.

PSYC 6130, PROF. J. ELDER 62

Assumptions of the Matched

t

Test

• Normality • Independent random sampling (within samples) PSYC 6130, PROF. J. ELDER 63