Inferential Statistics Parametric Ch 5. Inferential Statistics Random Samples Estimate Population Statistics Correlation of Two variables Experimental Methods Standard Error of the Mean Descriptive Statistics Ch 1.

Download Report

Transcript Inferential Statistics Parametric Ch 5. Inferential Statistics Random Samples Estimate Population Statistics Correlation of Two variables Experimental Methods Standard Error of the Mean Descriptive Statistics Ch 1.

Inferential Statistics
Parametric
Ch 5. Inferential Statistics
Random Samples
Estimate Population Statistics
Correlation of Two variables
Experimental Methods
Standard Error of the Mean
Descriptive
Statistics
Ch 1. The Mean, The Number of
Observations, & the Standard Deviation
N/Population/Parameters
Measures of Central Tendency –
Median, Mode, Mean
Measures of Variability – Range, Sum
of Squares, Variance, Standard Deviation
Ch 6. T Scores / T Curves
Estimates of Z scores
Computing t Scores
Critical Values
Degrees of Freedom
Ch 2. Frequency Distributions and
Histograms
Frequency Distributions
Bar Graphs / Histograms
Continuous vs Discreet Variables
Ch 7. Correlation
Variable Relationships – linearity,
direction, strength
Correlation Coefficient
Scatter plots
Best Fitting lines
Ch 3. The Normal Curve
Z scores & percentiles
Least Squares, Unbiased estimates
Ch 8. Regression
Predicting using the regression equation
Generalizing – The null hypothesis
Degrees of freedom and statistical
significance
Ch 4. Translating To and From Z Scores
Normal Scores
Scale Scores
Raw Scores
Percentiles
Non Parametric
Other stuff to come
Ch 10.Two Way Factorial Analysis of
Variance
Three null hypotheses
Graphing the means
Factorial designs
Chapter 11.
A variety of t tests
Ch 9. Experimental Studies
Independent and dependent variables
The experimental hypothesis
The F test and the t test
Ch 12. Tukeys Significant Difference
Testing differences in group means
Alpha for the whole experiment
HSD - Honestly Significant Difference
Ch 12. Power Analysis
Type 1 error and alpha
Type 2 error and beta
How many subjects do you need?
Ch 13. Assumptions Underlying Parametric Statistics
Sample means form a normal curveSubjects are randomly selected from the population
Homgeneity of VarianceExperimental error is random across samples
Ch 14. Chi Square
Nominal Data
Chapter 11- Lecture 1
t tests: single sample,
repeated measures, and
two independent samples.
11/7/2015
2
Conceptual overview
11/7/2015
3
t and F tests: 2 approaches
to measuring the distance
between means
There are two ways to tell how far apart things are.
When there are only two things, you can directly
determine their distance from each other. If they are
two scores, as they usually are in Psychology, you
simply subtract one from the other to find their
difference. That is the approach used in the t test and
its variants.
11/7/2015
4
F tests:
Alternatively, when you want to describe the distance
of three or more things from each other, the best way
to index their distance from each other is to find a
central point and talk about their average squared
distance (or average unsquared distance) from that
point.
The further apart things are from each other, the
further they will be, on the average, from that central
point. That is the approach you have used in the F
test (and when treating the t test as a special case of
the F test: t for one or two, F for more.)
11/7/2015
5
One way or another: the two
methods will yield identical
results.
We can use either method, or a combination of
the two methods to ask the key question in this
part of the course, “Are two or more means
further apart than they are likely to be when
the null hypothesis is true.”
11/7/2015
6
H0: It’s just sampling fluctuation.
If the only thing that makes the two means
different is random sampling fluctuation, the
means will be fairly close to the population
mean and to each other.
If an independent variable is pushing the
means apart, their distance from each other, or
from some central point, will tend to be too
great to be explained by the null hypothesis.
11/7/2015
7
Generic formula for the t test
These ideas lead to a generic formula for
the t test:
t (dfW)=(Actual difference between 2 means)
(Estimated average difference
between the two means that should
exist if the H0 is correct)
11/7/2015
8
Calculation and theory
As usual, we must work on calculation
and theory.
Again we’ll do calculation first.
11/7/2015
9
The first of three types of
simple t tests - the single or
one sample t test
One sample t test
t test in which a sample mean is compared to a
population mean.
The population mean is almost always the value
postulated by the null hypothesis.
Since it is a mean obtained from a theory (H0 is a
theory), we call that mean “muT”.
To do the single sample t test, we divide the actual
difference between the sample mean and muT by the
estimated standard error of the mean, sX-bar.
X
11/7/2015
10
Let’s do a problem:
You may recognize this problem. We used
it to set up confidence intervals in Ch. 6
11/7/2015
11
For example:
For example, let’s say that we had a
new antidepressant drug we wanted to
peddle. Before we can do that we must
show that the drug is safe.
Drugs like ours can cause problems
with body temperature. People can get
chills or fever.
We want to show that body
temperature is not effected by our new
drug.
11/7/2015
12
Testing a theory
“Everyone knows” that normal body
temperature for healthy adults is 98.6oF.
Therefore, it would be nice if we could show
that after taking our drug, healthy adults still
had an average body temperature of 98.6oF.
So we might test a sample of 16 healthy adults,
first giving them a standard dose of our drug
and, when enough time had passed, taking their
temperature to see whether it was 98.6oF on
the average.
11/7/2015
13
Here’s the formula:
t ( df )  ( X  muT ) / sX  ( X  muT ) /(s / n )
11/7/2015
14
Here’s the formula:
t ( df )  ( X  muT ) / sX  ( X  muT ) /(s / n )
11/7/2015
15
Data for the one sample t test
We randomly select a group of 16 healthy
individuals from the population.
We administer a standard clinical dose of our
new drug for 3 days.
We carefully measure body temperature.
RESULTS: We find that the average body
temperature in our sample is 99.5oF with an
estimated standard deviation of 1.40o (s=1.40).
In Chapter 7 we asked whether 99.5oF was in
the 95% CI around muT? It wasn’t. We should
get the same result with a t test.
11/7/2015
16
Here’s the computation:
t (15)  (99.5  98.6) /(1.40 /
11/7/2015
16)  .9 / .35  2.57
17
Notice that the critical
value of t changes with the
number of degrees of
freedom for s, our estimate
of sigma, and must be
taken from the t table.
If n= 16 in a single sample,
dfW=n-k=15.
11/7/2015
18
df
.05
.01
1
12.706
63.657
2
4.303
9.925
3
3.182
5.841
4
2.776
4.604
5
2.571
4.032
6
2.447
3.707
7
2.365
3.499
8
2.306
3.355
df
.05
.01
9
2.262
3.250
10
2.228
3.169
11
2.201
3.106
12
2.179
3.055
13
2.160
3.012
14
2.145
2.997
15
df
.05
.01
17
2.110
2.898
18
2.101
2.878
19
2.093
2.861
20
2.086
2.845
21
2.080
2.831
22
2.074
2.819
23
2.069
2.807
24
2.064
2.797
df
.05
.01
25
2.060
2.787
26
2.056
2.779
27
2.052
2.771
28
2.048
2.763
29
2.045
2.756
30
2.042
2.750
40
2.021
2.704
60
2.000
2.660
df
.05
.01
100
1.984
2.626
200
1.972
2.601
500
1.965
2.586
1000
1.962
2.581
2000
1.961
2.578
10000
1.960
2.576
16
2.131 2.120
2.947 2.921
We have falsified the null.
We would write the results as follows
t (15) = 2.57, p<.05
Since we have falsified the null, we reject 98.6o
as the population mean for people who have
taken the drug.
Instead, we would predict the average person,
drawn from the same population as our sample,
would respond as did our sample. We would
predict they will have an average body
temperature of 99.5o after taking the drug. That
is, they would have a slight fever.
11/7/2015
20
An obvious problem with
the one sample
experimental design: no
control group.
11/7/2015
21
So, we can use a single
random sample of
participants as their own
controls if we measure
them two or more times.
If it they are measured
twice, we can use the
repeated measures t test.
11/7/2015
22
Computation of the
repeated measures t test
Let’s say we measured 5 moderately
depressed inpatients and rated their
depression with the Hamilton rating scale
for depression. The we treated them with
CBT for 10 sessions and again got
Hamilton scores. Lower scores = less
depression.
11/7/2015
23
Here are pre, post and difference
scores showing post-treatment scores
subtracted from pretreatment scores
 Al scored 28 before treatment and 18 after . Bill scored
22 before and 14 after. Carol scored 23 before and 14
after. Dora scored 38 before and 27 after. Ed scored 33
before and 21 after.
Before After
 28
 22
 23
 38
 33
11/7/2015
18
14
14
27
21
Difference
10
8
9
11
12 Mean difference = 10.00
24
In this case, there are
5-1=4 degrees of freedom.
Now we can compute the
estimated standard error
of the difference scores:
t ( df )  ( X  muT ) / sX  ( X  muT ) /(s / n )
sDbar  1.58 / 5  .71
11/7/2015
25
Now we are ready for the
formula for the repeated
measures t test:
t equals the actual average
difference between the
means minus their difference
under H0 divided by the
estimated standard error of
the difference scores
t (dfD)  ( XD  muT ) / sDbar
11/7/2015
26

11/7/2015
27
Here is the computation in
this case:
t ( 4)  ( XD  mu T ) / sDbar  10.00  0.00 / .71  14.14
11/7/2015
28
Here is how we would
write the result:
t (4) = 14.14, p<.01
In this case the means are 14.14
estimated standard errors apart.
We wrote the results as p<.01
But these are results so strong, they far
exceed any value in the table, even with
just a few degrees of freedom.
This antidepressant works!!!!
11/7/2015
29
There are times a repeated
measures design is
appropriate and times
when it is not.
When it is not, we use a
two sample, independent
groups t test.
11/7/2015
30
The t test for two
independent groups.
You already know a formula for the two
sample t test:
t (n-k) = sB/s
But now we want an alternative formula
that allows us to directly compare two
means by subtracting one from the other.
It takes a little algebra, but the formula is
pretty straight forward
11/7/2015
31
Three steps to computing the t
test for two independent groups:
First, we need to compute the
actual difference between the
two means.
(That’s easy, we just subtract one
from the other.
11/7/2015
32
Step 2:
Then we compare that difference to
the difference predicted by H0. That’s
also easy because the null, as usual,
predicts no difference between the
means.
H0: mu1 – mu2 = 0.00
That is, the null says that there is
actually no average difference
between the means of the two
population represented by the two
groups.
11/7/2015
33
Step 3 – this one is a little
harder. Here we compute the
standard error of the difference
between two means ( sX  X ) .
Although the population means
may be identical, samples will
vary because of random sampling
fluctuation.
The amount of fluctuation is
determined by MSW and the sizes
of the two groups.
11/7/2015
34
So we need to divide the actual
difference between the mean
score at time 1 minus the mean
at time two.
Then we subtract the theoretical
difference (which is 0.00
according to the null).
Finally, we divide by the
estimated standard error of the
difference between the means of
two independent groups.
11/7/2015
35
Let’s learn the conceptual
basis and computation of
the estimated standard
error of the difference
between 2 sample means.
11/7/2015
36
The estimated average squared distance
between a sample mean and the
population mean due solely to sampling
fluctuation is MSW /n, where n is the size
of the sample.
The estimated average squared distance
between two sample means is their two
squared differences from mu added
together: MSW/n1+ MSW/n2.
11/7/2015
37
So, if the samples are the same size, their
average squared distance from each other
equals: MSW/n1+ MSW/n2 = 2MSW/n
But if the samples have different numbers
of scores, we have to use the average size
of the two groups.
11/7/2015
38
The problem is we can’t use a usual arithmetic
average; we need to use a geometric average
called the harmonic mean, nH.
Then the average squared distance between
two independent sample means equals 2MSW/nH
The square root of that is the average
unsquared difference between the mean of the
two samples, the denominator in the t test.
11/7/2015
39
Here is the formula for the
estimated standard error
of the difference between
the means of two
independent samples
sX  X  2 MSW / nH
11/7/2015
40
Here’s the formula for
the independent groups
t test:
t ( df )  [( X 1  X 2 )  ( mu1  mu 2 )] / sX
11/7/2015
 X
41
Where
sX  X  2 MSW / nH
11/7/2015
42
So, to do that computation
we need to learn to
compute nH.
11/7/2015
43
Calculating the Harmonic
Mean
k
nH 
1 1 1
1
    ... 
nK
 n1 n2 n3



Notice that this technique allows different
numbers of subjects in each group.
Oh No!! My rat died!
What is going to happen
to my experiment?
11/7/2015
44
If the groups are the same size , the
harmonic and ordinary mean number
of participants is the same.
3 groups; 4 subjects each
k
nH 
1 1 1
1
    ... 
nK
 n1 n2 n3
3
nH 
1 1 1
   
4 4 4
11/7/2015



3
3 *100 100



4
.75
75
25
45
When groups do not have equal
numbers, harmonic mean is
smaller than ordinary mean.
4 groups; 6, 4, 8 and 4 participants.
Ordinary mean=22/4=5.50 participants each.
nH 
4
1 1 1 1
    
6 4 8 4

4
 8 12 6 12 
    
 48 48 48 48 
4
4 * 48 2 * 48 96




 5.05
38
19
19
 38 
 
 48 
11/7/2015
46
The theory part:
11/7/2015
47
ZX-bar scores
As you know from Chapter 4, the Z score
of a sample mean is the number of
standard errors of the mean the sample
mean is from mu. Here is the formula.
ZX-bar = (X-bar - mu)/ sigmaX-bar
11/7/2015
48
Confidence intervals with Z
As you learned in Chapter 4, if a sample differs from
its population mean solely because of sampling
fluctuation, 95% of the time it will fall somewhere in
a symmetrical interval that goes 1.96 standard errors
in both directions from mu.
That interval is, of course, the CI.95.
CI.95 = mu + 1.960 sigmaX-bar
Or, for theoretical population means:
CI.95 = muT + 1.960 sigmaX-bar
11/7/2015
49
MUT, the CI.95 and H0.
Most of the time we don’t know mu, so
we are really talking about muT.
Most of the time, muT will be the value of
mu suggested by the null hypothesis.
If a sample falls outside the 95%
confidence interval around muT, we have
to assume that it has been pushed away
from mu by some factor other than
sampling fluctuation.
11/7/2015
50
ZX-bar and the null hypothesis
If H0 says that the only reason that a sample
mean differs from muT is sampling fluctuation,
as H0 usually does, then the value of ZX-bar can
be used as a test of the null hypothesis.
If H0 is correct, ZX-bar should fall within the CI.95,
within 1.960 standard errors of muT.
If ZX-bar has an absolute value greater than
1.960, the sample mean falls outside the 95%
confidence interval around mu and falsifies the
null hypothesis.
11/7/2015
51
The underlying logic of the Z test
Here is the formula for Zx-bar again.
ZX-bar = (X-bar - muT)/ sigmaX-bar
When used as a test of the null, most text
books identify ZX-bar simply as Z. We will follow
that lead and, when we use it in a test of the
null, call Zx-bar simply “Z.”
Here is the formula for the Z test.
Z = (X-bar - muT)/ sigmaX-bar
If the absolute value of Z equals or exceeds 1.960, Z
is significant at .05.
If the absolute value of Z equals or exceeds 2.576, Z
is significant at .01.
11/7/2015
52
In the Z test
You start with a random sample then expose it
to an IV.
You determine muT, the predicted mean if the
null hypothesis is true.
If the absolute value of Z > 1.960, Xbar outside
the CI.95 around muT.
The null hypothesis is probably not correct.
Since you have falsified the null, you must turn
to H1, the experimental hypothesis
Also, since Z was significant, you conclude that
were other individuals from the population
treated the same way, they would respond
similarly to the sample you studied.
11/7/2015
53
There are two problems
We seldom know sigma.
It would be nice to have a control group.
Let’s deal with those problems one at a
time.
We’ll deal with the fact that we don’t
know sigma
Therefore we can’t compute sigmaX-bar.
11/7/2015
54
The first problem:
Since we don’t know sigma, we must use our best
estimate of sigma, s, the square root of MSW
and then estimate sigmaX-bar by dividing s by the
square root of n, the size of the sample.
We therefore must use the critical values of the
t distribution to determine the CI.95 and CI.99
around muT in which the null hypothesis
predicts that Xbar will fall.
The exact value will depend of degrees of
freedom for s.
Since s is the square root of MSW, dfW=n-k.
11/7/2015
55
t curves and degrees of
freedom revisited
Z curve
F
r
e
q
u
e
n
c
y
Standard
deviations
5 df
1 df
score
3
2
1
0
1
2
3
To get 95% of the population in the body of the curve when
there are 5 df of freedom, you go out over 3 standard deviations.
To get 95% of the population in the body of the curve when there
is 1 df of freedom, you go out over 12 standard deviations.
11/7/2015
56
Critical values of the t curves
The following table defines t curves with
1 through 10,000 degrees of freedom
Each curve is defined by how many estimated
standard deviations you must go from the mean
to define a symmetrical interval that contains a
proportions of .9500 and .9900 of the curve,
leaving proportions of .0500 and .0100 in the
two tails of the curve (combined).
Values for .9500/.0500 are shown in plain print.
Values for .9900/.0900 and the degrees of
freedom for each curve are shown in bold
print.
11/7/2015
57
df
.05
.01
1
12.706
63.657
2
4.303
9.925
3
3.182
5.841
4
2.776
4.604
5
2.571
4.032
6
2.447
3.707
7
2.365
3.499
8
2.306
3.355
df
.05
.01
9
2.262
3.250
10
2.228
3.169
11
2.201
3.106
12
2.179
3.055
13
2.160
3.012
14
2.145
2.997
15
2.131
2.947
16
2.120
2.921
df
.05
.01
17
2.110
2.898
18
2.101
2.878
19
2.093
2.861
20
2.086
2.845
21
2.080
2.831
22
2.074
2.819
23
2.069
2.807
24
2.064
2.797
df
.05
.01
25
2.060
2.787
26
2.056
2.779
27
2.052
2.771
28
2.048
2.763
29
2.045
2.756
30
2.042
2.750
40
2.021
2.704
60
2.000
2.660
df
.05
.01
100
1.984
2.626
200
1.972
2.601
500
1.965
2.586
1000
1.962
2.581
2000
1.961
2.578
10000
1.960
2.576
Estimated distance of sample means
from mu: the estimated standard error
of the mean
We can compute the standard error of the mean
when we know sigma.
We just have to divide sigma by the square root of n,
the size of the sample
Similarly, we can estimate the standard error
of the mean, estimated the average unsquared
distance of sample means from mu.
We just have to divide s by the square root of n, the
size of the sample in which we are interested
11/7/2015
59
Here’s the formula:
t (df )  ( Xbar  muT ) / sX  ( Xbar  muT ) /(s / n )
11/7/2015
60
The one sample t test
t ( df )  ( Xbar  muT ) / sX  ( Xbar  muT ) /(s / n )
If the absolute value of t exceeds the
critical value at .05 in the t table, you
have falsified the null and must accept the
experimental hypothesis.
11/7/2015
61
The second problem: no
control group.
Participants as their own
controls: the repeated
measures t test
11/7/2015
62
3 experimental designs:
First = unrelated groups
There are three basic ways to run
experiments.
The first is to create different groups each
of which contains different individuals
randomly selected from the population.
You then measure the groups once to
determine whether the differences among
their means exceeds that expected for
sampling fluctuation.
That’s what we’ve done until now.
11/7/2015
63
Second type of design–
repeated measures
The second is to create one random
sample from the population. You then
treat the group in different ways and
measure that group two or more times,
once for each different way the group is
treated.
Again, you want to determine whether the
differences among the group’s means
taken at different times, exceeds that
expected for sampling fluctuation.
11/7/2015
64
Baseline vs. post-treatment
If the first measurement is done before
the start of the experiment, the result will
be a baseline measurement. This allows
participants to function as their own
controls.
In any event, the question is always
whether the change between conditions is
larger than you would expect from
sampling fluctuation alone.
11/7/2015
65
From this point on, we look
only at the difference scores.
That is, we ignore the original pre and
post absolute scores altogether and only
look at the differences between time 1
and time 2.
Of course, our first computation is the
mean and estimated standard deviation of
the differences scores.
11/7/2015
66
Here is the example we used in learning the
computation of the repeated measures t test
S#
A
B
C
D
E
X
10
8
9
11
12
X=50
n= 5
XD=10.00
11/7/2015
X
10.00
10.00
10.00
10.00
10.00
(X - X)2
0.00
4.00
1.00
1.00
4.00
(X - X)
0.00
2.00
-1.00
1.00
2.00
(X-X)=0.00
(X-X)2=10.00 = SSW
MSW = SSW/(n-k) = 10.00/4 = 2.50
s=
MSW = 1.58
67
The null hypothesis in our
repeated measures t test
Theoretically, the null can predict any
difference.
Pragmatically, the null almost always
predicts that there will be no change at all
from the first to the second measurement,
that the average difference between time
1 and time 2 will be 0.00.
Mathematically H0: muD = 0.00, where
muD is the average difference score.
11/7/2015
68
Does this look familiar?
We have a single set of difference scores
to compare to muT.
In the single sample t test, we compared
a set of scores to muT.
So the repeated measures t test is just
like the single sample t test.
Only this time our scores are difference
scores.
11/7/2015
69
To do a t test, we need the
expected mean under the
null.
We have that, muT=0.00.
11/7/2015
70
We also need the expected
amount of difference
between the two means
given random sampling
fluctuation.
sDbars/ nD
11/7/2015
71
The expected fluctuation of the difference
scores is called the estimated standard error of
the difference scores, sD-bar.
The estimated standard error of the difference
scores = the estimated standard deviation of
the difference scores divided by the square root
of the number of differences scores.
It has nD-k = nD – 1 degrees of freedom, where
nD is the number of difference scores.
Here is the formula for sD-bar
sDbar  sD / nD
11/7/2015
72
The repeated measures t is a
version of the single sample t
test:
t equals the actual average
difference between the
means minus their difference
under H0 divided by the
estimated standard error of
the difference scores
t (dfD )  ( Dbar  muT ) / sDbar
11/7/2015
73
By the way
Repeated measures designs are the simplest
form of related measures designs, in which the
each participants in each group is related to one
participant in each of the other groups.
The simplest way for participants in one group
to be related to each other is to use the same
participants in each group.
But there are other ways. For example, each
mouse in a four condition experiment could
have one litter-mate in each of the other
conditions.
But the commonest design is repeated
measures, and that is what we will study
11/7/2015
74
11/7/2015
75
11/7/2015
76
11/7/2015
77
11/7/2015
78
11/7/2015
79
11/7/2015
80
11/7/2015
81
11/7/2015
82