Document 7178071

Download Report

Transcript Document 7178071

Ch.14
Nonparametric
Statistical
Method
•
•
•
•
•
•
•
Ahmad Yusuf
Yoojung Chung
Malek Deib
Mihir Shah
Jung Yeon Lee
Hyun Yoon
Nadia Saleh
•
•
•
•
•
•
•
Se Hyun Ji
Chan Min Park
Kyunghyun Ma
Mi Jeong Kim
Wonchang Chio
Tarique Jawed
Hyun Keun Cho
What is NONPARAMETRIC Statistics?
• Normality doesn’t hold for all data.
• Similarly, some data may not have any
particular fixed distribution such as Binomial
or Poisson.
• Such sets of data are called Non-parametric
data or Distribution-free .
• We use nonparametric tests for these
populations.
When do we use NONPARAMETRIC Statistics?
• the population distribution is highly skewed or
very heavily tailed.
Median is a better measure to find the center
than the mean.
• The sample size is small (usually less than 30)
and not normal
(we find that out using SAS orother statistical
programs).
14.1.1 Sign Test and Confidence Interval
Sign test for a Single Sample
We want to test a hypothesis at a
significant level  if the true
median is above a certain known
value .
14.1.1 Sign Test and Confidence Interval
Example:
THERMOSTAT DATA:
202.2
203.4
200.5
202.5
206.3
198.0
203.7
200.8
.
201.3
199.0
Perform the sign test to
determine if the median
setting is different from
the design setting of 2000 F.
14.1.1 Sign Test and Confidence Interval
STEP 1:
H 0   0  200
H a   0  200
STEP 2:
We find the signs of each sample by comparing
with 200.
s  xi  u0  8
.
s  xi  u0  2
202.2 > 200
203.4 > 200
200.5 > 200
202.5 > 200
206.3 > 200
198.0 < 200
203.7 > 200
200.8 > 200
201.3 > 200
199.0 < 200
14.1.1 Sign Test and Confidence Interval
What do we do if there is a Tie?
xi  0
1) We can break the tie at random, meaning putting it with either
s or s . For a large sample it may not make a big difference,
but the result may vary significantly for a small sample.
2) We can contribute ½ towards each s and s . However, we
can not calculate the p-value using fractions. So, we should
. it.
not do
3) We exclude the ties. This may reduce the sample size and
hence the power of the test. For a large sample, it should not
be a big deal.
14.1.1 Sign Test and Confidence Interval
STEP 3:
s ~ Bin(10, 8
)
10
s ~ Bin(10, 2
)
10
Why Binomial?
Well, S+ and S- are the only two variables in the sample set, n.
s  s   n
s
s
n
  
1
n
n
n
.
s
p
n
P ( s ) 
s
p
n
P( s ) 
s
s
 1   1 p
n
n
both s and s are binomially distributed with
probability p and 1-p respectively.
S ~ Bin(n, p)
AND
S ~ Bin(n,1  p)
14.1.1 Sign Test and Confidence Interval
STEP 4:
S ~ Bin(10, 1 )
2
When H 0 :   0 is true, the 0 is the true median.
Therefore, s  s and p=1/2, because the number of
samples above the median is equal to the number of
samples below the media. Consequently, 1-p= ½
too.
.
S ~ Bin(n, 1 )
2
and
S ~ Bin(n, 1 )
2
Since they both S+ and S- have the same binomial distribution, we
can denote a common r.v S:
S ~ bin (n, ½).
14.1.1 Sign Test and Confidence Interval
Now we can calculate the p-value using the
binomial distribution:
10 10
 n 1 
  1 
P  value  P S  s       P S  8       0.055
i  s  i   2 
i 8  8   2 
n
n
10
alternatively,
10
2
 n 1 
10   1 
P  value  P S  s        P S  2        0.055
i 0  i   2 
i 0  2   2 
s
.
n
14.1.1 Sign Test and Confidence Interval
STEP 5:
We compare our p-value with the
significant level:
P-value= .055
At  = .05, P-value = .055>.05.
We fail to reject the null hypothesis.
14.1.1 Sign Test and Confidence Interval
For large sample (n  20)::
We can also approximate it by normal distribution.
E  S   E  S   n
z
2
and Var  S   Var  S   n
s  n / 2  1/ 2
n/4
4
where ½ is the continuity
correction.
We reject the null hypothesis if z  z .
Equivalently,
s  n  1  z n  bn, after rearranging.
2
2
4
14.1.1 Sign Test for matched pairs
Sign test for Matched Pairs
When observations are matched
Then:
- S+ = the positive differences
- S- = the negative differences
Note: the magnitude of the differences is not
known
When pairs are matched P(A,B)= P(B,A)
14.1.1 Sign Test for matched pairs
No.
Method
A
Method
B
No
Method
A
Method
B
Difference
Differences
i
xi
yi
di
i
xi
yi
di
1
6.3
5.2
1.1
14
7.7
7.4
0.3
2
6.3
6.6
-0.3
15
7.4
7.4
0
3
3.5
2.3
1.2
16
5.6
4.9
0.7
4
5.1
4.4
0.7
17
6.3
5.4
0.9
5
5.5
4.1
1.4
18
8.4
8.4
0
6
7.7
6.4
1.3
19
5.6
5.1
0.5
7
6.3
5.7
0.6
20
4.8
4.4
0.4
8
2.8
2.3
0.5
21
4.3
4.3
0
9
3.4
3.2
0.2
22
4.2
4.1
0.1
10
5.7
5.2
0.5
23
3.3
2.2
1.1
11
5.6
4.9
0.7
24
3.8
4
-0.2
12
6.2
6.1
0.1
25
5.7
5.8
-0.1
13
6.6
6.3
0.3
26
4.1
4
0.1
14.1.1 Sign Test for matched pairs
Note that for the matched paired test all tied
entries (xi = yi) are disregarded
Then
n=23 since xi = yi, for i=15,18,21
S+ = 20
S- = 3
Using
S -n/2  1/ 2
z
n/4
14.1.1 Sign Test for matched pairs
20-23/2-1/2
 3.336
23/ 4
.
Two sided p-value:
2(1- Φ(3.336))=0.0008
-This indicates a significant difference
between Method A and Method B
14.1.2 Wilcoxon Signed Rank Test
Who is Frank Wilcoxon?
Born: September 2 1892
Wilcoxon was born to American parents in County Cork, Ireland.
Frank Wilcoxon grew up in Catskill, New York although he
did receive pat of his education in England. In 1917 he
graduated from Pennsylvania Military College with a B.S. He
then received his M.S. in Chemistry in 1921 from Rutgers
University. In 1924 he received his PhD from Cornell
University in Physical Chemistry.
In 1945 he published a paper containing two tests he is most
remembered for, the Wilcoxon signed-rank test and the
Wilcoxon rank-sum test. His interest in statistics can be
accredited to R.A. Fisher's text, Statistical Methods for
Research Worker (1925).
Over the course of his career Wilcoxon published 70 papers.
14.1.2 Wilcoxon Signed Rank Test
Who is Frank Wilcoxon?
14.1.2 Wilcoxon Signed Rank Test
Alternative method to the Sign Test
The Wilcoxon Signed Rank Test
Improves on the Sign Test.
Unlike the sign test the Wilcoxon Signed Rank
Test not only looks at whether xi>µ̃ or xi<µ̃, but it
also considers the magnitude of the
difference di=xi-µ̃0.
14.1.2 Wilcoxon Signed Rank Test
Note : Wilcoxon Signed Rank Test
assumes that the observed
population distribution is
symmetric.
(Symmetry is not required for the
Sign Test)
.
14.1.2 Wilcoxon Signed Rank Test
Step 1:
Rank order the differences di in terms of their
absolute values.
Step 2:
w+= sum ri (ranks) of the positive differences.
w-= sum ri (ranks) of the negative differences.
if we assume no ties
Then
w+ + w- = r1+ r2+ … + rn = 1 + 2 + 3 + … + n
= n( n  1)
2
14.1.2 Wilcoxon Signed Rank Test
Step 3:
reject H0 if w+ is large or if w- is small!!
14.1.2 Wilcoxon Signed Rank Test
The size of w+ , w- needed to reject H0 at α is
determined using the distributions of the
corresponding W+ , W- r.v.’s when H0 is true.
Since the null distributions are identical and
symmetric the common r.v. is denoted by W.
p-value:
= P(W ≥ w+) = P(W ≤ w-)
Reject H0 if p-value is ≤ α
14.1.2 Wilcoxon Signed Rank Test

Zi  

n
1 if ith rank corresponds to a positive sign
0 if ith rank corresponds to a negative sign
W    iZi
i 1
Zi ~ Bernoulli (P)
P=p(xi > µ̃0̃ ) , P=1/2
E (w+) = E(Σ iZi)
= E(1Z1+2Z2+…+nZn)
= E(1Z1)+E(2Z2)+…+E(nZn)
= 1E(Z1) + 2E(Z2)+…+nE(Zn),
= 1E(Z1) + 2E(Z1)+…+nE(Z1)
= (1+2+3+…+n) E(Z1)
n(n  1)

p
2
[ E(Z1)= E(Z2)=…=E(Zn) ]
14.1.2 Wilcoxon Signed Rank Test
Var(W+) = Var(ΣiZi)
= Var(1Z1+2Z2+…+nZn)
= Var(1Z1)+Var(3Z2)+…+Var(nZn)
= 1²Var(Z1)+ 2²Var(Z2)+…+n²Var(Zn)
= 1²Var(Z1)+ 2²Var(Z1)+…+n²Var(Z1)
= (1²+2²+…+n²) Var(Z1)
n(n  1)(2n  1)
 p (1  p )
6
14.1.2 Wilcoxon Signed Rank Test
Then a Z-test is based on the statistic:
z
n(n  1)
w 
 1/ 2
4

n(n  1)(2n  1)
24
H0: µ̃ = µ̃0
Ha: µ̃ ≥ µ̃0
Reject H0 if Z ≥ Zα
14.1.2 Wilcoxon Signed Rank Test
H0: µ̃ = µ̃0
Ha: µ̃ ≤ µ̃0
reject H0 if Z ≤ Zα
H0: µ̃ = µ̃0
Ha: µ̃ ≠ µ̃0
reject H0 if
(1) Z ≥ Zα
(2) Z ≤ -Zα
The two sided p-value is
2p( W ≥ wmax ) = 2p( W ≤ wmin )
14.1.2 Summary
Sign Rank Test VS Sign Test
Weighs each signed
difference by its rank
If the positive differences are
greater in magnitude than the
negative differences they get
a higher rank resulting in a
larger value of w+
This improves the power of
the signed rank test
But it also affects the type I
error if the population
distribution is NOT symmetric
Counts the number of
differences
YOU WOULDN’T WANT THIS
TO HAPPEN!
14.1.2 Summary
Sign Rank Test VS Sign Test
Preferred
Test
And the winner is…
14.1.2 Summary
I pity the Fu that messes with the Wilcoxon
Signed Rank Test !!!
14.1.2 Wilcoxon Signed Rank Test
No.
Method
A
Method
B
Difference
No
Method
A
Method
B
Differences
i
Xi
Yi
Di
i
Xi
Yi
Di
1
6.3
5.2
1.1
19.5
14
7.7
7.4
0.3
8
2
6.3
6.6
-0.3
8
15
7.4
7.4
0
-
3
3.5
2.3
1.2
21
16
5.6
4.9
0.7
15
4
5.1
4.4
0.7
15
17
6.3
5.4
0.9
17.5
5
5.5
4.1
1.4
23
18
8.4
8.4
0
-
6
7.7
6.4
1.3
22
19
5.6
5.1
0.5
12
7
6.3
5.7
0.6
17.5
20
4.8
4.4
0.4
10
8
2.8
2.3
0.5
12
21
4.3
4.3
0
-
9
3.4
3.2
0.2
5.5
22
4.2
4.1
0.1
2.5
10
5.7
5.2
0.5
12
23
3.3
2.2
1.1
19.5
11
5.6
4.9
0.7
15
24
3.8
4
-0.2
5.5
12
6.2
6.1
0.1
2.5
25
5.7
5.8
-0.1
2.5
13
6.6
6.3
0.3
8
26
4.1
4
0.1
2.5
Rank
Rank
14.1.2 Wilcoxon Signed Rank Test
w- = 8 + 5.5 + 2.5 = 16
then
23(24)
 16  260
w+ =
2
z
23(24)
260 
 1/ 2
4

 3.695
23(24)(47)
24
.
Two sided p-value:
2(1 – Φ(3.695)) = 0.0002
14.1.2 Wilcoxon Signed Rank Test
• If di = 0 then the observations are dropped and
only the nonzero differences are retained
• Given |di|’s are tied for the same rank a new
rank is assigned to them called the midrank.
14.1.2 Wilcoxon Signed Rank Test
No.
A
B
Diff
R
No.
A
B
Diff
R
i
xi
yi
di
ri
i
xi
yi
di
ri
15
7.4
7.4
0
-
8
2.8
2.3
0.5
12
18
8.4
8.4
0
-
10
5.7
5.2
0.5
12
21
4.3
4.3
0
-
19
5.6
5.1
0.5
12
12
6.2
6.1
0.1
2.5
7
6.3
5.7
0.6
18
22
4.2
4.1
0.1
2.5
4
5.1
4.4
0.7
15
25
5.7
5.8
-0.1
2.5
11
5.6
4.9
0.7
15
26
4.1
4
0.1
2.5
16
5.6
4.9
0.7
15
9
3.4
3.2
0.2
5.5
17
6.3
5.4
0.9
18
24
3.8
4
-0.2
5.5
1
6.3
5.2
1.1
20
2
6.3
6.6
-0.3
8
23
3.3
2.2
1.1
20
13
6.6
6.3
0.3
8
3
3.5
2.3
1.2
21
14
7.7
7.4
0.3
8
6
7.7
6.4
1.3
22
20
4.8
4.4
0.4
10
5
5.5
4.1
1.4
23
14.1.2 Wilcoxon Signed Rank Test
In the new table we see that when
n=12,22,25,26 |di|=0.1
Then d1=d2=d3=d4=0.1
Then
1  2  3  4 10

 2.5
4
4
Therefore the new ranks of the above
differences are not 1,2,3,4 but rather 2.5
14.2 Inferences for independent samples
1. Wilcoxon rank sum test
Assumption: There are no ties in the two samples.
Hypothesis: H 0 :   0 vs.H a :   0
Step1: Rank all observations
Step2: Sum the ranks of the two samples
separately( w1 =sum the ranks of the x’s,
w2 =sum the ranks of the y’s)
.
Step3: Reject
null hypothesis if w1 is large or if w2 is
small
Problem: Distributions of W1 , W2 are not same when
n1  n2
14. 2.1 Wilcoxon-Mann-Whitney Test
1. Mann – Whitney test
Step1: Compare each xi with each y j
( u1 = #pairs in which xi  y j, u2= #pairs in which xi  y)j
Step2: Reject H 0 if u1 is large or u2 is small
statistic: u1  w1 
n1 (n1  1)
n (n  1)
, u2  w2  2 2
2
2
Rank sum test
P-value: P(U  u1 )  P(U  u2 )
For large samples, we approximate to normal, when
.
n1n2
n1n2 ( N  1)
E (U ) 
, Var (U ) 
2
12
Rejection rule: If Z 
u1  E (U ) 
Var (U )
1
2  z , then

reject H 0
14. 2.1. Wilcoxon-Mann-Whitney Test
Example: Failure Times of Capacitors
•
Table 1: Times to Failure
Control Group
Stressed Group
5.2
17.1
1.1
7.2
8.5
17.9
2.3
9.1
9.8
23.7
3.2
15.2
12.3
29.8
6.3
18.3
7.0
21.1
group
H 0 : F1  F2vs.H a : F1  F2
•
•
•
T.S.: w1=95, w2=76, u1 =59, u2 =21
P-value=.051 from Table A.11
Compare with large sample
normal approx:
(8)(10) 1
59 

2
2  1.643
Z
(8)(10)(19)
12
•
P-value= .052
Table 2: Ranks of Times to Failure
Control Group
Stressed Group
4
13
1
7
8
14
2
9
10
17
3
12
11
18
5
15
6
16
F1 : c.d.f. of the control group
and F : c.d.f. of the stressed
2
14. 2.1. Wilcoxon-Mann-Whitney Test
Null Distribution of the Wilcoxon-MannWhitney Test Statistic
Assumption:
Under H0, all N= n1 + n2 observations
come from the common distribution
F1 =F2.
.
All possible orderings of these
observations with n1 coming from F1
and n2 coming from F2 are equally
likely.
14. 2.1. Wilcoxon-Mann-Whitney Test
Example: Find the null distribution of W1
and U1 when n1 =2 and n2 =2.
Ranks
w1
1
2
3
4
x
x
y
y
3
x
y
x
y
x
y
y
y
y
y
y
u1
Null distn of W1
and U1
w1
u1
p
0
3
0
1/6
4
1
4
1
1/6
x
5
2
5
2
2/6
x
x
7
4
6
3
1/6
x
y
x
6
3
7
4
1/6
x
x
y
5
2
14. 2.2. Wilcoxon-Mann-Whitney Confidence Interval
F1 and F2 distrbutions belong to a location parameter family
with location paramaters 1 and  2 , respectively.
 F1 ( x)  F ( x  1 ) and F2 ( y )  F ( y   2 )
where F is a common unknown distribution function,
 1 and  2 are the respective population medians.
14. 2.2. Wilcoxon-Mann-Whitney Confidence Interval
A CI for  1   2 can be obtained by inverting the Mann-Whitney test.
The procedure is as follows:
Step 1
Calculate all N  n1n 2 differences
dij  xi  y j (1  i  n1, 1  j  n 2)
and rank them:
d (1)  d (2)    d ( N )
where d (i )is the ordered values of the differences dij  xi  yj
Step 2
Let u  u
n1 ,n 2 ,1 α/ 2
be the lower  /2 critical point
of the null distribution of the U  statistics.
Then a 100( 1 -  )% CI for  1   2 is given by
d ( u 1)   1   2  d ( N u )
14. 2.2. Wilcoxon-Mann-Whitney Confidence Interval
Example
Find 95% CI for the difference between the median failure times of the control group
and thermally stressed group of capacitors the data from ex 14.7.
n1  8, n 2  10, N  n1n 2  8*10  80
The lower 2.2% critical point of the distribution of U  17
By symmetry the upper 2.2% critical point  80-17  63
Setting  / 2  0.022 , 1-  1  0.044  0.956
 95.6 % CI for the difference between the median failure times
 [d (18), d (63)]
where the d (i ) are the ordered values of the differences dij  xi  yj
Differences dij are calcucated in an array form in Table 14.7.
Counting the 18th ordered differences from the lower and high ends.
Therefore, 95.6% CI for the difference of the two medians
 [d (18), d (63)]  [-1.1, 14.7]
14. 2.2. Wilcoxon-Mann-Whitney Confidence Interval
Table A.11 (pg. 684)
n1
n2
u1=upper critical point
(80-u1=lower critical point)
P(W>w1)
Upper Tail
Probabilities
8
10
59 (80-59=21)
0.051
10
62 (80-62=18)
0.027
10
63 (80-63=17)
0.022
10
66 (80-66=14)
0.010
10
68 (80-68=12)
0.006
14.3 Inferences for Several Independent Samples
One-way layout experiment
The data classified according to the level of a single
treatment factor.
Completely Randomized Design
• Comparing a > 2 treatment.
• The available experimental units are
randomly assigned to each treatment.
• No. of experimental units in different
treatment groups does not have to be
same.
14.3 Inferences for Several Independent Samples
Example of One-way layout experiment
• Comparing effectiveness of different pills
on migraine.
• Comparing duration of different tires.
• etc…
Treatment
1
2
……
a
X1 1
X1 2
:
:
X1 n1
X2 1
X2 2
:
:
X2 n2
……
:
:
:
……
Xa 1
Xa 2
:
:
Xa na
Sample Median
Θ1
Θ1
……
Θa
Sample SD
S1
S2
……
Sa
14.3 Inferences for Several Independent Samples
Assumption
1.
The data on the each treatment
form a random sample from a continuous
c.d.f. Fi.
F1
2.
3.
Random samples are
independent.
F2
Fi ( y ) = F ( y – Θ i ) ,
where Θi is the location
Fa
of parameter of Fi
Θi = Median of Fi
Θ1
Θ2
Θa
14.3 Inferences for Several Independent Samples
Hypothesis
H0: F1 = F2 = … = Fa
H1: Fi < Fj for some i = j
F1
F2
Θ1
It can be changed to
Θ2
H0: Θ1 = Θ2 = … = Θa
H1: Θi > Θj for some i = j
Fa
Θa
Can we say that all Fi’s are the same?
14.3.1 Kruskal – Wallis Test
STEP 1:
Rank all N = ∑ ai=1 ni observations in ascending
order. Assign mid-ranks in case of ties.
rij = rank (yij)
N = ∑ rij = 1 + 2 + … + N =
N(N+1)
(N+1)
E[r]=
2
2
STEP 2:
i
Calculate rank sums ri = ∑ nj=1
rij
and averages ri = ri / ni, i = 1, 2, …, a.
14.3.1 Kruskal – Wallis Test
STEP 3:
Calculate the Kruskal-Wallis test statistic
kw =
=
12
a
(N+1)
N(N+1)
i=1
2
12
a
N(N+1)
∑ ni ( ri ∑
i=1
ri 2
ni
)
2
- 3( N + 1 )
STEP 4:
Reject H0 for large value of kw.
If ni’s are large, kw follows chi-square dist. with
a-1 degrees of freedom.
14.3.1 Kruskal – Wallis Test
Example :
NRMA, the world’s biggest car insurance
company, has decided to test the durability of
tires from 4 major companies.
14.3.1 Kruskal – Wallis Test
Example :
Ranks
Average
of Average
Test Scores
Test Scores
Different Tires from 4 major co.
∑
3
14.59
13
23.44
16
25.43
5
18.15
9
20.82
1
14.06
2
14.26
8
20.27
17
26.84
4
14.71
10
22.34
6
19.49
24.92
14.5
7
20.20
19
27.82
24.92
14.5
20
28.68
11
23.32
23
32.85
26
33.90
12
23.42
24
33.16
18
26.93
22
30.43
27
36.43
28
37.04
28
21
29.76
25
33.88
49
66.5
125.5
165
14.3.1 Kruskal – Wallis Test
Example :
14.3.1 Kruskal – Wallis Test
Example :
12
kw =
a
∑
N(N+1)
i=1
ri 2
- 3( N + 1 )
ni
(49)2 (66.5) 2 (125.5) 2 (165)2
=
+
+
+
+
28(29)
7
7
7
7
12
- 3(29)
= 18.134
[
]
14.3.1 Kruskal – Wallis Test
Example :
kw = 18.34 > X3,.005 = 12.837
14.3.2 Pairwise Comparisons
Comparing 2 groups among treatments
H0: E ( Ri – Rj ) = 0 and
Var( Ri – Rj ) =
N(N+1)
12
1
(
ni
+
1
nj
)
For large ni’s, Ri – Rj follows approximately
normally distributed.
zij =
ri - rj
N(N+1)
12
1
(n
i
+
1
nj
)
14.3.2 Pairwise Comparisons
To control the type I familywise error rate
at level 
IzijI – statistic should be referred to appropriate
Studentized range distribution.
Tukey Method ( Chapter. 12)
14.3.2 Pairwise Comparisons
• No. of treatment group compared = a
• Degree of freedom = ∞
(assumption : sample is large)
Compare with critical constant q a, ∞,.  .
I zij I >
I ri - rj I >
q a,∞, 
2
or
q a,∞, 
N(N+1)
2
12
(
1
ni
+
1
nj
)
14.3.2 Pairwise Comparison
Example :
Ranks of Average Test Scores
Different Tires from 4 major co.
∑
3
13
16
5
9
1
2
8
17
4
10
6
14.5
7
19
14.5
20
11
23
26
12
24
18
22
27
18
21
25
49
66.5
125.5
165
14.3.2 Pairwise Comparison
Example : Let

be .05.
q a,∞, 
N(N+1)
2
12
3.63
=
(
(28)(29) 1
1
+
ni
nj
)
1
(7 + 7 )
differ
from
I r1 – r4 I , I rGOODYEAR!!!
1 – r4 I > 11.29
2
= 11.29
12 We
1
14.4 Inferences for Several Matched Samples
Randomized block design
a  2 treatment groups and b  2 blocks.
Friedman test
A distribution-free rank-based test for comparing the
treatments in the randomized block design
Hypothesis
H0: F1j = F2j = … = Faj H1: Fij < Fkj for some i = k
It can be changed to
H0: Θ1 = Θ2 = … = Θa H1: Θi > Θk for some i = k
14.4.1 Friedman Test
STEP 1:
Rank all N = ∑ a ni observations in ascending
i=1
order. Assign mid-ranks in case of ties.
rij = rank (yij)
STEP 2:
Calculate rank sums ri = ∑ b rij , i = 1, 2, …, a.
j=1
14.4.1 Friedman Test
STEP 3:
Calculate the Friedman test statistic
fr =
=
12
ab( a + 1 )
12
ab( a + 1 )
a
b(a+1)
i=1
2
∑ ( ri a
2
)
2
∑ ri - 3b( a + 1 )
i=1
STEP 4:
Reject H0 for large value of fr.
If ni’s are large, fr follows chi-square dist. with
a-1 degrees of freedom.
14.4.1 Friedman Test
Example :
Drip loss in Meat Loaves
Oven
Position
Batch
1
Rank
2
Rank
3
Rank
Rank
sum
1
2
3
4
5
6
7
8
7.33
3.22
3.28
6.44
3.83
3.28
5.06
4.44
8
1
2.5
7
4
2.5
6
5
8.11
3.72
5.11
5.78
6.50
5.11
5.11
4.28
8
1
4
6
7
4
4
2
8.06
4.28
4.56
8.61
7.72
5.56
7.83
6.33
7
1
2
8
5
3
6
4
23
3
8.5
21
16
9.5
16
11
14.4.1 Friedman Test
Example : Friedman test statistic equals
12
fr =
∑
ab( a + 1 )
a
2
2
= 17.583 >

2
7,.025
=
ri - 3b( a + 1 )
i=1
2
12
2
2
2
2
2
2
[ 23 +3 +8.5 +21 +16 +9.5 +16 +11 ] – 3*3*9
8*3*9
= 16.012
significant differences between the
oven positions
However, No. of blocks is only 3; the large
sample chi-square approximation may not be
accurate.
14.4.2 Pairwise Comparisons
Comparing 2 groups among treatments
H0: E ( Ri – Rj ) =0 and
Var( Ri – Rj ) =
a(a+1)
6b
As in the case of the Kruskal-Wallis test, i and j can
be declared different at significance level  if
ri  r j 
qa , , a
2
a ( a  1)
6b
14.5 Rank Correlation Methods
What is Correlation?
Correlation indicates the strength and direction
of a linear relationship between two random
variables.
In general statistical usage, correlation to the
departure
of two variables from independence.
.
Correlation does not imply causation.
14.5.1 Spearman’s Rank Correlation Coefficient
Charles Edward Spearman
BTW, he looks like Sean
• Born September 10, 1863
Connery
• Died September 7, 1945
(82 years old)
• An English psychologist
known. for work in statistics,
as a pioneer of factor
analysis, and for
Spearman's rank
correlation coefficient.
14.5.1 Spearman’s Rank Correlation Coefficient
What are we correlating?
• Yearly alcohol consumption from wine
• Yearly heart disease (Per 100,000)
.
• 19 Country Study
D
A
T
A
Xi
Yi
Ui
Vi
Rank X
Rank Y
Di
No.
Country
Alcohol
from Wine
Heart
Disease
Deaths
1
Australia
2.5
211
11
12.5
-1.5
2
Austria
3.9
167
15
6.5
8.5
3
Belgium
2.9
131
13.5
5
8.5
4
Canada
2.4
191
10
9
1
5
Denmark
2.9
220
13.5
14
-0.5
6
Finaland
0.8
297
3
18
-15
7
France
9.1
71
19
1
18
8
Iceland
0.8
211
3
12.5
-9.5
9
Ireland
0.7
300
1
19
-18
10
Italy
7.9
107
18
3
15
11
Netherlands
1.8
167
8
6.5
1.5
12
New Zealand
1.9
266
9
16
-7
13
Norway
0.8
227
3
15
-12
14
Spain
6.5
86
17
2
15
15
Sweden
1.6
207
7
11
-4
16
Switzerland
5.8
115
16
4
12
17
UK
1.3
285
6
17
-11
18
US
1.2
199
5
10
-5
19
W. Germany
2.7
172
12
8
4
14.5.1 Spearman’s Rank Correlation Coefficient
Spearmans Rank Correlation Coefficient
• A nonparametric (distribution-free) rank statistic
proposed in 1904 as a measure of the strength
of the associations between two variables.
• The Spearman rank correlation coefficient can
be used to give a measure of monotone
association that is used when the distribution of
the data make Pearson's correlation coefficient
undesirable.
14.5.1 Spearman’s Rank Correlation Coefficient
Relevant Formulas
n
rs 
 (u  u )(v  v )
i 1
i
i
n
n
i 1
i 1
( (ui  u ) 2 )( (vi  v ) 2 )
If Di is integer then:
n
rs  1 
6 d i 2
i 1
2
n(n  1)
14.5.1 Spearman’s Rank Correlation Coefficient
Examples
• From previous data we calculate:
n
rs  1 
6 d i 2
i 1
2
n(n  1)
(6)(2081.5)
rs  1 
 0.826
(19)(360)
14.5.1 Spearman’s Rank Correlation Coefficient
Hypothesis Testing Using Spearman
• Ho: X and Y are independent
• Ha: X and Y are positively
associated
14.5.1 Spearman’s Rank Correlation Coefficient
• For large values of N (> 10) is approximated
by the normal distribution with a mean
E ( Rs )  0
1
Var ( Rs ) 
n 1
Using the test statistic:
z  rs n  1
14.5.1 Spearman’s Rank Correlation Coefficient
Examples
• From previous data we calculate:
z  rs n  1
z  0.826 18  3.504
• P-value = 0.0004
14.5.2 Kendall’s Rank Correlation Coefficient
• born September 6, 1907
• died March 29, 1983 (76 years
old)
• Maurice Kendall was born in
Kettering, North Hampton shire
• He studied mathematics at St.
John's College, Cambridge,
where he played cricket and
chess
• After graduation as a
Mathematics Wrangler in 1929,
he joined the British Civil Service
in the Ministry of Agriculture. In
this position he became
increasingly interested in using
statistics.
• Developed the rank correlation
coefficient in 1948.
14.5.2 Kendall’s Rank Correlation Coefficient
Kendall’s Rank Correlation Coefficient
• A pair of Bivariate random variables
( X i , Yi )
( X j ,Yj )
• Concordant:
( X i  X j )(Yi  Y j )  0
• Which implies:
Xi  X j
Xi  X j
AND
or
AND
Yi  Y j
Yi  Y j
14.5.2 Kendall’s Rank Correlation Coefficient
Kendall’s Rank Correlation Coefficient
• Discordant:
( X i  X j )(Yi  Y j )  0
• Which implies:
Xi  X j
Xi  X j
AND
or
AND
Yi  Y j
Yi  Y j
14.5.2 Kendall’s Rank Correlation Coefficient
Kendall’s Rank Correlation Coefficient
• Tied Pair:
( X i  X j )(Yi  Y j )  0
• Which implies:
Xi  X j
OR
OR
BOTH
Yi  Y j
14.5.2 Kendall’s Rank Correlation Coefficient
Relevant Formula
 c  P(Concordant )  P(( X i  X j )(Yi  Y j )  0)
 d  P( Discordant )  P(( X i  X j )(Yi  Y j )  0)
  c d
ˆ  ˆc  ˆd
1    1
14.5.2 Kendall’s Rank Correlation Coefficient
Relevant Formula
Nc  Nd
ˆ 
N
Nc = # of Concordant Pairs
Nd = # of Discordant Pairs
14.5.2 Kendall’s Rank Correlation Coefficient
Formula Continued
If there are ties the formula is modified:
Suppose there are g groups of tied Xi’s with aj tied
observations in the jth group and h groups of tied
Yi’s with bj tied observations in the jth group.
 aj 
Tx    
j 1  2 
g
 bj 
Ty    
j 1  2 
h
Nc  Nd
ˆ 
( N  Tx )( N  Ty )
14.5.2 Kendall’s Rank Correlation Coefficient
Formula Explanation
• Five pairs of observations:(x,y)=
(1,3) • There is g=1 group of a =3 tied
1
(1,4) x’s equal to 1 and there are h=2
(1,5) groups of tied y’s
(2,5) • Group 1 has b =2 tied y’s qual to
1
(3,4) 4 and group 2 has b2=2 tied y’s
equal to 5.
14.5.2 Kendall’s Rank Correlation Coefficient
Formula Example continued…
 3  2
Tx        3  1  4
 2  2
 2  2
Ty        1  1  2
 2  2
Data
i
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Country
Ireland
Iceland
Norway
Finland
US
UK
Sweden
Netherlands
New Zealand
Canada
Australia
West
Germany
Belgium
Denmark
Austria
Switzerland
Spain
Italy
France
Xi
0.7
0.8
0.8
0.8
1.2
1.3
1.6
1.8
1.9
2.4
2.5
Yi
300
211
227
297
199
285
207
167
266
191
211
Nci
0
3
2
0
5
0
3
5
0
2
1
Ndi
18
11
13
15
9
13
9
5
10
7
7
Nti
0
3
1
0
0
0
0
1
0
0
0
2.7
2.9
2.9
3.9
5.8
6.5
7.9
9.1
172
131
220
167
115
86
107
71
1
2
0
0
0
1
0
0
Nc=25
6
4
5
4
3
1
1
0
Nd=141
0
0
0
0
0
0
0
0
Nt=5
14.5.2 Kendall’s Rank Correlation Coefficient
Testing Example
Nc  Nd
ˆ 
( N  Tx )( N  Ty )
25  141
ˆ 
 0.690
(171  4)(171  2)
14.5.2 Kendall’s Rank Correlation Coefficient
Hypothesis Testing
H0 :  0
H a :  0
E (ˆ)  0
2(2n  5)
Var (ˆ) 
9n(n  1)
9n(n  1)
z  ˆ
2(2n  5)
14.5.2 Kendall’s Rank Correlation Coefficient
Testing Example
25  141
ˆ 
 0.690
(171  4)(171  2)
(9)(19)(18)
z  0.690
 4.128
2(43)
• P-value < 0.0001
14.5.3 Kendall’s Coefficient of Concordance
Kendall’s Coefficient of Concordance
Q: Why do we need Kendall’s Coefficient of
Concordance?
A: It is a measure of association between several
matched samples.
Q: Why not use Kendall’s Rank Correlation
Coefficient instead?
A: Because its only works for two samples.
14.5.3 Kendall’s Coefficient of Concordance
Kendall’s Coefficient of Concordance
• How can you apply this to real life?
• A common & interesting example:
• A taste-testing experiment used four tasters to
rank eight recipes with the following results. Are
the tasters in agreement??
Hmm… lets find out! 
14.5.3 Kendall’s Coefficient of Concordance
Kendall’s Coefficient of Concordance
•
•
•
•
•
•
•
•
•
Recipe
1
2
3
4
5
6
7
8
1
5
7
1
3
4
2
8
6
Taster
2 3
4 5
5 7
2 1
3 2
6 4
1 3
7 8
8 6
4
4
5
3
1
6
2
8
7
Rank
Sum
18
24
7
9
20
8
31
27
14.5.3 Kendall’s Coefficient of Concordance
How does it work?
• It is closely related to Freidman’s test statistic
(mentioned in 14.4).
• The “a” treatments are candidates
(recipes).
• The “b” blocks are judges (Tasters).
• Each judge ranks the “a” candidates.
14.5.3 Kendall’s Coefficient of Concordance
Kendall’s Coefficient of Concordance
• The discrepancy of the actual rank sums under
perfect disagreement as defined by:
b(a  1) 

d   ri 

2 
i 1 
a
2
• Is a measure of agreement between the judges
14.5.3 Kendall’s Coefficient of Concordance
Kendall’s Coefficient of Concordance
• The maximum value of this measure is
attained when there is perfect agreement :
• It is given by:
b(a  1)  b a (a  1)

  ib 
 
2 
12
i 1 
a
d max
2
2
2
14.5.3 Kendall’s Coefficient of Concordance
Kendall’s Coefficient of Concordance
• Kendall’s “w” statistic :
• Is an estimate of the variance of the row sums
of ranks Ri divided by the maximum possible
value the variance can take:
w
d
d max
12
 b(a  1) 
 2 2
ri 


b a(a  1) i 1 
2 
a
2
• This occurs when all judges are in agreement.
• Hence;
0  w 1
14.5.3 Kendall’s Coefficient of Concordance
Kendall’s Coefficient of Concordance
• What relationship does “w” and “fr”,
Freidman’s statistic have?
fr
w
b(a  1)
• Does the Kendall’s “w” statistic relate to the
Spearman’s rank correlation coefficient?
• only when a=2 :
rs  2w  1
14.5.3 Kendall’s Coefficient of Concordance
Kendall’s Coefficient of Concordance
Q: How can we perform statistical tests?
What distribution does it follow?
• In order to perform a test on “w” for statistical
significance:
2
(

) distribution.
• Use chi-square
• Use (n-1) degrees of freedom.
Kendall’s Coefficient of
Concordance
• In order to find out whether or not tasters are
in agreement, we calculate the Kendall’s
coefficient of concordance.
• Freidman’s statistic: fr=24.667
• Therefore,
• w = 24.667/ (4)(8)= 0.881 ,
2

7,.05 =14.067,
• Comparing fr=24.667 with
since fr exceeds this critical value
•  we conclude that tasters agree .
14.6.1 Permutation Tests
Permutation Test
1) General Idea
A permutation test is a type of statistical significance test in
which a reference distribution is obtained by calculating all
possible values of the test statistic under rearrangements the
labels on the observed data points. Confidence intervals can
be derived from the tests.
2) Inventor
The theory has evolved from
the works of R.A. Fisher and
E.J.G. Pitman in the 1930s.
14.6.1 Permutation Tests
Major Theory & Derivation
•The permutation test finds a p-value as the proportion of
regroupings that would lead to a test statistic as extreme as
the one observed. We’ll consider the permutation test based
on sample averages, although one could computing and
comparing other test statistics
•We have two samples that we with to compare
•Hypotheses:
Ho: differences between two samples are due to chance
Ha: sample 2 tends to have higher values than sample 1 not
due to simply to chance
Ha: sample 2 tends to have smaller values than sample 1, not
due simply to chance
Ha: there are differences between the two samples not just to
chance
14.6.1 Permutation Tests
To See if the observed difference d from our data
supports Ho or one of the selected alternatives, do
the following steps of a Permutation Test:
Ms. Merry Huilin
Ma~ ^^*
14.6.2 Bootstrap Method
1) General Idea
Bootstrapping is a statistical method for estimating the
sampling distribution of an estimator by sampling with
replacement from the original sample, most often with the
purpose of deriving robust estimates of standard errors and
confidence intervals of a population parameter like a mean,
median, proportion, odds ratio, correlation coefficient or
regression coefficient.
2) Inventor
Bradley Efron(1938-present)'s work has spanned both
theoretical and applied topics, including empirical Bayes
analysis, applications of differential geometry to statistical
inference, the analysis of survival data, and inference for
microarray gene expression data.
Homepage: http://stat.stanford.edu/~brad/
E-mail: [email protected]
14.6.2 Bootstrap Method
3) Major Theory and Derivation
Consider the cases where a random sample of size n is drawn from an
unspecified probability distribution, The basic steps in the bootstrap
procedure are following
14.6.3 Jackknife Method
1) General Idea
Jackknife is a statistical method for estimating and
compensating for bias and for deriving robust estimates of
standard errors and confidence intervals. Jackknifed statistics
are created by systematically dropping out subsets of data
one at a time and assessing the resulting variation in the
studied parameter.
2) Inventor
Richard Edler von Mises(1883 - 1953)
was a scientist who worked on practical
analysis, integral and differential
equations, mechanics, hydrodynamics
and aerodynamics, constructive geometry,
probability calculus, statistics and
philosophy.
14.6.3 Jackknife Method
3) Major Theory and Derivation
Now we briefly describe how it is possible to obtain the standard
deviation of a generic estimator using the Jackknife method. For
simplicity we consider the average estimator. Let us consider the
variables:
where X is the sample average. X(i) is the sample average of
the data set deleting the ith point. Then we can define the
average of x(i) :
The jackknife estimate of standard deviation is then defined as:
%macro _SASTASK_DROPDS(dsname);
%IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;
DROP TABLE &dsname;
%END;
%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;
DROP VIEW &dsname;
%END;
%mend _SASTASK_DROPDS;
SAS program
%LET _EGCHARTWIDTH=0;
%LET _EGCHARTHEIGHT=0;
PROC SQL;
%_SASTASK_DROPDS(WORK.SORTTempTableSorted);
QUIT;
PROC SQL;
CREATE VIEW WORK.SORTTempTableSorted
AS SELECT ScoreChange FROM MIHIR.AMS572;
QUIT;
TITLE;
TITLE1 "Distribution analysis of: ScoreChange";
Title2 " Wilcoxon Rank Sum Test";
ODS EXCLUDE CIBASIC BASICMEASURES EXTREMEOBS MODES MOMENTS QUANTILES;
PROC UNIVARIATE DATA = WORK.SORTTempTableSorted
MU0=0
;
VAR ScoreChange;
HISTOGRAM / NOPLOT ;
RUN; QUIT;
PROC SQL;
%_SASTASK_DROPDS(WORK.SORTTempTableSorted);
QUIT;
SAS program
Distribution analysis of: ScoreChange Wilcoxon Rank
Sum Test
The UNIVARIATE Procedure
Variable: ScoreChange (Change in Test Scores)
Tests for Location: Mu0=0
Test
Statistic
p Value
Student's t
t
-0.80079
Pr > |t|
0.4402
Sign
M
-1
Pr >= |M|
0.7744
Signed Rank
S
-8.5
Pr >= |S|
0.5278
/**Kruskal-Wallis Test and Wilcoxon-Mann-Whitney Test **/
%macro _SASTASK_DROPDS(dsname);
%IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;
DROP TABLE &dsname;
%END;
%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;
DROP VIEW &dsname;
%END;
%mend _SASTASK_DROPDS;
%LET _EGCHARTWIDTH=0;
%LET _EGCHARTHEIGHT=0;
PROC SQL;
%_SASTASK_DROPDS(WORK.TMP0TempTableInput);
QUIT;
PROC SQL;
CREATE VIEW WORK.TMP0TempTableInput
AS SELECT PreTest, Gender FROM MIHIR.AMS572;
QUIT;
TITLE;
TITLE1 "Nonparametric One-Way ANOVA";
PROC NPAR1WAY DATA=WORK.TMP0TempTableInput WILCOXON
;
VAR PreTest;
CLASS Gender;
RUN; QUIT;
PROC SQL;
%_SASTASK_DROPDS(WORK.TMP0TempTableInput);
QUIT;
SAS program
Nonparametric One-Way ANOVA
SAS program
The NPAR1WAY Procedure
Wilcoxon Scores (Rank Sums) for Variable PreTest
Classified by Variable Gender
Sum of
Scores
Expected
Under H0
Std Dev
Under H0
Mean
Score
Gender
N
F
7
40.0
45.50
6.146877
5.714286
M
5
38.0
32.50
6.146877
7.600000
Average scores were used for ties.
Wilcoxon Two-Sample Test
Statistic
38.0000
Kruskal-Wallis Test
Normal Approximation
Z
0.8134
One-Sided Pr > Z
0.2080
Two-Sided Pr > |Z|
0.4160
Chi-Square
0.8006
DF
1
Pr > Chi-Square
0.3709
t Approximation
One-Sided Pr > Z
0.2166
Two-Sided Pr > |Z|
0.4332
Z includes a continuity correction
of 0.5.
/** Wilcoxon Signed Rank Test **/
%macro _SASTASK_DROPDS(dsname);
%IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;
DROP TABLE &dsname;
%END;
%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;
DROP VIEW &dsname;
%END;
%mend _SASTASK_DROPDS;
SAS program
%LET _EGCHARTWIDTH=0;
%LET _EGCHARTHEIGHT=0;
PROC SQL;
%_SASTASK_DROPDS(WORK.SORTTempTableSorted);
QUIT;
PROC SQL;
CREATE VIEW WORK.SORTTempTableSorted
AS SELECT ScoreChange FROM MIHIR.AMS572;
QUIT;
TITLE;
TITLE1 "Distribution analysis of: ScoreChange";
TITLE2 "Wilcoxon Signed Rank Test";
ODS EXCLUDE CIBASIC BASICMEASURES EXTREMEOBS MODES MOMENTS QUANTILES;
PROC UNIVARIATE DATA = WORK.SORTTempTableSorted
MU0=0
;
VAR ScoreChange;
HISTOGRAM / NOPLOT ;
RUN; QUIT;
PROC SQL;
%_SASTASK_DROPDS(WORK.SORTTempTableSorted);
QUIT;
SAS program
Distribution analysis of: ScoreChange Wilcoxon Signed
Rank Test
The UNIVARIATE Procedure
Variable: ScoreChange (Change in Test Scores)
Tests for Location: Mu0=0
Test
Statistic
p Value
Student's t
t
-0.80079
Pr > |t|
0.4402
Sign
M
-1
Pr >= |M|
0.7744
Signed Rank
S
-8.5
Pr >= |S|
0.5278
/**Friedman Test **/
%macro _SASTASK_DROPDS(dsname);
%IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;
DROP TABLE &dsname;
%END;
%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;
DROP VIEW &dsname;
%END;
%mend _SASTASK_DROPDS;
%LET _EGCHARTWIDTH=0;
%LET _EGCHARTHEIGHT=0;
SAS program
PROC SQL;
%_SASTASK_DROPDS(WORK.SORTTempTableSorted);
QUIT;
PROC SQL;
CREATE VIEW WORK.SORTTempTableSorted
AS SELECT Emotion, Subject, SkinResponse FROM WORK.HYPNOSIS1493;
QUIT;
TITLE; TITLE1 "Table Analysis";
TITLE2 "Results";
PROC FREQ DATA = WORK.SORTTempTableSorted
ORDER=INTERNAL
;
TABLES Subject * Emotion * SkinResponse /
NOROW
NOPERCENT
NOCUM
CMH
SCORES=RANK
ALPHA=0.05;
RUN; QUIT;
PROC SQL;
%_SASTASK_DROPDS(WORK.SORTTempTableSorted);
QUIT;
SAS program
Table Analysis Results:
The FREQ Procedure:
Summary Statistics for Emotion by SkinResponse
Controlling for Subject
Cochran-Mantel-Haenszel Statistics (Based on Rank Scores)
Statistic
Alternative Hypothesis
DF
Value
Prob
1
Nonzero Correlation
1
0.2400
0.6242
2
Row Mean Scores Differ
3
6.4500
0.0917
3
General Association
84
.
.
At least 1 statistic not computed--singular covariance matrix.
Total Sample Size = 32
/*Spearman correlation*/
%macro _SASTASK_DROPDS(dsname);
%IF %SYSFUNC(EXIST(&dsname)) %THEN %DO;
DROP TABLE &dsname;
%END;
%IF %SYSFUNC(EXIST(&dsname, VIEW)) %THEN %DO;
DROP VIEW &dsname;
%END;
%mend _SASTASK_DROPDS;
SAS program
%LET _EGCHARTWIDTH=0;
%LET _EGCHARTHEIGHT=0;
PROC SQL;
%_SASTASK_DROPDS(WORK.SORTTempTableSorted);
QUIT;
PROC SQL;
CREATE VIEW WORK.SORTTempTableSorted
AS SELECT Arts, Economics FROM WORK.WESTERNRATES5171;
QUIT;
TITLE1 "Correlation Analysis";
/*Sperman Method*/
PROC CORR DATA=WORK.SORTTempTableSorted
SPEARMAN
VARDEF=DF
NOSIMPLE
NOPROB
;
VAR Arts;
WITH Economics;
RUN;
/*Kendall Method */
PROC CORR DATA=WORK.SORTTempTableSorted
KENDALL
VARDEF=DF
NOSIMPLE
NOPROB
;
VAR Arts;
WITH Economics;
RUN;
RUN; QUIT;
PROC SQL;
%_SASTASK_DROPDS(WORK.SORTTempTableSorted);
QUIT;
SAS program
SAS program
Correlation Analysis
Correlation Analysis
The CORR Procedure
The CORR Procedure
1 With Variables:
Economics
1 With Variables:
Economics
1 Variables:
Arts
1 Variables:
Arts
Spearman Correlation Coefficients, N
= 52
Kendall Taub Correlation Coefficients,
N = 52
Arts
Economics
0.27926
Arts
Economics
0.18854
What
happened to
his eyes!!!!!
I don’t
really
believe in
peace
buddies
Statistics is funny!
How?
They are going to
kill me. HELP!
Are you
still taking
the picture?
I don’t know but I
am looking.
Is it safe to
look at the
camera?
We love
statistics
Losers!