1-Way ANOVA - Weighted Least Squares - Body Mass Indices among NHL, NBA, and EPL Athletes
Download ReportTranscript 1-Way ANOVA - Weighted Least Squares - Body Mass Indices among NHL, NBA, and EPL Athletes
Matrix form of 1-Way ANOVA
Cell Means Model
Weighted Least Squares
Body Mass Index for NHL, EPL, and NBA
Athletes
Populations / Model
• All National Hockey League, English Premier League, and
National Basketball Association Players
• Y ≡ Body Mass Index = 703(weight2/height)
Weight (pounds), Height (inches)
NHL (2014): m1 = 26.50 s1 = 1.455 N1 = 717
NBA (2013-14): m2 = 24.74 s2 = 1.720 N2 = 505
EPL (2015): m3 = 23.02 s3 = 1.713 N3 = 526
Random Samples of Sizes: n1 , n2 , n3 :
yij mi ij
ij ~ NID 0, s
2
i
i 1, 2,3; j 1,..., ni
BMI
5
0
30.75
10
30
15
29.25
20
28.5
25
27.75
30
27
35
26.25
40
25.5
45
24.75
NBA BMI Distribution
24
32
24
31.5
30.75
30
29.25
28.5
27.75
27
26.25
25.5
24.75
0.05
23.25
20
23.25
28
EPL
22.5
24
30
21.75
20
NHL
21
16
40
20.25
0.1
NBA
22.5
0.15
19.5
0.2
21.75
0.25
18.75
0
21
Normal Density
0.3
18
20
20.5
21
21.5
22
22.5
23
23.5
24
24.5
25
25.5
26
26.5
27
27.5
28
28.5
29
29.5
30
30.5
31
Frequncy
BMI - NBA, NHL, EPL Athletes
NHL BMI Distribution
80
70
60
50
Count
10
0
BMI
EPL BMI Distribution
50
45
40
35
30
25
20
15
10
5
0
Matrix Form of the Model and OLS Estimation
yi1
y
i2
Yi
yini
Y1
Y Y2
Y3
0
n2
0
1n1
X 0n2
0n3
0n1
1n2
0n3
0n1
0n2
1n3
s 12 I n1n1
0n1n2
0n1n3
V Y VY 0n2 n1 s 22I n2 n2
0n2 n3
0n3 n2
s 32 I n3 n3
0n3 n1
1
0 0
n1
y1
ni
1
1
y
X'X
0
0
X'Y
y
yij
i
2
n
j
1
2
y3
1
0 0
n3
m11n1
E Y Xβ m2 1n2
m3 1n3
n1
X'X 0
0
m1
β m2
m3
0
0
n3
y1
^
1
Ordinary Least Squares (OLS) Estimator: β X'X X'Y y 2
y 3
^
ni
y i
y
j 1
ij
ni
E β E X'X X'Y X'X X'Xβ β
1
1
1
n1
^
1
1
1
V β V X'X X'Y X'X X'VY X X'X 0
0
0
1
n2
0
0
n1s 12
0 0
0
1
n3
0
n2s 22
0
1
n
0 1
0 0
n3s 32
0
0
1
n2
0
2
s1
0
n
1
0 0
1
0
n3
0
s 22
n2
0
0
0
2
s3
n3
OLS with all Means Being Equal
Y1
Y Y2
Y3
β0 m
1n1
X0 1n2
1n3
N n1 n2 n3
s 12 I n1n1
0n1n2
V Y VY 0n2 n1 s 22I n2 n2
0n3 n2
0n3 n1
3 ni
1
1
X 0'Y yij
X0'X0
N
i 1 j 1
m 1n1
E Y Xβ 0 m 1n2
m
1
n
3
X0'X0 N
3
^
β 0 X0'X0 X0'Y
1
0n1n3
0n2 n3
s 32 I n3 n3
ni
y
i 1 j 1
N
ij
1
y
N
3
n y
i 1
i
i
^
E β0 β0
3
2
1 s
1
1
1
V β 0 X0'X 0 X 0'VY X 0 X 0'X 0 nis i2
N i 1
N N
^
3
2
s
ns
i 1
i
N
2
i
Weighted Least Squares with Different Means
1
I n1n1
s1
Let W 0n2 n1
0n3 n1
1
0n1n3
0n1
0n1
1n1
s1
1
1
I n2 n2
0n2 n3
Y* WY
X* WX 0n2
1n2
0n2
E Y* E WY WXβ X*β
s2
s2
1
1
0n3 n2
I n3 n3
0n3
1n3
0n3
s3
s3
1
1
0n1n2
0n1n3
0n1n2
0n1n3
I n1n1
I n1n1
s1
s 12 I n1n1
0n1n2
0n1n3 s 1
1
1
V Y* V WY WVY W' 0n2 n1
I n2 n2
0n2 n3 0n2 n1 s 22I n2 n2
0n2 n3 0n2 n1
I n2 n2
0n2 n3 I N
s2
s2
2
0
0
s
I
n3 n1
n3 n2
3 n3 n3
1
1
0n3 n2
I n3 n3
0n3 n2
I n3 n3
0n3 n1
0n3 n1
s3
s3
s 12
n1
0
0
0
0
2
n1
s
1
2
^
1
n2
s2
*
* 1
Weighted Least Squares Estimator: βW X* ' X* X* ' Y* where: X*'X* 0
0
X
'X
0
0
2
s
n
2
2
2
n3
s3
0
0
0
0
2
s3
n3
0n1n2
E βW X* ' X* X*'X*β β
^
1
V βW X* ' X* X*'V Y* X* X* ' X* X* ' X*
^
1
1
1
^
^
Note: for this Model: βW β
WLS with all Means Being Equal
1
Y
1
s1
1
Y* Y2
s 2
1
Y3
s3
β0 m
3 ni
X 'X 2
i 1 s i
*
0
1
1
n1
s1
1
X*0 1n2
s 2
1
1n3
s3
X 'X
*
0
* 1
0
*
0
3
βW 0 X 'X
^
*
0
* 1
0
X 'Y
*
0
*
i 1
3
^
E βW 0
2
i
ni
s
i 1
2
i
m β0
2
i
V βW 0 X*0'X
^
2
i
ni
s
* 1
0
X*0'VY* X*0 X*0'X
* 1
0
3
1
i 1
s i2
X 'Y
*
0
ni m
s
i 1
3
N n1 n2 n3
yi
s
i 1
3
3 n 1
i2
i 1 s i
m
1
n1
s1
m
E Y* X*0β 0 1n2
s 2
m
1n3
s3
3 n 1
i2
i 1 s i
ni
3
yi
y s
j 1
ij
i 1
2
i
V Y* VY* I N
I - Weighted Least Squares Testing H0: m1 = m2 = m3
H 0 : m1 m2 m3 m
H A : Not all mi are equal
H 0 :, Y 0 X*0 βW 0 P0*Y* SSE0 Y* P0*Y* ' Y* P0*Y* Y* ' I P0* Y*
^
^
1
2 1n1n1
s1
1
n ni 1
*
*
*
* 1
*
P0 X0 X0'X0 X0' 2
1n2 n1
i 1 s i s 1s 2
1
1n3 n1
s
s
1 3
1
s 1s 2
1
s
2
2
1n1n2
1n2 n2
1
s 2s 3
1n3 n2
P0*P0* P0*
1n1n3
s 1s 3
1
1n2 n3
s 2s 3
1
1
s 32 n3 n3
1
P0*P0* X*0 X*0'X*0 X*0'X*0 X*0'X*0 X*0' X*0 X*0'X*0 X*0' P0*
1
1
1
H A : Y : Y X βW P*Y* SSE0 Y* P*Y* ' Y* P*Y* Y* ' I P* Y*
*
^
*
^
1
1n1n1
n1
1
P* X* X*'X* X*' 0n2 n1
0n3 n1
0n1n2
1
1n n
n2 2 2
0n3 n2
0n1n3
0n2 n3
1
1n3 n3
n3
P*P* P*
II - Weighted Least Squares Testing H0: m1 = m2 = m3
Note: 1ni n j 1n j n1k n j 1ni nk
1
2 1n1n1
s1
1
n
1
n
P0*P* i2
1n2 n1
i 1 s i s 1s 2
1
1n3 n1
s 1s 3
1 n1
2 1n1n1
s 1 n1
1
n ni 1 n1
2
1n2 n1
s
s
s
1 2 n1
i 1 i
1 n1
s s n 1n3 n1
1 3 1
1
s 1s 2
1n1n2
1
1
2 n2 n2
s2
1
s 2s 3
1n3 n2
1
1n1n3 1n1n1
s 1s 3
n1
1
1n2 n3 0n2 n1
s 2s 3
1
0n3 n1
1
2 n3 n3
s3
1
1 n2
1
s 1s 2 n2 n1n2
1 n2
1
s 22 n2 n2 n2
1 n2
1n3 n2
s 2s 3 n2
0n1n2
1
1n2 n2
n2
0n3 n2
1 n3
1
s 1s 3 n3 n1n3
1 n3
*
1n2 n3 P0
s 2s 3 n3
1 n3
1
n
n
s 32 n3 3 3
0n1n3
0n2 n3
1
1n3 n3
n3
III - Weighted Least Squares Testing H0: m1 = m2 = m3
Error Sum of Squares (Full Model): SSE1 Y* P*Y* ' Y* P*Y* Y* ' I P* Y*
Error Sum of Squares (Reduced Model): SSE0 Y* P0*Y* ' Y* P0*Y* Y* ' I P0* Y*
Treatment Sum of Squares: SSTrt SSE0 SSE1 Y* ' P* P0* Y*
m1
1n1
s
1
m
Y* ~ MVN μY * 2 1n2 , VY * I N
s 2
m1
1n3
s
3
I P I P I P (see previous slides)
P P P P P P (see previous slides)
SSE SSE Y ' I P Y ~ with degrees of freedom: df r I P trace I P N 3
1
and non-centrality parameter: μ ' I P μ Note: Software Packages do not divide by 2.
2
SS Y ' P P Y ~ with degrees of freedom: df r P P trace P P 3 1 2
1
and non-centrality parameter: μ ' P P μ Note: Software Packages do not divide by 2.
2
*
*
*
*
0
*
*
*
0
*
*
0
*
*
*
2
*
1
*
E
*
E
*
Trt
*
*
0
*
Y*
Y*
2
*
Trt
*
Trt
Y*
*
0
Y*
*
0
*
*
0
IV - Weighted Least Squares Testing H0: m1 = m2 = m3
m
m3
m2
μY * ' 1 1n1 '
1n2 '
1n3 '
s2
s3
s1
3
ni mi2
μY * ' IμY * 2
i 1
E
1
μY * ' I P * μY *
2
si
1
1n1n1
n1
μY * ' P*μY * μY * ' 0n2 n1
0n3 n1
0n1n2
1
1n n
n2 2 2
0n3 n2
0n1n3
0n2 n3 μY *
1
1n3 n3
n3
m1
1n1
s1
m1 n1
m3 n3 m2 3 ni mi2
m2 n2
1n1 '
1n3 ' 1n2 2
1n2 '
s
n
s
n
s
2 2
3 n3
1 1
s 2 i 1 s i
m3
1n3
s3
Trt
1
μY * ' P* P0* μY *
2
V - Weighted Least Squares Testing H0: m1 = m2 = m3
1
2 1n1n1
s1
1
n ni 1
*
μY * ' P0 μY * μY * ' 2
1n2 n1
s
s
s
i
1
i 1 2
1
1n3 n1
s
s
1 3
Element(1) of μY * ' P0* :
1
s 1s 2
1
s 22
1n1n2
1n2 n2
1
s 2s 3
1n3 n2
1n1n3
s 1s 3
1
1n2 n3 μY *
s 2s 3
1
1
s 32 n3 n3
1
n
1
1
wi mi
n ni n1m1 n2 m2 n3 m3 n ni 1 n ni mi 1
2 2 i 1n
2 3
2
2
s1
i 1 s i s 1 s 1s 2 s 1s 3 i 1 s i s 1 i 1 s i
wi
wi
ni
s i2
i 1
n
wm
1
μY * ' P0*μY * i 1n
s
wi 1
i
i 1
i
1
s2
m1
1n1
s1
1 m2
1n2
s 3 s 2
m3
1n3
s3
2
n
w
m
w
m
i i n
i i n
ni mi i 1
i 1
wi m W
n
n
2
s
i 1
wi i1 i
wi
n
i 1
i 1
n
2
mW
wm
i 1
n
i
w
i 1
i
i
VI - Weighted Least Squares Testing H0: m1 = m2 = m3
m
m3
m2
μY * ' 1 1n1 '
1n2 '
1n3 '
s2
s3
s1
3
n
ni mi2
μY * ' IμY * 2 wi mi2
i 1
si
E
1
μY * ' I P * μY *
2
3
μY * ' P μY *
*
i 1
i 1
ni mi2
s i2
Trt
n
wi m
i 1
1
μY * ' P* P0* μY *
2
n
μY * ' P μ wi m W
2
i
*
0 Y*
3
1
1 3
*
2
E μY * ' I P μY * wi mi wi mi2 0
2
2 i 1
i 1
3
2
1
1 3
1 3
*
*
2
Trt μY * ' P P0 μY * wi mi wi m W wi mi m W
2
2 i 1
i 1
2 i 1
i 1
2
Note: Trt 0 m1 m2 m3 m
I P I P
*
SSE ~
2
*
P0* P* P0* P* P*P0* P* P0* P* P0* 0 SSE , SSTrt independent
df E N 3
SSTrts
1 3
~ dfT 3 1 2, Trt wi mi m W
2 i 1
2
SSTrts
3 1 MS
1 3
Trts
F
~ F df1 3 1, df 2 N 3, Trt wi mi m W
SSE
2 i 1
MSE
N 3
MSTrts
H0 :
~ F df1 3 1, df 2 N 3
MSE
2
2
2
Testing for League Differences in Mean BMI Levels
League(i)
NHL(1)
NBA(2)
EPL(3)
m_i
26.50
24.74
23.02
s_i
1.455
1.720
1.713
SampleSz1 SampleSz2 SampleSz3
n_i
n_i
n_i
6
12
25
3
7
15
4
10
10
Sum(Wts)
WtdMean
2_Trts
Weight1
w1
2.8342
1.0141
1.3632
5.2114
25.2473
11.4709
Weight2
w2
5.6683
2.3661
3.4079
11.4424
25.0996
26.1605
Weight3
w3
11.8090
5.0703
3.4079
20.2872
25.4756
35.6854
For each Sample Size arrangement:
• Obtain the degrees of freedom for the F-Statistic: (df1 3-1=2, df2 = N-3)
• Obtain Critical Value for a = 0.05 level test based on the central F-distribution: F.05;df1,df2
• For the corresponding non-central F, with non-centrality parameter 2Trts, find the area
above the Critical Value from the central F. This is the power of the test to detect these
particular differences
• In R, use: 1-pf(qf(1-a,df1,df2),df1,df2,2Trts)
Scenario1 Scenario2 Scenario3
df1
2
2
2
df2
10
26
47
F(.05,df1,df2)
4.103
3.369
3.195
2_Trts
11.4709
26.1605
35.6854
P{NCF>F(.05,df1,df2)}
0.7393
0.9940
0.9997
Simulating the F-test from Actual Distributions
• For each Sample Size Scenario, sample without
replacement, ni individuals from league i
• Compute the F-Statistic and save it.
• Obtain summary statistics of the computed FStatistics and observe how many exceed F.05,df1,df2
• Plot a Histogram of the computed statistics
• If possible, super-impose the theoretical non-central
F-distribution over Histogram
Scenario1 Scenario2 Scenario3
df1
2
2
2
df2
10
26
47
F(.05,df1,df2)
4.103
3.369
3.195
2_Trts
11.4709
26.1605
35.6854
P{NCF>F(.05,df1,df2)}
0.7393
0.9940
0.9997
Empirical Rejection Rate (WLS)
0.7454
0.9888
0.9993
Empirical Rejection Rate (OLS)
0.7564
0.9874
0.9993