Introduction to Linear Mixed Models

Download Report

Transcript Introduction to Linear Mixed Models

Introduction to Linear Mixed
Models
Tom Greene
Examples of Hierarchical or Clustered Data
Design Level Clusters:
• Observational studies relating health outcomes to
patient and center level predictor variables for patients
(level 1 units) nested within clinics (level 2 units)
• Cluster randomized trials (RCTs) in which centers are
randomized to different treatments (centers are level 2
units, patients are level 1 units)
• Multicenter RCTs in which patients are randomized to
treatments within each center, but it is thought that
the treatment effect may vary between centers
(centers are level 2 units, patients are level 1 units)
Examples of Hierarchical or Clustered Data
Design Level Clusters:
• Longitudinal studies relating repeated
measurements to predictor variables (patients
are level 2 units, measurements at different times
are level 1 units)
• Complex survey designs in which primary
sampling units (such as counties) are first
sampled (level 2 units), and then households are
sampled within the primary sampling units (level
1 units)
Examples of Hierarchical or Clustered Data
Naturally Occurring Clusters:
• Analyses of family members (level 1 units)
nested within families (level 2 units)
• Analyses of eyes, ears, lungs, etc. (level 1
units), nested within patients (level 2 units)
• Analyses of individual littermates (level 1
units) nested within litters (level 2 units)
Examples of Hierarchical or Clustered Data
Can have 3 or more levels of clustering:
• Longitudinal observations (level 1 units) from
patients (level 2 units) nested within centers
(level 3 units) of a multicenter RCT
• Complex surveys: Families (level 1) nested
within census tracts (level 2) nested within
counties (level 3)
Applications of Mixed Effects Models
• Account for correlated data when computing
p-values or confidence intervals
• Estimate level 2 quantities while borrowing
information from the full data set (shrinkage)
• Estimate the variance (or standard deviation)
of level 2 quantities while accounting for
“noise” due to sampling error
Applications of Mixed Effects Models
• Characterize the pattern of correlations (or
covariances) of measurements over time or
space
• Determine the optimal “weighting” to estimate
treatment effects when data are unbalanced
• Providing approximately unbiased analyses of
longitudinal data when data are missing at
random
Most Basic Example: 1-Way ANOVA
Most Basic Example: 1-Way ANOVA
Mixed Effects Formulation:
Primary goals are usually
a) to estimate the overall mean, applying inferences to a broader population of “groups”
(really level 2 units) from which the study groups are viewed as a random sample,
b) to estimate the individual group means while incorporating information from the other
groups, and
c) to estimate the variance of the distribution of the group means in the population from
which the sampled groups were drawn.
µ1
µ2
µg
.....
g Sampled
groups
• Hypothetical superpopulation from
which groups were
drawn
• This population has
an infinite # of µi .
• We’d like to know
the mean and
variance of this
distribution
Most Basic Example: 1-Way ANOVA
Example 1
• Research Objective: Estimate 6-month mean weight loss in
overweight diabetics resulting from 1-1 coaching program
• Randomly assign subjects to 8 different coaches who have
been certified in the program (6 subjects per coach)
• Yij = Observed weight loss for the jth patient assigned to the
ith coach
• β0 = overall mean weight loss across the super-population
of coaches
• β0 + bi = mean weight loss under the individual coaches
(without sampling error)
• εij = patient variation in weight loss
• Model: Yij = β0 + bi + εij, i = 1,2, .., 8; j = 1,2, …, 6
Weight Change Data: Change in Kg
Analysis Variable : Y
Coach #
(group)
1
2
3
4
5
6
7
8
N Obs
6
6
6
6
6
6
6
6
Mean
3.15
-7.20
-5.06
-19.61
-10.83
-9.41
-7.99
-9.26
Std Dev
8.84
2.58
4.80
5.76
4.00
5.89
5.76
4.24
PROC MIXED STATEMENTS
Statement
Function
PROC MIXED
Invokes linear mixed model procedure
MODEL
Specifies response variable and fixed effect
predictor variables
(e.g., model for E(Yij))
RANDOM
Specifies model for random effects (the bi)
REPEATED
Designates correlation (or covariance) structure in
the residuals (the εij)
CLASS
Specifies class variables
ESTIMATE
Designates linear functions of fixed and/or
random effects for estimation
CONTRAST
Designates general linear hypotheses in fixed
and/or random effects
• PROC MIXED specifies standard fixed effects ANOVA/Regression with
uncorrelated residuals if REPEATED and RANDOM statements are omitted.
Next time
with Rich
Kenward-Roger
degrees of freedom
proc mixed data=ydat;
(matches usual df for
class group;
fixed effects models)
model y=group / solution ddfm = kr noint;
estimate ‘Overall Mean' group 1 1 1 1
1 1 1 1 /divisor=8; run;
** Fixed Effects Analysis:
GROUP
1
2
3
4
5
6
7
8
Response
Coach #
Solution for Fixed Effects
Standard
Estimate
Error DF
t Value
3.1535
2.2496 40
1.40
-7.2040
2.2496 40
-3.20
-5.0556
2.2496 40
-2.25
-19.6076
2.2496 40
-8.72
-10.8269
2.2496 40
-4.81
-9.4063
2.2496 40
-4.18
-7.9894
2.2496 40
-3.55
-9.2603
2.2496 40
-4.12
Label
Overall Mean
Estimates
Standard
Estimate
Error
-8.2746
0.7954
Pr > |t|
0.1687
0.0027
0.0302
<.0001
<.0001
0.0002
0.0010
0.0002
DF t Value Pr > |t|
40 -10.40 <.0001
With intercept omitted,
fixed effects estimates
correspond to group means
What quantity does overall
mean ± SE refer to?
** Mixed Effects Analysis;
proc mixed data=ydat ;
Only fixed effect is overall mean β0
class group;
Random effects corresponding to
model y=/solution ddfm = kr;
each coach
random intercept/subject=group solution;
run;
Covariance Parameter Estimates
Cov Parm Subject
Estimate
Intercept
GROUP
34.8526
Residual
30.3639
Effect
Intercept
Intercept
Intercept
Intercept
Intercept
Intercept
Intercept
Intercept
Solution for Random Effects
Std Err
GROUP
Estimate
Pred DF
1
9.9791
2.9326 14.8
2
0.9348
2.9326 14.8
3
2.8109
2.9326 14.8
4
-9.8961
2.9326 14.8
5
-2.2287
2.9326 14.8
6
-0.9882
2.9326 14.8
7
0.2490
2.9326 14.8
8
-0.8607
2.9326 14.8
Effect
Intercept
Solution for Fixed Effects
Standard
Estimate
Error DF t Value Pr > |t|
-8.2746
2.2336 7
-3.70 0.0076
t Value Pr > |t|
3.40 0.0040
0.32 0.7544
0.96 0.3533
-3.37 0.0043
-0.76 0.4592
-0.34 0.7409
0.08 0.9335
-0.29 0.7732
= 34.85/(30.36+34.85)
= 0.534
is intra-class correlation
Comparison of Fixed Effects Estimates and BLUPs
GROUP
1
2
3
4
5
6
7
8
Fixed Effect
Estimate of Group
Mean
3.1535
-7.2040
-5.0556
-19.6076
-10.8269
-9.4063
-7.9894
-9.2603
eBLUP for
Overall
Mean
-8.2746
-8.2746
-8.2746
-8.2746
-8.2746
-8.2746
-8.2746
-8.2746
bi
11.43
1.07
3.22
-11.33
-2.55
-1.13
0.29
-0.99
9.9791
0.9348
2.8109
-9.8961
-2.2287
-0.9882
0.2490
-0.8607
eBLUP for
group
mean
1.7045
-7.3398
-5.4637
-18.1706
-10.5033
-9.2628
-8.0256
-9.1353
Example 2
• Research Objective: Compare effects of two 1-1 coaching
methods on 6-month mean weight loss in overweight
diabetics
• Randomly assign subjects to 8 coaches who have been
certified in both methods (6 subjects per coach)
• Randomly assign 8 coaches to 2 methods (4 coaches per
method)
• Cluster randomized trial
• Yij = Weight loss for the jth patient assigned to the ith coach
• Xi = indicator for assignment of ith coach to method B.
• Model: Yij = β0 + β1 Xi + bi + εij, i = 1,2, .., 8; j = 1,2, …, 6
Fixed effects
terms
Random effects
Weight Change Data: Change in Kg
Analysis Variable : Y
Coach #
Treatment
A
A
A
A
B
B
B
B
(GROUP)
1
2
3
4
5
6
7
8
N Obs
6
6
6
6
6
6
6
6
Mean
5.15
-5.20
-3.06
-17.61
-12.83
-11.41
-9.99
-11.26
Std Dev
8.84
2.58
4.80
5.76
4.00
5.89
5.76
4.24
** Fixed Effects Model Accounting for Group;
proc mixed data=ydat;
class group;
model y=group / solution ddfm = kr noint cl;
estimate ‘Method' group -1 -1 -1 -1 1 1 1 1
/divisor=4 cl;
GROUP
1
2
3
4
5
6
7
8
Solution for Fixed Effects
Standard
Estimate
Lower
Upper
Error DF
5.1535
2.2496 40
0.6069
9.7001
-5.2040
2.2496 40 -9.7506 -0.6574
-3.0556
2.2496 40 -7.6022
1.4910
-17.6076
2.2496 40 -22.1541 -13.0610
-12.8269
2.2496 40 -17.3735 -8.2803
-11.4063
2.2496 40 -15.9529 -6.8597
-9.9894
2.2496 40 -14.5360 -5.4428
-11.2603
2.2496 40 -15.8069 -6.7137
Produces fixed
effects inference
for Method effect
Type 3 Tests of Fixed Effects
Num Den
Effect
DF DF F Value Pr > F
GROUP
8 40 22.09 <.0001
What quantity do the estimate
and SE refer to?
Estimates
Label
Method
Estimate
-6.1923
Standard
Error DF
1.5907 40
t Value Pr > |t|
-3.89 0.0004
Alpha Lower Upper
0.05 -9.4072 -2.9774
** Fixed Effects Model Ignoring Group;
proc mixed data=ydat;
model y=method/ solution ddfm = kr cl;
Covariance Parameter Estimates
Cov Parm
Estimate
Residual
61.5922
Effect
Intercept
method
Estimate
-5.1784
-6.1923
Solution for Fixed Effects
Standard
Error DF t Value Pr > |t|
1.6020 46
-3.23 0.0023
2.2655 46
-2.73 0.0089
Does this standard error correspond to a
meaningful quantity?
Alpha
Lower Upper
0.05 -8.4030 -1.9538
0.05 -10.7526 -1.6320
** Fixed Effects Model Ignoring Group;
proc mixed data=ydat;
model y=method/ solution ddfm = kr cl;
Covariance Parameter Estimates
Cov Parm
Estimate
Residual
61.5922
Effect
Intercept
Method
Estimate
-5.1784
-6.1923
Muddles together σb2 and σ2
Solution for Fixed Effects
Standard
Error DF t Value Pr > |t|
1.6020 46
-3.23 0.0023
2.2655 46
-2.73 0.0089
Does this standard error correspond to a
meaningful quantity? NO!
Alpha
Lower Upper
0.05 -8.4030 -1.9538
0.05 -10.7526 -1.6320
** Mixed Effects Model with Treatment as Fixed Effect
** and Group as Random Effect;
proc mixed data=ydat;
model y=method/ solution ddfm = kr;
random group / solution;
Covariance Parameter Estimates
Cov Parm
Subject
Estimate
Intercept
GROUP
39.9027
Residual
30.3639
GROUP
1
2
3
4
5
6
7
8
Solution for Fixed Effects (from mixed model)
Standard
Pr >
Effect
Estimate
Error DF t Value
|t|
Intercept
-5.1784 3.3527 6 -1.54 0.1734
method
-6.1923 4.7415 6 -1.31 0.2394
Solution for Random Effects
Estimate Std Err Pred
DF
9.1690
3.6975 7.82
-0.02273
3.6975 7.82
1.8839
3.6975 7.82
-11.0302
3.6975 7.82
-1.2923
3.6975 7.82
-0.03155
3.6975 7.82
1.2258
3.6975 7.82
0.09801
3.6975 7.82
t Value
2.48
-0.01
0.51
-2.98
-0.35
-0.01
0.33
0.03
Pr > |t|
0.0388
0.9952
0.6244
0.0180
0.7359
0.9934
0.7489
0.9795
Comparison of Treatment Effect
Estimates
Fixed Effects Model with Group as Fixed Effect
Label
Method
Estimate
-6.1923
Standard
Error
1.5907
DF
40
t Value Pr > |t|
-3.89 0.0004
Fixed Effects Model Ignoring Group
Effect
Method
Estimate Standard Error
-6.1923
2.2655
DF t Value Pr > |t|
46 -2.73 0.0089
Solution for Fixed Effects (from mixed model)
Standard
Effect
Estimate
Error DF t Value Pr > |t|
Method
-6.1923
4.7415
6
-1.31 0.2394
Example 3
• Research Objective: Compare effects of two 1-1 coaching methods
on 6-month mean weight loss in overweight diabetics
• Randomly assign subjects to 8 coaches who have been certified in
both methods (6 subjects per coach)
• Randomly assign each coach’s 6 subjects to method A or method B
(3 subjects per method for each coach)
• Standard stratified randomized trial, with coaches as strata
• Yij = Weight loss for the jth patient assigned to the ith coach
• Xij = indicator for assignment of jth pt for the ith coach to method B.
• Model 1: Yij = β0 + β1 Xij + bi + εij, i = 1,2, .., 8; j = 1,2, …, 6
– Treatment effect assumed constant for all coaches
• Model 2: Yij = β0 + β1Xij + Xijb1i + (1-Xij)b2i + bi + εij, i = 1,2, .., 8; j = 1,2, …, 6
– Treatment effect assumed to vary between coaches
Weight Change Data: Change in Kg
Analysis Variable : Y
Coach #
(GROUP)
1
2
3
4
5
6
7
8
Method
N
Mean
Std Dev
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
-11.19
0.55
-11.03
-14.79
-27.43
-10.87
-19.95
-28.31
-22.45
-11.26
-11.31
-17.45
-8.76
-3.07
-24.21
-13.08
5.44
4.63
3.33
8.09
5.62
10.82
5.33
2.34
3.70
6.37
2.37
3.08
0.46
5.74
5.96
8.52
** Standard Fixed Effect Model for Randomized Block Design;
proc mixed data=ydat ;
class group;
model y= Method group/solution ddfm = kr;
Solution for Fixed Effects
Effect
Intercept
GROUP
Method
GROUP
GROUP
GROUP
GROUP
GROUP
GROUP
GROUP
GROUP
Estimate
-21.0240
4.7588
1
2
3
4
5
6
7
8
13.3267
5.7325
-0.5060
-5.4842
1.7910
4.2672
12.7287
0
Standard
Error DF t Value Pr > |t|
3.0957 39
-6.79 <.0001
2.0638 39
2.31 0.0265
4.1276
4.1276
4.1276
4.1276
4.1276
4.1276
4.1276
.
3.23
1.39
-0.12
-1.33
0.43
1.03
3.08
.
39
39
39
39
39
39
39
.
0.0025
0.1728
0.9031
0.1917
0.6667
0.3076
0.0037
.
Reference group since
Intercept included in
model
** Standard Mixed Effect Model for Randomized Block Design;
** Without Treatment x Group Interaction;
proc mixed data=ydat ;
class group;
model y= Method/solution ddfm = kr;
random group/ solution;
Effect
Intercept
Method
GROUP
1
2
3
4
5
6
7
8
Solution for Fixed Effects
Standard
Estimate
Error DF
-17.0421
2.5249 10
4.7588
2.0638
39
t Value Pr > |t|
-6.75 <.0001
2.31
Solution for Random Effects
Std Err
Estimate
DF
t Value
Pred
7.4710
3.3484 15.1
2.23
1.3995
3.3484 15.1
0.42
-3.5881
3.3484 15.1
-1.07
-7.5681
3.3484 15.1
-2.26
-1.7517
3.3484 15.1
-0.52
0.2280
3.3484 15.1
0.07
6.9929
3.3484 15.1
2.09
-3.1836
3.3484 15.1
-0.95
0.0265
Pr > |t|
0.0413
0.6819
0.3008
0.0390
0.6085
0.9466
0.0541
0.3567
Covariance Parameter
Estimates
Cov Parm
Estimate
GROUP
33.9648
Residual
51.1100
Estimate, SE, and DF are
identical to those of fixed
effects model. Hence, making
“coach” a random effect does
not influence the results
** Mixed Effect Model for Randomized Block Design;
** With Treatment x Group Interaction;
proc mixed data=ydat ;
class group Cmethod;
model y= Method/solution ddfm = kr;
random group CMethod*group/ solution;
Effect
Intercept
Method
Effect
GROUP
GROUP
GROUP
GROUP
GROUP
GROUP
GROUP
GROUP
GROUP*Cmethod
GROUP*Cmethod
GROUP*Cmethod
GROUP*CMethod
Solution for Fixed Effects
Standard
Estimate
Error DF t Value Pr > |t|
-17.0421
2.8536 12.8
-5.97 <.0001
4.7588
3.3660
7
1.41 0.2003
Solution for Random Effects
Group x
Method
GROUP
Estimate
1
4.3603
2
0.8168
3
-2.0942
4
-4.4170
5
-1.0223
6
0.1331
7
4.0813
8
-1.8580
1
0
1.1347
1
1
6.4475
2
0
3.9507
2
1
-2.5303
Output truncated
Std Err
Pred
4.8802
4.8802
4.8802
4.8802
4.8802
4.8802
4.8802
4.8802
5.2238
5.2238
5.2238
5.2238
DF
3.57
3.57
3.57
3.57
3.57
3.57
3.57
3.57
13.4
13.4
13.4
13.4
t Val
ue
0.89
0.17
-0.43
-0.91
-0.21
0.03
0.84
-0.38
0.22
1.23
0.76
-0.48
Covariance Parameter Estimates
Cov Parm
Estimate
GROUP
19.8231
GROUP*Cmethod
34.4704
Residual
32.5490
Pr > |t|
0.4277
0.8761
0.6924
0.4223
0.8455
0.9797
0.4553
0.7249
0.8313
0.2383
0.4625
0.6359
Example 4: Evaluate effect of ethylene
glycol (EG) dose on fetal weight in mice
• EG administered at dose levels 0, 750, 1500, or 3000
mg/kg/day to 94 pregnant mice (dams) beginning
just after implantation
• 94 litters included 1028 live fetuses, litter sizes
ranges from 1 to 16
• Yij = fetal weight of the jth fetus from the ith litter
• Dose transformed as: Xi = Sqrt(Dose/750)
From Fitzmaurice, on the Web
Example 4: Evaluate effect of ethylene
glycol (EG) dose on fetal weight in mice
• Mixed model:
Yij = β0 + β1 Xi + bi + εij
• β0 and β1 are fixed effect regression coefficients, as in a
standard linear regression of birth weight on X =
sqrt(dose/750)
• bi is a random effect to account for clustering by litter, and
assumed to vary independently across litters with bi ~
N(0,σb2).
• The εij are random errors, assumed to vary independently
across fetus within and between litters, with εij ~ N(0, σ2).
• The amount of clustering or correlation among the fetal
weights within a litter is modeled by variation in the bi.
PROC MIXED code
Model statement specifies fixed effects
part of the mixed model: β0 + β1 Xi
Random statement specifies the random effects
part of the mixed model: bi
SAS Output:
Intra-class correlation
0.007256/(0.007256 + 0.005565)
= 0.57
SAS Output:
The REML estimate of the regression parameter for (transformed) dose
indicates that the mean fetal weight decreases with increasing dose.
What happens if we stupidly ignore the litter effect and run a standard
regression analysis (PROC REG or PROC GLM)?
Mixed effect result was – 0.134 ± 0.0124
Example 5
• Research Objective: Compare effects of two coaching
methods on mean weight loss over a 6 month period
in overweight diabetics.
• Randomly assign 48 subjects to 2 different weight loss
programs (24 per group)
• Standard 2-group randomized trial
• Yij = Weight loss at time j for the ith patient, j = 0, 2, 4,
and 6 months
• Nesting of repeated measurements within patients
Example 5
• Xi = indicator for assignment to Method B
Can be relaxed (Rich will discuss)
εij are i.i.d. N(0,σ2)
(b0i,b1i) ~ MVN(0,D)
Unstructured covariance matrix to allow
correlation between random Intercept and slope
• 1-Stage model formulation:
Yij = β00 + β01 Xi + β10 tj + β11 Xi tj + b0i + tj b1i + εij
Illustration of 1st Stage of the 2 stage Model for analogous
reaction time vs. days of sleep deprivation study
Weight Change Data: Change in Kg
Analysis Variable : Y
Method Time
N
Mean Std Dev
A
0
24
99.94
6.28
2
24
97.20
7.84
4
24
95.39
9.68
6
24
92.58
11.43
B
0
24 100.87
5.06
2
24
95.80
6.05
4
24
93.56
7.33
6
24
89.72
9.49
** Standard Random Intercept & Slope Model;
data ydat; set ydat; timec=time;
proc mixed data=ydat;
class id;
model y= Method timec Method*timec/solution ddfm = kr;
random intercept timec/type = un subject=id ;
Cov Parm
UN(1,1)
UN(2,1)
UN(2,2)
Residual
Covariance Parameter Estimates
Standard
Subject
Estimate
Error Z Value
ID
17.7925
6.3732
2.79
ID
1.3181
1.3369
0.99
ID
1.7033
0.5427
3.14
16.6901
2.4090
6.93
Effect
Intercept
Method
timec
Method*timec
Solution for Fixed Effects
Standard
Estimate
Error
99.8594
1.1082
0.4839
1.5673
-1.1943
0.3252
-0.5913
0.4599
DF
46
46
46
46
Pr Z
0.0026
0.3241
0.0008
<.0001
t Value Pr > |t|
90.11 <.0001
0.31 0.7589
-3.67 0.0006
-1.29 0.2049
*** Likelihood ratio test for linear vs. quadratic model;
*** Must use method = ml;
proc mixed data=ydat method=ml;
class id;
model y= Method timec Method*timec/solution ddfm = kr;
random intercept timec/type = un subject=id ;
proc mixed data=ydat method=ml ;
class group id;
model y= Method timec timec*timec Method*timec Method*timec*timec/
solution ddfm = kr;
random intercept timec/type = un subject=id ;
Fit Statistics
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Fit Statistics
1227.5
1243.5
1244.3
1258.5
Solution for Fixed Effects
Standard
Effect
Estimate
Error
Intercept
99.8594
1.0849
Method
0.4839
1.5343
timec
-1.1943
0.3183
Method*timec
-0.5913
0.4502
Pr > |t|
<.0001
0.7538
0.0005
0.1952
-2 Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Solution for Fixed Effects
Standard
Effect
Estimate
Error
Intercept
99.8417
1.1618
Method
0.8099
1.6431
timec
-1.1677
0.7002
timec*timec
-0.00443
0.1039
Method*timec
-1.0804
0.9902
Method*timec*timec
0.08151
0.1470
1227.0
1247.0
1248.2
1265.7
Pr >
|t|
<.0001
0.6238
0.0977
0.9661
0.2772
0.5805
** General Longitudinal Model Estimating Separate Means for Each Visit;
proc mixed
class id
model y=
repeated
estimate
estimate
estimate
estimate
data=ydat;
time;
time Method*time/solution ddfm = kr noint;
time/subject=id type=un;
'Month 2 Treatment Effect' Method*time -1 1 0 0;
'Month 6 Treatment Effect' Method*time -1 0 0 1;
'Mean Fup Treatment Effect' Method*time -3 1 1 1/divisor=3;
'Treatment Effect on Slp per 6 mo' Method*time -3 -1 1 3/divisor=3;
Covariance Parameter
Estimates
Cov Parm Subject Estimate
UN(1,1)
ID
32.4980
UN(2,1)
ID
20.7188
UN(2,2)
ID
48.9954
UN(3,1)
ID
24.3278
UN(3,2)
ID
38.2395
UN(3,3)
ID
73.7292
UN(4,1)
ID
22.7135
UN(4,2)
ID
51.9496
UN(4,3)
ID
70.8862
UN(4,4)
ID
110.36
Solution for Fixed Effects
Effect
Time Estimate
SE DF t Value
P|
Time
0
99.9395
1.1637 46 85.88 <.0001
Time
2
97.1954
1.4288 46 68.03 <.0001
Time
4
95.3934
1.7527 46 54.43 <.0001
Time
6
92.5784
2.1444 46 43.17 <.0001
Method*Time
0
0.9347
1.6456 46
0.57 0.5728
Method*Time
2
-1.3993
2.0206 46 -0.69 0.4921
Method*Time
4
-1.8331
2.4787 46 -0.74 0.4633
Method*Time
6
-2.8629
3.0327 46 -0.94 0.3501
Estimates
Label
Estimate
SE DF t Value
P
Month 2 Treatment Effect
-2.3340
1.8270 46 -1.28 0.2078
Month 6 Treatment Effect
-3.7976
2.8495 46 -1.33 0.1892
Mean Fup Treatment Effect
-2.9665
2.0211 46 -1.47 0.1490
Treatment Effect on Slp per 6 mo
-3.9423
3.0658 46 -1.29 0.2049
Basic Linear Mixed Model Formulation
(Laird & Ware 1982)
Yij = Xij1β1 + Xij2 β2 + … + Xijp βp
+ Zij1b1i + Zij2b2i + … + Zijqbqi + εij
(b1i, b2i, … bqi) ~ MVN with E(bri) = 0, r = 1, 2, … q,
Cov(bri,bsi) = Drs, r=1,2, … q; s=1,2,… q
(εi1, εi2,…, εin i)~ MVN with E(εir) = 0, r = 1, 2, … ni,
Cov(εir, εis) = Σrs, r=1,2, … ni; s=1,2,…, ni
(εi1, εi2,…, εini) and (b1i, b2i, … bqi) are independent between
different i, and are independent of each other.
Basic Linear Mixed Model Formulation
(Laird & Ware 1982)
px1
qx1
Yi = Xi β + Zi bi + εi
ni x 1
ni x p
ni x q
bi ~ MVN(0,D)
pxp
εi ~ MVN(0,Σi)
ni x ni
ni x 1
Yi = response for subject i
Xi, Zi = measured covariates
for subject i
β = fixed effects
bi = random effects for
subject i
εi =
residuals for subject i
b1, b2, …. bg, ε1, ε2,…, εg are independent
Marginal Model & Estimation Procedure
The linear mixed model
Yi = Xi β + Zi bi + εi, bi ~ MVN(0,D), εi ~ MVN(0,Σi)
Yi ~ MVN(Xi β, Zi D Zit + Σi).
Marginal Model & Estimation Procedure
Marginal Model & Estimation Procedure
Optimum Weighting of Data
(if the model is valid and data are MAR)
0
-20
-40
Mixed models give more weight to
these patients when computing a
group mean slope
-60
eGFR Slope (ml/min/1.73m2/yr)
GFR Slope vs. Total Follow-up Time in the AASK Study
2
4
6
8
10
Years of eGFR Follow-up From 3 Months After Randomization
Consequences of Missing Data
• Because a likelihood based approach is used, results of
correctly specified mixed models remain valid if data are
missing at random (so missingness is allowed to depend on
covariates included in the model, and nonmissing outcome
values)
• However, results may be biased if data are missing not at
random (informative missingness).
• The use of differential weighting can exacerbate this problem.
• Informative censoring due to termination of follow-up due to
competing risks can be addressed by using joint mixed models
incorporating both the longitudinal outcome and the time-toevent outcome defining the competing risk
Example: GFR trajectories in the MDRD Study
GFR Slope vs. Total Follow-up Time in the MDRD Study
Open circles indicate pts terminating
follow-up prior to scheduled EOS.
Schluchter, Greene, Beck, Stat Med 2001
Estimated Mean GFR Slope by Different Methods
Violations of Normality
• Two types of violations:
– Non-normal ԑij
– Non-normal bi
• Two types of inference:
– For fixed effects: Central limit theorem type phenomena
protect inferences with non-normal ԑij if either the ni or g
are large. If the bi are non-normal need large g.
– For random effects: Results are quite sensitive to
deviations from normality – large g does not help.
Two Most Common
Misconceptions
• Inclusion of a factor (such as center) as a random
effect does NOT control for confounding associated
with that factor !!!!
– E(Yij) = Xi β ignores the random effect terms
• Inclusion of center as a random effect in an RCT does
not extend the inference space for the treatment
effect unless a treatment x center random effect is
included.
References
• Fitzmaurice G, Laird N, Ware J. Applied Longitudinal
Analysis. Wiley, 2004.
• Verbeke G & Molenberghs G. Linear Mixed Models for
Longitudinal Data. Springer 2000.
• Littell R, Milliken G, Stroup W, Wolfinger R,
Schabenberger O, SAS for Mixed Models 2nd Ed, SAS
2006.
• Singer J, Willett. Applied Longitudinal Data Analysis,
Modeling Change and Event Occurrence. Oxford Press,
2003.
Next Time (Nov 3)
Rich Holubkov on Correlation
Structures