Hierarchical Linear Modeling

Transcript Hierarchical Linear Modeling

Hierarchical Linear Modeling

David A. Hofmann Kenan-Flagler Business School University of North Carolina at Chapel Hill Academy of Management August, 2007

Overview of Session

• Overview – Why are multilevel methods critical for organizational research – What is HLM (conceptual level) – How does it differ from “traditional” methods of multilevel analysis – How do you do it (estimating models in HLM) – Critical decisions and other issues

Why Multilevel Methods

• Hierarchical nature of organizational data – Individuals nested in work groups – Work groups in departments – Departments in organizations – Organizations in environments • Consequently, we have constructs that describe: – Individuals – Work groups – Departments – Organizations – Environments

Why Multilevel Methods

• Hierarchical nature of longitudinal data – Time series nested within individuals – Individuals – Individuals nested in groups • Consequently, we have constructs that describe: – Individuals over time – Individuals – Work groups

Why Multilevel Methods

• Meso Paradigm – Micro OB – Macro OB – Call for shifting focus: • Contextual variables into Micro theories • Behavioral variables into Macro theories • Longitudinal Paradigm – Intraindividual change – Interindividual differences in individual change

What is HLM

• Hierarchical linear modeling – The name of a software package – Used as a description for broader class of models • Random coefficient models • Models designed for hierarchically nested data structures • Typical application – Hierarchically nested data structures – Outcome at lowest level – Independent variables at the lowest + higher levels

What is HLM

• What might be some examples – Organizational climate predicting individual outcomes over and above individual factors – Organizational climate as a moderator of individual level processes – Individual characteristics (personality) predicting differences in change over time (e.g., learning) – Organizational structure moderating the relationship between individual characteristics – Industry characteristics moderating relationship between corporate strategy and performance

What is HLM

• Yes, but what

it … – HLM models variance at two levels of analysis – At a conceptual level • Step 1 – Estimates separate regression equations within units – This summarizes relationships within units (intercept and slopes) • Step 2 – Uses these “summaries” of the within unit relationships as outcome variables regressing them on level-2 characteristics – Mathematically, not really a two step process, but this helps in understanding what is going on

What is HLM

Y ij

Level 1: Regression lines estimated separately for each unit X ij Level 2:

• •

Variance in Intercepts predicted by between unit variables Variance in slopes predicted by between unit variables

What is HLM

• For those who like equations … • Two-stage approach to multilevel modeling – Level 1: within unit relationships for each unit – Level 2: models variance in level-1 parameters (intercepts & slopes) with between unit variables Level 1: Level 2: Y ij ß 0j ß 1j = ß 0j =  00 + ß 1j +  01 X ij + r ij (Group j ) + U 0j =  10 +  11 (Group j ) + U 1j { j subscript indicates parameters that vary across groups}

What is HLM

• Think about a simple example – Individual variables • Helping behavior (DV) • Individual Mood (IV) – Group variable • Proximity of group members

What is HLM

• Hypotheses 1. Mood is positively related to helping 2. Proximity is positively related to helping after controlling for mood • On average, individuals who work in closer proximity are more likely to help; a group level main effect for proximity after controlling for mood 3. Proximity moderates mood-helping relationship • The relationship between mood and helping behavior is stronger in situations where group members are in closer proximity to one another

What is HLM

High Proximity Helping Low Proximity Mood

•

Overall, positive relationship between mood and helping (average regression line across all groups)

•

Overall, higher proximity groups have more helping that low proximity groups (average intercept green/solid vs. mean red/dotted line)

•

Average slope is steeper for high proximity vs. low proximity

What is HLM

• For those who like equations … • Here are the equations for this model Level 1: Helping ij Level 2: ß 0j =  ß 1j =  = ß 0j 00 +  10 +  + ß 1j 01 11 (Mood ij (Proximity (Proximity ) + r j j ij ) + U ) + U 1j 0j

ß 0j

How is HLM Different

• Ok, this all seems reasonable … • And clearly this seems different from “traditional” regression approaches – There are intercepts and slopes that vary across groups – There are level-1 and level-2 models • But, why is all this increased complexity necessary …

How is HLM Different

• “Traditional” regression analysis for our example – Compute average proximity score for the group – “Assign” this score down to everyone in the group

ID 1 2 3 4 5 6 7 8 Grp 1 1 1 1 2 2 2 2 Help 5 3 4 4 2 3 3 2 Mood 5 4 3 5 1 3 3 4 Prox 5 5 5 5 10 10 10 10

How is HLM Different

• Then you would run OLS regression • Regress helping onto mood and proximity • Equation: Help = b 0 + b 1 (Mood) + b 2 (Prox) + b 3 (Mood*Prox) + e ij • Independence of e ij component assumed – But, we know individuals are clustered into groups – Individuals within groups more similar than individuals in other groups – Violation of independence assumption

How is HLM Different

• OLS regression equation (main effect): Helping ij = b 0 + b 1 (Mood) + b 2 (Prox.) + e ij • The HLM equivalent model (ß 1j across groups): is fixed Level 1: Level 2: Helping ß ß 0j 1j ij =  =  00 10 = ß 0j +  + ß 1j (Mood) + r 01 (Prox.) + U 0j ij

How is HLM Different

• Form single equation from HLM models – Simple algebra – Take the level-1 formula and replace ß 0j equations for these variables and ß 1j with the level-2 Help = [  = =   00 00 00 + +   10 01 (Prox.) + U 0j (Mood) +  01 ] + [  10 ] (Mood) + r (Prox.) + U 0j + r ij ij + ß 1j (Mood) +  01 (Prox.) + [U 0j + r ij ] OLS = b 0 + b 1 (Mood) + b 2 (Prox.) + e ij Only difference – Instead of e ij you have [U 0j + r ij ] – No violation of independence, because different components are estimated instead of confounded

How is HLM Different

• HLM – Models variance at multiple levels – Analytically variables remain at their theoretical level – Controls for (accounts for) complex error term implicit in nested data structures – Weighted least square estimates at Level-2

Estimating Models in HLM

• Ok, I have a sense of what HLM is • I think I’m starting to understand that it is different that OLS regression … and more appropriate • So, how do I actually start modeling hypotheses in HLM

Estimating Models in HLM

Some Preliminary definitions: – Random coefficients/effects • Coefficients/effects that are assumed to vary across units – Within unit intercepts; within unit slopes; Level 2 residual – Fixed effects • Effects that do

not

vary across units – Level 2 intercept, Level 2 slope Level 1: Level 2: Helping ij ß 0j =  00 = ß 0j +  ß 1j =  10 + ß 1j (Mood) + r 01 (Prox.) + U 0j ij

Estimating Models in HLM

• Estimates provided: – Level-2 parameters (intercepts, slopes)** – Variance of Level-2 residuals*** – Level 1 parameters (intercepts, slopes) – Variance of Level-1 residuals – Covariance of Level-2 residuals • Statistical tests: – t-test for parameter estimates (Level-2, fixed effects)** – Chi-Square for variance components (Level-2, random effects)***

Estimating Models in HLM

• Hypotheses for our simple example 1. Mood is positively related to helping 2.

Proximity is positively related to helping after controlling for mood • On average, individuals who work in closer proximity are more likely to help; a group level main effect for proximity after controlling for mood 3.

Proximity moderates mood-helping relationship • The relationship between mood and helping behavior is stronger in situations where group members are in closer proximity to one another

Estimating Models in HLM

• Necessary conditions – Systematic within and between group variance in helping behavior – Mean level-1 slopes significantly different from zero (Hypothesis 1) – Significant variance in level-1 intercepts (Hypothesis 2) – Variance in intercepts significantly related to Proximity (Hypothesis 2) – Significant variance in level-1 slopes (Hypothesis 3) – Variance in slopes significantly related to Proximity (Hypothesis 3) 29

Estimating Models in HLM

Helping High Proximity Low Proximity Mood

•

Overall, positive relationship between mood and helping (average regression line across all groups)

•

Overall, higher proximity groups have more helping that low proximity groups (average intercept green/solid vs. mean red/dotted line)

•

Average slope is steeper for high proximity vs. low proximity

Estimating Models in HLM

• Pop quiz – What do you get if you regress a variable onto a vector of 1’s and nothing else; equation: variable = b 1 (1s) + e – b-weight associated with 1’s equals mean of our variable – This is what regression programs do to model the intercept • How much variance can 1’s account for – Zero – All variance forced to residual

Variable 5 4 5 3 2 4 5 Mean = 4.0

1s 1 1 1 1 1 1 1

Estimating Models in HLM

• One-way ANOVA - no Level-1 or Level-2 predictors (null) Level 1: Helping ij Level 2: ß 0j =  00 = ß 0j + U + r 0j ij • where: ß 0j = mean helping for group j  00 = grand mean helping Var ( r ij ) =  2 = within group variance in helping Var ( U 0j ) =    between group variance in helping Var (Helping ij ) = Var ( U 0j + r ij ) =   +  2 ICC =   / (   +  2 ) 32

Estimating Models in HLM

• Random coefficient regression model – Add mood to Level-1 model ( no Level-2 predictors) Level 1: Helping ij Level 2: ß 0j =  = ß 0j + ß 00 ß 1j + U 0j =  10 1j (Mood + U 1j ij ) + r ij • where:  00 = mean (pooled) intercepts (t-test)  10 = mean (pooled) slopes (t-test; Hypothesis 1) Var ( r ij ) =  2 = Level-1 residual variance (R 2 , Hyp. 1) Var ( U 0j ) =    variance in intercepts (related Hyp. 2) Var (U 1j ) = variance in slopes (related Hyp. 3) R 2 = [σ 2 owa - σ 2 rrm] / σ 2 owa] R 2 = [(σ 2 owa +   owa) – (σ 2 rrm +   rrm)] / [σ 2 owa +   owa)] 33

Estimating Models in HLM

• Intercepts-as-outcomes - model Level-2 intercept (Hyp. 2) – Add Proximity to intercept model Level 1: Helping ij Level 2: ß 0j =  = ß 0j 00 ß 1j +  = 01  + ß 1j (Mood (Proximity 10 + U 1j ij j ) + r ) + U ij 0j • where:    00 01 = Level-2 intercept (t-test) = Level-2 slope (t-test; Hypothesis 2) 10 = mean (pooled) slopes (t-test; Hypothesis 1) Var ( r ij Var ( U 0j ) = Level-1 residual variance ) =   = residual inter. var (R 2 - Hyp. 2) Var (U 1j ) = variance in slopes (related Hyp. 3) R 2 = [   rrm   intercept ] / [   rrm ] R 2 = [(σ 2 rrm +   rrm) – (σ 2 inter +   inter)] / [σ 2 rrm +   rrm)

Estimating Models in HLM

• Slopes-as-outcomes - model Level-2 slope (Hyp. 3) – Add Proximity to slope model • where: Level 1: Helping ij Level 2: ß 0j =  ß 1j =  00 = ß 0j +  10 +  + ß 01 11 1j (Mood ij ) + r ij (Proximity (Proximity j j ) + U ) + U 0j 1j    00  01 10 11 = Level-2 intercept (t-test) = Level-2 slope (t-test; Hypothesis 2) = Level-2 intercept (t-test) = Level-2 slope (t-test; Hypothesis 3) Var ( r ij ) = Level-1 residual variance Var ( U 0j ) = residual intercepts variance Var (U 1j ) = residual slope var (R 2 - Hyp. 3) 35

Other Issues

• Assumptions • Statistical power • Centering level-1 predictors • Additional resources

Other Issues

• Statistical assumptions – Linear models – Level-1 predictors are independent of the level-1 residuals – Level-2 random elements are multivariate normal, each with mean zero, and variance 

and covariance 

qq’

– Level-2 predictors are independent of the level-2 residuals – Level-1 and level-2 errors are independent.

– Each r ij is independent and normally distributed with a mean of zero and variance  2 for every level-1 unit i within each level-2 unit j (i.e., constant variance in level-1 residuals across units).

Other Issues

• Statistical Power – Kreft (1996) summarized several studies – .90 power to detect cross-level interactions 30 groups of 30 – Trade-off • Large number of groups, fewer individuals within • Small number of groups, more individuals per group • More recent paper (ORM) – 30/30 still somewhat holds – Factor in cost of sample level-1 and level-2 – Additional formulas for computing power

Other Issues

• Picture this scenario – Fall of 1990 – Just got in the mail the DOS version of HLM – Grad student computer lab at Penn State – Finally, get some multilevel data entered in the software and are ready to go – Then, we are confronted with …

41 Select your level-1 predictor(s): 1 1 for (job satisfaction) 2 for (pay satisfaction) How would you like to center your level-1 predictor Job Satisfaction?

1 for Raw Score 2 for Group Mean Centering 3 for Grand Mean Centering Please indicate your centering choice: ___

Other Issues

• HLM forces to you to make a choice in how to center your level-1 predictors • This is a critical decision – The wrong choice can result in you testing theoretical models that are inconsistent with your hypotheses – Incorrect centering choices can also result in spurious cross-level moderators • The results indicate a level-2 variable predicting a level-1 slope • But, this is not really what is going on

Centering Decisions

• Level-1 parameters are used as outcome variables at level-2 • Thus, one needs to understand the

meaning

parameters of these • Intercept term: expected value of Y when X is zero • Slope term: expected increase in Y for a unit increase in X • Raw metric form: X equals zero might not be meaningful

Centering Decisions

• 3 Options – Raw metric – Grand mean – Group mean • Kreft et al. (1995): raw metric and grand mean equivalent, group mean non-equivalent • Raw metric/Grand mean centering – intercept var = adjusted between group variance in Y • Group mean centering – intercept var =

between

group variance in Y [ Kreft, I.G.G., de Leeuw, J., & Aiken, L.S. (1995). The effect of different forms of centering in Hierarchical Linear Models.

Multivariate Behavioral Research, 30

, 1-21.]

Centering Decisions

• Bottom line – Grand mean centering and/or raw metric estimate incremental models • Controls for variance in level-1 variables prior to assessing level-2 variables – Group mean centering • Does NOT estimate incremental models – Does not control for level-1 variance before assessing level-1 variables – Separately estimates with group regression and between group regression

Centering Decisions

• An illustration from Hofmann & Gavin (1998): – 15 Groups / 10 Observations per group – Individual variables: A, B, C, D – Between Group Variable: G j • G =

(A • Thus, if between group variance in A & B (i.e., A is accounted for, G outcome j , B j ) j j & B j should not significantly predict the ) – Run the model: • Grand Mean • Group Mean • Group + mean at level-2

Centering Decisions

• Grand Mean Centering Null A ij A ij B ij A A ij ij B B ij ij

Variables included in: Level 1 Model Level 2 Model

C C ij ij D ij G j G j G j G j G j

Parameter Est. G j .445** .209** .064

.007

-.028

• What do you see happening here … what can we conclude?

Centering Decisions

• Group Mean Centering Null A ij - A j A ij - A j B ij - B j A A ij ij - A - A

Variables included in: Level 1 Model Level 2 Model

j j B B ij ij - B - B j j C C ij ij - C - C j j D ij - D j G j G j G j G j G j

Parameter Est. G j .445** .445** .445** .445** .445**

• What do you see happening here … what can we conclude?

Centering Decisions

• Group Mean Centering with A, B, C, D Means in Level-2 Model Null A ij - A j A A A ij ij ij - A - A - A

Variables included in: Level 1 Model Level 2 Model

j j j B B B ij ij ij - B - B - B j j j C C ij ij - C - C j j D ij - D j G j G j A j G j A j B j G j A j B j C j G j A j B j C j D j

Parameter Est. G j .445** .300* .132

.119

.099

• What do you see happening here … what can we conclude?

Centering Decisions

• Centering decisions are also important when investigating cross-level interactions • Consider the following model: Level 1: Y ij Level 2: ß 0j ß 1j = ß 0j =  =  00 + ß 10 1j + U (X 0j grand ) + r ij • The ß 1j does not provide an unbiased estimate of the pooled within group slope – It actually represents a mixture of both the within and between group slope – Thus, you might not get an accurate picture of cross-level interactions

Centering Decisions

• Bryk & Raudenbush make the distinction between cross-level interactions and between-group interactions – Cross-level: Group level predictor of level-1 slopes – Group-level: Two group level predictors interacting to predict the level-2 intercept • Only group-mean centering enables the investigation of both types of interaction • Illustration (Hofmann & Gavin, 1999, J. of Management) – Created two data sets • Cross-level interaction, no between-group interaction • Between-group interaction, no cross-level interaction

Model Level-1:Y ij =  0j +  1j (X ij - X..) + e ij Level-2:  0j =  00 +  01 G 1j +

u 0j

 1j =  10 +  11 G 1j +

u 1j

 00  01  10 

Level-1:Y Level-2:  0j =  00 +  01 X j +  02 G j +  03 ( X j G j ) +

u 0j

 1j =  10 ij = +   11 0j G + j  + 1j

ij - X j ) + r ij  00  01  02 

 10 

Cross-level Interaction Para t meter p -.27 ns 3.10 .91 ns -.71

3.51

-.73

7.24

<.01

-.22 .19 1.59

-.19

-.52

4.52

-.03 .10 .44

-.22

-.49

8.43

ns ns

<.01

Between-group interaction Para t p meter 2.10 .63 ns 10.16 6.23 <.01 .01

.72

.01

3.80

<.01

.10 .07 .80

3.77

-.20

.19

.07 .19 1.16

22.50

-.51

.95

ns ns ns

<.01

Centering Decision

• Incremental – group adds incremental prediction over and above individual variables – grand mean centering – group mean centering with means added in level-2 intercept model • Mediational – individual perceptions mediate relationship between contextual factors and individual outcomes – grand mean centering – group mean centering with means added in level-2 intercept model

Centering Decisions

• Moderational – group level variable moderates level-1 relationship – group mean centering provides clean estimate of within group slope – separates between group from cross-level interaction – Practical: If running grand mean centered, check final model group mean centered • Separate – group mean centering produces separate within and between group structural models

Do You Really Need HLM?

Alternatives for Estimating Hierarchical Models

SAS: Proc Mixed

• SAS Proc Mixed will estimate these models • Key components of Proc Mixed command language – Proc mixed • Class – Group identifier • Model – Regression equation including both individual, group, and interactions (if applicable) • Random – Specification of random effects (those allowed to vary across groups)

SAS: Proc Mixed

• Key components of Proc Mixed command language – Some options you might want to select • Class: noitprint ( suppresses interation history ) • Model: – solution ( prints solution for random effects ) – ddfm=bw ( specifies the “between/within” method for computing denominator degrees of freedom for tests of fixed effects ) • Random: – sub=

( how level-1 units are divided into level-2 units ) – type=un ( specifies unstructured variance-covariance matrix of intercepts and slopes; i.e., allows parameters to be determined by data )

SAS: Proc Mixed

proc means; run; data; set; {grand mean center mood} moodgrd = mood-5.8388700; data; set; proc mixed noitprint; class id; model helping = / solution; random intercept / sub=id; proc mixed noitprint; class id; model helping = moodgrd/ solution ddfm=bw ; random intercept moodgrd/ sub=id type=un; proc mixed noitprint; class id; model helping = moodgrd proxim / solution ddfm=bw ; random intercept moodgrd / sub=id type=un; proc mixed noitprint; class id; model helping = moodgrd proxim moodgrd*proxim / solution ddfm=bw ; random intercept moodgrd / sub=id type=un; run;

Model One-way ANOVA L1: Helping ij =  0j + r ij L2:  0j =  00 + U 0j HLM Proc Mixed Random coefficient regression L1: Helping ij =  0j +  1j (Mood ij ) + r ij L2:  0j =  00 + U 0j L2:  1j =  10 + U 1j HLM Proc Mixed Intercepts-as-outcomes L1: Helping ij =  0j +  1j (Mood ij ) + r ij L2:  0j =  00 +  01 (Proximity j ) + U 0j L2:  1j =  10 + U 1j HLM Proc Mixed Slopes-as-outcomes L1: Helping ij =  0j +  1j (Mood ij ) + r ij L2:  0j =  00 +  01 (Proximity j ) + U 0j HLM L2:  1j =  10 +  11 (Proximity j ) + U 1j Proc Mixed  00 31.39 31.39 31.42 31.43 24.92 24.91 25.14 25.14  01 -- -- -- -- 1.24 1.24 1.19 1.19  10 -- -- 3.01 3.01 3.01 3.01 2.06 2.06  11 -- -- -- -- -- -- .18 .18  31.76 31.76 99.82 99.81 5.61 5.61 5.61 5.61 2 5.61 5.61  00 45.63 45.64 41.68 41.68 42.95 42.94  11 -- -- .13 .13 .13 .13 .02 .02 59

SAS: Proc Mixed

• Key references – Singer, J. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models.

Journal of Educational and Behavioral Statistics, 23,

323 355.

60 • Available on her homepage – http://hugse1.harvard.edu/~faculty/singer/

Resources

• http://www.unc.edu/~dhofmann/hlm.html

– PowerPoint slides – Annotated output – Raw data + system files – Link to download student version of HLM – Follows chapter: • Hofmann, D.A., Griffin, M.A., & Gavin, M.B. (2000). The application of hierarchical linear modeling to management research. In K.J. Klein, & S.W.J. Kozlowski, (Eds.),

Multilevel theory, research, and methods in organizations: Foundations, extensions, and new directions

. Jossey-Bass, Inc. Publishers.

– Proximity = Group Cohesion • Also: http://www.ssicentral.com/hlm/hlmref.htm

Hierarchical Linear Modeling

Transcript Hierarchical Linear Modeling