No Slide Title

Download Report

Transcript No Slide Title

Multiple Regression
Research Hypothesis Testing
•
•
•
•
•
Looking for “the model”
Kinds of multiple regression questions
Ways of forming reduced models
Comparing “nested” models
Comparing “non-nested” models
Searching for “the model” with multiple regression
A common question is,” What is the best multiple regression
model for this criterion?”
This certainly seems like an important question, because such a
model would tell us what variables must be considered to
predict or perhaps understand the criterion & what variables
can be safely ignored in our theory and practice.
A “best model” would have two important properties…
1. Every predictor in the model contributes to the model
(parsimony or necessity)
2. No other predictor would contribute to the model if it were
added.
Searching for “the model” with multiple regression
There are three things that routinely thwart our attempts to find
“the model”
1. Collinearity – because of the correlations among the
predictors (which are sometimes stronger than the predictors
are correlated with the criterion) there are often alternative
models that perform equally well
2. Underspecification – there’s just no way we can ever test that
“no other predictor” would contribute (one solution is to
decide theoretically on the set of predictors - almost cheating)
3. Also, again because of collinearity, it is possible for to include
a variable in model that, while it doesn’t contribute to the
model, does change the size or even the sign of other
predictors in the model. If so, the more “parsimonious” model
might not be the most accurate.
So, what are we to do?
Rather than telling “the model” we need to tell “the story”
(which also gives us the best chance of finding the
model if it is out there…)
“the story” is told from …
1. Each predictor’s correlation with the criterion and the
collinearities among predictors
2. Each predictor’s contribution to the full model (noting
likely reasons why variables don’t contribute and
suppressors)
3. Relative utility (R2) of alternative models and each
predictors contribution to each
4. Building a story of which predictors contribute when or
which what model when included.
When carefully considered, most research hypotheses or questions
involving multiple predictors has one of four forms:
1. The impact of adding or subtracting one or more particular
predictors from to a specified model
• Whether or not adding one or more particular to a specified
model will “help” the model, (i.e., increase R² significantly)
• Whether dropping one or more predictors from a specified
model will “hurt” the model (i.e., decrease R² significantly)
• This involves comparing “nested models” (R2 change test)
2. The impact of substituting one or more predictors for one or
more others
• Whether the substitution “helps” or “hurts” the model (i.e.,
significantly changes the R2)
• This involves comparing “non-nested models” (t- or Z-test)
Research Hypotheses and Multiple Regression, cont.
3. The differential performance of a specific model across two or
more groups (populations, treatments, etc.)
• Whether the model produces equivalent R2 for the groups
• This involves comparing the models’ fit (Fisher’s Z-test) and
the models’ structure (t- or Z-test)
4.The differential performance of a specific model for predicting
two or more different criterion variables
• Whether the model produces equivalent R2 for the criterion
variables
• This involves comparing “correlated R2” (t- or Z-test)
Tell if each type of model comparison (  means “predicting”)
Nested
#1 age, gender, birth order & # siblings  social skills
#2 age, gender, birth order, # siblings, family type & height  social skills
#1 age, birth order & # siblings  social skills for females
#2 age, birth order & # siblings  social skills for males
Pop comp
#1 age, gender, birth order & # siblings  social skills Non-nested
#2 age, gender, birth order, family type & height  social skills
#1 age, gender, birth order & # siblings  dyadic social skills
DV comp
#2 age, gender, birth order & # siblings  group social skills
About Full vs. Reduced (nested) models …
Full model -- model involving “all the variables”
• all that you “care about”- not every variable in the data set
Reduced model -- model involving “some subset” of the variables
Ways of forming reduced models:
Theory -- some variables are “more important” from a theoretical
perspective, and the question is if a subset of
variables accounts for the criterion variable as well as
the full model (e.g., will adding MMPI scores improve a
model of drinking behavior that is based on only
demographic variables?)
Pragmatic -- will a subset of the “less costly” predictors do as well
as a model that included them and the more expensive
ones (e.g., will adding a full scale IQ measure (would cost
us $250) to a model using GRE scores ($0 for us)
improve our selection of graduate students?)
Summary of ways of constructing reduced models:
• only include variables with significant simple correlations
• nah -- ignores suppressor variables & is atheoretical
• only include variables with significant contributions to full model
• nah -- ignores collinearity patterns & is atheoretical
• use automated/statistical model construction techniques
• nah -- doesn’t work as intended & is atheoretical
• select a subset of variables based on theory or
availability/economics that might be “sufficient” (perform
equivalently to the full model)
• yeah !!!
Keep in mind that the hypothesis/question might involve
comparing two reduced models - one nested in the other.
Many hierarchical modeling efforts have three basic steps…
1. Enter the demographic variables
2. Enter the “known predictors” (based on earlier work)
3. Enter the “new variables” (the ones you are proposing make up
an important, but as yet unidentified, part of understanding
this criterion variable)
This provides a conservative test of the “new variables”, because
they must “compete” with all the other variables and each other in
order to become a contributing predictor in the model.
To show that your “new variables” are correlated with the criterion,
but don’t contribute beyond what’s accounted for by the “demo +
old” model, often isn’t sufficient (your variables don’t add anything)
Comparing “nested models”
R2y.x1,x2,x3,x4
vs.
R2y.x1,x2
H0: R2y.x1,x2,x3,x4
(RL² - RS²) / (kL - kS)
F = --------------------------------(1 - RL²) / (N - kL - 1)
=
R2y.x1,x2
RL2 = R2 of the larger model
RS2 = R2 of the smaller model
kL = # preds of larger model
kS = # preds of smaller model
N = total number of subjects
Find F-critical using df = (kL - kS ) & (N-kL-1)
Notice the numerator of the F-test
• The change in R² is divided by the number of predictors that are
added (or removed)
• This makes sense, because an R² change of .20 from adding or
dropping a single predictor is “more impressive” than the
same change of .20 from adding or dropping 4 predictors
• This test actually asks if the “average contribution to the R²” of
the particular variables that are added to or dropped from
the model is significantly different from 0.00.
• The impact of adding or dropping predictors from a model may
depend upon how those are “packaged”, how many are
involved, etc.
Applying this knowledge of R²-change will allow us to consider
changes in multiple predictor models, for example…
We know …
• dropping a contributing predictor from a model will lower R²
significantly
• dropping a non-contributing predictor from a model will lower R²
numerically, but not significantly
• adding a contributing predictor to a model will raise R²
significantly
• adding a non-contributing predictor to a model will raise R²
numerically, but not significantly
Usually (in most texts) we are told that we can’t accurately
anticipate the results of adding or dropping more than one
variable at a time -- but this is not strictly correct !!!
Consider what would happen if we dropped 2 non-contributing
predictors from a multiple regression model
• we are told that when we’ve dropped one of the predictors, that
the other might now contribute (we’ve changed the
collinearity mix)
• but consider this…
• neither of the predictors, if dropped by itself will produce a
significant R² change
• so the average R² change from dropping the two shouldn’t
be significant either
• thus, we can drop both and expect that the R² change won’t
be significant)
• this logic is useful, but becomes increasingly precarious as
sample size drops, collinearity increases or the number of
predictors in the model or being dropped increases
Similarly…
Dropping two predictors that both contribute should produce an
average R² change that is significant (same logic in reverse)
However, things get “squirrelly” when considering dropping one
contributing and one non-contributing predictor
•
we have no good way of considering whether the average R²
change will or will not be significant
We will consider these issues, their applications and some
“variations” when we look at the workings of statistical/automated
modeling procedures.
The moral of the story…
• Because this R²-change tests really tests the average R²-change
of the set of added or dropped predictors, the apparent
contribution of a added variable may depend upon the
variables along with which it is added or dropped
• Adding or dropping large sets of variables simultaneously can
make the results harder to interpret properly
Because of this...
• Good RHs usually call for arranging the addition or deletions of
items in small, carefully considered sets
• Thus, most Rhyps of this type use the addition or removal of
multiple sets (each with a separate R²-change test)
• This is called hierarchical modeling -- the systematic addition or
removal of hypothesized sets of variables
Comparing “non-nested models”
Another important type of hypothesis tested using multiple
regression is about the substitution of one or more predictors for
one or more others.
Common bases for substitution hypotheses :
• Often two collinear variables won’t both contribute to a model -you might check of there is an advantage of one vs. the
other being included in the model
• You might have a hypothesis that a variable commonly used in a
model (theory) can be substituted for with some other
variable
• You might be interested in substituting one (or more)
inexpensive (or otherwise more available) predictors for
one that is currently used
Non-nested models are compared using either Hotelling’s t or
Steiger’s Z
Some basic principles of these tests . . .
• the more correlated the two models are to each other, the less
likely they are to be differentially correlated with the criterion
• obviously the more predictors two models share, the more
collinear they will be -- seldom does the substitution of a
single predictor within a multivariate model have a
significant effect
• the more collinear the substituted variables are, the more
collinear the models will be -- for this reason there can be
strong collinearity between two models that share no
predictors
• the weaker (lower R²) the two models are, the less likely they are
to be differentially correlated with the criterion
Comparing model performance across groups
• this involves the same basic idea as comparing a bivariate
correlation across groups, only now we’re working with a
multivariate model
• this sort of analysis has important theoretical uses (differential
behavioral models for different groups) as well as
psychometric applications (part of evaluating if “measures”
are equivalent for different groups, such as gender, race,
across cultures or within cultures over time)
• there are two different “versions” of this question
• Does the R2 of a predictor set differ between the groups ?
• we’ll use Fisher’s Z-test (nill- or nonnill-H0:) for this
• Does the model -- that is the b weights of the predictors -differ between the groups?
• this requires a somewhat more complicated test using
Hotelling’s t-test or the Meng .etc. Z-test, , but often
provides the more important information
Comparing multiple regression models across groups
Group #1 (larger n)
“direct model”
R²D1
y’1 = bx + bx + a
“crossed model”
y’1 = bx + bx + a
Compare R²D1 & R²D2
Compare R²D1 & R²X2
R²X1
Group #2 (smaller n)
“direct model”
y’2 = bx + bx + a
R²D2
Apply the model (bs & a) from
group 2 to the data from group 1
using Fisher’s Z-test
using Hotelling’s t-test & Ment, etc. Z-test
(will need rDX -- correlation between models
Retaining the H0: for each suggests group comparability in terms of “fit”
and “structure” of a single model for the two groups.
Comparing model performance across criteria
• same basic idea as comparing correlated correlations, but now
the difference between the correlations is the criterion, not
the predictor (nill- or nonnill-H0:)
• two important uses of this type of comparison
• theoretical/applied -- do we need separate models to predict
related behaviors?
• psychometric -- do different measures of the same construct
have equivalent models (i.e., measure the same thing) ?
• the process is similar to testing for group differences, but what
changes is the criterion that is used, rather than the group
that is used
• we’ll apply the Hotelling’s t-test and/or Meng, etc Z-test to
compare the structure of the two models
Comparing multiple regression models across criterion variables
Criterion #1
“direct model”
“Y”
R²DY
y’1 = bx + bx + a
“crossed model”
y’1 = bx + bx + a
Compare R²D1 & R²X2
R²Xy
Criterion #2
“direct model”
z’2 = bx + bx + a
“Z”
R²DZ
Apply the model (bs & a)
from group 2 to the data
from group 1
using Hotelling’s t-test & Ment, etc. Z-test
(will need rDX -- correlation between models
Retaining the H0: for each suggests group comparability in terms of the
“structure” of a single model for the two criterion variables -- there is no
direct test of the differential “fit” of the two models to the two criteria.