Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.

Download Report

Transcript Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.

Interpreting multivariate
OLS and logit coefficients
Jane E. Miller, PhD
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Overview
• What elements to report for coefficients
• Coefficients on
– Continuous independent variables (IVs; predictors)
– Categorical independent variables
• Ordinary least squares (OLS) and logit
coefficients
• Topic sentences for paragraphs reporting
multivariate coefficients
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Report and interpret results
• Report detailed multivariate results in tables.
– Coefficients.
– Inferential statistical results:
• standard error or test statistic,
• p-value or symbol.
– Model goodness of fit statistics.
• Interpret coefficients in the text.
– Refer to associated table.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
What to report for coefficients
• Topic
– Independent variable (IV)
– Dependent variable (DV)
•
•
•
•
•
Direction (AKA “sign”)
Magnitude (AKA “size”)
Units or categories
Statistical significance
Most authors remember to report statistical
significance, so I have listed that element last!
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting coefficients
• Poor: “The effect of public insurance was –7.2 (p <
0.05).”
– Reports the coefficient without interpreting it. Without units
or reference group, the meaning of “– 7.2” cannot be
interpreted.
• Better: “Children with private insurance stayed on
average 7.2 days longer than those with public
insurance (p < 0.05).”
– Interprets the β in intuitive terms, mentioning the topic, units,
categories, direction, magnitude and statistical significance.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
More examples of interpreting βs
• Poor: “Insurance and length of stay were associated (p
< 0.05).”
– Topic and statistical significance, but not direction or size.
• Better: “Privately-insured children stayed longer than
publicly insured children (p < 0.05).”
– Statistical significance and direction, but not size.
• Best: “Children with private insurance stayed on
average 7.2 days longer than those with public
insurance (p < 0.05).”
– Topic, direction, magnitude, units, categories, and statistical
significance.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpretation of βs depends on
types of variables in your models
• The type of dependent variable:
– Continuous dependent variable
• Ordinary least squares (OLS)
– Categorical dependent variable
• Logistic (logit) regression model
• Type of independent variable:
– Continuous
– Categorical
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting coefficients from
ordinary least squares (OLS) models
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Coefficients for OLS models
• For ordinary least squares (OLS) models, the
coefficient (β) is a measure of difference in the
DV for a 1-unit increase in the IV.
– For unstandardized coefficients, difference in the
same units as the dependent variable.
• Can be explained using wording for results of
subtraction.
• For standardized coefficients, β measures difference in
standardized units (multiples of standard deviations).
– See podcast about resolving the Goldilocks problem using model
specification.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting βs: continuous predictors
• The unstandardized coefficient on a continuous
predictor in an OLS model measures
– The difference in the dependent variable for a oneunit increase in the independent variable.
– Effect size is in original units of the DV.
• Example topic: Mother’s age as a predictor of
birth weight:
– Dependent variable = birth weight in grams.
– Independent variable = mother’s age in years.
– Both are continuous variables.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Example: Mother’s age as a predictor of birth weight
• Poor: “Mother’s age and child’s birth weight are
correlated (p<0.01).”
– Names the dependent and independent variables and
conveys statistical significance, but not direction or
magnitude of the association.
• Better: “As mother’s age increases, her child’s birth
weight also increases (p<0.01).”
– Concepts, direction, and statistical significance, but not size.
• Best: “For each additional year of mother’s age at the
time of her child’s birth, the child’s birth weight
increases by 10.7 grams (p<0.01).”
– Concepts, units, direction, size, and statistical significance.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting βs: categorical predictors
• The β on a categorical IV in an OLS model
measures the difference in the DV for the
category of interest compared to the reference
category.
– A “1-unit increase” does NOT make sense.
• Example: gender
– Dummy variable (AKA “binary variable”) coded
• 1 = boy
• 0 = girl = omitted (reference) category
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Example: Gender as a predictor of birth weight
• Poor: “The β for ‘BBBOY’ is 116.1 with an s.e.
of 12.3 (table 15.3).”
– Uses a cryptic acronym rather than naming the
independent variable or conveying that it is
categorical.
– Doesn’t convey the dependent variable.
– Reports the same information as the table (size of
coefficient and standard error), but does not
interpret them.
– The direction of the effect cannot be determined
because categories and units are not specified.
– To assess statistical significance, readers must
calculate test statistic and compare it against
critical value.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Gender as a predictor of birth weight, cont.
• Slightly better: “Gender is associated with a
difference of 116.1 grams in birth weight (p <
0.01).”
– Concepts, magnitude, units, and statistical
significance but not direction: Was birth weight
higher for boys or for girls?
• Best: “At birth, boys weigh on average 116
grams more than girls (p < 0.01).”
– Concepts, reference category and units, direction,
magnitude, and statistical significance.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Identifying the reference category
• For categorical variables, mention identity of
reference category.
– E.g., effect size is relative to whom?
• Example for 2-category comparison:
– “Boys weighed 116 grams more than girls.”
• Example for multicategory comparison:
– “Compared to white infants, black and Hispanic infants
weighed 62 and 16 grams less on average.”
– OR “Mean birth weight was 62 and 16 grams less, for
black and Hispanic infants, respectively, when each is
compared to white infants.”
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting coefficients from
logistic regression models
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Logit models for
categorical dependent variables
• Logit = log[p/(1 – p)] = log(odds of the category
you are modeling)
– p is the proportion of the sample in the modeled category
• β measures the log relative-odds of the
outcome for different values of the
independent variable
• Exponentiate the logit coefficient
eβ = relative odds, or “odds ratio”
• Compares the odds of the outcome for different values of
the independent variable
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Example: Logit model of LBW
• Low birth weight (LBW)
= birth weight <2,500 grams
• Log-odds = log[pLBW/(1 – pLBW )]
– Where pLBW is the proportion of the sample that is LBW.
• Log relative odds of LBW = comparison of logodds of LBW for different values of the
independent variable.
• eβ = relative odds of LBW for different values of
the independent variable.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Wording for odds ratios
• βs for logit models are in the form of ratios.
• For suggestions on how to phrase descriptions of
ratios with minimal jargon, see
– Table 5.3 in The Chicago Guide to Writing about
Numbers
OR
– Table 8.3 in The Chicago Guide to Writing about
Multivariate Analysis
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Phrases for ratios
Type of ratio
Ratio
example
Rule of thumb
Writing suggestion
< 1.0 (e.g., 0.x)
% difference = ratio 
100
0.80
[Group] is only x% as ___b as
the reference value.
“Males were only 80% as likely as
females to graduate from the
program.”
Close to 1.0
1.02
Use phrasing to express
similarity between the two
groups.
“Average test scores were similar for
males and females (ratio = 1.02 for
males vs. females).”
>1.0 (e.g., 1.y)
% difference = (ratio –
1)  100.
1.20
[Group] is 1.y times as ___ as
the reference value.
“On average, males were 1.20 times
as tall as females.”
OR [Group] is y% ___er than
the reference value.
OR “Males were on average 20%
taller than females.”
2.34
[Group] is (2.34 – 1)  100, or
134% more ___ than the
reference value.
“Males’ incomes were 134% higher
than those of females.”
2.96
[Group] is (about) z times as
___.
“Males were nearly three times as
likely to commit a crime as their
female peers.”
Close to a multiple of
1.0
(e.g., z.00)
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Odds ratios for
categorical independent variables
• Odds ratio of the outcome for the category of
interest compared to the reference category.
• “Infants born to smokers had 1.4 times the
odds of low birth weight (LBW) as those born
to nonsmokers (p < 0.01).”
– Concepts, reference category, direction,
magnitude, and statistical significance.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Odds ratios for
continuous independent variables
• Odds ratio of the outcome for a one-unit
increase in the independent variable.
• “Odds of LBW decreased by about 0.8% for
each 1 year increase in mother’s age (NS).”
– Concepts, units, direction, magnitude, and
statistical significance.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Topic sentences for paragraphs
reporting multivariate results
• Start each paragraph of the results section with a
restatement of topic addressed by analysis to be
reported in that paragraph.
– Can paraphrase title of table or chart that reports the detailed
statistical results.
• Topic sentence should mention:
– Dependent variable.
– Independent variable(s).
• Use summary phrase rather than long list of variables.
– Type of analysis.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Example topic sentences
• “Multivariate logistic regression results show
that insurance is a powerful predictor of length
of stay (table X).” [Next sentence goes into detail
about direction, size, and statistical significance.]
– Mentions type of analysis, dependent variable, and
independent variable.
• “As shown in figure Y, race and income level
interact in their effect on risk of asthma.”
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Summary
• Report detailed multivariate results in tables.
• Interpret coefficients in prose.
• Specify direction, magnitude, and statistical
significance of associations.
– Units for continuous variables
– Categories for nominal or ordinal variables
• Write about concepts, not acronyms.
– Introduce concepts under study in topic sentences.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested resources
• Miller, J. E. 2013. The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Chapter 5, on creating effective multivariate tables
– Chapter 8, on wording for results of
• subtraction (OLS βs)
• ratios (logit βs)
– Chapter 9, on writing about βs from OLS and logit
models
– Chapter 10, on the Goldilocks problem for choosing a
fitting contrast size for interpreting coefficients
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested online resources
• Podcasts on
– Comparing two numbers or series
– Choosing a reference category
– Defining the Goldilocks problem
– Resolving the Goldilocks problem: Presenting results
– Differentiating between statistical significance and
substantive importance
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested practice exercises
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Problem sets for chapters 9 and 15
– Suggested course extensions for chapters 9 and 15
• “Reviewing,” “writing” and “revising” exercises.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Contact information
Jane E. Miller, PhD
[email protected]
Online materials available at
http://press.uchicago.edu/books/miller/multivariate/index.html
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.