Multivariate Data Exploration with Stata:

Download Report

Transcript Multivariate Data Exploration with Stata:

Multivariate Data
Exploration with Stata:
Evaluation and Wish List
Stephen Soldz
Boston Graduate School of Psychoanalysis
[email protected]
Principal Components Analysis
Purpose: Data exploration and data reduction
Available in Stata
Issues/Limitations
• Base ado (pca)
• Built-in (factor, pcf)
•
• score will produce
component scores
•
•
•
pca just a wrapper for (now
undocumented) pc option to factor,
which user cannot access and modify
Confusing documentation on
difference between pca and factor,
pcf (i.e., scaling of eigenvectors)
Does not directly allow pca of
correlation/ covariance matrix – must
use corr2data, introducing error
Does not allow rotate to “protect”
user – seems patronizing and
uncharacteristic of Stata
Exploratory Factor Analysis
Purpose: Data exploration and data reduction
Available in Stata
• Built-in factor allows
principal factors (with
and without iteration of
communalities),
maximum likelihood
• Built-in rotate allows
varimax (with and
without Horst correction)
and promax
Issues/Limitations
 factor, pfi (prinipal factors
with iteration) does not allow
specification of number of times to
iterate – this directly conflicts with
Gorsuch (1983) recommendation that
communalities be iterated only 3-4
times
 As factor built-in, users cannot
modify or build on it rotate
options very limited (only varimax
and promax) and users cannot
modify, though they could access
eigenvectors (matix_get) and
write their own
Exploratory Factor Analysis,
Continued
Available in Stata
• Built-in factor allows
principal factors (with
and without iteration of
communalities),
maximum likelihood
• Built-in rotate allows
varimax (with and
without Horst correction)
and promax
Issues/Limitations
 rotate not well
documented, so not clear if
one could, e.g., rotate
canonical correlations as
suggested by Cliff & Krus
(1976).
Correspondence Analysis
Purpose: Data exploration and reduction of
categorical data
Available in Stata
•
•
Issues/Limitations
User-written coranal • Graphics broken in Stata 8
(correspondence
• Statalist question as to
whether mca is producing
analysis)
correct output
User-written mca
(multiple correspondence • Few variations implemented
analysis)
Optimal Scaling
Purpose: Data exploration, reduction, and
transformation
Available in Stata
•
None (that I’m aware of)
Issues/Limitations
Multidimensional Scaling
Purpose: Data exploration
Available in Stata
•
None (that I’m aware of)
Issues/Limitations
Conclusion
• Stata is weak in”multivariate exploratory
data analysis” procedures.
• Many existing procedures are inflexible and
not extensible, or user-contributed and not
currently maintained.
• Stata lags behind SPSS, SAS, S-Plus, and R
in this area.