Transcript Principle Component Analysis - University of British Columbia
About Factor Analysis
February 5, 2007
Terminology
Factor analysis as a generic term is for a family of statistical techniques concerned with the reduction of a set of observable variables in terms of a small number of factors/components.
Factor Analysis Principal Component Analysis PCA Factor Analysis Exploratory Factor Analysis EFA Confirmatory Factor Analysis CFA
2
Principal Component Analysis
A variable reduction technique Used when variables are highly correlated Reduces the number of observed variables to a smaller number of principal components which account for most of the variance of the observed variables 3
Exploratory Factor Analysis
A variable reduction technique Identifies the number of underlying latent factors and factor structure of a set of variables provide an explanation of the relationships among observed variables in terms of factors 4
PCA and EFA
Variable reduction methods Identify groups of observed variables that tend to hang together empirically Performed by PROC FACTOR Sometimes even provide very similar results 5
PCA vs. EFA
Identifies a smaller number of
composite variables
(principal components, artificial variables) to account for most of the variance present in the observed variables Identifies the number of underlying
latent factors
(cannot be directly measured) and factor structure of a set of variables PCs retained account for a maximal amount of
variance
variables
total
of observed Factors account for
common variance
observed variables of the 6
PCA vs. EFA
Diagonals of the correaltion matrix equal to one Diagonals of the correaltion matrix adjusted with unique factors (communality) Component scores are linear combinations of the observed variables weighted by eigenvectors Observed variables are linear combinations of the underlying factors.
7
PCA vs. EFA
PCA
proc factor
method=prin priors=one; or
proc princomp
; EFA
proc factor
method=ml priors=smc; 8
PRINCOMP vs. FACTOR
PROC PRINCOMP has the following advantages over PROC FACTOR: Slightly faster if a small number of components is requested. Can analyze somewhat larger problems in a fixed amount of memory. Can output scores from an analysis of a partial correlation or covariance matrix. Simpler to use. PROC FACTOR has the following advantages over PROC PRINCOMP for PCA: Produces more output, including the scree (eigenvalue) plot, pattern matrix, and residual correlations. Does rotations. 9
Communalities
Communality measures the proportion of the variance of the observed variable shared with the other variables; EFA begins by substituting the diagonal of the correlation matrix with what are called prior communality estimates; By PRIORS= option to the PROC FACTOR statement: o o o PRIORS=MAX, the largest absolute correlation for a variable with any other variable as the communality estimate for the variable; PRIORS=SMC, the squared multiple correlation between the variable and all other variables; SAS by default sets all prior communalities to 1.0, which is the same as requesting a PCA. 10
Confirmatory Factor Analysis
Confirmatory factor analysis allows you to test very specific hypotheses regarding the number of factors, factor loadings, and factor intercorrelations;
PROC CALIS;
Specify the structure of three matrices a priori of data analysis: 1) 2) 3) the factor loading matrix the factor intercorrelation matrix the unique variance matrix It is more complex to run than EFA; The assumptions underlying confirmatory factor analysis as well as the interpretation of the output can be exceedingly complex. 11
Proc Calis
PROC CALIS METHOD = LSML ALL NOMOD ; VAR Item1-Item6; FACTOR HEYWOOD N = 2; /* N = 2 specifies a two factor solution ; */ MATRIX _F_ {1,1} = Item1F1 ( .80), {1,2} = Item1F2 ( .20), {2,1} = Item2F1 ( .80), {2,2} = Item2F2 ( .20), {3,1} = Item3F1 ( .80), {3,2} = Item3F2 ( .20), {4,1} = Item4F1 ( .20), {4,2} = Item4F2 ( .80), {5,1} = Item5F1 ( .20), {5,2} = Item5F2 ( .80), {6,1} = Item6F1 ( .20), {6,2} = Item6F2 ( .80) ; /* The matrix _F_, the factor loading matrix;*/ Matrix _P_ {1, 1} = 1.0, {2, 2} = 1.0, {2, 1} = .60 ; /* The _P_, the factor intercorrelation matrix; defaults to an identity matrix;*/ Matrix _U_ {1, 1} = Theta1-Theta6 6*.10 ; /* Matrix _U_ is the unique variance matrix; */ RUN ;
12
Combining EFA and CFA
Use EFA if you do not have strong theory about the underlying constructs (theory-generating); CFA otherwise (theory-testing); On separate data sets, using an EFA to generate a theory about the underlying constructs; then test the structure of the extracted factors with a CFA; Example: The Coach-Athlete Relationship Questionnaire (CART-Q): development and initial validation.
13
Steps of PCA/EFA
Initial Extraction of the components/factors; Determining the number of “meaningful” components/factors to retain; Rotation to a Final solution; Interpreting the rotated solution; Creating factor scores or factor-based scores for further analysis; Summarizing the results.
14
Determining the number of factors/components
Kaiser-Guttman Rule/Eigenvalue-greater-than-one rule; o o o An eigenvalue represents the amount of variance that is accounted for by a given factor/component.
More appropriate for PCA than EFA. For EFA, it is adjusted by communalities.
The PROC FACTOR has the option MINEIGEN= to specify the cutoff.
Scree test; o o Plotting the eigenvalues against the corresponding factor numbers Look for the “elbow” Cumulative proportion of variance explained, e.g at least 70%; Interpretability of the factors/components extracted; 15
Interpretability
Factor Pattern Matrix/factor loadings: In an orthogonal analysis, factor loadings are equivalent to bivariate correlations between the observed variables and the retained factors/components. A rule of thumb: factor loadings greater than .40 in absolute value are considered to be significant.
Simple structure: each variable has relatively high factor loadings on only one component, and near zero loadings on the others.
16
F actor P attern M atrix L abel N Q 46_1L AC K_F O O D N Q 46_2T AKE _C AR E N Q 46_3F L Y_T E AS E N Q 46_4D R U N K_F L Y N Q 46_5F E L T _IM P N Q 46_6D IR T Y_C L O T H N Q 46_7L O V E D N Q 46_8N V R _B O R N N Q 46_9H IT _H AR D N Q 46_11F L Y_H IT N Q 46_12P U N IS H N Q 46_13F L Y_C O O P N Q 46_14F L Y_IN S U L T N Q 46_15P H Y_AB U S E N Q 46_17N O T E _H U R T N Q 46_18F L Y_H AT E N Q 46_19F L Y_C L O S E N Q 46_20T O U C H _S E X N Q 46_21T H R E AT _S E X N Q 46_23T R IE D _S E X N Q 46_24M O L E S T N Q 46_25E M O T IO N _AB N Q 46_26H E L P _D O C N Q 46_27B E L IE V E _AB F actor1
0.33
0.52
0.43
0.43
0.30
0.31
0.48
0.46
0.42
0.46
0.41
0.46
0.43
0.54
0.44
0.45
0.44
0.91
0.67
0.81
0.93
0.57
0.41
0.92
F actor2
0.25
0.37
0.53
0.22
0.38
0.29
0.52
0.52
0.41
0.58
0.43
0.56
0.60
0.54
0.49
0.52
0.53
-0.27
-0.11
-0.23
-0.28
0.47
0.30
-0.26
F actor3
-0.15
-0.28
0.06
-0.01
-0.33
0.00
-0.35
-0.13
0.31
0.47
0.34
-0.39
0.11
0.30
0.38
0.04
-0.43
-0.01
-0.06
-0.06
0.01
0.06
-0.26
0.02
F actor4
0.05
0.19
-0.37
0.01
0.10
0.04
0.07
-0.12
0.17
0.18
0.10
0.08
-0.40
-0.02
0.28
-0.22
0.05
0.00
0.10
0.02
-0.01
-0.26
0.27
Rotation to a Final Solution
The factor pattern matrix is not unique.
By rotating the reference axes of the factor solution to simplify the factor structure and to make the final solution easier to interpret.
Orthogonal rotation: VARIMAX, EQUAMAX, ORTHOMAX… Oblique rotation: PROCRUSTES, PROMAX 18
Interpretation of Factors
Identifying significant loadings. That is, to determine what construct seems to be measured by factor 1, what construct seems to be measured by factor 2 … Unique loadings of 0.40 and above, and of at least 0.10 cross-loading differences Naming Factors. What these variables have in common?
19
Factor Scores
Assign scores to each subject to indicate where that subject stands on each retained factor/component for further analysis.
A
factor score
is a linear composite of the optimally weighted observed variables. If request, PROC FACTOR will compute each subject’s factor scores.
A
factor-based score
is a linear composite of the variables that demonstrated meaningful loadings for the factor/component in question, i.e. sum.
20
Example: Childhood Trauma Questionnaire
ARYS, N=438, 28 questions/measured variables; The scale ranged from 1 to 5: 1 - never true 2 - rarely true 3 - sometimes true 4 - often true 5 - very often true Excluded Q 10, 16, 22; Reversed scores for Q 2, 5, 7, 13, 19, 26, 28; 21
6.
7.
8.
9.
1.
2.
3.
4.
5.
When I was growing up …
10.
11.
12.
26.
27.
I didn’t have enough to eat.
I knew that there was someone to take care of me and protect me.
People in my family called me things like “stupid”, “lazy”, or “ugly”.
My parents were too drunk or high to take care of the family.
There was someone in my family who helped me feel that I was important or special.
I had to wear dirty clothes.
I felt loved.
I thought that my parents wished I had never been born.
I got hit so hard by someone in my family that I had to see a doctor or go to the hospital.
There was nothing I wanted to change about my family.
People in my family hit me so hard that it left me with bruises or marks.
… There was someone to take me to the doctor if I needed it.
… 22
Correlation Matrix
19 20 21 23 24 25 26 27 28 1 2 3 4 5 6 7 8 9 11 12 13 14 15 17 18 1 2 3 4 5 6 7 8 9 11 12 13 14 15 17 18
1.00
0.43 1.00
0.26 0.36 1.00
0.36 0.32 0.30 1.00
0.23 0.39 0.30 0.16 1.00
0.49 0.29 0.30 0.40 0.21 1.00
0.37 0.57 0.46 0.29 0.59 0.25 1.00
0.29 0.41 0.53 0.29 0.37 0.26 0.60 1.00
0.20 0.35 0.33 0.37 0.16 0.29 0.31 0.37 1.00
0.21 0.37 0.47 0.30 0.23 0.30 0.38 0.42 0.59 1.00
0.21 0.31 0.40 0.24 0.16 0.26 0.28 0.35 0.44 0.62 1.00
0.32 0.60 0.44 0.28 0.48 0.29 0.62 0.54 0.32 0.37 0.32 1.00
0.24 0.34 0.68 0.31 0.29 0.32 0.47 0.53 0.41 0.52 0.42 0.44 1.00
0.25 0.37 0.52 0.35 0.28 0.27 0.43 0.51 0.51 0.72 0.59 0.43 0.58 1.00
0.22 0.34 0.38 0.25 0.24 0.29 0.39 0.37 0.60 0.72 0.55 0.33 0.43 0.60 1.00
0.25 0.34 0.52 0.28 0.23 0.26 0.43 0.57 0.43 0.49 0.39 0.47 0.61 0.52 0.41 1.00
19 20 21 23
0.30 0.55 0.40 0.28 0.45 0.27 0.62 0.48 0.29 0.30 0.28 0.70 0.44 0.42 0.31 0.47 1.00
0.23 0.38 0.25 0.33 0.15 0.21 0.29 0.28 0.26 0.24 0.26 0.26 0.23 0.33 0.29 0.27 0.26 1.00
0.24 0.34 0.21 0.25 0.20 0.16 0.31 0.28 0.28 0.21 0.20 0.28 0.19 0.26 0.31 0.27 0.25 0.68 1.00
0.21 0.34 0.22 0.30 0.16 0.12 0.30 0.25 0.23 0.21 0.21 0.27 0.21 0.28 0.24 0.25 0.28 0.83 0.68 1.00
24
0.24 0.37 0.26 0.34 0.18 0.20 0.31 0.28 0.27 0.27 0.27 0.28 0.23 0.36 0.27 0.28 0.26 0.92 0.64 0.81 1.00
25
0.33 0.42 0.58 0.39 0.27 0.29 0.46 0.49 0.39 0.52 0.39 0.49 0.63 0.63 0.42 0.56 0.47 0.39 0.26 0.34 0.41 1.00
26
0.29 0.46 0.20 0.36 0.34 0.30 0.47 0.38 0.27 0.29 0.20 0.45 0.22 0.30 0.31 0.28 0.45 0.29 0.27 0.29 0.30 0.28 1.00
27 28
23 0.30 0.53 0.44 0.33 0.48 0.30 0.65 0.55 0.29 0.40 0.37 0.70 0.47 0.46 0.37 0.48 0.72 0.21 0.23 0.21 0.22 0.51 0.49 0.22 1.00
Method 1: EFA without Rotation
PROC FACTOR DATA=trauma METHOD=ml SCREE PRIORS=smc; VAR nq:; RUN; Method is maximum likelihood Scree plot of eigenvalues Diagonals of the correlation matrix are equal to squared multiple correlations 24
Table 1: Eigenvalues of the Weighted Reduced Correlation Matrix.
N umbe r 1 2 9 1 0 1 1 1 2 1 3 1 4 3 4 5 6 7 8 1 5 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3 2 4 2 5 E ige nva lue D iffe re nc e P roportion
47.69
15.65
32.04
10.99
0.68
0.22
4.66
1.99
0.83
0.77
0.57
0.32
0.20
0.17
0.13
0.06
0.02
-0.05
2.68
1.16
0.06
0.20
0.25
0.12
0.04
0.04
0.07
0.03
0.08
0.02
0.07
0.03
0.01
0.01
0.01
0.00
0.00
0.00
0.00
0.00
0.00
0.00
-0.07
-0.12
-0.15
-0.15
-0.20
-0.25
-0.31
-0.37
-0.41
-0.45
-0.53
0.05
0.03
0.00
0.05
0.04
0.07
0.06
0.04
0.04
0.08
0.00
0.00
0.00
0.00
0.00
0.00
0.00
-0.01
-0.01
-0.01
-0.01
C umula tive
0.68
0.90
0.97
1.00
1.01
1.02
1.03
1.04
1.04
1.04
1.04
1.04
1.04
1.04
1.04
1.04
1.04
1.04
1.03
1.03
1.03
1.02
1.01
25 1.00
Scree Plot
Scree Plot of Eigenvalues
60 50 40 30 20 10 0 -10 0 5 10 15
number
20 25 30 26
Table 2: Unrotated Factor Pattern Matrix.
L a be l N Q 4 6 _ 1 L AC K_ F O O D N Q 4 6 _ 2 T AKE _ C AR E N Q 4 6 _ 3 F L Y_ T E AS E N Q 4 6 _ 4 D R U N K_ F L Y N Q 4 6 _ 5 F E L T _ IM P N Q 4 6 _ 6 D IR T Y_ C L O T H N Q 4 6 _ 7 L O V E D N Q 4 6 _ 8 N V R _ B O R N N Q 4 6 _ 9 H IT _ H AR D N Q 4 6 _ 1 1 F L Y_ H IT N Q 4 6 _ 1 2 P U N IS H N Q 4 6 _ 1 3 F L Y_ C O O P N Q 4 6 _ 1 4 F L Y_ IN S U L T N Q 4 6 _ 1 5 P H Y_ AB U S E N Q 4 6 _ 1 7 N O T E _ H U R T N Q 4 6 _ 1 8 F L Y_ H AT E N Q 4 6 _ 1 9 F L Y_ C L O S E N Q 4 6 _ 2 0 T O U C H _ S E X N Q 4 6 _ 2 1 T H R E AT _ S E X N Q 4 6 _ 2 3 T R IE D _ S E X N Q 4 6 _ 2 4 M O L E S T N Q 4 6 _ 2 5 E M O T IO N _ AB N Q 4 6 _ 2 6 H E L P _ D O C N Q 4 6 _ 2 7 B E L IE V E _ AB N Q 4 6 _ 2 8 F L Y_ S U P P O R T F a c tor1
0.33
0.52
0.43
0.43
0.30
0.31
0.48
0.46
0.42
0.46
0.41
0.46
0.43
0.54
0.44
0.45
0.44
0.91
0.67
0.81
0.93
0.57
0.41
0.92
0.42
F a c tor2
0.25
0.37
0.53
0.22
0.38
0.29
0.52
0.52
0.41
0.58
0.43
0.56
0.60
0.54
0.49
0.52
0.53
-0.27
-0.11
-0.23
-0.28
0.47
0.30
-0.26
0.62
F a c tor3
-0.15
-0.28
0.06
-0.01
-0.33
0.00
-0.35
-0.13
0.31
0.47
0.34
-0.39
0.11
0.30
0.38
0.04
-0.43
-0.01
-0.06
-0.06
0.01
0.06
-0.26
0.02
-0.37
F a c tor4
0.05
0.19
-0.37
0.01
0.10
0.04
0.07
-0.12
0.17
0.18
0.10
0.08
-0.40
-0.02
0.28
-0.22
0.05
0.00
0.10
0.02
-0.01
-0.26
0.27
-0.02
Method 2: EFA with Orthogonal Rotation
PROC FACTOR DATA=trauma METHOD=ml ROTATE=varimax N=4 OUT=factscore PRIORS=smc; VAR nq:; RUN; Orthogonal rotation method VARIMAX 4 factors retained Original data and factor scores output to dataset factscore 28
Table 3: Rotated Factor Pattern matrix by orthogonal method VARIMAX (number of factors=4).
L a be l N Q 4 6 _ 1 L AC K_ F O O D N Q 4 6 _ 2 T AKE _ C AR E N Q 4 6 _ 3 F L Y_ T E AS E N Q 4 6 _ 4 D R U N K_ F L Y N Q 4 6 _ 5 F E L T _ IM P N Q 4 6 _ 6 D IR T Y_ C L O T H N Q 4 6 _ 7 L O V E D N Q 4 6 _ 8 N V R _ B O R N N Q 4 6 _ 9 H IT _ H AR D N Q 4 6 _ 1 1 F L Y_ H IT N Q 4 6 _ 1 2 P U N IS H N Q 4 6 _ 1 3 F L Y_ C O O P N Q 4 6 _ 1 4 F L Y_ IN S U L T N Q 4 6 _ 1 5 P H Y_ AB U S E N Q 4 6 _ 1 7 N O T E _ H U R T N Q 4 6 _ 1 8 F L Y_ H AT E N Q 4 6 _ 1 9 F L Y_ C L O S E N Q 4 6 _ 2 0 T O U C H _ S E X N Q 4 6 _ 2 1 T H R E AT _ S E X N Q 4 6 _ 2 3 T R IE D _ S E X N Q 4 6 _ 2 4 M O L E S T N Q 4 6 _ 2 5 E M O T IO N _ AB N Q 4 6 _ 2 6 H E L P _ D O C N Q 4 6 _ 2 7 B E L IE V E _ AB N Q 4 6 _ 2 8 F L Y_ S U P P O R T F a c tor1
0.37
0.62
0.28
0.26
0.58
0.27
0.71
0.49
0.19
0.19
0.16
0.76
0.27
0.25
0.23
0.33
0.76
0.14
0.23
0.18
0.14
0.32
0.56
0.13
0.76
F a c tor2
0.16
0.26
0.11
0.26
0.07
0.12
0.16
0.14
0.15
0.10
0.13
0.12
0.07
0.20
0.13
0.12
0.12
0.92
0.64
0.82
0.95
0.26
0.21
0.92
0.05
F a c tor3
0.14
0.22
0.28
0.23
0.09
0.25
0.19
0.25
0.62
0.83
0.61
0.17
0.33
0.63
0.76
0.33
0.11
0.13
0.14
0.09
0.14
0.33
0.20
0.15
0.21
F a c tor4
0.13
0.11
0.67
0.20
0.12
0.17
0.26
0.44
0.17
0.27
0.24
0.25
0.73
0.42
0.13
0.53
0.26
0.11
0.04
0.08
0.12
0.58
-0.02
29 0.27
Table 4: Rotated Factor Pattern matrix by orthogonal method VARIMAX (number of factors=5).
L a be l N Q 4 6 _ 1 L AC K_ F O O D N Q 4 6 _ 2 T AKE _ C AR E N Q 4 6 _ 3 F L Y_ T E AS E N Q 4 6 _ 4 D R U N K_ F L Y N Q 4 6 _ 5 F E L T _ IM P N Q 4 6 _ 6 D IR T Y_ C L O T H N Q 4 6 _ 7 L O V E D N Q 4 6 _ 8 N V R _ B O R N N Q 4 6 _ 9 H IT _ H AR D N Q 4 6 _ 1 1 F L Y_ H IT N Q 4 6 _ 1 2 P U N IS H N Q 4 6 _ 1 3 F L Y_ C O O P N Q 4 6 _ 1 4 F L Y_ IN S U L T N Q 4 6 _ 1 5 P H Y_ AB U S E N Q 4 6 _ 1 7 N O T E _ H U R T N Q 4 6 _ 1 8 F L Y_ H AT E N Q 4 6 _ 1 9 F L Y_ C L O S E N Q 4 6 _ 2 0 T O U C H _ S E X N Q 4 6 _ 2 1 T H R E AT _ S E X N Q 4 6 _ 2 3 T R IE D _ S E X N Q 4 6 _ 2 4 M O L E S T N Q 4 6 _ 2 5 E M O T IO N _ AB N Q 4 6 _ 2 6 H E L P _ D O C N Q 4 6 _ 2 7 B E L IE V E _ AB N Q 4 6 _ 2 8 F L Y_ S U P P O R T F a c tor1
0.27
0.59
0.26
0.19
0.57
0.16
0.70
0.48
0.16
0.18
0.16
0.75
0.26
0.25
0.21
0.33
0.76
0.12
0.21
0.17
0.12
0.31
0.52
0.10
0.76
F a c tor2
0.12
0.26
0.10
0.24
0.07
0.07
0.16
0.14
0.14
0.10
0.13
0.13
0.06
0.20
0.13
0.13
0.13
0.92
0.64
0.82
0.94
0.25
0.20
0.92
0.06
F a c tor3
0.08
0.21
0.27
0.19
0.09
0.18
0.19
0.25
0.60
0.83
0.61
0.17
0.32
0.63
0.75
0.33
0.11
0.12
0.13
0.09
0.14
0.32
0.19
0.14
0.21
F a c tor4
0.10
0.09
0.66
0.18
0.12
0.14
0.25
0.43
0.16
0.26
0.23
0.24
0.72
0.42
0.13
0.53
0.25
0.11
0.03
0.08
0.12
0.57
-0.04
0.14
0.26
F a c tor5
0.59
0.25
0.17
0.44
0.09
0.68
0.13
0.12
0.20
0.11
0.10
0.13
0.15
0.08
0.13
0.11
0.11
0.12
0.11
0.05
0.11
0.16
0.26
0.13
Method 3: EFA with Oblique Rotation
PROC FACTOR DATA=trauma METHOD=ml ROTATE=promax N=5 OUT=factscore PRIORS=smc; VAR nq:; RUN; Oblique rotation method PROMAX 5 factors retained 31
Table 5: Rotated Factor Pattern matrix by oblique method PROMAX.
L a be l F a c tor1 F a c tor2 F a c tor3 F a c tor4 F a c tor5 N Q 4 6 _ 1 L AC K_ F O O D
0.14
0.00
-0.08
0.03
0.62
N Q 4 6 _ 2 T AKE _ C AR E N Q 4 6 _ 3 F L Y_ T E AS E N Q 4 6 _ 4 D R U N K_ F L Y N Q 4 6 _ 5 F E L T _ IM P N Q 4 6 _ 6 D IR T Y_ C L O T H N Q 4 6 _ 7 L O V E D N Q 4 6 _ 8 N V R _ B O R N N Q 4 6 _ 9 H IT _ H AR D N Q 4 6 _ 1 1 F L Y_ H IT N Q 4 6 _ 1 2 P U N IS H N Q 4 6 _ 1 3 F L Y_ C O O P N Q 4 6 _ 1 4 F L Y_ IN S U L T N Q 4 6 _ 1 5 P H Y_ AB U S E N Q 4 6 _ 1 7 N O T E _ H U R T N Q 4 6 _ 1 8 F L Y_ H AT E N Q 4 6 _ 1 9 F L Y_ C L O S E N Q 4 6 _ 2 0 T O U C H _ S E X N Q 4 6 _ 2 1 T H R E AT _ S E X N Q 4 6 _ 2 3 T R IE D _ S E X N Q 4 6 _ 2 4 M O L E S T N Q 4 6 _ 2 5 E M O T IO N _ AB N Q 4 6 _ 2 6 H E L P _ D O C N Q 4 6 _ 2 7 B E L IE V E _ AB N Q 4 6 _ 2 8 F L Y_ S U P P O R T
0.59
0.02
0.02
0.63
-0.04
0.72
0.38
-0.01
-0.03
-0.02
0.79
0.00
0.04
0.06
0.15
0.81
-0.04
0.13
0.07
-0.04
0.08
0.55
-0.06
0.79
0.14
-0.02
0.14
-0.02
-0.07
0.03
0.02
0.02
-0.05
0.02
-0.01
-0.07
0.07
-0.01
0.01
0.00
0.95
0.64
0.85
0.98
0.14
0.10
0.95
-0.09
0.08
0.01
0.06
-0.03
0.05
0.01
0.06
0.63
0.90
0.63
-0.01
0.06
0.57
0.83
0.14
-0.08
-0.02
0.05
-0.03
0.00
0.10
0.12
0.01
0.04
-0.09
0.74
0.11
-0.01
0.07
0.11
0.38
0.00
0.08
0.10
0.09
0.81
0.32
-0.09
0.53
0.12
0.02
-0.09
-0.02
0.03
0.59
-0.24
0.05
0.11
0.14
0.07
0.43
-0.01
0.73
-0.02
-0.01
0.11
-0.03
-0.01
-0.03
0.04
-0.08
0.01
-0.01
-0.04
0.02
0.01
-0.06
0.00
0.04
0.18
32 -0.03
Plots of Factor Pattern for Factor1 and Factor2 Figure 1: without Rotation Figure 2: with Orthogonal Rotation
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
-0.1
0.0
-0.2
-0.3
-0.4
0.2
0.4
Factor1
0.6
0.8
1.0
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0
0.2
0.4
Factor1
0.6
1.0
Figure 3: with Oblique Rotation
0.8
0.6
0.4
0.2
-0.2
0.0
-0.2
0.0
0.2
0.4
Factor1
0.6
0.8
1.0
33 0.8
Table 6: Rotated Factor Pattern matrix by oblique method PROMAX after removing Q8.
L abel N Q 46_1L AC K_F O O D N Q 46_2T AKE _C AR E N Q 46_3F L Y_T E AS E N Q 46_4D R U N K_F L Y N Q 46_5F E L T _IM P N Q 46_6D IR T Y_C L O T H N Q 46_7L O V E D N Q 46_9H IT _H AR D N Q 46_11F L Y_H IT N Q 46_12P U N IS H N Q 46_13F L Y_C O O P N Q 46_14F L Y_IN S U L T N Q 46_15P H Y_AB U S E N Q 46_17N O T E _H U R T N Q 46_18F L Y_H AT E N Q 46_19F L Y_C L O S E N Q 46_20T O U C H _S E X N Q 46_21T H R E AT _S E X N Q 46_23T R IE D _S E X N Q 46_24M O L E S T N Q 46_25E M O T IO N _AB N Q 46_26H E L P _D O C N Q 46_27B E L IE V E _AB N Q 46_28F L Y_S U P P O R T F actor1
0.00
0.14
-0.02
0.14
-0.02
-0.07
0.03
0.02
-0.05
0.02
-0.01
-0.07
0.07
-0.01
0.01
0.00
0.95
0.64
0.85
0.98
0.14
0.10
0.95
-0.09
F actor2
0.13
0.58
0.03
0.01
0.62
-0.04
0.71
-0.01
-0.03
-0.01
0.79
0.01
0.04
0.06
0.15
0.82
-0.04
0.13
0.07
-0.04
0.09
0.54
-0.06
0.80
F actor3
-0.08
0.08
0.03
0.06
-0.03
0.05
0.02
0.64
0.90
0.63
-0.01
0.07
0.58
0.84
0.15
-0.09
-0.01
0.05
-0.03
0.00
0.10
0.12
0.01
0.04
F actor4
0.03
-0.08
0.72
0.12
-0.01
0.08
0.10
0.00
0.08
0.10
0.09
0.80
0.31
-0.09
0.51
0.13
0.02
-0.09
-0.01
0.03
0.59
-0.24
0.05
0.12
F actor5
0.63
0.14
0.07
0.43
-0.01
0.72
-0.01
0.11
-0.04
-0.02
-0.03
0.04
-0.08
0.01
-0.01
-0.05
0.02
0.02
-0.06
0.00
0.04
0.18
34 -0.03
Table 7: Inter-Factor Correlations by oblique method PROMAX after removing Q8.
Variable Factor1 Factor1
1
Factor2 Factor3 Factor4 Factor5
0.37
0.37
0.33
0.36
Factor2
1 0.49
0.56
0.48
Factor3
1 0.61
0.42
Factor4
1 0.37
Factor5
1 35
Table 8: Scoring Direction for Childhood Trauma Questionnaire.
subscale Emotional Abuse Physical Abuse Sexual Abuse Emotional Neglect Physical Neglect variable label Q# assigned manually Q# from EFA
EA 3,8,14,18,25 FA4: 3,14,18,25 PA SA EN PN 9,11,12,15,17 20,21,23,24,27 5,7,13,19,28 1,2,4,6,26 FA3: 9,11,12,15,17 FA1: 20,21,23,24,27 FA2: 2,5,7,13,19,28,26 FA5: 1,4,6 36
6.
7.
8.
9.
1.
2.
3.
4.
5.
When I was growing up …
10.
11.
12.
26.
27.
I didn’t have enough to eat.
I knew that there was someone to take care of me and protect me.
People in my family called me things like “stupid”, “lazy”, or “ugly”.
My parents were too drunk or high to take care of the family.
There was someone in my family who helped me feel that I was important or special.
I had to wear dirty clothes.
I felt loved.
I thought that my parents wished I had never been born.
I got hit so hard by someone in my family that I had to see a doctor or go to the hospital.
There was nothing I wanted to change about my family.
People in my family hit me so hard that it left me with bruises or marks.
… There was someone to take me to the doctor if I needed it.
… 37
PCA or EFA
PCA deals with correlated variables with the purpose of reducing the numbers of variables and explaining the large amount of variance with few variables EFA estimates factors, underlying constructs that cannot be measured directly Do not run both. Select the appropriate analysis first.
38
Assumptions & Limitations
No outliers; Variables have to be correlated, interval-scaled; Linearity; Normal distribution; Sufficient number of observed variables; Sufficient number of observations to provide reliable estimations of the correlations; Sometimes arbitrary and subjective decisions have to be made.
39
Related Topics
Nonlinear factor analysis; Factor analysis of ordinal/categorical variables; Principal components of qualitative data (PROC PRINQUAL); Assess reliability by computing coefficient alpha: an index of internal consistency reliability.
40
References
Factor Analysis Using SAS PROC FACTOR, by the University of Texas at Austin Statistical Services http://www.ats.ucla.edu/stat/sas/library/factor_ut.htm
Principle Component Analysis vs. Exploratory Factor Analysis, Diana D. Suhr, University of Northern Colorado http://support.sas.com/publishing/pubcat/chaps/55129.pdf
A tutorial on Principal Components Analysis, by Lindsay I Smith Basic concepts and procedures of confirmatory factor analysis, by Connie D. Stapleton SAS/STAT OnlineDoc 9.1.3
Sophia Jowett, Nikos Ntoumanis (2004), The Coach-Athlete Relationship Questionnaire (CART-Q): development and initial validation http://www.utexas.edu/its/rc/answers/sas/sas26.html
41
Thank you …
42
43
44
45
Most skiing accidents happen on sunny days on easy slopes. The percentage of head injuries in skiing has gone up. The current injury rate in Scotland is 2.24 injuries per 1000 skier/boarder days; 1.74 injuries per 1000 skier days; 3.55 injuries per 1000 boarder days.
Alpine skiers are three times more likely to be involved in a collision with other people than snowboarders. Both drivers and passengers in SUVs are more likely to die in accidents than those in compact cars.
Traffic accidents account for about 10,000 deaths a year in Japan compared to 30,000+ deaths due to suicide.
46