CATPAC & LIWC Key output and findings D.K. & B.L. How CATPAC is Used • Reads text to identify most important words – Can.

Download Report

Transcript CATPAC & LIWC Key output and findings D.K. & B.L. How CATPAC is Used • Reads text to identify most important words – Can.

CATPAC & LIWC
Key output and findings
D.K. & B.L.
How CATPAC is Used
• Reads text to identify most important words
– Can determine patterns of similarity
– Produces simple frequency counts
• The neural network is self-organizing
– Finds patterns of usage between words
– Uses clustering algorithms
– Produces perceptual maps
•
•
•
•
•
TOTAL
TOTAL
TOTAL
TOTAL
WORDS
UNIQUE WORDS
EPISODES
LINES
300
25
294
60
THRESHOLD
0.000
RESTORING FORCE
0.100
CYCLES
1
FUNCTION
Sigmoid (-1 - +1)
CLAMPING
Yes
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
DESCENDING FREQUENCY LIST
CASE
WORD
FREQ PCNT FREQ
--------------- ---- ---- ---I
47 15.7 201
A
28 9.3 153
MY
19 6.3
89
I'M
16 5.3
76
FOR
15 5.0
85
AM
14 4.7
86
BE
13 4.3
75
YOU
13 4.3
63
OUT
12 4.0
73
KNOW
10 3.3
62
HAVE
9 3.0
54
ME
9 3.0
51
ON
9 3.0
62
SOMEONE
9 3.0
59
WITH
9 3.0
58
LIFE
8 2.7
51
LOVE
8 2.7
46
NOT
8 2.7
45
SHOULD
7 2.3
42
SO
7 2.3
49
ABOUT
6 2.0
42
ALL
6 2.0
39
CAN
6 2.0
39
NO
6 2.0
41
WHAT
6 2.0
39
CASE
PCNT
---68.4
52.0
30.3
25.9
28.9
29.3
25.5
21.4
24.8
21.1
18.4
17.3
21.1
20.1
19.7
17.3
15.6
15.3
14.3
16.7
14.3
13.3
13.3
13.9
13.3
ALPHABETICALLY SORTED LIST
CASE
WORD
FREQ PCNT FREQ
--------------- ---- ---- ---A
28 9.3 153
ABOUT
6 2.0
42
ALL
6 2.0
39
AM
14 4.7
86
BE
13 4.3
75
CAN
6 2.0
39
FOR
15 5.0
85
HAVE
9 3.0
54
I
47 15.7 201
I'M
16 5.3
76
KNOW
10 3.3
62
LIFE
8 2.7
51
LOVE
8 2.7
46
ME
9 3.0
51
MY
19 6.3
89
NO
6 2.0
41
NOT
8 2.7
45
ON
9 3.0
62
OUT
12 4.0
73
SHOULD
7 2.3
42
SO
7 2.3
49
SOMEONE
9 3.0
59
WHAT
6 2.0
39
WITH
9 3.0
58
YOU
13 4.3
63
CASE
PCNT
---52.0
14.3
13.3
29.3
25.5
13.3
28.9
18.4
68.4
25.9
21.1
17.3
15.6
17.3
30.3
13.9
15.3
21.1
24.8
14.3
16.7
20.1
13.3
19.7
21.4
CATPAC
frequencies
•WARDS METHOD
•A M H Y I N I A O S S W A N K W A M C L B S F L O
•. Y A O ' O . B U O O I L O N H M E A O E H O I N
•. . V U M T . O T . M T L . O A . . N V . O R F .
•. . E . . . . U . . E H . . W T . . . E . U . E .
•. . . . . . . T . . O . . . . . . . . . . L . . .
•. . . . . . . . . . N . . . . . . . . . . D . . .
•. . . . . . . . . . E . . . . . . . . . . . . . .
•. . . . . . . . . . . . . . . . . . . . . . . . .
•. . . . . . . . . . . . . . . . . . . . . . . . .
•^^^ . . . . . . . . . . . . . . . . . . . . . . .
•^^^^^ . . . . . . . . . . . . . . . . . . . . . .
•^^^^^^^ . . . . . . . . . . . . . . . . . . . . .
•^^^^^^^^^ . . . . . . . . . . . . . . . . . . . .
•^^^^^^^^^^^ . . . . . . . . . . . . . . . . . . .
•^^^^^^^^^^^^^ . . . . . . . . . . . . . . . . . .
•^^^^^^^^^^^^^ . . . . . . . . . . . . . ^^^ . . .
•^^^^^^^^^^^^^ . . . . . . . . . . . . . ^^^ . ^^^
•^^^^^^^^^^^^^ . . . . . . . . . ^^^ . . ^^^ . ^^^
•^^^^^^^^^^^^^ . . . . . . . ^^^ ^^^ . . ^^^ . ^^^
•^^^^^^^^^^^^^ . . . . . ^^^ ^^^ ^^^ . . ^^^ . ^^^
•^^^^^^^^^^^^^ ^^^ . . . ^^^ ^^^ ^^^ . . ^^^ . ^^^
•^^^^^^^^^^^^^ ^^^ . . . ^^^ ^^^ ^^^ . . ^^^ ^^^^^
•^^^^^^^^^^^^^ ^^^ . . . ^^^ ^^^ ^^^ ^^^ ^^^ ^^^^^
•^^^^^^^^^^^^^ ^^^ . ^^^ ^^^ ^^^ ^^^ ^^^ ^^^ ^^^^^
•^^^^^^^^^^^^^ ^^^^^ ^^^ ^^^ ^^^ ^^^ ^^^ ^^^ ^^^^^
•^^^^^^^^^^^^^ ^^^^^ ^^^ ^^^ ^^^ ^^^^^^^ ^^^ ^^^^^
•^^^^^^^^^^^^^ ^^^^^ ^^^ ^^^ ^^^ ^^^^^^^ ^^^^^^^^^
•^^^^^^^^^^^^^ ^^^^^^^^^ ^^^ ^^^ ^^^^^^^ ^^^^^^^^^
•^^^^^^^^^^^^^ ^^^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^^^
•^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ ^^^^^^^ ^^^^^^^^^
•^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^
•^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
•^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dendogram
output
CATPAC 3-D Perceptual Map
Operating Issues with CATPAC
• Exclude dictionary: must amend the default
and save or create in correct format
• Text input: separating multiple texts
requires insertion of a slide barrier
• Refining the exclude list and analysis
settings can be a long, incremental process
• The 3-D visualizing is cluttered for larger
numbers of terms
Linguistic Inquiry and
Word Count
• Provide an effective method for studying
emotional/cognitive/structural/process
components present in individuals’ verbal
and written speech
• Calculates % of words that match of up to
84 dimensions
• Generates an output that is readable by
SPSS or Excel
LIWC / output variables
• Text files, once formatted for entry, are processed
for up to 84 output variables, including:
– 17 standard linguistic dimensions (e.g., word count,
percentage of pronouns, articles)
– 25 word categories tapping psychological constructs
(e.g., affect, cognition)
– 10 dimensions related to "relativity" (time, space,
motion)
– 19 personal concern categories (e.g., work, home,
leisure activities)
LIWC / How to…
• For best results -> prepare text for analysis (adjust
misspellings, inappropriate words, abbreviations)
• Adjusting words can be tricky… e.g.: US -> U.S.
• Sometimes used to analyze oral
conversations/interviews -> transcribe speech to
text -> dictionary includes some “nonfluencies”
(e.g.: hm, uh, huh, um)
• Analyzes data one file at a time
• Files: TEXT or ASCII format! Can’t read word
document
• The longer the document, the better
LIWC / dictionaries
• Only counts words that are in the
dictionaries
• default dictionary: Internal Pennebaker
Dictionary -> 2300 words
• But you can develop your own dictionary!
• To create dictionary: choose “load new
dictionary” from the “dictionary” menu
• Dictionaries have to be plain text files
LIWC output with standard
linguistic dimensions