CS 424P/ LINGUIST 287 Extracting Social Meaning and Sentiment Dan Jurafsky Lecture 5: Romantic Interest and Personality.

Download Report

Transcript CS 424P/ LINGUIST 287 Extracting Social Meaning and Sentiment Dan Jurafsky Lecture 5: Romantic Interest and Personality.

CS 424P/ LINGUIST 287
Extracting Social Meaning and Sentiment
Dan Jurafsky
Lecture 5: Romantic Interest and Personality
Joint work with:
Rajesh Ranganath,
Dan McFarland
Dan Jurafsky, Rajesh Ranganath, and Dan McFarland. 2009.
Extracting Social Meaning: Identifying Interactional Style in
Spoken Conversation. Proceedings of NAACL HLT 2009.
Rajesh Ranganath, Dan Jurafsky, and Dan McFarland. 2009.
It's Not You, it's Me: Detecting Flirting and its Misperception
in Speed-Dates. EMNLP-2009
Detecting social meaning:
our study
 Given speech and text from a conversation
 Can we detect `styles’, like whether a speaker is
 Awkward?
 Flirtatious?
 Friendly?
 Can we tell if the speakers like each other?
 Dataset:
 991 4-minute “speed-dates”
 Each participant rated their partner and themselves for
these styles
Speed dating
Our
speed
date
setup
Our
speed
date
setup
What do you do for fun? Dance?
Uh, dance, uh, I like to go, like camping. Uh, snowboarding, but I'm not
good, but I like to go anyway.
You like boarding.
Yeah. I like to do anything. Like I, I'm up for anything.
Really?
Yeah.
Are you open-minded about most everything?
Not everything, but a lot of stuffWhat is not everything [laugh]
I don't know. Think of something, and I'll say if I do it or not. [laugh]
Okay. [unintelligible].
Skydiving. I wouldn't do skydiving I don't think.
Yeah I'm afraid of heights.
F: Yeah, yeah, me too.
M: [laugh] Are you afraid of heights?
F: [laugh] Yeah [laugh]
The SpeedDate corpus
 991 4-minute dates
 3 events, each with ~20x20=400 dates, some data loss
 Participants: graduate student volunteers in 2005
 participated in return for the chance to date
 Speech
 ~60 hours, from shoulder sash recorders; high noise
 Transcripts
 ~800K words, hand-transcribed, w/turn boundary times
 Surveys
 (Pre-test surveys, event scorecards, post-test surveys)
 Date perceptions and follow-up interest
 General attitudes, preferences, demographics
 Largest experiment with audio, text, + survey info
What we attempted to predict
 Conversational style:
 How often did you behave in the following ways
on this date?
 How often did they behave in the following ways
on this date?
 On a scale of 1-10 (1=never, 10=constantly)
1. flirtatious
2. friendly
3. awkward
4. assertive
Features
 Prosodic
 pitch (min, mean, max, std)
 intensity (min, max, mean, std)
 duration of turn
 rate of speech (words per second)
 Dialog
 questions
 backchannels (“uh-huh”, “yeah”)
 appreciations (“Wow!”, “That’s great!”)
 Lexical
 negative emotion (bad, weird, crazy, hate) words
 storytelling words (past tense) + food words (eat, dinner)
 love and sexual/emotional words (love, passionate, screw)
 personal pronouns (I, you, we, us)
Features extracted within turns
F0 max in
this turn
F0 max in
this turn
F0 min in
this turn
Features: Pitch
 F0 min, max, mean
 Thus to compute, e.g., F0 min for a conversation side
 Take F0 min of each turn (not counting zero values)
 Average over all turns in the side
 “F0 min, F0 max, F0 mean”
 We also compute measures of variation
 Standard deviation, pitch range
 F0 min sd, F0 max sd, F0 mean sd
 pitch range = (f0 max – f0 min)
Features: Other Prosodic
 Intensity min, max, mean, std
 computed as for pitch
 Duration of turn
 Total time for conversation side
 Rate of speech (words per second)
Prosodic features
Dialog act features
 Questions
 Laughter
 Turns
 Backchannels
 Uh-huh.
Yeah.
 Appreciations
 Wow.
# of questions in side
# of instances of laughter in side
total # of turns in a side
# of backchannels in side
Right.
Oh, okay.
# of appreciations in side
That’s true.
Oh, great!
Oh, gosh!
 Regular expressions drawn from hand-labeled
Switchboard Dialogue Act Corpus (Jurafsky, Biasca,
Shriberg 1997)
Appreciations
Backchannels

Wow.

Oh, wow.

Uh-huh

That's great.

Yeah

That's good.

Right

That's right.

Oh, no.

Oh

Oh, my goodness.

Yes

That's true.

Huh

Well, that's good.

Oh, yeah

Oh, that's great.


Oh, gosh.
Okay

Great.

Sure

Good.

Really

Oh, my.

Oh, really

Oh, that's good.

I see

Oh, great!

Oh, boy.

yep

I know.

Oh, yeah.
Clarifications
I’ve been
goofing off big
time
You’ve been
what?
I’ve been
goofing off big
time
Collaborative Completion
 a turn where a speaker completes the utterance
begun by the alter (Lerner, 1991; Lerner, 1996).
And I’m wearing a
yellow shirt
And black pants
Heuristic:
 first word of sentencei
 is predictable
 from last two words of sentencei-1
 (using a trigram grammar trained on Switchboard)
Dialog feature:
Collaborative Completion
 Heuristic: first word of sentencei is predictable from
last two words of sentencei-1
 Result: Tends to find “locally coherent phrasal answers”
 M: What year did you graduate?
 F: From high school?
 F: What department are you in?
 M: The business school.
 But not:
 F: What department are you in?
 M: I’m in the teacher education program.
Disfluency features
 UH/UM:
# of filled pauses (uh or um) in side
 M: Um, eventually, yeah, but right now I want to get some more experience,
uh, in research.
 F: Oh.
 M: Uh, so I will probably work for, uh, a research lab for, uh, big companies.
 RESTART:
# of disfluent restarts in side
 Uh, I–there’s a group of us that came in–
 OVERLAP: # of turns in side where speakers overlapped
 M: But-and also obviously–
F: It sounds bigger.
 M: –people in the CS school are not quite as social in general as other–

Livejournal.com:
I, me, my on or after Sep 11, 2001
Cohn, Mehl, Pennebaker. 2004. Linguistic markers of psychological change
surrounding September 11, 2001. Psychological Science 15, 10: 687-693.
7.2
7.0
6.8
6.6
6.4
6.2
6.0
5.8
s12
s16
Graph from Pennebaker slides
s20
o30-n5
o2-o8
s22
s18
s14
B
s24
o16-o22
September 11 LiveJournal.com study:
We, us, our
Cohn, Mehl, Pennebaker. 2004. Linguistic markers of psychological change
surrounding September 11, 2001. Psychological Science 15, 10: 687-693.
1.1
1.0
.9
.8
.7
.6
.5
B
s14
s12
s18
s16
Graph from Pennebaker slides
s22
s20
o2-o8
s24
o30-n5
o16-o22
LiveJournal.com September 11, 2001 study:
Positive and negative emotion words
Cohn, Mehl, Pennebaker. 2004. Linguistic markers of psychological change
surrounding September 11, 2001. Psychological Science 15, 10: 687-693.
Graph from Pennebaker slides
LIWC
 Linguistic Inquiry and Word Count
 Pennebaker, Francis, & Booth, 2001
 dictionary of 2300 words grouped into > 70 classes
 negative emotion (bad, weird, hate, problem, tough)
 sexual (love, loves, lover, passion, passionate, sex,)
 1st person pronouns (I me mine myself I’d I’ll I’m…)
 2nd person pronouns (you, you’d you’ll your you’ve…)
 ingest (food, eat, eats, cook, dinner, drink, restaurant…)
 swear (hell, sucks, damn, fuck,…)
 …
 after 9/11
 greater negative emotion
 more socially engaged
Lexical features
Domain-specific lexical features
via an autoencoder
 Our first paper showed lexical features help
 but not as much as prosodic or dialog features
 Better: data-driven lexical features?
 Pilot experiment: Using only Naïve Bayes with word existence
features works better than chance
 How do we extract lexical features that we can combine
with the previous features?
 Intuition:
 Create multinomial vector of all words with counts
 Use dimensionality reduction to create a 30-dimensional
vector
 Use these 30 dimensions as 30 features
Dimensionality reduction:
autoencoders
 Goal: Reduce the lexical
information in the
document to a smaller
number of features.
 Autoencoders have
been shown to perform
better than other
compressive techniques
(G. E. Hinton and R. R
Salakhutdinov. 2006).
Autoencoder
 A deep belief network (Hinton and Salakhutdinov
2006, Hinton 2007) used to form compact
representations of an input space
 The input space, for each conversation:
 multinomial distribution (1000 most common words) for
words used by each speaker x 2
 Two phases of training:
 Pretraining: Use contrastive divergence to train
hierarchichal RBM’s to find a good initial point
 Fine-tuning: Use backpropogation to fine tune the
weights
Autoencoder stages
Pre-processing before classifier
training
 Standardized all variables to have zero mean and unit
variance
 Removed all features correlated greater than .7
 To remove colinearity from the regression so weights
could be interpreted
 To use less features, since # of training examples was
small
 Example: Male Flirtatious
 Removed f0 range (correlated with f0 max)
 Removed f0 min sd (correlated with f0 min)
 Removed Swear (correlated with Anger)
Architecture: 6 binary classifiers
 Female
±Awkward, Male ±Awkward,
 Female ±Friendly,
Male ±Friendly,
 Female ±Flirtatious, Male ±Flirtatious,
 Multiple classifier experiments
 L1-regularized logistic regression
 SVM w/RBF kernel
 5-fold cross-validation
 tested on held-out test set of 10% highest and 10% lowest
 5 folds: 3 train, 1 validation, 1 test
Experiments
 K-fold cross validation.
 5 folds: 3 train, 1 validation, 1 test
 Randomized the data ordering, repeated k-fold cross
validation 25 times.
 Feature weights (θ)
 We calculated a separate θ for each randomized run.
 Resulting in a vector of weights for each feature.
 We kept any features if the median of its weight vector
was non-zero
Illustrating features: 10 most
significant features, 1 not
 For male flirtation intention
Results with SVM:
predicting flirt intention
 Using my speech to predict whether I say I am
flirting
I say I’m
flirting
Male
speaker
72%
Female
speaker
76%
Results with SVM:
Predicting flirt perception
 Using my speech to predict whether partner says I
am flirting
Male
speaker
Partner says 80%
I’m flirting
Female
speaker
68%
Summary: flirt detection
 Using my speech to predict whether I am flirting
Male
speaker
72%
I say I’m
flirting
Partner says 80%
I’m flirting
Female
speaker
76%
68%
Fine, but how good is 72 or 76?
 In speech we generally use human
performance as a “ceiling”
 Checking human performance:
 If John says Jane is flirting
 And Jane says Jane is flirting
 Then we say John is right.
Details of human experiment
 We converted the Likert values to a binary classification by
splitting the space around the mid-value
 John thinks Jane is flirting:
 If John’s Likert (1-10) value for “Jane flirting” is > 5
 We evaluate John
 By comparing John’s perception to Jane’s intention
 We used only the relatively certain cases of intention
 Computed by taking the top 10%/bottom 10% of intention ratings
 (We also tried other ways to derive binary classes like
median, z-scores, etc. this was the most generous to the
humans)
Fine, but how good is 72 or 76?
 In NLP we use human performance as a “ceiling”
 Checking human performance:
 If John says Jane is flirting
 And Jane says Jane is flirting
 Then we say John is right.
Male speaker
Female speaker
(female perceiver) (male perceiver)
64%
57%
Implication #1
 Females are better than males at
detecting flirting
 or males give off clearer flirting cues
Male speaker
Female speaker
(female perceiver) (male perceiver)
64%
57%
Implication #2: Machines are better
than humans at detecting flirting
Computer
detector
Human
detector
Overall Male
Female
speaker speaker
74%
72%
76%
61%
64%
57%
How can this be?
 Why are humans so bad at detecting flirtation?
 (Busso and Narayanan 2008: similar result for emotion detection)
 Our Intuition:
I am flirting Other is flirting
Male 101 says:
8
7
Female 127
says:
1
1
What correlates with my perception
of others flirting
 Pearson correlation coefficients
Variable
How I see other flirting
&
How other sees themself flirting
How I see other flirting
&
How I see myself flirting
ρ
.15
.73
What correlates with my perception
of others style
 Pearson correlation coefficients
Variable
My perception of other
& self-intention
My perception of other
& other-intention
Flirting
.73
.15
Friendly
.77
.05
Awkward .58
.07
Assertive .58
.09
“It’s not you, it’s me”
 My perception of whether my date is flirting
 Is the same as my perception of whether I am
flirting
 Why?
 Speakers aren’t very good at capturing
intentions of others in 4 minutes
 Speakers instead base judgments on their own
behavior/intentions
What about the features?
How much do autoencoders help?
SVM
+autoencoder
Male Intention
66%
72%
Female Intention
72%
76%
Male Perception
77%
80%
Female Perception 60%
68%
Likely (positive or negative) words from
one of the 30 autoencoder features
 More likely to flirt:










S_phone
O_phone
S_party
S_girl
O_girl
S_dating
S_hate
S_weird
S_dating
O_party
 Less likely to flirt:










O_academia
S_academia
S_interview
S_teacher
O_phd
O_advisor
O_lab
S_research
S_management
O_management
Intention Regression weights -Men
 a
Intention regression weights women
Gender differences in flirt intention
 Both genders when flirting:
 use words related to negative emotion
 especially men
 Women when flirting:
 use words related to love or sex
 use appreciations
 laugh, and use I
 Men when flirting:
 raise their pitch floor
 are more fluent
What are these“negative emotion”
words we use when flirting?
 M: “Oh wow, that’s terrible”
 M: “That is awful”
 M: “Wow, are you serious?”
 M: “Yeah, like, I hated it too”
 F: That’s crazy.
 M: It’s like kind of weird
Sympathy!
What are these“love/sex” words
women use when flirting?
 love, loved, loves, passion, passionate
 Well, I love to cook.
 I really love San Francisco.
 Oh, I love that show
 …my passion is teaching.
 …cooking is my passion.
 Um, right now I’m passionate about getting
through my first year of my PhD program.
Strong positive affect toward
hobbies or interests!
Missing the cues!!
 Men think women are flirting when women:
 use love/sex words,
 tell stories
 have higher pitch max,
 vary their loudness.
 But women who are flirting actually:
 use love/sex words [men get this right]
 use more I
 laugh more
 use more appreciations
Missing the cues!!
 Women think men are flirting when:
 men ask questions
 men speak faster.
 But men who are flirting actually:
 raise their pitch floor
 are sympathetic
 are more fluent
What about friendliness,
awkwardness, etc?
Detecting awkward and friendly
speakers
 Using what I do & what my date does to predict what
my date calls me
 Simpler (logistic regression) classifier
Awkward
Friendly
M
F
M
F
51
72
68
64
73
75
Using speaker 63%
words/speech
+ partner
64
words/speech
What makes someone seem friendly?
“Collaborative conversational style”
 Related to the “collaborative floor” of Edelsky (1981), Coates (1996)
 Collaborative completions (Lerner 1991, 1996)
 M: And I’m wearing a green shirt.
 F: And blue pants.
 Clarifications
 F: I'm working at Pottery Barn this summer.
 M: I'm sorry, who?
 Other questions
 You
 Laughter
 Plus perhaps
 Appreciations (for women)
 Overlaps (for men)
What makes a man seem awkward?
 More disfluent
 Increased uh/um and restarts
 Not collaborative conversationalists
 (no appreciations, repair questions, collab completions,
you)
 Take fewer turns
 Don’t overlap
 (Prosodically hard to characterize)
Work in progress:
Can we predict liking?
 That is, can we predict the binary variable:
 ‘willing to give this person my email’
 Either for a single speaker (baseline 53%=no)
 Or for a dyad (baseline 81% = no)
What you do when you like someone:
Preliminary results
 Men when they like their date
 use more appreciations (“Great!”, “Wow!”,
“That’s cool”)
 Women when they like their date
 vary their pitch and loudness more,
 raise their max pitch
 laugh
 tell stories
Who do you say yes to?
Preliminary results
 Men say yes to women who:
 show interest by asking clarification
questions (“excuse me?”)
 use “love” and “passion”
 talk about food
 Women say yes to men who:
 don’t use appreciations
 talk about food
 tell stories
 laugh
Current work: Accommodation
 In general, speakers change their behavior to match
(or not match) their interlocutor
Natale 1975, Giles, Mulac, Bradac, & Johnson 1987, Bilous & Krauss
1988, Giles, Coupland, and Coupland, 1991, Giles and Coupland
1992, Niederhoffer and Pennebaker 2002, Pardo 2006, Nenkova
and Hirschberg 2008, inter alia.
 Matching rate of speech
 Matching F0
 Matching intensity (loudness)
 Matching vocabulary and grammar
 Matching dialect
 Our question:
 Do we see more accommodation when people like each
other?
Future: New variables!
 “How would you rate the other person on each of the
following attributes? (1=not at all, 10=very much)”
 Attractive
 Sincere
 Intelligent
 Funny
 Ambitious
 Courteous
Conclusions – for daters
 Talking about your advisor is a bad idea
on a date
 Sympathy is a good idea, if you’re a guy
 Passion is good, if you’re a woman
 Food is good, if you eat
Conclusions – for psychology
Humans project their internal
state on others
Men and women (at least in 4
minutes) seem to focus on the
wrong verbal cues to flirtation
Conclusions – for computer science
 We can do automatic extraction of rich
social variables from speech and text.
 For at least this variable (“does speaker
intend to flirt”) we beat human
performance
Work in progress:
Flirting for fun and for real
 “Flirting but not interested” -> “For Fun Flirting”
 “Flirting and interested” -> “For Real Flirting”
 For fun flirters
 Men: raise min pitch
 Men: use more “we”
 Women: laugh
 For real flirters
 Men + Women: “love”, “passionate”, “sexy”
 Women: eating words
 Men: use less “we” and less hedges (“I think”)
 I think: softener, but also characteristic of formal situations and
middle class speech
Work in progress:
laughter and irony
more on hedges
 http://blog.okcupid.com/index.php/online-dating-
advice-exactly-what-to-say-in-a-first-message/
Part II: Personality
Personality and Cultural Values
 Personality refers to the structures and propensities
inside a person that explain his or her characteristic
patterns of thought, emotion, and behavior.
 Personality captures what people are like.
 Traits are defined as recurring regularities or trends
in people’s responses to their environment.
 Cultural values, defined as shared beliefs about desirable
end states or modes of conduct in a given culture,
influence the expression of a person’s traits.
McGraw-Hill/Irwin Chapter 9
The Big Five Dimensions of
Personality
 Extraversion vs. Introversion
 (sociable, assertive, playful vs. aloof, reserved, shy)
 Emotional stability vs. Neuroticism
 (calm, unemotional vs. insecure, anxious)
 Agreeableness vs. Disagreeable
 (friendly, cooperative vs. antagonistic, faultfinding)
 Conscientiousness vs. Unconscientious
 (self-disciplined, organised vs. inefficient, careless)
 Openness to experience
 (intellectual, insightful vs. shallow, unimaginative)
73
Aside: Do Animals Have
Personalities?
 Gosling (1998) studied spotted hyenas. He:
 had human observers use personality scales to
rate the different hyenas in the group
 did a factor analysis on these findings
 found five dimensions
three closely resembled the Big Five traits of
neuroticism, openness to experience, and
agreeableness
Slide from Randall E. Osborne
74
 BFI – Big Five Inventory –John et al.
http://www.outofservice.com/bigfive/
The Big Five Personality Traits
 Conscientiousness - dependable, organized, reliable,
ambitious, hardworking, and persevering.
McGraw-Hill/Irwin Chapter 9
The Big Five Personality Traits,
Cont’d
 Agreeableness - warm, kind, cooperative, sympathetic,
helpful, and courteous.
 Prioritize communion striving, which reflects a strong desire
to obtain acceptance in personal relationships as a means of
expressing personality.
 Agreeable people focus on “getting along,” not necessarily
“getting ahead.”
McGraw-Hill/Irwin Chapter 9
The Big Five Personality Traits,
Cont’d
 Extraversion - talkative, sociable, passionate,
assertive, bold, and dominant.
 Easiest to judge in zero acquaintance situations —
situations in which two people have only just met.
 Prioritize status striving, which reflects a strong desire
to obtain power and influence within a social structure
as a means of expressing personality.
 Tend to be high in what’s called positive affectivity — a
dispositional tendency to experience pleasant, engaging
moods such as enthusiasm, excitement, and elation.
McGraw-Hill/Irwin Chapter 9
The Big Five Personality Traits,
Cont’d
 Neuroticism - nervous, moody, emotional,
insecure, and jealous.
 Synonymous with negative affectivity —a dispositional
tendency to experience unpleasant moods such as
hostility, nervousness, and annoyance.
 Associated with a differential exposure to stressors,
meaning that neurotic people are more likely to appraise
day-to-day situations as stressful.
 Associated with a differential reactivity to stressors,
meaning that neurotic people are less likely to believe
they can cope with the stressors that they experience.
McGraw-Hill/Irwin Chapter 9
The Big Five Personality Traits,
Cont’d
 Neuroticism, continued
 Neuroticism is also strongly related to locus of control,
which reflects whether people attribute the causes of
events to themselves or to the external environment.
 Tend to hold an external locus of control, meaning that they
often believe that the events that occur around them are driven
by luck, chance, or fate.
 Less neurotic people tend to hold an internal locus of control,
meaning that they believe that their own behavior dictates
events.
McGraw-Hill/Irwin Chapter 9
External and Internal Locus of
Control
McGraw-Hill/Irwin Chapter 9
The Big Five Personality Traits,
Cont’d
 Openness to experience - curious, imaginative, creative,
complex, refined, and sophisticated.
 Also called “Inquisitiveness” or “Intellectualness” or even
“Culture.”
 Openness to experience is also more likely to be valuable in
jobs that require high levels of creativity, defined as the
capacity to generate novel and useful ideas and solutions.
 Highly open individuals are more likely to migrate into artistic
and scientific fields.
McGraw-Hill/Irwin Chapter 9
Changes in Big Five Dimensions
Over the Life Span
McGraw-Hill/Irwin Chapter 9
Personality demo
 Demo:
 http://mi.eng.cam.ac.uk/~farm2/personality/demo.html: find
your personality type
11/7/2015
Relationship between Dating and
Personality studies
 Observed versus self-reports
 Agreeableness (in Mairesse et al) and Friendliness (in
Jurafsky et al):
 as
Pickiness in Dating
 Finkel and Eastwick 2009, Psych Science
 Men are less selective than women in speed dating
 Novel explanation: act of physically approaching a
partner increases attraction to that partner
 traditional events, always men rotates
 Ran 15 speed dating events
 in 8, men rotated: men more selective
 in 7, women rotated: men equally selective to women
 Conclusion?