Transcript caws4 13286

The Emergence of Conventions
in Online Social Networks
Farshad Kooti, Haeryun Yang, Meeyoung
Cha, Krishna Gummadi, and Winter Mason
1
Metric
Imperial
2
Linguistic conventions
Hey
Good day
Hello
How’s it
going
3
4
Why retweeting convention?
o Information-sharing channels are explicit in
Twitter
o Specific to Twitter: exposures within the
community
o Contained in Twitter, hence capturing all usages
5
Twitter dataset
o Crawled near-complete data from 03-2006 to 092009
- 54 million users
- 1.9 billion tweets
- 1.7 billion follow links
o Follow links are a snapshot of the network
6
The retweeting variations
Variation
o Searched for syntax
token @username
o “Adopter” refers to a
user using the
variation at least once
# of adopters # of retweets
RT
1,836 K
53,221 K
via
751 K
5367 K
Retweeting
50 K
296 K
Retweet
36 K
110 K
HT
8K
22 K
R/T
5K
28 K

3K
18 K
Total
2,059 K
59,065 K
7
Origins
Early adopters
Majority
acceptance
8
What are the very first use cases?
HT
Retweeting
Oct’07
Jan’08
Via
Mar’07
Retweet
RT
Nov’07 Jan’08
R/T
Jun’08

Sep’08
9
Via started from natural language
@JasonCalacanis (via @kosso) - new Nokia N-Series
phones will do Flash, Video and YouTube
HT
Retweeting
Oct’07
Jan’08
Via
Mar’07
Retweet
RT
Nov’07 Jan’08
R/T
Jun’08

Sep’08
10
HT started from blog communities
The Age Project: how old do I look?
http://tweetl.com/21b ( HT @technosailor )
HT Retweeting
Oct’07 Jan’08
Via
Mar’07
Retweet
RT
Nov’07 Jan’08
R/T
Jun’08

Sep’08
11
The first Twitter-specific variation
Retweet @HealthyLaugh she is in the Boston
Globe today, for a Stand up show she’s doing
tonight. Add the funny lady on Tweeter!
HT
Retweeting
Oct’07
Jan’08
Via
Mar’07
Retweet RT
Jan’08
Nov’07
R/T
Jun’08

Sep’08
12
RT was an adaption to constraints
RT @BreakingNewsOn: "LV Fire Department: No
major injuries and the fire on the Monte Carlo west
wing contained east wing nearly contained."
HT
Retweeting
Oct’07
Jan’08
Via
Mar’07
Retweet
Nov’07
RT
Jan’08
R/T
Jun’08

Sep’08
13
Some start from explicit discussions
♻ @ev of @biz re: twitterkeys ★
http://twurl.nl/fc6trd
HT
Retweeting
Oct’07
Jan’08
Via
Mar’07
Retweet
RT
Nov’07 Jan’08
R/T
Jun’08

Sep’08
14
Origins
Early adopters
Majority
acceptance
15
Early adopters are more tech-savvy
Random users
Early adopters
16
Early adopters are more innovative
Early adopters Random users
Has Bio
Profile Pic
Changed profile
theme
Has Location
Has Lists
Has URL
94%
99%
25%
50%
91%
40%
95%
57%
85%
36%
4%
14%
17
Early adopters are more popular
• Much higher number of followers
• 80% of early adopters in top 1% based on PageRank
18
Defining the diffusion network
o Each adopter is a node in the graph.
o There is a link from A to B if A was exposed
to the variation by B.
19
Diffusion network of first 500 adopters of
Retweet
20
Diffusion network of first 500 adopters of
RT
21
Early adopter network
o Average number of exposures: 2.9 – 6.4
o Average clustering coefficient: 0.233 - 0.320
o Criticality: fraction of users who were only
exposed because of the most critical user: 0.5%
- 4.9%
Early adopters’ diffusion networks are dense and
clustered. There is no single critical user.
22
Convention had different spread
patterns from the URLs
o URLs’ early adopters are not necessarily core
users
o The diffusion network is not dense and clustered
o There are critical users in the process
23
Origins
Early adopters
Majority
acceptance
24
25
Some variations
Only
Variations
twoare
variations
have
growing
different
became
and some
growth
dominant
dying
rates
at the end
RT
via
26
WHY DID RT BECOME THE
MOST POPULAR?
27
From micro- to macro• Retweeting conventions are specific to
Twitter; therefore, dominance of one
convention likely a result of local decisions
• What are the factors that lead individuals
to choose one convention over another?
28
Convention prediction problem
Suppose we are given a social network with
records of users and times of adoptions, but
information about which variation was
adopted by user u at time t is hidden. How
reliably can we infer which variation v the
user chose to adopt?
29
Hypotheses
• First mover advantage
• Fastest-mover advantage
• Influencers
– RT was adopted by more popular users,
leading to a faster spread of the convention
• Tipping point
– People were exposed to RT more
– People had more friends who adopted RT
30
First-mover advantage
31
Fastest-mover advantage
32
Influencers
33
Tipping point: number of exposures
• Only 2.49% of the adopters were exposed to
exactly one variation
• However, 72.68% only adopted a single variation.
Most others 24.47% switched from one variation to
another once or used two different variations back
and forth
• Of those who adopted RT, 80.40% had been
exposed to RT the most frequently.
• For the other variations, only 6.17%–30.39% of
adopters had been exposed to that variation more
than any other variation.
34
Tipping point: number of friends
35
Multinomial modeling
Variation
Baseline
Accuracy
Precision
Recall
AUC
RT
0.682
0.712
0.728
0.92
0.681
via
0.72
0.726
0.521
0.237
0.666
Retweeting
0.981
0.98
0.431
0.177
0.905
Retweet
0.986
0.985
0.343
0.084
0.801
HT
0.996
0.997
0.505
0.039
0.849
R/T
0.998
0.998
0.19
0.001
0.815
recycle icon
0.998
0.999
0.359
0.009
0.823
Weighted
average
0.704
0.726
0.657
0.698
0.683
36
Convention-specific predictions
Variation
Accuracy
Precision
Recall
AUC
RT
0.613
0.607
0.631
0.662
via
0.607
0.606
0.601
0.65
Retweeting
0.591
0.589
0.618
0.641
Retweet
0.569
0.566
0.566
0.604
HT
0.823
0.828
0.815
0.896
R/T
0.773
0.77
0.772
0.838
recycle symbol
Weighted
average
0.815
0.831
0.802
0.898
0.61
0.607
0.615
0.657
37
Redefining friend or adoption
38
Predictive power:
Individual Features
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Feature
date of adoption
# of exposures to RT
# of posted URLs
# of exposures to via
join date of the adopter
# of posted tweets
# of RT-adopter friends
# of exposures to Retweeting
# of exposures to HT
# of exposures to Retweet
# of exposures to recycle
in-degree of the adopter
# of HT-adopter friends
out-degree of the adopter
# of exposures to R/T
# of Retweeting-adopter friends
# of Retweet-adopter friends
country of the adopter
# of via-adopter friends
# of R/T-adopter friends
# of recycle-adopter friends
Type
Global
Social
Personal
Social
Personal
Personal
Social
Social
Social
Social
Social
Personal
Social
Personal
Social
Social
Social
Personal
Social
Social
Social
χ2
300,666
106,627
80,728
64,160
48,071
44,523
44,079
43,100
42,807
36,604
32,889
30,762
24,338
24,338
23,370
8,141
7,507
4,476
4,429
203
67
39
Transition probabilities
40
Switch-out over time
(%)
probability
Switch-out
80
70
60
50
40
30
20
10
RT
via
Retweet
Retweeting
ht
recycle
R/T
Oct’08 Feb’09 Jun’09 Sep’09
Time
41
Hypotheses, take 2
Global popularity is more important than
local activity in the decision to adopt
• People have access to global signals that
influence their decision to adopt
• RT was invented when new users were
joining most quickly, so edged out others
(Matthew effect)
• RT had cultural and ecological fit (inherent
advantage)
42
Testing Hypotheses, take 2
• Different methodologies to understand
factors related to adoption of social
conventions
• Additional data sets that capture
emergence of social conventions
• Simulations!
43
Summary
o Conventions emerged in an organic, bottom-up
manner
o Early adopters were core members of the
community: Active, tech-savvy, popular, and
innovative
o Social conventions start spreading through dense
and clustered networks and there is no critical user
o When variations got popular, they reached out side
of core community
44
Summary
o The final reach of the retweeting variations is not
the direct result of either the amount of time each
variation had to grow or the rate at which it grew.
o Nearly all adopters had been exposed to the
convention through their friends on Twitter,
suggesting that social relations play an important
role in the adoption process.
o However, the decision to adopt a particular
variation has more to do with the global popularity
of the variation rather than its popularity in the local
neighborhood or personal preference.
45
Thank you!
@winteram
@farshadkt @nicesea @nekozzang
@kgummadi
46