A social network caught in the Web Lada Adamic and Eytan Adar Orkut Buyukkokten (HP Labs, Palo Alto, CA) (Google)

Download Report

Transcript A social network caught in the Web Lada Adamic and Eytan Adar Orkut Buyukkokten (HP Labs, Palo Alto, CA) (Google)

A social network caught in the Web
Lada Adamic and Eytan Adar
Orkut Buyukkokten
(HP Labs, Palo Alto, CA)
(Google)
1
Outline
Intro to Club Nexus
Profiles
Nexus Net
Similarity and distance
Association by similarity
Nexus Karma
Conclusions
2
3
4
5
6
Profiles:
status (UG or G)
year
major or department
residence
gender
Personality
you
friendship
romance
freetime
support
(choose 3 exactly):
funny, kind, weird, …
honesty/trust, common interests, commitment, …
-“socializing, getting outside, reading, …
unconditional accepters, comic-relief givers, eternal optimists
Interests
books
movies
music
social activities
land sports
water sports
other sports
(choose as many as apply)
mystery & thriller, science fiction, romance, …
western, biography, horror, …
folk, jazz, techno, …
ballroom dancing, barbecuing, bar-hopping, …
soccer, tennis, golf, …
sailing, kayaking, swimming, …
ski diving, weightlifting, billiards, …
7
Finding correlations between user attributes
Are people who consider themselves funny also
more likely to enjoy comedies?
518
funny users
74 %
of users overall like comedies
416 (80% of) funny users like comedies,
this is 3.4 standard deviations (=10) above expected (383)
Z score = 3.4
Z scores with absolute value > 2 are significant at the p = 0.05 level.
3.4 is significant at the 0.0003 level
small differences (10%) can be significant.
8
Personality and tastes (just a few examples)
creative
book
music
movie
art & photography, philosophy, fiction &
literature, classics
folk, bluegrass/rural, jazz
art, documentary, independent
successful book
landsport
other
social
watersport
free time
business
tennis
weightlifting
barbecuing
boating, jet skiing, water skiing
fulfilling commitments, catching up on chores and
things
not
book
responsible movie
sex
erotic & softcore, gay & lesbian,
independent
funk, jungle, reggae, trance
skateboarding
raving
music
other
social
9
Major and personality
personality (% of total)
major
free time: learning (17%)
Physics (46%), Philosophy (37%), Math (31%),
EE (26%), CS (24%)
free time: reading (26%)
English (55%)
free time: staying at home (8%)
History (24%)
free time: doing anything exciting
(52%)
undecided/undeclared (62%)
you: weird (12%)
Physics (34%), Math (28%), EE (18%)
you: intelligent (32%)
Philosophy (59%), CS (42%)
you: successful (4%)
CS (7%)
you: socially adaptable (14%)
STS (46%)
you: attractive (16%)
Political Science (29%), International Relations (25%)
you: lovable (12%)
Political Science (24%)
you: kind (25%)
Public Policy (45%)
you: funny (25%)
Philosophy (6%)
you: fun (26%)
Human Biology (38%)
you: creative (22%)
Product Design (62%), English (42%)
you: sexy (8%)
English (18%), EE (2%)
10
Gender Differences
preference
book
Male users
computers, science fiction, professional &
technical, science, business, politics,
philosophy, sports, adventure
landsport
football, frisbee golfing, table tennis, golf,
baseball, basketball, cricket, fencing,
racquetball, squash, tennis, soccer, wrestling
movie
science fiction, war, action, spy film, erotic &
softcore, adventure, anime, sports, western
romance, family, drama, musical,
performing arts, comedy, independent
music
heavy metal
other
soul/R&B, pop, country/western,
rap/hip hop, folk, latin
aerobics, ice skating, jogging
computer gaming, weightlifting, billiards,
ultimate frisbee, mountain biking, paintballing,
laser gaming, bicycling
barbecuing, raving, hot tubbing
hip-hop dancing, lating dancing,
clubbing
fishing, sailing
swimming
social
watersport
personality
freetime
friendship
romance
support
you
learning, doing physical challenging activities
mutual friends, common interests,
appearance/look, sex
appearance/look, sex, physical attraction
the eternal optimists, the give-it-to-youstraight people, i've-been-down-and-dirty-afew-times-myself people
intelligent
Female users
romance, fiction & literature, health
mind & body, cooking, art &
photography, entertainment, mystery
& thriller, psychology, classics
gymnastics, field hockey, softball
catching up on chores and things,
socializing
laughter, honesty/trust, communication
laughter, honesty/trust
unconditional accepters, the listeners,
chicken-soup people
fun, lovable, friendly
11
Degree Distribution for Nexus Net
2469 users, average degree 8.2
200
number of users
number of users with so many links
250
150
2
10
1
10
0
10
0
10
100
1
10
number of links
2
10
50
0
0
20
40
60
number of links
80
100
12
Shortest paths between users
5
12
x 10
average distance = 4.0
10
pairs of users
8
6
4
2
0
1
2
3
4
5
6
7
8
9
10
11
12
13
distance
13
Clustering and betweenness
Clustering or transitivity: how many of the user’s friends are friends themselves
C=
# links between friends
(# friends)* (# friends - 1)/2
c = 0.17 for Club Nexus
Other findings:
people who list more buddies list more preferences/activities
edges with high betweenness lie between dissimilar people (r = -0.2)
people with high betweenness have more links (r = 0.7)
- “ -
have lower clustering coefficients (r = -0.12)
14
Similarity and distance
year is more important for undergrads
department is more important for grads
G residence
UG residence
G department
UG major
G year
UG year
G status
UG status
fraction of similar users
1
0.8
0.6
0.4
0.2
0
1
2
3
4
5
6
7
8
distance between users in hops
15
Association ratios
p = (# users who like A)/(total #users)
L = # connections A users have
m = expected number of links to other A users = L*p
r = (# links between A users)/m
users who like A
all users
16
Personality and association ratio
personality
Z score
# users
# connections
sexy
talented
fun
weird
lovable
unique
funny
friendly
socially adaptable
association
ratio
1.46
1.40
1.25
1.25
1.22
1.11
1.10
1.10
1.09
5.47
5.17
11.22
4.32
4.20
4.15
4.06
7.55
2.12
204
213
633
286
292
547
619
1024
342
192
210
1852
332
406
1194
1474
4024
482
attractive
creative
intelligent
responsible
kind
competent
successful
1.07
1.04
1.01
0.99
0.99
0.92
0.70
1.76
1.48
0.42
-0.28
-0.44
-1.40
-1.57
406
541
779
500
625
294
99
522
982
1848
686
1226
226
18
17
Interests and association ratios
high association
low association
book
gay & lesbian, professional & technical,
computers, teen, sex, sports
history, fiction & literature,
outdoor & nature
movie
genres
gay & lesbian, performing arts, religion, erotic
& softcore, sports
drama, mystery,
documentary, comedy
music
genres
gospel, jungle, bluegrass/rural, heavy metal,
trance
pop, classical, rock
land sport
lacrosse, field hockey, wrestling, cricket
tennis, martial arts, bicycling,
racquetball
water
sport
synchronized swimming, diving, crew
swimming, fishing
windsurfing
social
raving, ballroom dancing, Latin dancing
partying, camping
18
Nexus Karma
Rank how ‘trusty’, ‘nice’, ‘cool’, and ‘sexy’ your buddies
are on a scale of 1 to 4
446 users ranked 1735 different friends
correlations between scores given (users were ranked as
‘3,3,3,3’ more often than ‘1,4,2,3’
average scores: nice (3.37), trusty (3.22), cool (3.13), sexy(2.83)
trusty--nice and cool--sexy more highly correlated (r = 0.7) vs.
trusty--sexy and nice--sexy (r = 0.4)
no relationship
negative correlation
between average score received and # of friends
between average score given and # of friends
19
How users view themselves vs. how others view them
trusty
(3.22)
nice
cool
sexy
(3.37)
(3.13)
(2.83)
3.02
2.67
responsible
3.36
sexy
3.10
3.23
3.03
attractive
3.09
3.25
2.93
kind
3.34
3.46
friendly
3.44
weird
funny
2.67
3.31
20
Additional insights from Nexus Karma
Users receiving higher ‘nice’ scores give higher ‘trusty’, ‘nice’, and ‘cool’
scores (r = 0.14-0.17)
If one user gives another user a higher ‘trusty’ or ‘nice’ score than their
other friends, that same friend is more likely to reciprocate.
Users who share friends are more likely to give each other high scores
(r = 0.10-0.13)
21
Conclusions
Learn about real world social networks from online community
Less effort than traditional social network survey methods,
almost a side-effect of digital nature of interactions
Although most results not surprising, data is very rich
- opportunity to simulate search and information spread
Karma data can be used to study online reputation mechanisms
Longitudinal data can be used to study network evolution
22
To find out more:
Information dynamics group (IDL) at HP Labs:
http://www.hpl.hp.com/shl/
Paper at:
http://www.hpl.hp.com/shl/social/
23
Free time activity and association ratios
free time activity
fulfilling
commitments
socializing
catching up on
chores and things
learning
doing anything
exciting
watching TV
reading
getting outside
staying at home
alone
doing physical
challenging activities
association
ratio
1.34
Z score
# users
# connections
9.30
398
826
1.12
1.09
21.12
2.71
1660
494
11374
850
1.07
1.07
1.82
8.05
420
1280
536
6278
1.07
1.02
1.01
0.97
0.96
0.96
1.85
0.66
0.97
-0.32
-0.93
-1.46
415
631
940
209
380
577
602
1186
2882
126
398
878
24