Hybrid Social Media Network

Download Report

Transcript Hybrid Social Media Network

Hybrid Social Media Network
Dong Liu
Nara, Japan, October 2012
Joint work with Guangnan Ye (Columbia), Ching-Ting Chen (Columbia),
Shuicheng Yan and Shih-Fu Chang (Columbia)
Social Media Network
 Users actively create and exchange rich multimedia content.
 Extensive social interactions among the users.
-2-
digital video | multimedia lab
Characteristics of Social Media Networks
 Heterogeneous entities
 user, concept, multimedia object
 Heterogeneous relations
User-user
relations
• Friendship
• Group member
• Message
communication
• ………….
Object-object
relations
• Color similarity
• Semantic
closeness
• Geo-location
• …………
User-object
relations
•
•
•
•
•
Tagging
Favorite
Bookmarking
Like
………
Provide rich contextual information,
facilitating the design of novel multimedia applications.
-3-
digital video | multimedia lab
Complexity of the Heterogeneous Relations
geo-location
-4-
digital video | multimedia lab
statue of
liberty

Challenges:


-5-
Capture the heterogeneous relations and entities
in a unified model.
Select the most useful information to avoid the
potential information redundancy &
computational burden.
digital video | multimedia lab
Hybrid Social Media Network
Content Subnetwork
Concept Subnetwork
event
place
people
birthday
soccer
parade
wedding
city
home
park
lake
group
portrait
costume
bride
object
bike
flower
cake
balloon
mapping
Tagging/
browsing
Interactions:
“like”, “comment”…
User Subnetwork
target users
Task: “Personalized
album”, “tour
planning”, “shopping”
Task/user specific
recommendation
Recommend
information select
in a personalized
way for
Given
a query, dynamically
the most informative
Preserve the heterogeneous types of entities and relations.
applications
ads. diffusion.
edges and relation
typeslike
for target
information
-6-
digital video | multimedia lab
Use Scenario I : Personalized Recommendation
of Contents, Friends, and Topics
D
0
0
0
0
0
0
0
C
0
0
New0York
0
1
A
Given User A + The statue of
liberty as query.
 Propagate the information
and predict the scores for all
other nodes.
 Recommend the nodes with
highest scores.

E
0
0
0
B
 Content
recommendation:
Statue of1liberty
 Friend recommendation:
manhattan
0
USA
0
D
E
 Concept recommendation:
-7-
digital video | multimedia lab
New York, Manhattan
Use Scenario II : Target Media Ads

E
0
1
D
0
0
0
0
0
0
0
0
1
ipad

0
NIKE
0
0
C

0
0
A
B
Set a certain set of media
objects or topical concepts
as the initial query.
Perform information
propagation.
The predicted utility scores
can be used to identify the
best group of users that may
be interested in receiving
such media objects or topics.
User recommendation:
iphone
0
-8-
sneaker
0
digital video | multimedia lab
B
C
Social Media Data Crawling for Hybrid Network
Construction
 We crawled 10 social photo groups from Flickr.



Range from cities, tourist sites, products and life styles.
Download all users, tags and photos & all observable relations.
In average, there are 7,914 entities and 131,975 relations
within each group.
Beijing
Painted
New York
IKEA
digital video | multimedia lab
Taipei
wildlife
Universal
Studio
Korean
food
sneaker
fashion
Hybrid Network Construction

Node: user, concept and multimedia object

Heterogeneous Edges
User-User (2)
Contact : 1 or 0
Favorite : real value
User-Image (3)
Ownership: 1 or 0
Favorite : 1 or 0
Comment: 1 or 0
Concept-Image (1)
Tagging: 1 or 0
Concept-Concept (1)
Co-occurrence: real value
User-Image-Concept (1)
Tagging: 1 or 0
-10-
digital video | multimedia lab
User-Concept (1)
Ownership: 1 or 0
Image-Image (2)
SIFT Similarity
Classeme Similarity
Information Propagation Procedure
object




Node set :
Edge set between and :
Edge affinity weight:
Edge selection parameter :
Conventional Graph



Edge strength function:
indicates whether relation type g is selected.
Each element in
indicates selection of a specific
edge .
-11-
concept
digital video | multimedia lab
Hybrid Graph
user
Markov Random Walk Process

The stationary distribution of f should satisfy:

Q is the transition probability matrix:

Establish the link between edge selective parameters
and node scores.


-12-
Node score will influence edge selection and vice versa.
Enable the query specific edge selection.
digital video | multimedia lab
Sparse Edge Subset Selection

Multi-level sparse penalty on the edges:
Group level selection
• L21 norm : group-wise sparsity
• Only a small number of relation types are selected.
• e.g., image similarity edges are useless for friendship
recommendation.
Edge level selection
• L1 norm: element-wise sparsity
• Only most useful individual edges are selected.
• Minimize the redundancy and computational burden.
Recall
-13-
digital video | multimedia lab
Problem Formulation

Predict node utility scores and edge selective parameters
through minimizing
Markov equation

Multi-level sparse penalty
Properties:Query vector


Semi-supervised graph ranking
Query specific edge selection

-14-
Label fitness
The edge selection is optimized for each query dynamically.
digital video | multimedia lab
Information Propagation over Hybrid Social
Media Network


Offline: Build hybrid social media network based on
nodes and observable relations.
Online:




-15-
User issues a query vector y.
Perform edge selective information propagation.
Predict edge selection parameter & node score vector f .
Rank entities based on prediction scores and pickup the
top ones as recommendation.
digital video | multimedia lab
Optimization Procedure

Handle the non-smoothness of L21 norm and L1 norm.


Nesterov Smooth Approximation.
Gradient Descend based Iterative Optimization.
Iterative procedure converges quickly.
-16-
digital video | multimedia lab
Algorithmic Analysis

Time complexity : O(m(n+J))


Converges within 1.2 minutes on an Intel XeonX 5600 workstation
with 3.2 GHz CPU and 18 GB memory.
Parameter sensitivity

-17-
J: edge number
Convergence procedure is fast


m: iteration number; n : node number,
.
Relatively stable performance with a variation range of 10%.
digital video | multimedia lab
Experiment Design

Query Type-I



Query Type-II



Query : a user plus few keywords.
Recommend the top ranked keywords /images/users.
Query Type-III


-18-
Query : a user.
Recommend the top ranked keywords/images/users.
Query : an image plus its associated keywords.
Recommend users who may have interests in receiving
such media objects and topics.
digital video | multimedia lab
Evaluation
 In each query, we randomly select 50% of edges which have
connections with the relevant entities as the test set for
performance evaluation.
query
Remove half
of ground
truth edges.
Whether these entities can be successfully retrieved at the
top positions of the propagation results.
-19-
digital video | multimedia lab
Comparison Baselines

EAIP : Edge Averaging based Information Propagation.


ESIP : Edge Selective based Information Propagation.


Only involve L21 norm :
ERSIP (Proposed): Edge and Relation Selective based
Information Propagation.

-20-
Only involve L1 norm :
RSIP : Relation Selective based Information propagation.


Simply average
Involve both L21 norm and L1 norm :
digital video | multimedia lab
Performance Comparison
arecommend
user plus keywords,
recommend
Input Input
a user,
keywords/images/users.
Input an image plus keywords,
keywords/images/users.
recommend
EAIP
ESIP users.
RSIP
ERSIP
EAIP
0.8
0.8
0.7
0.6
0.55
0.55
0.82
0.75
0.65 0.65
0.6
0.5
0.5 0.45
0.78
EAIP
0.76
ESIP
0.74
RSIP
0.72
ERSIP
0.7
0.68
0.66
keyword
keyword0.64
image
image
user
-21-
ERSIP
0.84
Average MAP
Average MAP
Average MAP
0.7
RSIP
0.85
0.8
0.75
ESIP
digital video | multimedia lab
user
user
Recommendation Examples
Query:
UserUser
+ keywords
(yiheyuan,
summerpalace,
China)
Query:
+ keywords
(taipei,
temple)
Query:
User
+ keywords
(yum,buddhism,
meat, delicious)
 Images (orange borders): retrieved by the direct links.
 Image (green borders): retrieved by propagation via the proposed
hybrid network.
-22-
digital video | multimedia lab
Average Relation Type Contribution
Type-II:Type-I:
user
plus
user
keyword
Query Query
Type-III:
image
plus
keyword
Relation Type
contribution
Image-image SIFT similarity
Image-image mid-level similarity
concept-image tagging
User-concept ownership
User-image comment
User-image favorite
User-image ownership
User-user contact
User-user favorite
Concept-concept co-occurrence
User-image-concept tagging
The edge selection strategy is able to identify the relations
are closely
related to the query.
-23digital videothat
| multimedia
lab
Conclusions

A novel hybrid social media network model



-24-
Heterogeneous relations and entities can be seamlessly
integrated within a unified framework.
An edge selective information propagation procedure.
Recommend information in a personalized way.
digital video | multimedia lab
Future Work

Scalability



More applications


-25-
Anchor entities and relations.
Scaling to a large network.
Task specific social media community discovery.
Personalized multimedia content composition and
manipulation.
digital video | multimedia lab
Thank You
[email protected]
http://www.ee.columbia.edu/~dongliu/
-26-
digital video | multimedia lab