下載/瀏覽

Download Report

Transcript 下載/瀏覽

Predicting YouTube Content Popularity via
Facebook Data: A Network Spread Model for
Optimizing Multimedia Delivery
Speaker : Yu-Hui Chen
Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C.
Au, and Amine Bermak
From : 2013 IEEE Symposium on Computational
Intelligence and Data Mining (CIDM)
outline
1.
2.
3.
4.
5.
Introduction
Methodology
Simulation results
Future work
Conclusion
1.Introduction
 Through websites such as Facebook and YouTube to share
multimedia content, the limited network resources, access to
large amounts of multimedia data is a major challenge.
 This paper proposes a Fast Threshold Spread Model (FTSM)
to predict the future access pattern of multi-media content
based on the social information of its past viewers.
2.Methodology
An example infection process of
Independent Cascade Model
A) Facebook Data Mining
Experimental setup: Requesting, downloading and analyzing
JSON objects from Facebook
B) YouTube Video Statistics Mining
 The YouTube statistics provided by YouTube API
C) Fast Threshold Spread Model
 The Facebook social graph extracted is modelled as an undirected
graph
with vertices, , in as the users in the network and edges, ,
in as the relationship between individuals. For each interuser
edge k, we evaluate the weight function:
 where
is the average number of posts posted per week by
user m and
is the average number of shares plus the
number of comments plus the number of likes for each of his posts.
They are averaged to yield W(m) which indicates the social
influence of a given user.
D) Complexity Analysis on a Small
Network vs a Large Network
 Each user node m will be either active or inactive.
 The total number of activated nodes, also known as the
influence spread, is denoted by NumActiveNodes.
 The Threshold value is chosen by simulation to be 4.0.
D) FTSM algorithm
D) Complexity Analysis on a Small
Network vs a Large Network
 Let di be the number of neighbors of node i.
is the probability that the computed W(m) in
Eq . 2 is greater than Threshold in Eq. 3 。

is the probability that the node is not accessed
before and activated in the past

be the computation complexity

D) Complexity Analysis on a Small
Network vs a Large Network
 Since the above equations follow a geometric
progression,the sum of all the m terms can be
calculated by
D) Complexity Analysis on a Small
Network vs a Large Network
 N be the number of hops(number of iterations)
 It can be seen that the number of computations increase in
the power law of N when N is increased.
 The value of P(notRepeat) is decaying as more and more nodes
get activated.
 It is more advantageous to simulate ina small network
 The highlighted circle represents a small network observation,
that shows a similar spread pattern to the large network.
A) Determining Global Threshold
 Effect on NumActiveNodes by changing the Threshold
B) Power Law behavior of the Facebook
Dataset
 Plot of Node Degree vs Number of Nodes in linear scale
B) Power Law behavior of the Facebook
Dataset
 Plot of Node Degree vs Number of Nodes in log scale
D) Transient spread simulation
compared with YouTube data
 Normalized view count for FTSM simulation (in red) and YouTube data
(in blue) for top 9 viral videos in the Facebook Dataset
4.Future work
 FTSM for a large network of a few million nodes results in
very long execution time.
 This paper is able to show that a small network’s.
 A large network can be partitioned into multiple small
networks .(ex. Hong Kong)
5.Conclusion
 The Fast Threshold Spread Model (FTSM) was used to
perform fast prediction of multi-media content propagation
based on the social information of its past viewers.
 This can be a solution to the cache management challenges
when prioritizing.
Thank you