Transcript caws4 13286
The Emergence of Conventions in Online Social Networks Farshad Kooti, Haeryun Yang, Meeyoung Cha, Krishna Gummadi, and Winter Mason 1 Metric Imperial 2 Linguistic conventions Hey Good day Hello How’s it going 3 4 Why retweeting convention? o Information-sharing channels are explicit in Twitter o Specific to Twitter: exposures within the community o Contained in Twitter, hence capturing all usages 5 Twitter dataset o Crawled near-complete data from 03-2006 to 092009 - 54 million users - 1.9 billion tweets - 1.7 billion follow links o Follow links are a snapshot of the network 6 The retweeting variations Variation o Searched for syntax token @username o “Adopter” refers to a user using the variation at least once # of adopters # of retweets RT 1,836 K 53,221 K via 751 K 5367 K Retweeting 50 K 296 K Retweet 36 K 110 K HT 8K 22 K R/T 5K 28 K 3K 18 K Total 2,059 K 59,065 K 7 Origins Early adopters Majority acceptance 8 What are the very first use cases? HT Retweeting Oct’07 Jan’08 Via Mar’07 Retweet RT Nov’07 Jan’08 R/T Jun’08 Sep’08 9 Via started from natural language @JasonCalacanis (via @kosso) - new Nokia N-Series phones will do Flash, Video and YouTube HT Retweeting Oct’07 Jan’08 Via Mar’07 Retweet RT Nov’07 Jan’08 R/T Jun’08 Sep’08 10 HT started from blog communities The Age Project: how old do I look? http://tweetl.com/21b ( HT @technosailor ) HT Retweeting Oct’07 Jan’08 Via Mar’07 Retweet RT Nov’07 Jan’08 R/T Jun’08 Sep’08 11 The first Twitter-specific variation Retweet @HealthyLaugh she is in the Boston Globe today, for a Stand up show she’s doing tonight. Add the funny lady on Tweeter! HT Retweeting Oct’07 Jan’08 Via Mar’07 Retweet RT Jan’08 Nov’07 R/T Jun’08 Sep’08 12 RT was an adaption to constraints RT @BreakingNewsOn: "LV Fire Department: No major injuries and the fire on the Monte Carlo west wing contained east wing nearly contained." HT Retweeting Oct’07 Jan’08 Via Mar’07 Retweet Nov’07 RT Jan’08 R/T Jun’08 Sep’08 13 Some start from explicit discussions ♻ @ev of @biz re: twitterkeys ★ http://twurl.nl/fc6trd HT Retweeting Oct’07 Jan’08 Via Mar’07 Retweet RT Nov’07 Jan’08 R/T Jun’08 Sep’08 14 Origins Early adopters Majority acceptance 15 Early adopters are more tech-savvy Random users Early adopters 16 Early adopters are more innovative Early adopters Random users Has Bio Profile Pic Changed profile theme Has Location Has Lists Has URL 94% 99% 25% 50% 91% 40% 95% 57% 85% 36% 4% 14% 17 Early adopters are more popular • Much higher number of followers • 80% of early adopters in top 1% based on PageRank 18 Defining the diffusion network o Each adopter is a node in the graph. o There is a link from A to B if A was exposed to the variation by B. 19 Diffusion network of first 500 adopters of Retweet 20 Diffusion network of first 500 adopters of RT 21 Early adopter network o Average number of exposures: 2.9 – 6.4 o Average clustering coefficient: 0.233 - 0.320 o Criticality: fraction of users who were only exposed because of the most critical user: 0.5% - 4.9% Early adopters’ diffusion networks are dense and clustered. There is no single critical user. 22 Convention had different spread patterns from the URLs o URLs’ early adopters are not necessarily core users o The diffusion network is not dense and clustered o There are critical users in the process 23 Origins Early adopters Majority acceptance 24 25 Some variations Only Variations twoare variations have growing different became and some growth dominant dying rates at the end RT via 26 WHY DID RT BECOME THE MOST POPULAR? 27 From micro- to macro• Retweeting conventions are specific to Twitter; therefore, dominance of one convention likely a result of local decisions • What are the factors that lead individuals to choose one convention over another? 28 Convention prediction problem Suppose we are given a social network with records of users and times of adoptions, but information about which variation was adopted by user u at time t is hidden. How reliably can we infer which variation v the user chose to adopt? 29 Hypotheses • First mover advantage • Fastest-mover advantage • Influencers – RT was adopted by more popular users, leading to a faster spread of the convention • Tipping point – People were exposed to RT more – People had more friends who adopted RT 30 First-mover advantage 31 Fastest-mover advantage 32 Influencers 33 Tipping point: number of exposures • Only 2.49% of the adopters were exposed to exactly one variation • However, 72.68% only adopted a single variation. Most others 24.47% switched from one variation to another once or used two different variations back and forth • Of those who adopted RT, 80.40% had been exposed to RT the most frequently. • For the other variations, only 6.17%–30.39% of adopters had been exposed to that variation more than any other variation. 34 Tipping point: number of friends 35 Multinomial modeling Variation Baseline Accuracy Precision Recall AUC RT 0.682 0.712 0.728 0.92 0.681 via 0.72 0.726 0.521 0.237 0.666 Retweeting 0.981 0.98 0.431 0.177 0.905 Retweet 0.986 0.985 0.343 0.084 0.801 HT 0.996 0.997 0.505 0.039 0.849 R/T 0.998 0.998 0.19 0.001 0.815 recycle icon 0.998 0.999 0.359 0.009 0.823 Weighted average 0.704 0.726 0.657 0.698 0.683 36 Convention-specific predictions Variation Accuracy Precision Recall AUC RT 0.613 0.607 0.631 0.662 via 0.607 0.606 0.601 0.65 Retweeting 0.591 0.589 0.618 0.641 Retweet 0.569 0.566 0.566 0.604 HT 0.823 0.828 0.815 0.896 R/T 0.773 0.77 0.772 0.838 recycle symbol Weighted average 0.815 0.831 0.802 0.898 0.61 0.607 0.615 0.657 37 Redefining friend or adoption 38 Predictive power: Individual Features Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Feature date of adoption # of exposures to RT # of posted URLs # of exposures to via join date of the adopter # of posted tweets # of RT-adopter friends # of exposures to Retweeting # of exposures to HT # of exposures to Retweet # of exposures to recycle in-degree of the adopter # of HT-adopter friends out-degree of the adopter # of exposures to R/T # of Retweeting-adopter friends # of Retweet-adopter friends country of the adopter # of via-adopter friends # of R/T-adopter friends # of recycle-adopter friends Type Global Social Personal Social Personal Personal Social Social Social Social Social Personal Social Personal Social Social Social Personal Social Social Social χ2 300,666 106,627 80,728 64,160 48,071 44,523 44,079 43,100 42,807 36,604 32,889 30,762 24,338 24,338 23,370 8,141 7,507 4,476 4,429 203 67 39 Transition probabilities 40 Switch-out over time (%) probability Switch-out 80 70 60 50 40 30 20 10 RT via Retweet Retweeting ht recycle R/T Oct’08 Feb’09 Jun’09 Sep’09 Time 41 Hypotheses, take 2 Global popularity is more important than local activity in the decision to adopt • People have access to global signals that influence their decision to adopt • RT was invented when new users were joining most quickly, so edged out others (Matthew effect) • RT had cultural and ecological fit (inherent advantage) 42 Testing Hypotheses, take 2 • Different methodologies to understand factors related to adoption of social conventions • Additional data sets that capture emergence of social conventions • Simulations! 43 Summary o Conventions emerged in an organic, bottom-up manner o Early adopters were core members of the community: Active, tech-savvy, popular, and innovative o Social conventions start spreading through dense and clustered networks and there is no critical user o When variations got popular, they reached out side of core community 44 Summary o The final reach of the retweeting variations is not the direct result of either the amount of time each variation had to grow or the rate at which it grew. o Nearly all adopters had been exposed to the convention through their friends on Twitter, suggesting that social relations play an important role in the adoption process. o However, the decision to adopt a particular variation has more to do with the global popularity of the variation rather than its popularity in the local neighborhood or personal preference. 45 Thank you! @winteram @farshadkt @nicesea @nekozzang @kgummadi 46