Measuring user influence in Twitter: The million follower fallacy

{{Summary
 * title=Measuring User Influence in Twitter: The Million Follower Fallacy
 * authors=Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, Krishna P. Gummadi
 * url=http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewPaper/1538
 * tags=sociology, Twitter, influence, followers, mentions, retweets
 * summary=This paper empirically validates Avnit 2009's "million follower fallacy"--that follower count is not the best measure of influence in Twitter. Retweets and mentions, not just a large number of followers, indicate influence in Twitter, and the top users by number of followers, retweets, and mentions do not overlap significantly. Influence is gained by concerted effort, such as covering a single topic, especially those related to catastrophic events.

Through an empirical analysis of indegree, retweets, and mentions in a large Twitter corpus, they study the types and degree of influence.

Corpus
2 billion follow links among 54 million users who produced a total of 1.7 billion tweets. Started crawl in August 2009 of all public accounts. "We gathered information about a user’s follow links and all tweets ever posted by each user since the early days of the service." Twitter whitelisted their IP address range (58 servers) to cooperate with the crawl.

The main analysis filtered these to a subset of 6,189,636 active users (10+ tweets) with valid screen names, from the large connected component of the network.

Different kinds of influentials
Further analysis looked at the "all time influentials", the 100 top users by followers, retweets, and mentions (233 users due to some overlap). Users are ranked with Spearman’s rank correlation coefficient, then relative orders of ranks are considered.

They also looked at topical influence, based on three popular events of 2009: the Iranian presidential election, the outbreak of the H1N1 influenza, and the death of Michael Jackson.

Consideration of topical influence included those who discussed all three topics: (2% of all users, or 13,219 users). They also studied the users who discussed just a single topic, deeming the 20 top users for each topic (by follower count) as "topical influencers". Limiting to a single topic was a good strategy for increasing influence, and catastrophic events increased retweets and mentions.
 * relevance=Relevant to social media marketing. For instance, they conclude that high and moderate influentials are the best choices for advertising, since they can be effective in spreading messages on multiple topics and since influence follows a power law.

Various facts about Twitter as of August 2009: }}
 * Private accounts were nearly 8% of accounts.
 * Number of registered Twitter users had quadrupled since the start.
 * A single large connected component (94.8% of crawled users); singletons (5%); and smaller components (.2%).
 * "A single most popular influential user spawned up to 20,000 retweets and 50,000 mentions over a random 15 day period."
 * journal=ICWSM 2010
 * pub_date=2010
 * subject=Computer Science
 * pub_open_access=yes