SG elections – a Twitter snapshot

With the election fever ongoing in Singapore, let’s take a snapshot of the popular tweets with the hashtag #ge2015 at this point of time.

library(twitteR)

consumerKey <- readLines("twitterkey.txt")
consumerSecret <- readLines("twittersecret.txt")
accessToken <- readLines("twitteraccesstoken.txt")
accessTokenSecret <- readLines("twitteraccesstokensecret.txt")

setup_twitter_oauth(consumerKey,consumerSecret,accessToken,accessTokenSecret)
## [1] "Using direct authentication"
tweets <- searchTwitter("#ge2015", resultType="popular", n=100)

tweetsdf <- twListToDF(tweets)

library(dplyr)
tweetsdf <- tbl_df(tweetsdf)
glimpse(tweetsdf)
## Observations: 31
## Variables:
## $ text          (chr) "#GE2015: PAP candidate Koh Poh Koon performs CP...
## $ favorited     (lgl) FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ favoriteCount (dbl) 113, 32, 58, 49, 7, 16, 19, 2, 8, 6, 8, 168, 17,...
## $ replyToSN     (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ created       (time) 2015-09-06 04:43:24, 2015-09-06 04:01:11, 2015-...
## $ truncated     (lgl) FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ replyToSID    (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ id            (chr) "640384783080538112", "640374159881601025", "640...
## $ replyToUID    (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ statusSource  (chr) "<a href=\"https://about.twitter.com/products/tw...
## $ screenName    (chr) "STcom", "TODAYonline", "YamKeng", "YamKeng", "T...
## $ retweetCount  (dbl) 293, 67, 75, 92, 45, 24, 12, 11, 10, 17, 9, 476,...
## $ isRetweet     (lgl) FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ retweeted     (lgl) FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ longitude     (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ latitude      (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...

Let’s first look at the top contributors of these popular tweets.

# There are users with several popular tweets
tweetsdf %>% select(screenName) %>%
  group_by(screenName) %>%
  summarise(count=n()) %>%
  arrange(desc(count))
## Source: local data frame [7 x 2]
## 
##     screenName count
## 1      mrbrown     9
## 2        STcom     6
## 3  TODAYonline     6
## 4         wpsg     6
## 5      YamKeng     2
## 6 LizforLeader     1
## 7 PAPSingapore     1

From the counts, we can see that amongst the popular tweets in this snapshot, the highest number come from the user ‘mrbrown’ which is the Twitter handle of blogger Mr Brown.

An outlier here would be the user ‘LizforLeader’, which is the official Twitter account of Liz Kendall’s campaign to become leader of the Labour Party in the UK. Since this is not applicable in the context of Singapore, it shall be removed subsequently.

# retweets of popular tweets per user
retweetsByUsers <- tweetsdf %>% 
  select(screenName, retweetCount) %>%
  filter(screenName != 'LizforLeader') %>%
  group_by(screenName) %>%
  summarise_each(funs(sum)) %>%
  arrange(desc(retweetCount))

retweetsByUsers %>%
    mutate(percentage = round(retweetCount/sum(retweetCount)*100, digits=2))
## Source: local data frame [6 x 3]
## 
##     screenName retweetCount percentage
## 1      mrbrown         2118      61.34
## 2        STcom          449      13.00
## 3  TODAYonline          347      10.05
## 4 PAPSingapore          220       6.37
## 5      YamKeng          167       4.84
## 6         wpsg          152       4.40

If we look at the combined retweet counts per user, we can see that the majority of retweets (at 61.34%) in this snapshot are that of Mr Brown’s tweets.

Let’s proceed to look at the top 5 tweets by retweet count in descending order at this point of time.

# Top 5 tweets by highest number of retweets 
top5Tweets <- tweetsdf %>% select(screenName,id,retweetCount) %>%
  filter(screenName != 'LizforLeader') %>%
  arrange(desc(retweetCount)) %>% 
  top_n(5)
## Selecting by retweetCount
top5Tweets
## Source: local data frame [5 x 3]
## 
##   screenName                 id retweetCount
## 1    mrbrown 639654658827423744          476
## 2    mrbrown 638647979021234176          415
## 3      STcom 640384783080538112          293
## 4    mrbrown 639105775214792706          274
## 5    mrbrown 638705015981379586          224
# Direct Links to the top 5 tweets
paste("http://twitter.com/",top5Tweets$screenName,"/status/", top5Tweets$id, sep="")
## [1] "http://twitter.com/mrbrown/status/639654658827423744"
## [2] "http://twitter.com/mrbrown/status/638647979021234176"
## [3] "http://twitter.com/STcom/status/640384783080538112"  
## [4] "http://twitter.com/mrbrown/status/639105775214792706"
## [5] "http://twitter.com/mrbrown/status/638705015981379586"

Not surprisingly, of the top 5 tweets, 4 came from Mr Brown. With a few more days to go before election day, the results will probably change. For now, you can view the top 5 tweets in this current snapshot below.