With the election fever ongoing in Singapore, let’s take a snapshot of the popular tweets with the hashtag #ge2015 at this point of time.
library(twitteR)
consumerKey <- readLines("twitterkey.txt")
consumerSecret <- readLines("twittersecret.txt")
accessToken <- readLines("twitteraccesstoken.txt")
accessTokenSecret <- readLines("twitteraccesstokensecret.txt")
setup_twitter_oauth(consumerKey,consumerSecret,accessToken,accessTokenSecret)
## [1] "Using direct authentication"
tweets <- searchTwitter("#ge2015", resultType="popular", n=100)
tweetsdf <- twListToDF(tweets)
library(dplyr)
tweetsdf <- tbl_df(tweetsdf)
glimpse(tweetsdf)
## Observations: 31
## Variables:
## $ text (chr) "#GE2015: PAP candidate Koh Poh Koon performs CP...
## $ favorited (lgl) FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ favoriteCount (dbl) 113, 32, 58, 49, 7, 16, 19, 2, 8, 6, 8, 168, 17,...
## $ replyToSN (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ created (time) 2015-09-06 04:43:24, 2015-09-06 04:01:11, 2015-...
## $ truncated (lgl) FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ replyToSID (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ id (chr) "640384783080538112", "640374159881601025", "640...
## $ replyToUID (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ statusSource (chr) "<a href=\"https://about.twitter.com/products/tw...
## $ screenName (chr) "STcom", "TODAYonline", "YamKeng", "YamKeng", "T...
## $ retweetCount (dbl) 293, 67, 75, 92, 45, 24, 12, 11, 10, 17, 9, 476,...
## $ isRetweet (lgl) FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ retweeted (lgl) FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,...
## $ longitude (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ latitude (lgl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
Let’s first look at the top contributors of these popular tweets.
# There are users with several popular tweets
tweetsdf %>% select(screenName) %>%
group_by(screenName) %>%
summarise(count=n()) %>%
arrange(desc(count))
## Source: local data frame [7 x 2]
##
## screenName count
## 1 mrbrown 9
## 2 STcom 6
## 3 TODAYonline 6
## 4 wpsg 6
## 5 YamKeng 2
## 6 LizforLeader 1
## 7 PAPSingapore 1
From the counts, we can see that amongst the popular tweets in this snapshot, the highest number come from the user ‘mrbrown’ which is the Twitter handle of blogger Mr Brown.
An outlier here would be the user ‘LizforLeader’, which is the official Twitter account of Liz Kendall’s campaign to become leader of the Labour Party in the UK. Since this is not applicable in the context of Singapore, it shall be removed subsequently.
# retweets of popular tweets per user
retweetsByUsers <- tweetsdf %>%
select(screenName, retweetCount) %>%
filter(screenName != 'LizforLeader') %>%
group_by(screenName) %>%
summarise_each(funs(sum)) %>%
arrange(desc(retweetCount))
retweetsByUsers %>%
mutate(percentage = round(retweetCount/sum(retweetCount)*100, digits=2))
## Source: local data frame [6 x 3]
##
## screenName retweetCount percentage
## 1 mrbrown 2118 61.34
## 2 STcom 449 13.00
## 3 TODAYonline 347 10.05
## 4 PAPSingapore 220 6.37
## 5 YamKeng 167 4.84
## 6 wpsg 152 4.40
If we look at the combined retweet counts per user, we can see that the majority of retweets (at 61.34%) in this snapshot are that of Mr Brown’s tweets.
Let’s proceed to look at the top 5 tweets by retweet count in descending order at this point of time.
# Top 5 tweets by highest number of retweets
top5Tweets <- tweetsdf %>% select(screenName,id,retweetCount) %>%
filter(screenName != 'LizforLeader') %>%
arrange(desc(retweetCount)) %>%
top_n(5)
## Selecting by retweetCount
top5Tweets
## Source: local data frame [5 x 3]
##
## screenName id retweetCount
## 1 mrbrown 639654658827423744 476
## 2 mrbrown 638647979021234176 415
## 3 STcom 640384783080538112 293
## 4 mrbrown 639105775214792706 274
## 5 mrbrown 638705015981379586 224
# Direct Links to the top 5 tweets
paste("http://twitter.com/",top5Tweets$screenName,"/status/", top5Tweets$id, sep="")
## [1] "http://twitter.com/mrbrown/status/639654658827423744"
## [2] "http://twitter.com/mrbrown/status/638647979021234176"
## [3] "http://twitter.com/STcom/status/640384783080538112"
## [4] "http://twitter.com/mrbrown/status/639105775214792706"
## [5] "http://twitter.com/mrbrown/status/638705015981379586"
Not surprisingly, of the top 5 tweets, 4 came from Mr Brown. With a few more days to go before election day, the results will probably change. For now, you can view the top 5 tweets in this current snapshot below.
"It was a joke. I did not mean it to be taken seriously," Cheo later said. No, YOU are the joke. #sgelections #GE2015 pic.twitter.com/OImbDcY1bB
— mrbrown (@mrbrown) September 4, 2015
Wait, vote for WHO?! #sgelections #GE2015https://t.co/JkhOEus2l1
— mrbrown (@mrbrown) September 1, 2015
#GE2015: PAP candidate Koh Poh Koon performs CPR on elderly man who collapsed at a AMK market. http://t.co/64RIqySZTH pic.twitter.com/NXSq54o3uY
— The Straits Times (@STcom) September 6, 2015
Why all close-up shots one? You don't have wide-angle lens to film WP Rally ah? #sgelections #GE2015 pic.twitter.com/FDqxQ6s7bD
— mrbrown (@mrbrown) September 2, 2015
More #YouHadOneJob #sgelections #GE2015 pic.twitter.com/of58G6sQAY
— mrbrown (@mrbrown) September 1, 2015