Wednesday, July 24, 2013

Archival, Analysis, and Visualization of #ISMBECCB 2013 Tweets

As the 2013 ISMB/ECCB meeting is winding down, I archived and analyzed the 2000+ tweets from the meeting using a set of bash and R scripts I previously blogged about.

The archive of all the tweets tagged #ISMBECCB from July 19-24, 2013 is and will forever remain here on Github. You'll find some R code to parse through this text and run the analyses below in the same repository, explained in more detail in my previous blog post.

Number of tweets by date:

Number of tweets by hour:

Most popular hashtags, other than #ismbeccb. With separate hashtags for each session, this really shows which other SIGs and sessions were well-attended. It also shows the popularity of the unofficial ISMB BINGO card.

Most prolific users. I'm not sure who or what kind of account @sciencstream is - seems like spam to me.

And the obligatory word cloud:


  1. I wonder if there is a way of filtering out generic words from word cloud, like "will", "see", "get", "us", "just", etc.

    1. The 'tm' package in R has the function stopwords() which has a collection of words like this, but obviously doesn't catch everything.

  2. Something like:

    corpus=tm_map(corpus, function(x) removeWords(x, c("will","see","get","us","just")))

    But I think it's okay anyway.


