GitHub - samuelwjlee/emojicloud: Twitter emoji cloud generator built with D3 and Ruby on Rails to visualize public emotion.

EmojiCloud is a data visualization of emoji usage from around the world. We utilize the Twitter API to collect tweets, analyze the emoji usage trends, and present the data in a friendly cloud format. Similar to a word cloud, more common emojis are depicted larger in size.

LIVE DEMO

Technologies

Twitter API, Data-Driven-Documents(D3), and Google Maps API

EmojiCloud was built with Ruby on Rails for backend and Javascript for frontend. Twitter API was used to collect and save tweets to Postgresql database. D3 was used to render emojis and Google Maps API was used to locate where the source of tweets.

Twitter Streaming API

Twitter provides two different APIs to access tweet information. The REST API uses secure tokens obtained via OAuth to make requests for tweets data, with a number of different filter options (location, time, etc). The Twitter Streaming API gives access tweets in real time but is limited in its filtering capabilities. We utilize this stream for our data. World tweets are taken directly from the open stream, while continent specific tweets are obtained by filtering the stream for a specific geographic location. This location is created by giving coordinates that create a square by giving the lower and upper bounds for two corners.

EUROPE = '-46.230469, 35.666222, 73.300781, 75.906829'
N_AMERICA = '-165.058594, 18.552532, -58.535156, 72.151523'
AFRICA = '-22.675781, -38.899583, 52.910156, 35.092945'
ASIA = '60.996094, -49.217597, 171.386719, 48.392738'

Data Sets

Each emojicloud is rendered based on results from a sample stream of 1000 tweets. We found that number returned about 200 tweets with emojis, giving a better range of emojis in the sample. Overall, in our limited datasets we found that emoji usage hovers around 20% in our overall tweets. Interestingly, we noticed in our samples from Africa that emojis utilization is much higher, appearing in roughly 30% of tweets.

def self.data(region)
  tweets = []
  client.filter(locations: region) do |tweet|
    tweets << tweet
    return tweets if tweets.count > 500
  end
end

The data returned from the Streaming API is in JSON format. A limiting factor in creating a large enough data set to be statistically significant was the infrequency of emoji usage in tweets. For example, in order to collect a set of 300 emojis used in tweets, the stream might have to be open 5-10 seconds. For our front end user, waiting this long for the emoji cloud to refresh would be unacceptable. To solve this we increased responsiveness by...

Parsing Data

We parsed the emoji data using a regular expression that filters out tweets. We then tabulate the frequency of the emojis by storing frequencies in a hash along with sample tweet data for that particular emoji. We were surprised to find that the most commonly used emoji seems to be the tears of joy emoji. It is represented in almost every dataset we receive.

def self.sort(arr)
  coordinates = []
  word_cloud = {}
  count = 0;

  arr.each do |tweet|Â
    coordinates << tweet.attrs[:coordinates][:coordinates] if tweet.attrs[:coordinates]
    tweet2 = (EMOJI_REGEX.match(tweet.text)).to_s

    ...
  end
end

API bottleneck

Initially we planned on doing requests on the spot when a link was clicked. We soon learned that to collect enough tweets, takes far too long to be responsive to user requests. Thus, we created tasks in our application manager, Heroku, to retrieve the data on an hourly basis.

Cloud Visualization using Data-Driven-Documents(D3)

The emoji cloud visualization was implemented using the force graph from the d3.js library. In the force graph each emoji is represented by a node with its own attraction and repulsion forces. The node forces allow the emojis to space evenly and form an attractive cloud pattern. The nodes can be dragged to any point on the canvas and positions are recalculated on every "tick" (approximately 60 times a second).

Emojis were utilized because we felt that it would be visually appealing as well as provide an interesting data analysis.

Handling the Data

During the planning stages we were unaware of the popularity of some emojis. In fact, in any given sample we gathered, the most popular emoji was up to 20 times more frequent than the median. In order to make the cloud visually appealing we wanted to maximize the size of the emojis and represent popularity in scale. We were able to accomplish this using a roughly logarithmic scale: an emoji that is ten times as popular is roughly double the size and one that is one hundred times more popular is roughly triple the size.

Collision Detection

We were able to implement a simple collision detection with a bounding box which prevents the emoji nodes from flying off the canvas. Ensuring that the nodes don't overlap each other was a more difficult task. An approximation of node collision bounding was achieved by increasing each node's repulsion charge. Going forward we'd like to implement a more robust collision detection to prevent overlap.

Features

The Emoji Cloud

A visual representation of emojis used in tweets. The emoji "cloud" displays the relative frequency of each type of emoji, in logarithmic scale.

The EmojiCloud Team

EmojiCloud was designed and implemented by Mark Noizumi, Peter Delfausse, and Samuel Lee.

Mark and Sam put together the Rails backend, tamed the Twitter API, and formatted the data. Peter transformed the data into an emoji cloud on the frontend.

Samuel's Portfolio: http://www.theleesamuel.com

Mark's Portfolio: http://www.marknoizumi.com/

Peter's Portfolio: https://delisauce.github.io/

Name		Name	Last commit message	Last commit date
Latest commit History 344 Commits
app		app
bin		bin
config		config
db		db
doc		doc
frontend		frontend
lib		lib
public		public
test		test
vendor		vendor
.gitignore		.gitignore
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md
Rakefile		Rakefile
config.ru		config.ru
package-lock.json		package-lock.json
package.json		package.json
webpack.config.js		webpack.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Technologies

Twitter API, Data-Driven-Documents(D3), and Google Maps API

Twitter Streaming API

Data Sets

Parsing Data

API bottleneck

Cloud Visualization using Data-Driven-Documents(D3)

Handling the Data

Collision Detection

Features

The Emoji Cloud

The EmojiCloud Team

About

Releases

Packages

Contributors 2

Languages

samuelwjlee/emojicloud

Folders and files

Latest commit

History

Repository files navigation

Technologies

Twitter API, Data-Driven-Documents(D3), and Google Maps API

Twitter Streaming API

Data Sets

Parsing Data

API bottleneck

Cloud Visualization using Data-Driven-Documents(D3)

Handling the Data

Collision Detection

Features

The Emoji Cloud

The EmojiCloud Team

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages