Skip to content
ninajansen edited this page Sep 13, 2010 · 5 revisions

Welcome. This is the wiki for the “cloud” gem. Currently, we have the following pages:

This is the first blogpost about “cloud”

My motivation for this gem is that I love making wordclouds with wordle. However there are a bunch of things about wordle that I don’t love. It’s closed source, so you can’t take it apart and hack it. Also, it only runs in you browser as a java app. I wanted to be able to generate wordclouds as a background process. Eventually, I want to be able to make a cloud out of anything, like a table in a database. I want to be able to run cloud generating as a background process in my rails app. All this will not work with a closed source in-browser java project.

Now, I am not, nor is anyone else permitted to reverse engineer wordle, and I have certainly not done that. I don’t know what algortihm wordle code uses, it is certainly much faster than the one I came up with. Cloud is slow, can’t insert words inside other word, and can only generate pdf-files, which must be post-processed by some other code to make it into other formats. The only redeeming quality is that it is open source and I am really hoping that there are other people out there who would like to spend some time making cloud much better.

I use a 2D binpacking algorithm for cloud. To start with, I use pdf:writer’s built in pdf-functions to compute the size of a box that a word takes up. I add up the total area, and add some extra just in case, and find the smallest possible paper size that has this area and is avaible in pdf. Since pdf is scalable, you can always transform your clouds to some uniform size later, if you want. Once I have all the boxes, I take the largest and place it at the center of the canvas. I then add all the corners and centers of the sides of the box to an array. I look through this array of points to find the one that is closest to the “center” of the canvas. The center is computed by one of several distance functions, so it is possible to make clouds of different shapes. Once I have found the point closest to the center, I take the next box, and place it’s opposite point there (if the point is at the center of the righthand edge of the box, I place the center, lefthand edge of the second box at the point). Then I check if this placement is okay, i.e. if it doesn’t overlap with any other boxes. If it is okay, great, I have placed the second box and I can add it’s points to my “possible locations” array. If not, I try to place it at another point. I keep going until all boxes are placed. I clean up points regularly, i.e. I try to place the smallest possible wordbox at each point, and if it cannot be placed I erase the points from my “possible locations” array.

This algorithm gets slower and slower the more boxes (words) there are. The first 100 words are definitely faster to place than the next 100 words. 100 words takes about a minute. I’ve added speed to the algorithm by using ruby inline and written some of the functionality in C. Still, you wouldn’t want to generate very large clouds this way. But if you want to put a wordcloud on the front page of your website with treding topics, you could easily run cloud periodically in the background.

I hope that someone would like to use this gem, and contribute to it. You could add more color palettes or distance functions, incoorporate it into rails or just use it and tell me what you think.

Clone this wiki locally