Ruby gem generating image thumbnails from a given URL. Rank them and give you back an object containing images and website informations. Works like Facebook link previewer.
Demo Application is here ! The source code of the Demo Application is hosted here!
- Dead simple.
- Support OpenGraph protocol.
- Find and sort images that best represent what the page is about.
- Find and rate description that best represent what the page is about.
- Allow for custom class to sort the website descriptions yourself.
- Support image urls blacklisting (advertisements).
- Works with and without Rails.
- Fully customizable.
- Fully tested.
Add this line to your application's Gemfile:
gem 'link_thumbnailer'
And then execute:
$ bundle
Or install it yourself as:
$ gem install link_thumbnailer
If you are using Rails, you can generate the configuration file with:
$ rails g link_thumbnailer:install
This will add link_thumbnailer.rb
to config/initializers/
.
Run irb
and require the gem:
require 'link_thumbnailer'
The gem handle regular website but also website that use the Opengraph protocol.
object = LinkThumbnailer.generate('http://stackoverflow.com')
=> #<LinkThumbnailer::Models::Website:...>
object.title
=> "Stack Overflow"
object.favicon
=> "//cdn.sstatic.net/stackoverflow/img/favicon.ico?v=038622610830"
object.description
=> "Q&A for professional and enthusiast programmers"
object.images.first.src.to_s
=> "http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon@2.png?v=fde65a5a78c6"
LinkThumbnailer generate
method return an instance of LinkThumbnailer::Models::Website
that respond to to_json
and as_json
as you would expect:
object.to_json
=> "{\"url\":\"http://stackoverflow.com\",\"title\":\"Stack Overflow\",\"description\":\"Q&A for professional and enthusiast programmers\",\"images\":[{\"src\":\"http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon@2.png?v=fde65a5a78c6\",\"size\":[316,316],\"type\":\"png\"}]}"
LinkThumbnailer comes with default configuration values. You can change default value by overriding them in a rails initializer:
In config/initializers/link_thumbnailer.rb
LinkThumbnailer.configure do |config|
# Numbers of redirects before raising an exception when trying to parse given url.
#
# config.redirect_limit = 3
# Set user agent
#
# config.user_agent = 'link_thumbnailer'
# Enable or disable SSL verification
#
# config.verify_ssl = true
# The amount of time in seconds to wait for a connection to be opened.
# If the HTTP object cannot open a connection in this many seconds,
# it raises a Net::OpenTimeout exception.
#
# See http://www.ruby-doc.org/stdlib-2.1.1/libdoc/net/http/rdoc/Net/HTTP.html#open_timeout
#
# config.http_open_timeout = 5
# List of blacklisted urls you want to skip when searching for images.
#
# config.blacklist_urls = [
# %r{^http://ad\.doubleclick\.net/},
# %r{^http://b\.scorecardresearch\.com/},
# %r{^http://pixel\.quantserve\.com/},
# %r{^http://s7\.addthis\.com/}
# ]
# List of attributes you want LinkThumbnailer to fetch on a website.
#
# config.attributes = [:title, :images, :description, :videos, :favicon]
# List of procedures used to rate the website description. Add you custom class
# here. See wiki for more details on how to build your own graders.
#
# config.graders = [
# ->(description) { ::LinkThumbnailer::Graders::Length.new(description) },
# ->(description) { ::LinkThumbnailer::Graders::HtmlAttribute.new(description, :class) },
# ->(description) { ::LinkThumbnailer::Graders::HtmlAttribute.new(description, :id) },
# ->(description) { ::LinkThumbnailer::Graders::Position.new(description, weight: 3) },
# ->(description) { ::LinkThumbnailer::Graders::LinkDensity.new(description) }
# ]
# Minimum description length for a website.
#
# config.description_min_length = 25
# Regex of words considered positive to rate website description.
#
# config.positive_regex = /article|body|content|entry|hentry|main|page|pagination|post|text|blog|story/i
# Regex of words considered negative to rate website description.
#
# config.negative_regex = /combx|comment|com-|contact|foot|footer|footnote|masthead|media|meta|outbrain|promo|related|scroll|shoutbox|sidebar|sponsor|shopping|tags|tool|widget|modal/i
# Numbers of images to fetch. Fetching too many images will be slow.
# Note that LinkThumbnailer will only sort fetched images between each other.
# Meaning that they could be a "better" image on the page.
#
# config.image_limit = 5
# Whether you want LinkThumbnailer to return image size and type or not.
# Setting this value to false will increase performance since for each images, LinkThumbnailer
# does not have to fetch its size and type.
#
# config.image_stats = true
#
# Whether you want LinkThumbnailer to raise an exception if the Content-Type of the HTTP request
# is not an html or xml.
#
# config.raise_on_invalid_format = false
#
# Sets number of concurrent http connections that can be opened to fetch images informations such as size and type.
#
# config.max_concurrency = 20
# Sets the default encoding.
#
# config.encoding = 'utf-8'
end
Or at runtime:
object = LinkThumbnailer.generate('http://stackoverflow.com', redirect_limit: 5, user_agent: 'foo')
Note that runtime options will override default global configuration.
See Configuration Options Explained for more details on each configuration options.
LinkThumbnailer defines a list of custom exceptions you may want to rescue in your code. All the following exceptions inherit from LinkThumbnailer::Exceptions
:
RedirectLimit
-- raised when redirection threshold defined in config is reachedBadUriFormat
-- raised when url given is not a valid HTTP urlFormatNotSupported
-- raised when theContent-Type
of the HTTP request is not supported (nothtml
)
You can rescue from any LinkThumbnailer exceptions using the following code:
begin
LinkThumbnailer.generate('http://foo.com')
rescue LinkThumbnailer::Exceptions => e
# do something
end
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Run the specs (
bundle exec rspec spec
) - Commit your changes (
git commit -am 'Added some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request