Skip to content

SitemapGenerator Usage

sambit edited this page Jan 2, 2016 · 29 revisions

SitemapGenerator Usage Examples

Please add your name, site and how many links in your Sitemap, and, if you feel like it, a small snippet of cool code, showing how SitemapGenerator made your life easier.


Produced sitemaps with more than 280M links in a matter of 4.3 hours by running the sitemap generation in parallel (parallel gem) over 4 cores.


....
Sitemap stats: 35,622,979 links / 713 sitemaps / 129m58s
Sitemap stats: 35,622,979 links / 713 sitemaps / 130m04s
Sitemap stats: 35,622,979 links / 713 sitemaps / 130m11s
Sitemap stats: 35,622,979 links / 713 sitemaps / 131m18s
....
15460.43 real     39030.90 user     14347.09 sys

Parallel.each(domains, :in_processes => 4) do |domain|
  SitemapGenerator::Sitemap.default_host = "http://#{domain}"
  SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/#{domain}"
  SitemapGenerator::Sitemap.adapter = SitemapGenerator::FileAdapter.new
  SitemapGenerator::Sitemap.create do
    add '/', changefreq: 'monthly', priority: 1.0
    add '/signup', changefreq: 'monthly', priority: 0.8
    add '/login', changefreq: 'monthly', priority: 0.8
    add '/about', changefreq: 'monthly', priority: 0.8
    add '/contact', changefreq: 'monthly', priority: 0.8
    add '/faq', changefreq: 'monthly', priority: 0.8
    add '/careers', changefreq: 'monthly', priority: 0.8
    add '/privacy', changefreq: 'monthly', priority: 0.8
    add '/terms', changefreq: 'monthly', priority: 0.8
    add '/password_resets/new', changefreq: 'monthly', priority: 0.64
    ...
  end
end

Andrew Cetinick, www.sherpi.com, 233,939 links, 4m40s


Sitemap stats: 233,939 links / 5 sitemaps / 4m40s

Rake task on Heroku to push to S3 bucket. Also added this route to my Rails app so that it would redirect the sitemaps to S3

get '/sitemaps/:filename.xml.gz' => 'pages#sitemap'


Adam Salter, www.answermyoffice.com, 72,956 links, 2m03s


Zipcode.find(:all, :include => :city).each do |z|
  sitemap.add zipcode_path(:state => z.city.state, :city => z.city, :zipcode => z)
end

Rob Biedenharn, stylepath.com, Sitemap stats: 4,684,358 links, 6h21m31s

Old: Sitemap stats: 78,645 links, 3m37s

New: Sitemap stats: 4,684,358 links, 6h21m31s


  Category.find_in_order.each do |category|
    sitemap.add category_page_path(category), :changefreq => 'daily', :priority => 0.6
    Product.interesting_from_category(category.id, 0, nil, true).each do |product|
      sitemap.add details_id_path(product), :changefreq => 'weekly', :priority => 0.5
    end
  end

And running against a Rails 1.2.2 project. Only a few changes needed:

  • Need to provide a String#present? (which was easy since I already had String#nonblank?)
  • Cope with the change from app/controllers/application.rb to app/controllers/application_controller.rb by adding:
    • require 'app/controllers/application' to lib/sitemap_generator/helper.rb

mattmueller, 1.9 million urls

It took about 2 hours to generate on a very powerful production server without niceing it. If you decide to nice it (we tried at 15) for that sort of load it would take > 8 hours


openc, 104+million urls for OpenCorporates

Takes several days to generate. Runs weekly on worker server (also processes Resque jobs), and then SCP’d to shared folder on app server, which is symlinked from production.


Since my main sitemap takes too long for Google to process, I take advantage of sitemap_generator’s multiple config option. I generate smaller sitemaps for rapidly changing content such as news.

I use Heroku and S3 (via the Wave Adapter). Due to Google’s Webmaster Tools restriction that sitemap submission must be on same domain, I use 302s to point to sitemap the S3 buckets. Google now indexes them beautifully!


Resque task


Sitemap stats: 1,242,638 links / 25 sitemaps / 17m45s

businessprofiles, ~130M pages indexed for the corporate registration directory, Business Profiles

We store the sitemap files, which take around a week to generate, on S3 space and have Rails routes to appropriately direct requests to sitemap.xml on our primary app server. The gem allowed us to index the site much more efficiently and has resulted in improved indexation by Google of our many millions of pages.


Simple sidekiq job ran daily and generate the sitemap of all “changes”


Sitemap stats: 51,443 links / 2 sitemaps / 1m14s

Like many others, we run sitemaps as a worker job on a separate server. Currently generating over 10 million links in under an hour usually.


Sitemap stats: 10,942,929 links / 219 sitemaps / 42m35s

Sitemap is running with a cronjob on a weekly basis. Currently we generate Sitemaps with ~12M product links using a batch size of 25k with find_each including all images of the specific products. Great gem – we highly recommend it!


Sitemap stats: 12,886,563 links / 573 sitemaps / 497m55s (incl. ~5M images)