Skip to content

Generate Sitemaps on read only filesystems like Heroku

kjvarga edited this page Jun 11, 2011 · 19 revisions

To generate sitemaps on read-only filesystems (like Heroku) we generate then into a temporary directory (or any directory with write access) and then upload them to a remote server.

Sitemap Generator uses CarrierWave to support uploading to Amazon S3 store, Rackspace Cloud Files store, and MongoDB's GridFS...basically whatever CarrierWave supports.

Include the CarrierWave gem

# Gemfile
gem 'sitemap_generator'
gem 'carrierwave'
gem 'fog' # if you're using S3

Configure Sitemap Generator

Here is an example sitemap file. It generates sitemaps into tmp/sitemaps/. Note that we set the sitemaps_host to the hostname of the server that will be hosting our sitemaps. The full path to the sitemaps then becomes the remote host + the sitemaps path + the sitemap filename. We set the adapter to a WaveAdapter which is a CarrierWave::Uploader::Base.

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.sitemaps_host = "http://s3.amazonaws.com/sitemap-generator/"
SitemapGenerator::Sitemap.public_path = 'tmp/'
SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
SitemapGenerator::Sitemap.adapter = SitemapGenerator::WaveAdapter.new
SitemapGenerator::Sitemap.create do
  add 'hello_world!'
  add 'another'
end

Configure CarrierWave

In this example we are uploading to S3 using Fog. (I didn't have any success using the s3 storage option.) The fog_directory is your S3 bucket name.

# config/initializers/carrierwave.rb
CarrierWave.configure do |config|
  config.cache_dir = "#{Rails.root}/tmp/"
  config.storage = :fog
  config.permissions = 0666
  config.fog_credentials = {
    :provider               => 'AWS',
    :aws_access_key_id      => 'your key',
    :aws_secret_access_key  => 'your secret',
  }
  config.fog_directory  = 'bucket name'
end

With all that in place, you should be able to run rake sitemap:refresh and have your sitemaps generated and uploaded! If you encounter problems, check the sitemaps in tmp/ and make sure they look right. Also make sure that your bucket is made public and check for any response messages from CarrierWave.

After running my test with my bucket 'sitemap-generator' my sitemaps were uploaded to https://s3.amazonaws.com/sitemap-generator/sitemaps/sitemap1.xml.gz and https://s3.amazonaws.com/sitemap-generator/sitemaps/sitemap_index.xml.gz successfully.

And that should be it! This is still in beta and is not well tested at this time.