We have a need for dynamic handling of robots.txt file as we have different requirements for production, staging, dev, test, etc.
Google-fu shows various way to do this, some for Rails 2.x, some for Rails 3.x. Here is my version.
First is to edit config/routes.rb and add this line:
match '/robots.txt' => RobotsGenerator
Then add the following to app_root/lib/classes/robots_generator.rb.
NOTE: We have an old domain, foo.com, that redirects to our newfoo.com. We don’t want foo.com to get indexed, so I have special treatment for that in production
class RobotsGenerator
# Use the config/robots.txt in production.
# Disallow everything for all other environments.
def self.call(env)
req = ActionDispatch::Request.new(env)
headers = {}
body = if Rails.env.production?
if req.host.downcase =~ /foo.com$/
headers = { 'X-Robots-Tag' => "noindex,nofollow" }
"User-agent: *\nDisallow: /"
else
File.read Rails.root.join('config', 'robots.txt')
end
else
"User-agent: *\nDisallow: /"
end
[200, headers, [body]]
rescue Errno::ENOENT
[404, {}, "User-agent: *\nDisallow: /"]
end
end
Finally, you want to move public/robots.txt to config/robots.txt.
I want to give credits to the people that inspired my version.