robots.txt: block Googlebot from crawling certain useless URLs.

Block Googlebot from crawling certain slow useless URLs. Sometimes
Googlebot tries to crawl old source:<url>, approver:<name>, and
ordfav:<name> searches in bulk, which tends to slow down the site because
things like source:<url> are inherently slow, and because Google spends
hours at a time crawling them in parallel. This is despite the fact that
these links are already marked as nofollow and noindex, and source:<url>
links were removed from posts a long time ago to try to stop Google from
crawling them.
This commit is contained in:
evazion
2021-11-12 16:42:15 -06:00
parent c68043bf26
commit 91587aeb6b

View File

@@ -5,6 +5,9 @@ Allow: /$
<% if !Rails.env.production? || Danbooru.config.hostname == request.host %>
Disallow: /*.atom
Disallow: /*.json
Disallow: /posts?tags=source:*
Disallow: /posts?tags=ordfav:*
Disallow: /posts?tags=approver:*
Disallow: <%= artist_urls_path %>
Disallow: <%= artist_versions_path %>