Writing a Robots.txt file - page 3

Author: Steven Neiland
Published:

Warning: This blog entry was written two or more years ago. Therefore, it may contain broken links, out-dated or misleading content, or information that is just plain wrong. Please read on with caution.

Crawl-delay

Several major crawlers support a Crawl-delay parameter, which determines how long a spider must wait after retrieving a page before it can request another page from the site. This is a good security measure to implement as it reduces the likelyhood of a spider acting like a 'Denial of Service' attack as it prevents your site from being overloaded by a spider suddenly requesting a large number of pages in quick succession.

The crawl-delay directive accepts a value which represents the number of seconds to wait between successive requests to the same server:[1][2][3]

#wait ten seconds between page requests
User-agent: *
Crawl-delay: 10

Sitemap

Another important function of the robots.txt file in particular for SEO is the sitemap directive which tells a webspider where the site map(s) is stored. Example as follows.

Sitemap: http://www.mysite.com/sitemaps/profiles-sitemap.xml
Sitemap: http://www.mysite.com/sitemaps/blog-sitemap.xml
1 2 3

Reader Comments

  • Please keep comments on-topic.
  • Please do not post unrelated questions or large chunks of code.
  • Please do not engage in flaming/abusive behaviour.
  • Comments that contain advertisments or appear to be created for the purpose of link building, will not be published.

Archives Blog Listing