How Feedbot honors robots.txt rules

If you haven't yet, you may want to learn a little about what Feedbot does and how it does it. You may also want to watch our Feedbot demo videos. Reading up on Feedbot is a good way to understand its purpose, and how it behaves toward your content.

Why would I use Feedbot?

Feedbot allows content owners and publishers to enlarge their audience by having their content more widely seen. It also gives website owners an easy way to sort content on their own sites.

For web designers, bloggers and others with a web presence, Feedbot offers dynamic, up-to-date content that they know will be of interest to their users.

Can I use it now?

Feedbot is in limited-participation beta testing right now. Please sign up now to be considered.

Does Feedbot respect robots.txt?

Yes. However, because Feedbot acts both as an RSS aggregator and a search engine, it treats robots.txt differently depending on what role it's taken on when it's crawling your site. Feedbot also respects the Feed Access Control standard when indexing your RSS.

HTML and content pages

Feedbot will always honor robots.txt with regard to URLs that do not contain RSS.

RSS feeds

When an RSS feed is submitted by a user to Feedbot, Feedbot will fetch the feed and aggregate it regardless of whether the directory containing the XML file is disallowed in robots.txt, unless Feedbot is explicitly disallowed as a user-agent or access is explicitly disabled using the Feed Access Control standard.

To keep your RSS feeds from being aggregated by Feedbot, you must explicitly disallow Feedbot as a user-agent in your robots.txt file, or utilize the Feed Access Control standard suggested by Bloglines, which Feedbot respects in all cases.

If content referenced by an access-enabled RSS feed lives in a directory disallowed by robots.txt, the RSS will be indexed, but the referenced content will not.

Example entries for robots.txt files

The examples that follow assume that you know something about robots.txt files and how they work. Visit the Web Robots Pages for more information on the robots.txt standard.

How do I explicitly allow Feedbot while still disallowing other robots?

Explicitly allowing Feedbot to crawl your site or directory means Feedbot will be able to successfully crawl and index your site's content and RSS feeds. To explicitly allow Feedbot, add these two lines anywhere in your robots.txt:

User-agent: Feedbot
Allow: /path/to/dir

How do I explicitly disallow Feedbot?

Explicitly disallowing Feedbot from crawling your site or directory will stop Feedbot from indexing your content and RSS feeds. To explicitly disallow Feedbot, add these two lines anywhere in your robots.txt:

User-agent: Feedbot
Disallow: /path/to/dir

Report problems

If you feel that Feedbot has been crawling or indexing your site improperly, or that it has been crawling it correctly but too frequently, please use our contact form or email us directly at [at] and we will work with you as quickly as possible to find a solution.