Back to all tools

robots.txt Validator

Parse and validate your robots.txt. Check syntax, view rules per user-agent, and test URLs against crawl policies.

What is robots.txt?

robots.txt is a text file placed at the root of a website that tells web crawlers (bots) which pages they can and cannot access. It follows the Robots Exclusion Standard. Search engines like Googlebot, Bingbot, and others check this file before crawling your site.

A properly configured robots.txt helps you control which parts of your site appear in search results and prevents bots from wasting server resources on low-value pages.

How to Use This Validator

  1. Paste your robots.txt — Paste the full content of your robots.txt file in the editor.
  2. Review errors and warnings — Syntax issues are highlighted with line numbers.
  3. Browse rules by user-agent — Click each group to see which paths are allowed or disallowed.
  4. Test a URL — Enter any URL or path to check if it would be blocked or allowed for the wildcard agent.

Common Use Cases

Frequently Asked Questions

Does robots.txt actually prevent indexing?

robots.txt prevents crawling, not indexing. A page blocked by robots.txt can still appear in search results if other pages link to it. To prevent indexing, use a noindex meta tag or X-Robots-Tag header instead.

What does "Disallow: " (empty) mean?

An empty Disallow directive means "allow everything." It's equivalent to not having any Disallow rules for that user-agent. This is often used to explicitly allow all bots while still declaring the user-agent block.

Where should robots.txt be placed?

robots.txt must be placed at the root of your domain: https://yourdomain.com/robots.txt. It only applies to the domain it's served from, not subdomains (which need their own robots.txt).