robots.txt Validator
Parse and validate your robots.txt. Check syntax, view rules per user-agent, and test URLs against crawl policies.
Parse and validate your robots.txt. Check syntax, view rules per user-agent, and test URLs against crawl policies.
robots.txt is a text file placed at the root of a website that tells web crawlers (bots) which pages they can and cannot access. It follows the Robots Exclusion Standard. Search engines like Googlebot, Bingbot, and others check this file before crawling your site.
A properly configured robots.txt helps you control which parts of your site appear in search results and prevents bots from wasting server resources on low-value pages.
robots.txt prevents crawling, not indexing. A page blocked by robots.txt can still appear in search results if other pages link to it. To prevent indexing, use a noindex meta tag or X-Robots-Tag header instead.
An empty Disallow directive means "allow everything." It's equivalent to not having any Disallow rules for that user-agent. This is often used to explicitly allow all bots while still declaring the user-agent block.
robots.txt must be placed at the root of your domain: https://yourdomain.com/robots.txt. It only applies to the domain it's served from, not subdomains (which need their own robots.txt).