Robots.txt Generator
Visually build and test your robots.txt file — no coding required
1Configure Rules
Sitemap URLs
2robots.txt Output
User-agent: * Disallow:
URL Test Simulator
What Is robots.txt?
A robots.txt file tells web crawlers which pages they can and cannot access on your website. It lives at the root of your domain (e.g. https://example.com/robots.txt) and is one of the first files crawlers check before indexing. While it cannot force compliance, major search engines like Google and Bing respect its directives.
How to Deploy
Generate & Copy
Build your rules using the form above, then click Copy to copy the generated text.
Create the File
Save the content as a plain text file named robots.txt (no file extension changes).
Upload to Root
Upload the file to the root directory of your website so it's accessible at https://yourdomain.com/robots.txt. For WordPress, this is typically managed by SEO plugins like Yoast or Rank Math.
Verify
Open https://yourdomain.com/robots.txt in your browser to confirm it's accessible. Then use Google Search Console → Settings → Crawling → robots.txt to validate.
Directives Explained
User-agent
Specifies which crawler the rules apply to. Use * for all bots, or a specific name like Googlebot for targeted rules.
Disallow
Blocks crawlers from accessing the specified path. Disallow: /admin/ blocks everything under /admin/. An empty value means nothing is blocked.
Allow
Overrides a broader Disallow rule for a specific path. Allow: /admin/public/ re-enables access within a blocked /admin/ directory.
Sitemap
Tells crawlers where your XML sitemap is located. Not tied to any User-agent block — all crawlers will see it.
Crawl-delay
Requests crawlers wait N seconds between requests. Supported by Bing and Yandex, but ignored by Google (use Google Search Console instead).
Wildcards
Use * to match any sequence of characters, and $ for end-of-URL matching. Example: /*.pdf$ blocks all PDF files.
Best Practices
- ✓ Always include a Sitemap directive pointing to your XML sitemap for faster discovery
- ✓ Don't use robots.txt to hide sensitive content — use authentication or
noindexmeta tags instead - ✓ Test your robots.txt with Google Search Console before deploying changes
- ✓ Be careful with
Disallow: /— it blocks your entire site from being indexed - ✓ Use specific User-agent directives to block AI training bots without affecting search engine crawling
- ✓ Remember that robots.txt is publicly visible — don't put private paths in it that you want to keep hidden
Frequently Asked Questions
Does robots.txt prevent pages from appearing in Google?
Not necessarily. While Disallow prevents Googlebot from crawling a page, Google may still index the URL if other pages link to it — it just won't know the content. To fully prevent indexing, use a noindex meta tag instead.
Can robots.txt block all AI crawlers?
You can block known AI bots (GPTBot, ClaudeBot, CCBot, Google-Extended, Bytespider) by adding specific User-agent rules. However, not all AI crawlers identify themselves, so it's not a guarantee. The "Block AI Bots" preset above covers the major ones.
What happens if a rule conflicts with another?
When multiple rules match the same URL, the most specific (longest) rule wins. For example, if you have Disallow: /admin/ and Allow: /admin/public/, the /admin/public/ path will be allowed because it's more specific.
Does Crawl-delay work with Google?
No. Google ignores the Crawl-delay directive entirely. To control Google's crawl rate, use the Crawl Rate Settings in Google Search Console. Crawl-delay is supported by Bing, Yandex, and some other crawlers.
Where should I upload my robots.txt file?
It must be at the root of your domain: https://example.com/robots.txt. Placing it in a subdirectory (like /pages/robots.txt) will not work. Each subdomain needs its own robots.txt file.