robots.txt for AI Agents — How to Allow GPTBot, ClaudeBot & PerplexityBot
Your robots.txt controls which bots can crawl your site. For decades, that meant search engines. Now it means AI agents too. Here's how to configure it so ChatGPT, Claude, and Perplexity can discover — and recommend — your business.
The AI Bots You Need to Know
Each major AI company has its own crawler. Here's the complete list:
| User-Agent | Company | Purpose |
|---|---|---|
GPTBot | OpenAI | Crawls for ChatGPT browsing + model training |
ChatGPT-User | OpenAI | Real-time browsing when users ask ChatGPT to look something up |
ClaudeBot | Anthropic | Crawls for Claude AI knowledge + web access |
PerplexityBot | Perplexity | Powers Perplexity's AI search engine |
Google-Extended | AI/Gemini training (separate from Googlebot Search) | |
Bytespider | ByteDance | TikTok AI training data |
CCBot | Common Crawl | Open dataset used by many AI models |
cohere-ai | Cohere | Enterprise AI model training |
GPTBot crawls for training + knowledge. ChatGPT-User is the real-time browser — when a user says "look up X" in ChatGPT. You want both allowed if you want maximum AI visibility.
The Copy-Paste Template
Want maximum AI visibility? Use this robots.txt:
# Standard search engines User-agent: * Allow: / # AI Agents — Explicitly allowed User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: ClaudeBot Allow: / User-agent: PerplexityBot Allow: / User-agent: Google-Extended Allow: / User-agent: cohere-ai Allow: / # Sitemap Sitemap: https://yourdomain.com/sitemap.xml
Why explicit Allow matters: Some robots.txt parsers treat "not mentioned" differently from "explicitly allowed." Being explicit removes all ambiguity — you're telling AI bots: "Yes, we want you here."
Selective Access: Allow Some, Block Others
Maybe you want ChatGPT and Claude to reference your content, but you don't want your data used for model training by everyone:
# Allow AI search/browsing agents User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: ClaudeBot Allow: / User-agent: PerplexityBot Allow: / # Block training-only crawlers User-agent: Google-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / # Standard search engines User-agent: * Allow: / Sitemap: https://yourdomain.com/sitemap.xml
Common Mistakes
Mistake #1: Blocking everything with a wildcard
# ❌ This blocks ALL bots including AI agents User-agent: * Disallow: /
If you have a wildcard Disallow: /, AI bots without a specific rule will be blocked. Always add explicit rules for the AI bots you want to allow.
Mistake #2: Confusing Google-Extended with Googlebot
Googlebot handles Google Search indexing — blocking it kills your SEO. Google-Extended is specifically for Gemini/AI training. They're independent. Blocking Google-Extended does NOT affect your search ranking.
Google-Extended if you don't want Google using your content for Gemini.
Mistake #3: Not testing after changes
After updating robots.txt, always verify:
- Visit
yourdomain.com/robots.txtin your browser to confirm the content is correct - Run your site through AEO Checker — the robots.txt check will show which bots are allowed/blocked
- Use Google Search Console's robots.txt tester if available for your property
Mistake #4: Wrong bot name spelling
Bot names are case-sensitive in some parsers. Use the exact names:
GPTBot(notgptbotorGptBot)ClaudeBot(notclaudebotorClaude-Bot)Google-Extended(notGoogleExtendedorGooglebot-Extended)
The Strategic Question: Allow or Block?
This depends on your business model:
| Business Type | Recommendation | Why |
|---|---|---|
| SaaS / Product | Allow all | You want AI agents recommending your product |
| Service business | Allow all | AI referrals = free qualified leads |
| Content publisher | Selective | Allow browsing bots, consider blocking training bots |
| Paywalled content | Block training bots | Prevent free access to paid content |
| Proprietary data | Block all AI bots | Protect intellectual property |
For most businesses, the math is simple: AI agents are the new search engines. Being invisible to them is like blocking Googlebot in 2005. You might have your reasons, but you're choosing to be unfindable by a growing share of internet users.
Beyond robots.txt: The Full AEO Stack
robots.txt is just one of six signals AI agents use to evaluate your site. The complete AEO (AI Engine Optimization) stack includes:
- Structured Data — JSON-LD schemas that AI agents can parse
- robots.txt — Explicit AI bot permissions (this guide)
- llms.txt — Structured context file for AI comprehension
- Content Structure — Clean H1-H3 hierarchy, FAQ schemas
- API Discoverability — OpenAPI specs, ai-plugin.json
- Performance — Response time under 2 seconds
Check all 6 signals at once
The free AEO Checker scans your site across all six AI discoverability signals and gives you a score out of 100.
Free AEO Scan →FAQ
What is GPTBot and should I allow it?
GPTBot is OpenAI's web crawler. It collects data for model improvement and powers ChatGPT's browsing feature. Allowing it means ChatGPT can recommend your site. If you want AI visibility, allow it.
What is ClaudeBot?
ClaudeBot is Anthropic's crawler for Claude AI. It works like GPTBot — letting Claude understand and reference your content. Allowing it means Claude can recommend your products.
Should I block or allow AI bots?
If you want AI agents to recommend your site: allow them. If you have proprietary content: selectively block training bots while allowing browsing bots. Most businesses benefit from maximum AI visibility.
What is Google-Extended?
Google's user-agent for Gemini AI training. It's separate from Googlebot (Search indexing). Blocking it won't affect your Google ranking but prevents Gemini from training on your content.
Does blocking AI bots affect my SEO?
No. AI bots are separate from search engine crawlers. Blocking GPTBot or ClaudeBot has zero impact on your Google/Bing rankings. However, you become invisible in AI-powered answer engines — which is where users increasingly search.
Related: What is AEO? · How to Create llms.txt · AEO Optimization Guide · AEO vs SEO