llms.txt: The New robots.txt
In 1994, Martijn Koster proposed robots.txt — a simple text file that told web crawlers what not to do. It became one of the most important files on the internet. Every website has one. Every search engine respects it.
Thirty-two years later, the web has a new problem. AI agents don't just crawl your site — they try to understand it. And robots.txt was never designed for understanding. It was designed for blocking.
Enter llms.txt: a file that tells AI agents what your site is.
The Paradigm Shift: Blocking → Informing
Here's the fundamental difference between the two files:
🚫 robots.txt (1994)
"Here's what you can't access."
User-agent: *
Disallow: /admin/
Disallow: /private/
✅ llms.txt (2024)
"Here's what we are and what matters."
# Acme Corp
> B2B analytics platform
- Docs: /docs
- API: /api-reference
robots.txt is a bouncer. llms.txt is a tour guide.
This isn't a subtle shift. For three decades, the relationship between websites and bots was adversarial: "Stay away from these areas." Now, with AI agents making purchasing recommendations, answering questions, and writing code, the relationship needs to be collaborative: "Here's how to understand us."
Why robots.txt Isn't Enough for AI
Consider what happens when someone asks ChatGPT, Claude, or Perplexity: "What's the best project management tool for remote teams?"
The AI agent doesn't just crawl URLs. It needs to:
- Understand what your product does
- Categorize it against competitors
- Evaluate whether it's relevant to the query
- Cite specific capabilities with confidence
robots.txt tells the agent none of this. It only says which URLs are off-limits. An agent could crawl every allowed page on your site and still misunderstand what you do — because your homepage is 90% hero animations and your value prop is buried in a carousel.
The invisibility problem: A recent Hacker News post put it perfectly — "Most websites are invisible to AI agents." Not because they're blocked, but because they're incomprehensible. Your robots.txt says "come in." But nothing tells the agent what it's looking at.
What llms.txt Actually Contains
The llms.txt file sits at your site root (yoursite.com/llms.txt) and provides structured context in Markdown format:
# Your Company Name > One-line description of what you do ## Docs - [Getting Started](/docs/quickstart.md): Setup guide for new users - [API Reference](/docs/api.md): Complete REST API documentation ## Features - Real-time collaboration for remote teams - SOC 2 Type II certified - Integrates with Slack, Jira, and GitHub ## Pricing - Free tier: up to 5 users - Pro: $12/user/month - Enterprise: custom pricing
Simple. Human-readable. Machine-parseable. No special tooling required.
The Evolution: 1994 → 2026
robots.txt — "Don't crawl these URLs." Solved the problem of bots overwhelming servers.
sitemap.xml — "Here are all our URLs." Helped search engines discover content more efficiently.
Schema.org — "Here's structured data about our content." Enabled rich snippets and knowledge graphs.
llms.txt — "Here's what we are and what matters." Designed for AI agents that need to understand, not just crawl.
Each generation solved a different problem. robots.txt controlled access. sitemap.xml mapped structure. Schema.org added meaning. llms.txt adds context — the one thing AI agents need most and HTML provides least.
Side-by-Side Comparison
| Aspect | robots.txt | llms.txt |
|---|---|---|
| Purpose | Access control | Context & understanding |
| Approach | Deny-list (block) | Allow-list (inform) |
| Format | Custom syntax | Markdown |
| Audience | Web crawlers | AI agents & LLMs |
| Tells agents | Where NOT to go | What your site IS |
| Impact on visibility | Can reduce indexing | Increases AI comprehension |
| Complexity | Low (pattern matching) | Low (structured Markdown) |
| Standard age | 32 years | ~2 years (emerging) |
Who's Already Using llms.txt
The adoption curve is early but accelerating:
- Anthropic — Claude's parent company has full llms.txt implementation
- GitAuto.ai — Implemented llms.txt and shared results on Hacker News (March 2026)
- Stripe — Structured documentation that AI agents can parse (scored 68/100 on our AEO Check)
- Developer documentation platforms — ReadMe, GitBook, and others are adding llms.txt generation
The pattern is clear: companies that serve developers and AI-native audiences are adopting first. But as AI agents handle more general consumer queries — restaurant recommendations, product comparisons, service evaluations — every business will need to be "AI-legible."
The Business Case: Invisible = Non-Existent
Here's the uncomfortable math:
- 40% of Gen Z now uses AI chatbots for product research instead of Google
- AI-generated answers produce $0 in ad revenue for websites — no click, no visit, no impression
- When an AI agent recommends your competitor because it understood their site better, you lost the deal before you knew it existed
robots.txt ensured you appeared in search results. llms.txt ensures you appear in AI recommendations. The stakes are identical — only the mechanism has changed.
How to Implement Both (5 Minutes)
1. Keep Your robots.txt (Still Essential)
User-agent: * Allow: / # AI-specific bot rules User-agent: GPTBot Allow: / User-agent: ClaudeBot Allow: / User-agent: Google-Extended Allow: /
2. Add llms.txt at Your Root
# Your Company > What you do in one sentence ## Key Pages - [Product](/product): Main product description - [Docs](/docs): Technical documentation - [Pricing](/pricing): Plans and pricing ## Quick Facts - Founded: 2020 - Category: [your category] - Key differentiator: [what makes you unique]
3. Check Your Score
Use AEO Check to scan your site. It evaluates your llms.txt, robots.txt AI bot rules, structured data, and overall AI agent readability — all for free.
Is Your Site Visible to AI Agents?
Scan your website in 30 seconds. Free, unlimited, no signup.
Check Your AEO Score →Frequently Asked Questions
What is the difference between robots.txt and llms.txt?
robots.txt tells crawlers what not to access (blocking rules). llms.txt tells AI agents what your site is (context and understanding). robots.txt is defensive; llms.txt is informative. You need both.
Do I need both robots.txt and llms.txt?
Yes. robots.txt controls crawler access (still essential for SEO). llms.txt provides context for AI agents that are already allowed to access your site. They serve complementary purposes — like having both a lock on your door and a welcome sign.
Will llms.txt replace robots.txt?
No. llms.txt adds a new layer on top of robots.txt. robots.txt handles access control. llms.txt handles understanding. Both are needed as the web shifts from pure search to AI-mediated discovery.
Which AI agents read llms.txt?
As of 2026, llms.txt is supported by Anthropic's Claude, various AI coding assistants, and a growing number of AI agents. ChatGPT and Perplexity primarily use traditional crawling but are expected to adopt the standard as it matures.
How do I check if my site is optimized for AI agents?
Use AEO Check to scan your site for free. It checks llms.txt, robots.txt AI bot rules, structured data, and other factors that determine your visibility to AI agents.