llms.txt: The New robots.txt

March 9, 2026 · 8 min read · Updated March 2026

In 1994, Martijn Koster proposed robots.txt — a simple text file that told web crawlers what not to do. It became one of the most important files on the internet. Every website has one. Every search engine respects it.

Thirty-two years later, the web has a new problem. AI agents don't just crawl your site — they try to understand it. And robots.txt was never designed for understanding. It was designed for blocking.

Enter llms.txt: a file that tells AI agents what your site is.

The Paradigm Shift: Blocking → Informing

Here's the fundamental difference between the two files:

🚫 robots.txt (1994)

"Here's what you can't access."

User-agent: *
Disallow: /admin/
Disallow: /private/

✅ llms.txt (2024)

"Here's what we are and what matters."

# Acme Corp
> B2B analytics platform
- Docs: /docs
- API: /api-reference

robots.txt is a bouncer. llms.txt is a tour guide.

This isn't a subtle shift. For three decades, the relationship between websites and bots was adversarial: "Stay away from these areas." Now, with AI agents making purchasing recommendations, answering questions, and writing code, the relationship needs to be collaborative: "Here's how to understand us."

Why robots.txt Isn't Enough for AI

Consider what happens when someone asks ChatGPT, Claude, or Perplexity: "What's the best project management tool for remote teams?"

The AI agent doesn't just crawl URLs. It needs to:

  1. Understand what your product does
  2. Categorize it against competitors
  3. Evaluate whether it's relevant to the query
  4. Cite specific capabilities with confidence

robots.txt tells the agent none of this. It only says which URLs are off-limits. An agent could crawl every allowed page on your site and still misunderstand what you do — because your homepage is 90% hero animations and your value prop is buried in a carousel.

The invisibility problem: A recent Hacker News post put it perfectly — "Most websites are invisible to AI agents." Not because they're blocked, but because they're incomprehensible. Your robots.txt says "come in." But nothing tells the agent what it's looking at.

What llms.txt Actually Contains

The llms.txt file sits at your site root (yoursite.com/llms.txt) and provides structured context in Markdown format:

# Your Company Name
> One-line description of what you do

## Docs
- [Getting Started](/docs/quickstart.md): Setup guide for new users
- [API Reference](/docs/api.md): Complete REST API documentation

## Features
- Real-time collaboration for remote teams
- SOC 2 Type II certified
- Integrates with Slack, Jira, and GitHub

## Pricing
- Free tier: up to 5 users
- Pro: $12/user/month
- Enterprise: custom pricing

Simple. Human-readable. Machine-parseable. No special tooling required.

The Evolution: 1994 → 2026

1994

robots.txt — "Don't crawl these URLs." Solved the problem of bots overwhelming servers.

2005

sitemap.xml — "Here are all our URLs." Helped search engines discover content more efficiently.

2011

Schema.org — "Here's structured data about our content." Enabled rich snippets and knowledge graphs.

2024

llms.txt — "Here's what we are and what matters." Designed for AI agents that need to understand, not just crawl.

Each generation solved a different problem. robots.txt controlled access. sitemap.xml mapped structure. Schema.org added meaning. llms.txt adds context — the one thing AI agents need most and HTML provides least.

Side-by-Side Comparison

Aspect robots.txt llms.txt
Purpose Access control Context & understanding
Approach Deny-list (block) Allow-list (inform)
Format Custom syntax Markdown
Audience Web crawlers AI agents & LLMs
Tells agents Where NOT to go What your site IS
Impact on visibility Can reduce indexing Increases AI comprehension
Complexity Low (pattern matching) Low (structured Markdown)
Standard age 32 years ~2 years (emerging)

Who's Already Using llms.txt

The adoption curve is early but accelerating:

The pattern is clear: companies that serve developers and AI-native audiences are adopting first. But as AI agents handle more general consumer queries — restaurant recommendations, product comparisons, service evaluations — every business will need to be "AI-legible."

The Business Case: Invisible = Non-Existent

Here's the uncomfortable math:

robots.txt ensured you appeared in search results. llms.txt ensures you appear in AI recommendations. The stakes are identical — only the mechanism has changed.

How to Implement Both (5 Minutes)

1. Keep Your robots.txt (Still Essential)

User-agent: *
Allow: /

# AI-specific bot rules
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

2. Add llms.txt at Your Root

# Your Company
> What you do in one sentence

## Key Pages
- [Product](/product): Main product description
- [Docs](/docs): Technical documentation
- [Pricing](/pricing): Plans and pricing

## Quick Facts
- Founded: 2020
- Category: [your category]
- Key differentiator: [what makes you unique]

3. Check Your Score

Use AEO Check to scan your site. It evaluates your llms.txt, robots.txt AI bot rules, structured data, and overall AI agent readability — all for free.

Is Your Site Visible to AI Agents?

Scan your website in 30 seconds. Free, unlimited, no signup.

Check Your AEO Score →

Frequently Asked Questions

What is the difference between robots.txt and llms.txt?

robots.txt tells crawlers what not to access (blocking rules). llms.txt tells AI agents what your site is (context and understanding). robots.txt is defensive; llms.txt is informative. You need both.

Do I need both robots.txt and llms.txt?

Yes. robots.txt controls crawler access (still essential for SEO). llms.txt provides context for AI agents that are already allowed to access your site. They serve complementary purposes — like having both a lock on your door and a welcome sign.

Will llms.txt replace robots.txt?

No. llms.txt adds a new layer on top of robots.txt. robots.txt handles access control. llms.txt handles understanding. Both are needed as the web shifts from pure search to AI-mediated discovery.

Which AI agents read llms.txt?

As of 2026, llms.txt is supported by Anthropic's Claude, various AI coding assistants, and a growing number of AI agents. ChatGPT and Perplexity primarily use traditional crawling but are expected to adopt the standard as it matures.

How do I check if my site is optimized for AI agents?

Use AEO Check to scan your site for free. It checks llms.txt, robots.txt AI bot rules, structured data, and other factors that determine your visibility to AI agents.