2026-03-01·5 min read

llms.txt Is a Good Idea That Nobody's Actually Reading

The new standard for telling AI what's on your site has a problem: the bots aren't checking it.

In 2024, Jeremy Howard of Answer.AI proposed llms.txt, a deceptively simple idea: a plain-text file sitting in your website's root directory, written in Markdown, that curates your most important pages specifically for AI systems. Think of it as a sitemap for LLMs — not for Google, but for ChatGPT, Claude, Perplexity, and other conversational AI platforms. Instead of hoping AI crawlers piece together your site from raw HTML and JavaScript, you hand them a clean map of what matters most.

The appeal is genuine. Websites are messy. Navigation menus repeat content, sidebar widgets add noise, ads clutter the signal. An AI system ingesting your entire site through raw HTML has to do a lot of work to figure out what's important and what's peripheral. Llms.txt sidesteps this problem. You tell the AI: "Here are my best resources. These are organized. These are authoritative. Start here."

For site owners, the cost is near-zero. Writing a few hundred words to describe your content strategy and link your key pages takes maybe an hour. The upside could be real: better context for AI systems that ingest your work, higher quality citations when AI pulls your research, cleaner ingestion for future AI use cases we haven't imagined yet.

And then there's the problem: nobody's reading it.

The Data on Adoption

A 2025 study of 1,000 domains found zero visits from GPTBot, ClaudeBot, PerplexityBot, or Google-Extended to llms.txt pages over a 30-day observation period. Not a few visits. Not slow adoption. Zero. The researchers set up tracking pixels and monitored access logs across thousands of sites that had implemented llms.txt files. The silence was complete.

That's the most direct signal available. But there's more supporting evidence. A separate analysis of 300,000 domains found no statistical correlation between having an llms.txt file and being cited by LLMs. If llms.txt was being honored by these systems, you'd expect to see a clear signal: domains with llms.txt files would show up more often in AI-generated citations, would be ranked higher in AI search results, would see measurable traffic from AI applications.

Instead, the data shows flatness. A domain with a carefully curated llms.txt file and a domain without one appear indistinguishable to the major AI platforms.

Why? Start with adoption. Only about 10% of domains have implemented llms.txt at all. This creates a prisoner's dilemma for AI companies. Why build crawler support for a file that 90% of the web doesn't have? Why prioritize honoring a format when the vast majority of your training data and real-time crawl targets don't use it? The incentive structure points the wrong direction.

None of the major AI labs have officially committed to honoring llms.txt. Not OpenAI. Not Anthropic. Not Google. In fact, Google explicitly stated that its crawlers don't support llms.txt. The silence from other platforms is louder than any public statement — if it mattered to them, they'd say so.

The robots.txt Analogy

The llms.txt proposal always included an implicit comparison to robots.txt, the now-ubiquitous file that tells web crawlers which parts of your site to crawl and which to skip. Robots.txt is over thirty years old and remains one of the internet's most respected standards, honored by Google, Bing, and thousands of third-party crawlers.

But there's a crucial detail that gets glossed over in the robots.txt parallel: robots.txt only became a standard because crawlers chose to respect it. It started as a convention. It worked because Google, Yahoo, and other search engines made deliberate engineering decisions to read it and honor it. The adoption was bottom-up, driven by the platforms that stood to benefit from a more efficient web.

Llms.txt needs the same commitment. It needs OpenAI, Anthropic, Perplexity, and Google to decide it's worth the engineering effort to read it and honor it during crawling. That decision hasn't happened. Until it does, llms.txt is a standard without a use.

Should You Still Add It?

The pragmatic answer is probably yes, but for hedging reasons, not for immediate impact.

Adding an llms.txt file costs almost nothing. If you already have a sitemap or a knowledge base organized by topics, you can adapt that into an llms.txt file in less time than this article took to read. The maintenance cost is minimal — you update it when you publish major new content, not constantly.

The upside is speculative but not unreasonable. AI platforms may decide tomorrow to start honoring llms.txt. If and when they do, sites that have already published clean, well-organized llms.txt files will have an immediate advantage. You'll have essentially prepared your content for a format that could become standard.

That said, don't expect it to change what AI sees about your site today. If you're publishing llms.txt expecting a measurable bump in AI citations next month, you'll be disappointed. The adoption simply isn't there.

The Bigger Picture

There's a temptation, when discussing llms.txt, to treat it as a solved problem for "AI visibility." It's not. The real problem isn't that AI doesn't know where your content is. GPTBot, ClaudeBot, and PerplexityBot have already crawled your site multiple times if you have any meaningful traffic. They've found your pages. They've read them.

The problem is that they can't read them correctly when they get there.

Modern websites are built for browsers, not crawlers. Content is loaded dynamically via JavaScript — and that's a problem the standards layer can't fix on its own. Text is stripped from semantic context by DOM manipulation. Buttons and forms are invisible to systems that don't execute JavaScript. A product page might be fully rendered in your browser, complete with pricing, availability, reviews, and imagery. But to GPTBot, it's a pile of script tags and an empty div.

Llms.txt doesn't solve rendering. It doesn't solve semantic stripping. It doesn't solve the JavaScript invisibility problem. It's a pointer to content, not a fix for content that's being read incompletely or incorrectly.

That's where the real work lies. And that's a problem that a text file can't solve.

For now, llms.txt is worth adding as insurance. It's a signal that you respect this emerging layer of the internet. But the more pressing question for anyone concerned about AI visibility isn't how to tell AI about your content — it's how to make sure your content is actually readable when AI gets there. All of these metadata systems — llms.txt, Schema.org, robots.txt — share the same limitation: they only work if the underlying content is being read correctly. For a full picture of where the standards debate is heading, see The Race to Build Web Standards for AI.

Built for this problem

Control exactly what AI reads on your site

MachineContext serves clean, structured content to AI bots — JavaScript rendered, properly formatted, always accurate — while keeping your site unchanged for humans.

Get started →