Learn

llms.txt

llms.txt is a proposed plain-markdown file served at /llms.txt that gives large language models a curated, token-efficient map of a website's most important content.

Also known as:LLMs.txtllms-full.txtthe llms.txt standard

Where llms.txt came from

llms.txt was proposed by Jeremy Howard of Answer.AI in September 2024, with the specification published at llmstxt.org. The motivating observation is simple: language models work inside a finite context window, and a typical HTML page spends most of its bytes on navigation, scripts, styling, and boilerplate rather than the content itself. A model (or an agent built on one) that wants to understand your site has to burn tokens wading through all of that.

The proposal borrows the shape of robots.txt and sitemap.xml: a single well-known file at a predictable root path. Instead of access rules or a URL dump, it holds a short, human-curated summary of what the site is and links to its most important pages, written in plain markdown so any model can parse it without an HTML pipeline.

The format

An llms.txt file is ordinary markdown with a fixed skeleton:

  • An H1 with the site or project name. This is the only required element.
  • A blockquote summary directly under the H1: one or two sentences describing what the site is and who it serves.
  • H2 sections containing markdown link lists. Each section groups related pages (for example “Docs”, “Guides”, “Pricing”), and each list item is a link followed by an optional one-line description of what a model will find at that URL.
  • An optional “Optional” section for secondary links a model can skip when its context budget is tight.

The spec also describes llms-full.txt, an expanded single-file variant. Where llms.txt is a linked table of contents, llms-full.txt inlines the full content of the referenced pages into one large markdown document, so a model can ingest everything in a single fetch. Documentation sites are the most common adopters of the full variant.

What llms.txt does and does not do

Honest status first: as of mid-2026, no major AI vendor has officially committed to fetching llms.txt as part of a published crawl or retrieval policy. OpenAI, Google, Anthropic, and Perplexity all document their crawlers and user agents; none of those documents name llms.txt as an input. Treat any claim that llms.txt directly improves your LLM visibility today with skepticism.

It also has no role in training control. Whether an AI company may crawl your site for training data is governed by robots.txt directives aimed at agents like GPTBot, along with whatever terms you publish. Adding an llms.txt file neither grants nor revokes that permission.

What it does offer: cheap insurance. The file costs minutes to create, weighs a few kilobytes, and is immediately useful to any agent that does fetch it. That population is real and growing: coding assistants pointed at your docs, custom research agents, retrieval pipelines that check well-known paths, and developer tools that support the convention. If the major assistants adopt it later, early adopters are already done. If they never do, you spent minutes producing a clean summary of your own site, which tends to be useful anyway.

How it relates to robots.txt and sitemap.xml

The three root files answer three different questions, and llms.txt makes the most sense viewed alongside its two older siblings:

  • robots.txt is access control. It tells crawlers which user agents may fetch which paths. This is where AI-training directives live, via rules targeting agents such as GPTBot.
  • sitemap.xml is an exhaustive URL inventory. It lists every indexable page for search engine crawlers, with no opinion about which pages matter most or what any of them contain.
  • llms.txt is a curated reading list. It says: if you can only spend a few thousand tokens understanding this site, read these pages, in this order, and here is a one-line summary of each.

They complement rather than replace each other. A site can and usually should serve all three: robots.txt to set the rules, sitemap.xml for completeness, and llms.txt for the short version aimed at context-constrained readers.

How to create an llms.txt file

The manual route: open a text editor, write an H1 with your site name, add a blockquote describing the business in one or two sentences, then group your 10 to 30 most important URLs under H2 sections with a one-line description each. Save it as plain text and serve it at the root of your domain, at /llms.txt. Regenerate it when your key pages change, the same way you would keep a sitemap current.

The faster route: our free llms.txt generator crawls your site, drafts the summary and the curated link list in the correct format, and gives you a file you can review and upload. No signup required.

We practice what we describe here: rank.ai serves its own file at https://www.rank.ai/llms.txt, and you can use it as a working reference for the format.

See it in the product

Check Your AI Ranking

llms.txt is one input; the output that matters is whether AI assistants actually mention and cite you. Track your brand across ChatGPT, Claude, Gemini, Perplexity, and more with daily re-checks.

Frequently asked.

Do ChatGPT, Google, or Claude actually read llms.txt?
As of mid-2026, no major AI vendor has officially committed to fetching llms.txt as part of a published crawl or retrieval policy. Some agents and developer tools do fetch it: coding assistants pointed at documentation, custom research agents, and retrieval pipelines that check well-known root paths. The honest framing is that llms.txt is cheap insurance with a real but currently niche audience, and its value grows if major assistants adopt the convention.
Does llms.txt stop AI companies from training on my content?
No. Training-crawl permissions are expressed in robots.txt through directives aimed at specific user agents such as GPTBot, plus whatever terms of use you publish. llms.txt has no access-control semantics at all. It is a content map for models that are already allowed to read your site, and adding one neither grants nor revokes crawl permission.
What is the difference between llms.txt and llms-full.txt?
llms.txt is a curated table of contents: an H1, a blockquote summary, and H2 sections of markdown links with short descriptions. llms-full.txt is the expanded variant that inlines the complete content of those referenced pages into one large markdown file, so a model can ingest everything in a single fetch instead of following links. Documentation sites commonly serve both.
Where should the file live and what format is it?
At the root of your domain, at /llms.txt, served as plain text. The content is ordinary markdown: an H1 with the site name (the only required element), a blockquote summary, and H2 sections containing link lists. The spec at llmstxt.org defines the full grammar, including the optional 'Optional' section for links a model may skip under a tight context budget.
What should I put in my llms.txt?
Your 10 to 30 highest-value pages, grouped into logical H2 sections, each with a one-line description of what a reader will find there. For a SaaS business that usually means the product pages, pricing, documentation, and a handful of cornerstone guides. Leave out tag archives, pagination, and thin pages; the whole point is curation, and the sitemap already covers completeness.
Is llms.txt an official web standard?
No. It is a community proposal by Jeremy Howard of Answer.AI, published in September 2024 at llmstxt.org. Adoption so far has been grassroots, strongest among developer-tool and documentation sites. That is roughly how robots.txt and sitemap.xml started too, though neither of those had to wait on AI vendors to formalize support.
Will adding llms.txt improve my AI visibility?
Directly and immediately, probably not, since the major assistants have not committed to reading it. The reliable levers for AI visibility remain the ones covered in our answer engine optimization entry: answer-shaped content, authority, consistent brand mentions across credible sites, and clean crawlability. llms.txt is a low-cost complement to that work, useful today for agents that fetch it and positioned well if adoption spreads.
How do I generate one without writing it by hand?
Use our free llms.txt generator at /free-tools/llms-txt-generator. It crawls your site, drafts the summary and the curated link list in the correct markdown structure, and produces a file you can review, edit, and upload to your web root. You can also inspect rank.ai's own file at https://www.rank.ai/llms.txt as a working example of the format.

Ready to put this into practice?

rank.ai gives you geo-grid local rank tracking, AI visibility across nine surfaces, and GBP change monitoring on a single subscription.