AI search grew 527% YoY - most brands are invisible in it. Get your free AI-visibility audit

What is llms.txt?

llms.txt is a Markdown file at a site's root that gives AI crawlers a curated map of what the site is about and which pages matter most.

llms.txt is an emerging standard - a simple Markdown file hosted at yourdomain.com/llms.txt - that tells large language models what your site is about, summarises your business, and links to your most important pages.

It works like robots.txt, but instead of controlling crawling, it helps AI systems understand and represent your brand accurately. It's a low-effort machine-readability signal that complements (but doesn't replace) strong content and authority.

What exactly is llms.txt and what does the file look like?

llms.txt is a plain text file, written in Markdown, that a website publishes to give large language models a clean, curated map of its most important content. Instead of forcing an AI crawler to wade through navigation menus, cookie banners, ads, and JavaScript, llms.txt hands it a concise, machine readable summary of what the site is and where the good stuff lives. It was proposed by Jeremy Howard of Answer.AI in September 2024 as a way to make sites easier for LLMs to read and reason over.

The format is deliberately simple. It opens with a single H1 that names the project or site. Below that sits an optional blockquote that summarizes, in a sentence or two, what the site offers and who it serves. After the summary you list curated links grouped under H2 headings, where each link points to a key page and carries a short description of what that page covers. A common convention is an H2 named Optional that holds secondary links an LLM can skip when it needs to save tokens or stay focused.

How is llms.txt like robots.txt but different?

The easiest mental model is robots.txt for LLMs, with one crucial difference. robots.txt tells crawlers what they are not allowed to access, so it is a file about restriction and exclusion. llms.txt does the opposite. It is a file about invitation and curation, telling AI systems here is exactly what matters and here is how to find it fast.

The two files also serve different audiences. robots.txt is a decades-old standard read mainly by search engine bots and governed by crawl directives. llms.txt is aimed at the reasoning layer of generative engine optimization, meaning the models behind ChatGPT, Perplexity, Claude, Gemini, and AI Overviews that try to understand and summarize your content rather than simply index it. They coexist happily. You keep robots.txt to manage crawl access and add llms.txt to guide comprehension.

Where do you host llms.txt and what is llms-full.txt?

Host llms.txt at the root of your domain so it resolves at https://yourdomain.com/llms.txt. The root location is part of the convention, the same way robots.txt lives at the root, so AI crawlers and tools know exactly where to look without guessing. Serve it as plain text and keep the links absolute so a model can follow them without resolving relative paths.

There is a companion file called llms-full.txt. Where llms.txt is a lean index of links and descriptions, llms-full.txt inlines the actual content of those pages into one long Markdown document. The idea is that a model can ingest your full documentation or knowledge base in a single fetch rather than crawling page by page. It is popular with software and documentation sites. For most marketing sites the lighter llms.txt is the practical starting point, and you add llms-full.txt only when you genuinely want the entire corpus available in one file.

Does llms.txt actually help, and is it a real standard yet?

Honest answer first. llms.txt is an emerging proposal, not an official standard, and as of 2025 the major AI providers have not publicly confirmed that their crawlers read it. Google's John Mueller has been openly skeptical, comparing it to the long-ignored keywords meta tag, and noted that no AI system he was aware of used it. So you should treat it as a low cost bet rather than a guaranteed ranking lever.

That said, adoption is climbing. Documentation platforms like Mintlify generate llms.txt automatically, and a growing list of developer-focused companies including Anthropic, Cloudflare, Stripe, and many others now publish one. Tools such as Perplexity-adjacent crawlers and several open source agents already look for it. The realistic view is that llms.txt costs little to add, signals that your site is AI-aware, and positions you for a convention that may harden over the next year or two. It will not rescue thin content, but it removes friction for any model that chooses to use it.

How does llms.txt fit with schema and strong content?

llms.txt is one signal in a stack, not a replacement for the fundamentals. Structured data, meaning JSON-LD schema like Article, FAQPage, Organization, and Product, still does the heavy lifting of telling machines what each entity on a page means. Schema lives inside individual pages and describes them in detail. llms.txt sits above all of that and describes the site as a whole, pointing models toward the pages worth reading.

Underneath both, content quality remains the deciding factor. AI engines cite sources that answer a question clearly, lead with a direct answer, and back claims with specifics. llms.txt and schema make that strong content easier for a machine to find and parse, but they cannot manufacture authority on their own. The winning combination is genuinely useful, answer-first pages, marked up with accurate schema, and surfaced through a clean llms.txt index.

How do you create an llms.txt file?

You can write one by hand in a few minutes or generate it. The manual route is straightforward and gives you the most control over what models see first.

Start with an H1 that names your site or brand, for example a single line reading your company name.
Add a blockquote summary directly below it that states what you do and who you serve in one or two plain sentences.
Create H2 sections for your most important content groups, such as Services, Guides, Docs, or About.
Under each H2, list links as Markdown with a short description after each one, using absolute URLs that a crawler can follow directly.
Add an H2 named Optional for secondary links a model can safely skip when it needs to conserve tokens.
Save the file as llms.txt in plain text and upload it to the root of your domain so it resolves at /llms.txt.
Optionally generate the file with a tool or a docs platform like Mintlify, then review the output so the curation reflects your real priorities.

What are the most common llms.txt mistakes?

Most problems come from treating llms.txt like a sitemap dump or an afterthought rather than a curated guide.

Listing every URL on the site, which buries your best pages and defeats the point of curation.
Placing the file in a subfolder instead of the root, so tools that expect /llms.txt never find it.
Using relative links that a model cannot resolve, rather than full absolute URLs.
Writing vague or missing descriptions, which strips away the context that helps a model decide what to read.
Letting it go stale, so the links point to retired pages and the summary no longer matches the business.
Stuffing it with keywords, which adds noise and signals manipulation rather than clarity.
Assuming it replaces robots.txt, schema, or quality content, when it only complements them.

Questions, answered

Is llms.txt required for my website?

No. It is optional and not yet an official standard, so your site works fine without it. It is a low cost, forward-looking addition that makes your content easier for AI crawlers to parse, so many teams add it as a small bet on where generative engine optimization is heading.

Will llms.txt improve my Google rankings?

Not directly. llms.txt is aimed at how language models read and summarize your site, not at traditional search ranking, and Google has said it does not currently use the file. Treat it as part of an AEO and GEO strategy rather than a classic SEO ranking factor.

What is the difference between llms.txt and llms-full.txt?

llms.txt is a short, curated index of links and descriptions pointing to your key pages. llms-full.txt inlines the actual page content into one long Markdown file so a model can read your entire corpus in a single fetch. Start with llms.txt and add the full version only when you want everything available at once.

How often should I update llms.txt?

Review it whenever you publish major new content, retire pages, or change what your business offers. A stale file that links to dead pages or misdescribes the site does more harm than good, so a quick check each quarter, plus updates after big content launches, keeps it accurate.

Do AI crawlers actually read llms.txt today?

Some tools and crawlers look for it, and a growing list of companies publish one, but the major AI providers have not publicly confirmed they rely on it. Adoption is real but early, so the practical stance is to publish a clean file now and benefit if and when more engines adopt the convention.

Related terms

Generative Engine Optimization (GEO)Schema Markup (Structured Data)

See where you stand on AI search

Get a free AI-visibility audit - exactly where AI cites you, who it recommends instead, and the fastest fixes.

Get my free audit