Skip to content

pawamoy/mkdocs-llmstxt

Repository files navigation

mkdocs-llmstxt

ci documentation pypi version gitter

MkDocs plugin to generate an /llms.txt file.

/llms.txt - A proposal to standardise on using an /llms.txt file to provide information to help LLMs use a website at inference time.

See our own dynamically generated /llms.txt as a demonstration.

Installation

pip install mkdocs-llmstxt

Usage

Enable the plugin in mkdocs.yml:

plugins:
- llmstxt:
    files:
    - output: llms.txt
      inputs:
      - file1.md
      - folder/file2.md

You can generate several files, each from its own set of input files.

File globbing is supported:

plugins:
- llmstxt:
    files:
    - output: llms.txt
      inputs:
      - file1.md
      - reference/*/*.md

The plugin will concatenate the rendered HTML of these input pages, clean it up a bit (with BeautifulSoup), convert it back to Markdown (with Markdownify), and format it (with Mdformat). By concatenating HTML instead of Markdown, we ensure that dynamically generated contents (API documentation, executed code blocks, snippets from other files, Jinja macros, etc.) are part of the generated text files. Credits to Petyo Ivanov for the original idea ✨

You can disable auto-cleaning of the HTML:

plugins:
- llmstxt:
    autoclean: false

You can also pre-process the HTML before it is converted back to Markdown:

plugins:
- llmstxt:
    preprocess: path/to/script.py

The specified script.py must expose a preprocess function that accepts the soup and output arguments:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from bs4 import BeautifulSoup

def preprocess(soup: BeautifulSoup, output: str) -> None:
    ...  # modify the soup

The output argument lets you modify the soup depending on which file is being generated.

Have a look at our own pre-processing function to get inspiration.