Token optimizer

MDPilot runs a 5-pass optimizer on every generated file, typically cutting token counts by 20–40% without losing meaning or information density.

Why it matters

AI agents read your generated files at the start of every conversation. A CLAUDE.md that is 300 tokens shorter means roughly 300 fewer tokens consumed per session — across hundreds of sessions, that is significant API cost and context budget saved.

More importantly, models perform better with denser, more precise context. Boilerplate and filler phrases dilute the signal — stripping them improves comprehension, not just efficiency.

The 5 passes

1

Boilerplate strip

Removes generic filler phrases that appear in AI-generated text but carry no information: phrases like "This project is a…", "In order to…", "Please note that…", "As mentioned above", and hedge words like "very", "basically", "essentially". These phrases are common in first-draft AI output and consume tokens without helping the reader.

Before

In order to install the dependencies, you need to first run `npm install`.

After

Run `npm install` to install dependencies.

2

Cross-file deduplication

When you generate multiple files at once (e.g. AGENTS.md + CLAUDE.md + README), sections that appear in more than one file are replaced in lower-priority files with a short cross-reference. Similarity is measured with bigram matching — two sections with >60% overlap are considered duplicates. README is canonical; AGENTS, CLAUDE, CONTRIBUTING, SECURITY follow in priority order.

Before

CLAUDE.md has a full "Local setup" section identical to README.md.

After

> See [README.md § Local setup](./README.md) for details on local setup.

3

Structure compression

Cleans code block formatting: removes inline comments from fenced code blocks, strips blank lines immediately after opening fences and before closing fences. Also collapses 3+ consecutive blank lines to 2 and removes trailing whitespace per line. This pass does not change prose — only formatting.

4

Verbose compression

Replaces wordy multi-word phrases with their concise equivalents. Examples: "in order to" → "to", "due to the fact that" → "because", "prior to" → "before", "has the ability to" → "can", "it is important to note that" → (removed). Also collapses redundant pairs like "each and every" → "every", "first and foremost" → "first".

Before

Prior to running the tests, you need to ensure that the database has the ability to accept connections.

After

Before running the tests, ensure the database can accept connections.

5

Line compression

Removes heading descriptions that just restate the heading (e.g. a paragraph immediately after "## Installation" that says "This section describes how to install the package"). Converts prose admonition prefixes ("Note: ", "Warning: ") to blockquote markers. Collapses multiple blank lines and removes trailing whitespace.

When it runs

The optimizer runs automatically on every file MDPilot generates — Task mode, Generate mode, and all Labs tools. The output view shows the before/after token count and a per-pass breakdown of tokens saved. You can switch between the original and optimized versions in the editor.

Via the MCP server, the optimize_markdown tool runs the same 5-pass pipeline on any existing markdown file or string.