Open source · LLM tooling

Stop paying to feed fluff to an LLM.

mdcompress strips the parts a model doesn't need — badges, boilerplate, hedging prose, dead tables of contents, repeated context — from your READMEs and agent-context files, into a hidden token-optimized mirror. 35 deterministic rules, an optional LLM rewriter, and a faithfulness audit.

Try it in your browser →View on GitHub

zsh — mdcompress

# one-time setup in any repo

$ mdcompress init

✓ wrote .mdcompress/config.yaml (tier: aggressive)

# compress every tracked .md into a hidden mirror

$ mdcompress run --all

✓ 42 files · 18,204 → 15,610 tokens

saved 2,594 tokens (−14.2%)

$ mdcompress status

cumulative: 128,400 tokens saved

✓Open source · MIT
✓Runs in your browser or your shell
✓Nothing uploaded
✓Deterministic by default

The problem

Markdown is the silent money pit of LLM workflows.

Every time you hand a model a README or a docs tree, you pay for the badges, the table of contents, the “it is worth noting that” padding — tokens that carry no meaning the model can use.

Agent-context files like CLAUDE.md and AGENTS.md are worse: they're re-sent on every call, so the same bloat is billed over and over, all session long.

mdcompress produces a meaning-preserving mirror that's cheaper to read — without you hand-editing a single doc.

How it works

Three tiers, escalating from safe to bold.

Pick how aggressive to be. Tiers stack: each includes everything below it. The default, Tier 2, is deterministic except where you opt into the LLM rewriter.

Tier 1safe

Safe

Deterministic, lossless-to-meaning rules: strip frontmatter, badges, HTML comments, dead tables of contents, tracking params, shell prompts in code fences. Nothing that changes what the document says.

Tier 2aggressive · default

Aggressive

Adds prose-simplification and cross-file work: strip hedging phrases and admonition prefixes, drop benchmark narration, factor repeated paragraphs and code blocks across the repo into back-references. The default.

Tier 3llm

LLM

Section-level rewriting with a language model, each section gated by a faithfulness audit (a separate model answers questions about the original and the rewrite; rewrites that drift are rejected). CLI-only.

The engine

35 deterministic rules, fixed order.

Most of the work is plain, predictable text transformation — no model required, no surprises. Rules run in a fixed sequence; four of the boldest ship opt-in even when their tier is active.

Safe (Tier 1) — a sample

strip-frontmatter
Remove YAML/TOML frontmatter blocks
strip-badges
Remove shield.io-style badge images and links
strip-toc
Remove auto-generated tables of contents
strip-html-comments
Remove  blocks
compress-code-blocks
Strip shell prompts + config comments from fences
strip-trailing-cta
Remove star/follow/sponsor sections at doc end

Aggressive (Tier 2) — a sample

strip-hedging-phrases
Cut “it is worth noting that”, “in order to”, …
factor-cross-file-paragraphs
Replace repeated prose across files with a back-ref
dedup-cross-file-code-blocks
Collapse code blocks duplicated across files
strip-admonition-prefixes
Drop **Note:** / **Warning:** / **Tip:** prefixes
strip-benchmark-prose
Remove prose that only narrates an adjacent table
factor-phrase-dictionary
Factor repeated phrases into a short glossary preamble

The full rule list lives in the README.

Results

Measured, not promised.

Numbers from a 20-repo benchmark corpus on Tier 2. mdcompress is strongest on marketing-heavy READMEs, repeated agent context, and generated command output — and honest about where it isn't.

~14%

Full markdown tree, Tier-2, across the 20-repo corpus

~13%

Top-level READMEs

30–53%

Docs-heavy repositories

~11%

Code-dense repositories (honest floor)

It helps less on dense technical reference where most tokens are code, API names, or tables — expect single-digit savings there, and check per-rule diffs before enabling aggressive rules broadly. Full methodology in BENCHMARKS.md, or run the live benchmarks.

Try it

The engine runs right here, in your browser.

mdcompress's single-document rules are compiled to WebAssembly and shipped into this site's Lab — paste a README, pick a tier, and watch the tokens drop, with a per-rule breakdown. Your text never leaves the tab.

No install needed to see it work.

The browser build runs the single-document rules. Cross-file deduplication and the Tier-3 LLM rewriter need the CLI — that's what the quickstart below is for.

Open the live tool →

Get started

Quickstart.

Install the CLI, set it up once per repo, and compress. Nothing leaves your machine; the originals are never mutated — output goes to a hidden mirror.

InstallShell

# one-line installercurl -fsSL \  https://raw.githubusercontent.com/\dhruv1794/mdcompress/main/install.sh | sh # or with the Go toolchaingo install \  github.com/dhruv1794/mdcompress/\cmd/mdcompress@latest

Compress a repoRun

# set up config in any repomdcompress init # compress every tracked .mdmdcompress run --all # see cumulative savingsmdcompress status

Tune it.mdcompress/config.yaml

version: 1tier: aggressiverules:  disabled:    - dedup-cross-section    - collapse-example-outputeval:  backend: ollama  model: llama3.1:8b  threshold: 0.95

More than a CLI

An MCP server, an LLM rewriter, and a plugin API.

mdcompress is v3.2: the deterministic engine is the core, but it reaches further when you want it to.

MCP server

Expose compression to an AI agent over the Model Context Protocol — the agent compresses context on demand instead of re-sending bloated docs.

LLM rewriter + audit

Tier-3 section rewriting guarded by a faithfulness check (default threshold 0.95) so aggressive prose edits can’t silently change meaning.

Plugin API

Local web UI

`mdcompress web` serves an interactive test page with per-rule diffs, token/byte stats, and a cost estimate — all on localhost.

Shrink your docs before the next prompt.

MIT licensed. Runs in your browser or your shell — nothing uploaded.

Try it live →GitHub