Caveman Code · coming soon 37,500 ★ across the stack

The token-efficient stack
for agent-native development.

[thesis] why many token when few do trick.

Caveman is a three-part ecosystem for builders who treat tokens as a resource worth designing around — a compression primitive, a spec-driven workflow, and a coding agent CLI that stacks both.

§ 01 · Ecosystem

Four projects. One thesis.

Caveman starts as a small compression primitive, becomes a workflow with Cavekit, grows a memory with Cavemem, and lands as a full coding agent CLI with Caveman Code. Each layer is useful on its own. Stacked, they compound.

01 primitive

Caveman

A compression skill that shrinks token footprint by roughly 75% without losing task fidelity. Portable, composable, boring in the best way.

  • reduction~75%
  • stars37,000
  • licenseMIT
Read the primitive →
02 workflow

Cavekit

Spec-driven development, pushed further. Turns a written spec into a structured plan, then into verifiable execution. Opinionated, not clever.

  • built onCaveman
  • stars500
  • surfaceCLI + lib
See the workflow →
03 memory

Cavemem

A persistent, cross-agent memory layer. Local SQLite + FTS5 + vector search, Caveman-compressed, exposed via MCP. Your agents stop forgetting.

  • storelocal SQLite
  • protocolMCP
  • versionv0.1.3
Explore the memory →
04 flagship

Caveman Code

A next-gen coding agent CLI built around four independent compression layers. Fewer tokens at every hop — prompt, commands, outputs, context.

  • layers
  • statuspreview
  • runtimelocal
Open the flagship →
§ 02 · Caveman

The primitive. A skill that eats tokens for breakfast.

Caveman is a small, focused compression skill. Give it a prompt, a system message, a long-form document, a CLAUDE.md — it returns something semantically equivalent and dramatically shorter. No retraining. No proprietary runtime. Plug it wherever tokens get spent.

  • ~75% fewer tokens on typical agent workloads.
  • Model-agnostic. Works upstream of whatever you're calling.
  • Deterministic compression, with a dictionary you control.
  • Composable. Pipe it. Chain it. Ignore it when you don't need it.
caveman · compress.ts
// before — 4,820 tokens
const prompt = `You are an expert software engineer.
When the user asks a question, think step
by step, consider edge cases, and then ...`;

// after — 1,204 tokens (-75%)
const prompt = caveman.compress(prompt, {
  dict: 'eng/v1',
  preserve: ['examples', 'schema']
});

// fidelity kept, bytes did not survive
before
4,820 tok
after
1,204 tok
§ 03 · Cavekit

Spec-driven development, taken seriously.

Cavekit is the workflow layer. Write a spec in prose, let Cavekit turn it into a structured plan, then drive execution against it. It uses Caveman internally so the plan and the context both stay lean.

  • Specs, not vibes. Every change is anchored to a written goal.
  • Plan before code. Structured tasks with acceptance criteria.
  • Verifiable. Each task has a check. Nothing ships on a hunch.
  • Iterative. Specs evolve with the codebase, not against it.
SPEC-024 payments · refund flow ready
goal Users can refund a completed order within 30 days of purchase.
scope API + admin UI. Excludes partial refunds for v1.
plan
  1. migration: add refunds table
  2. endpoint: POST /orders/:id/refund
  3. admin: refund button + confirm modal
  4. webhook: refund.processed
  5. tests: edge cases + denial paths
verify 3/5 tasks · 62% · last build green
§ 04 · Cavemem

Persistent memory. Cross-agent. Local-first.

Cavemem gives coding agents a memory that survives the session. It captures observations, stores them compressed in a local SQLite index, and serves them back through MCP — so Claude Code, Cursor, Codex, or Gemini can all recall what the last agent learned.

[thesis] why agent forget when agent can remember.

  • Persistent across sessions. Stop re-explaining the codebase every turn.
  • Cross-IDE. One memory, many agents. MCP tools: search, timeline, get_observations.
  • Compressed. ~75% smaller via Caveman before it ever hits disk.
  • Local & private. SQLite + FTS5 + vector. Nothing leaves the box. <private> tags get stripped.
MEM · RECALL cavemem · search("refund flow") indexed
query cavemem.search({ q: "refund flow", k: 4 })
hits
  1. SPEC-024 refund flow · 2d ago · 0.94
  2. orders.test.ts edge cases · 3d ago · 0.81
  3. CHANGELOG v0.8 refunds · 11d ago · 0.62
  4. chat stripe webhook · 3w ago · 0.55
store ~/.cavemem/db.sqlite · 4,812 obs · 1.2MB
mcp search · timeline · get_observations · viewer at :37777
§ 05 · Caveman Code · flagship

A coding agent CLI
with four compression layers.

shipping soon Not yet installable. The design is locked, the layers are wired, and the install commands below are a preview of what lands when it ships.

Most coding agents pay the token tax everywhere: bloated prompts, verbose tool calls, chatty outputs, sprawling context files. Caveman Code squeezes each of those independently, then stacks them. The result is a CLI that feels fast, cheap, and deliberate.

L01

Prompt input compression

User prompts are normalized and shrunk before they ever hit the model. Same intent, less surface area.

reducesuser → model prompt
input
compressed
L02

RTK command compression

Tool and command calls are routed through a Reduced Token Kernel — a compact grammar for frequently-used actions.

reducestool call payloads
raw call
rtk
L03

Output compression via Caveman

Model outputs — plans, diffs, explanations — pass through the Caveman primitive before rendering or being fed back in.

reducesassistant outputs
output
compressed
L04

Context & CLAUDE.md compression

Long-form agent context — instructions, repo maps, style guides — is compiled once and cached as a dense Caveman artifact.

reduceslong-lived context
CLAUDE.md
cached
~/projects/acme · caveman-code preview · coming soon
$ caveman-code "add a refund endpoint per SPEC-024"

▸ load context   CLAUDE.md → 18.4k → 3.9k (L04)
▸ compile prompt user → L01 → 0.3k
▸ plan           5 tasks · 62% covered by spec
▸ edit           apps/api/orders/refund.ts
▸ run            pnpm test --filter orders   ✓ 41 passing

 done in 00:47 · 4,812 tok (baseline: 21,340)
   saved 77% · fewer words, same work.
§ 06 · Why this matters

Tokens are a resource. Most stacks spend them like tap water.

The problem

  • 01Prompts bloat. System messages grow by accretion, never by design.
  • 02Context files sprawl. CLAUDE.md, style guides, repo maps — loaded every turn.
  • 03Outputs are verbose. Every hop repeats the preamble.
  • 04Tool calls are chatty. JSON for a one-word command.
  • 05Cost compounds. Latency and bills scale with the waste, not the work.

The thesis

If tokens were free, bloat would be free. They aren't. Treat each token as a unit of intent and most systems reveal themselves as 75% noise.

Caveman is a bet that the right answer isn't a bigger context window — it's a smaller, sharper one. Compression isn't just a cost story. It's a control story: precision, speed, and a model that thinks inside a tighter frame tends to think better.

[aside] we said it before. why many token when few do trick.

§ 07 · Architecture

How the ecosystem composes.

Each project is independently useful. Caveman Code is what you get when you wire them together on purpose.

app
workflow
primitive
model
caveman (standalone)
caveman · compress()
any LLM
cavekit (spec workflow)
spec → plan → verify
caveman
any LLM
cavemem (memory)
SQLite · FTS5 · vector · MCP
caveman (store-time)
any MCP client
caveman-code (flagship)
agent CLI · 4× compression
cavekit + cavemem embedded
caveman ×2 (L03, L04)
model of your choice
[note] You can adopt one layer at a time. Start with Caveman, graduate to Cavekit, bolt on Cavemem when your agents start forgetting, and move into Caveman Code when you're tired of paying the token tax.
§ 08 · Built in the open

Proof, not press.

37,000 ★ caveman the original primitive
500 ★ cavekit early but growing
v0.1.3 cavemem memory layer · april 2026
soon caveman-code shipping · april 2026
MIT license across the stack

Fewer words.
Same work.

Caveman, Cavekit, and Cavemem are public today — read the source, send a patch. Caveman Code, the flagship CLI, ships soon. Watch the repo to be first on install day.

coming soon ★ watch on GitHub
classified · not yet released

Commands are sealed until ship day. Caveman Code is still in private development — no packages are live yet. Watch the repo to be first when it unlocks.