← AI-pair numerics

Spend tokens well: cheap for mechanics, expensive for judgment

The future is unbounded inference. Today you work to a budget — so spend it on judgment, not on mechanical reformatting. Everything else on this trunk is really one idea wearing different hats: tee up the expensive model with as little in its context as you can get away with.

The whole idea in one line: the costly model's attention is the scarce resource. Hand every mechanical job — reading documents in, formatting documents out, cleaning data, boilerplate — to a cheaper model or a deterministic tool, and save the expensive model for the reasoning and the verification only it should do.

The one rule

Sort every task into two piles: mechanical (a faithful transform with a right answer — transcribe, reformat, parse, rename) and judgment (a call that needs reasoning — is this right, which option wins, what does the physics say). Mechanical work is cheap work; it does not need your best model. Judgment is what you're actually paying the premium for.

The cleanest example is a PDF. Formatting a report to PDF should never touch Opus — that's pandoc + Typst, a deterministic typesetter that costs zero tokens. Reading a datasheet PDF in shouldn't either — that's a Sonnet worker transcribing to text. Opus's only job in the whole loop is the engineering call in the middle.

Who should do what

TaskHand it toWhy
Transcribe a PDF / drawing to textSonnet workerMechanical reading; isolates the heavy read from your main context. how
Typeset Markdown → PDFpandoc + TypstDeterministic; costs zero tokens. how
Clean / parse a CSVSonnet, or a scriptMechanical; once Opus writes the script, reruns are free.
Bulk rename, boilerplate, reformata scriptRepeatable transforms shouldn't spend tokens at all.
Judge whether the answer is right (the oracle)OpusJudgment — the thing you're paying the premium for.
Weigh tradeoffs, set weights, pick the designOpusJudgment.

Try it: route the task

A task lands. Pick the cheapest model × effort that can actually do it — then reveal the suitability heatmap. The goal isn't “use the best model,” it's “use the smallest one that still clears the bar.”

right-sized overpaying underpowered (fails)
Pick a cell.
right-sized 0 / 0

Keep the expensive model's context light

The trunk you climbed to get here is, read another way, a token-economy checklist:

You can also just tell the expensive model to economize up front:

PromptBefore we start: transcribe the datasheets with a Sonnet worker and keep only the extracted YAML in context, render any PDF output with pandoc + Typst (not by hand), and write running findings to NOTES.md. Spend your reasoning on the sizing decision and its verification, not on reading or formatting.

Why bother — for now

Honestly? Because today there's a budget. A lighter context is a longer effective session, a cheaper run, and — the part people miss — a sharper model: the expensive one reasons better when its window isn't clogged with page furniture and stale output it has to read past every turn. The day inference is unbounded, some of this stops mattering. Until then, routing the mechanical work elsewhere is free leverage, and it's most of what the other foundation pages are quietly teaching.

Where this fits

This is the capstone of the trunk and the habit every problem branch inherits: do the cheap things cheaply so the expensive model has room to do the one thing only it can. From here, pick a branch — the tech tree is back home.