The future is unbounded inference. Today you work to a budget — so spend it on judgment, not on mechanical reformatting. Everything else on this trunk is really one idea wearing different hats: tee up the expensive model with as little in its context as you can get away with.
Sort every task into two piles: mechanical (a faithful transform with a right answer — transcribe, reformat, parse, rename) and judgment (a call that needs reasoning — is this right, which option wins, what does the physics say). Mechanical work is cheap work; it does not need your best model. Judgment is what you're actually paying the premium for.
The cleanest example is a PDF. Formatting a report to PDF should
never touch Opus — that's pandoc + Typst, a
deterministic typesetter that costs zero tokens. Reading a datasheet PDF
in shouldn't either — that's a Sonnet worker transcribing to
text. Opus's only job in the whole loop is the engineering call in the
middle.
| Task | Hand it to | Why |
|---|---|---|
| Transcribe a PDF / drawing to text | Sonnet worker | Mechanical reading; isolates the heavy read from your main context. how |
| Typeset Markdown → PDF | pandoc + Typst | Deterministic; costs zero tokens. how |
| Clean / parse a CSV | Sonnet, or a script | Mechanical; once Opus writes the script, reruns are free. |
| Bulk rename, boilerplate, reformat | a script | Repeatable transforms shouldn't spend tokens at all. |
| Judge whether the answer is right (the oracle) | Opus | Judgment — the thing you're paying the premium for. |
| Weigh tradeoffs, set weights, pick the design | Opus | Judgment. |
A task lands. Pick the cheapest model × effort that can actually do it — then reveal the suitability heatmap. The goal isn't “use the best model,” it's “use the smallest one that still clears the bar.”
The trunk you climbed to get here is, read another way, a token-economy checklist:
/effort low
for transcription and formatting, high only where the reasoning is hard.NOTES.md so the next
session reads a note instead of re-deriving — persistent memory for a
few hundred tokens. notes/ folder.You can also just tell the expensive model to economize up front:
Honestly? Because today there's a budget. A lighter context is a longer effective session, a cheaper run, and — the part people miss — a sharper model: the expensive one reasons better when its window isn't clogged with page furniture and stale output it has to read past every turn. The day inference is unbounded, some of this stops mattering. Until then, routing the mechanical work elsewhere is free leverage, and it's most of what the other foundation pages are quietly teaching.
This is the capstone of the trunk and the habit every problem branch inherits: do the cheap things cheaply so the expensive model has room to do the one thing only it can. From here, pick a branch — the tech tree is back home.