← AI-pair numerics

Droplet sizing: a distribution from a slide, and why to doubt it

Spray a slide, photograph it under a scope, and count the droplets — a size distribution falls out. The catch: every number depends on one threshold you picked by eye. This explorer is an HTML slide deck of live widgets, and its last slide is the one that tells you whether to believe the rest.

1. The brief

A microscope image of droplets deposited on a slide. You want the drop-size distribution — the percentiles D10/D50/D90 and the Sauter mean D32 that spray engineers live by. The pipeline is short: calibrate pixels to microns, threshold the image to separate droplets from background, label the blobs, and turn each blob's area into an equivalent diameter. Simple — until you notice the whole distribution slides when you nudge the threshold.

2. Why this is an afternoon, not a week

Hand Claude the image and it writes the computer-vision pipeline — threshold, connected-components labelling, area→diameter, percentiles — in one pass, and renders it as an explorable deck. What it can't do for you is decide whether the measurement is trustworthy: whether the threshold is in a stable band, whether overlapping droplets merged into one fat blob, whether the deposited volume reconciles. That judgment is the whole point, and the deck is built to make it.

3. Explore it — the deck

Four slides, one shared dataset. Drive the threshold and step through with the arrows (or ←/→ keys). The image is synthetic, so the true diameters are known — which is what lets the later slides grade the measurement.

1 / 4 · the slide synthetic · 1600 × 1000 µm

Slide 4 is the one that matters. A measurement you can only get with one magic threshold isn't a measurement — it's a coincidence. Drag the threshold across its range and watch D50: there's a band near the true value where the answer barely moves (trust it there), then it drifts low as a rising threshold shaves every droplet smaller — while at the very low end neighbours start to merge into one fat blob.

When merging bites, flip on split touching drops — a distance-transform watershed that breaks fused blobs back into separate droplets. Watch the count climb toward the dosed number and the inflated D90 tail come down. One honest caveat: it won't rescue the volume bar. Area-based diameters over-count a merged blob and then under-count the pieces a split carves it into, so conservation tracks the threshold, not the segmentation — a separate oracle, a separate fix. Splitting is the right tool for the count and the distribution; calibrating the threshold is the right tool for the volume.

4. Verification — the measurement is the oracle

Image measurements lie quietly: the code runs, a tidy histogram appears, and it can be off by half. Four independent checks keep it honest — and the synthetic slide lets you actually run them.

Rung 1 — calibration. Pixels mean nothing until a known length fixes the scale (a stage micrometer, or a feature of known size). Every diameter rides on that µm-per-pixel number.
Pass: the scale bar is set from a known reference; diameters are in microns, not pixels.
Fail here: a wrong calibration scales every result by the same factor — and nothing in the histogram looks wrong.
Rung 2 — ground truth. On the synthetic slide the true diameters are known. Overlay them (the checkbox) and the measured distribution should track the truth.
Pass: measured D50 sits within a few percent of true D50 at a sensible threshold.
Fail here: a systematic gap means the threshold is biasing every diameter — the soft droplet edge is being cut too tight or too loose.
Rung 3 — conservation. The droplet volumes must add up to the volume deposited. Sum πd³/6 over the blobs and compare to the dosed total.
Pass: measured volume reconciles with the dosed volume within tolerance.
Fail here: too-low threshold merges neighbours and inflates volume; too-high shrinks every blob and loses it. The reconciliation catches both.
Rung 4 — robustness. Sweep the threshold and watch D50. A trustworthy reading lives on a plateau, not a slope.
Pass: D50 is flat across a band of thresholds — the result doesn't depend on your exact pick.
Fail here: if D50 rides a slope everywhere, there's no objective answer; report a range, or fix the segmentation (watershed the overlaps).

5. Hints

Hint 1 — the threshold is the whole ballgame

Droplet edges are soft, so the threshold decides where each droplet “ends” — it sets every diameter and whether neighbours merge. Don't pick it by eye and move on; sweep it and find the band where the answer is stable. That band, not a single value, is your operating point.

Hint 2 — overlaps are the silent error

Two touching droplets label as one blob with a big combined area — and since volume goes as d³, one merged blob reads as far more volume than the two it replaced. That's why conservation breaks before the histogram looks obviously wrong. Watershed segmentation (going further) splits them.

Hint 3 — report the right average

There isn't one “mean diameter.” D50 is the median by count; the Sauter mean D32 = Σd³/Σd² weights by volume-to-surface and is what governs evaporation and combustion. Quote the statistic your physics cares about, and say which one it is.

Hint 4 — what to ask for
PromptFrom this slide image: calibrate px→µm from the scale bar, threshold to segment droplets, label connected components, and report the diameter distribution (D10/D50/D90 and Sauter D32). Then sweep the threshold and plot D50 vs threshold so I can see the stable band, and sum the droplet volumes to check against a dosed volume of X. Flag blobs touching the image border and any suspiciously large merged blobs.
Hint — tune the collaborator

Two free levers worth setting. Turn the reasoning effort up for the hard part — Claude Code's /effort (see Feed it documents for the model and effort controls); a transcription wants it low, an analysis like this one wants it high. And end your prompt with an explicit self-check — “before you finish, confirm D50 sits on the true line across a band of thresholds, not just one” — which is exactly why the prompt above asks Claude to verify itself. Naming the oracle is the highest-value line in the prompt. And keep the expensive model's context light — route transcription and formatting to cheaper tools (see Spend tokens well).

6. Where to draw the line

Let Claude write the vision pipeline and the deck — thresholding, labelling, percentiles, the live histogram, all of it is exactly the tireless plumbing it's good at. But you own the calls that decide whether the number is real: is the scale calibrated, is the threshold on a plateau, did overlaps merge, does the volume reconcile. A clean histogram from an uncalibrated, single-threshold, overlap-ridden image is a confident wrong answer — and only those checks catch it.

7. One worked solution

What good looks like

Park the threshold in the low-middle (around 70–80) where slide 4 shows D50 flat and sitting on the true line, flip on the ground-truth overlay, and confirm the measured median tracks the true one within a few percent. Read D10/D50/D90 and the Sauter D32 off slide 3, and confirm the summed droplet volume reconciles with the dosed total. Then state the result with its operating band: “D50 ≈ 52 µm, stable across threshold ~55–95.”

The honest footnote: push the threshold up and every blob is shaved smaller — D50, D90, and especially the summed volume all bias downward, because volume goes as d³. Push it too low and neighbours merge into single fat blobs. Neither extreme is the answer. The plateau near the true line is — and reporting the band instead of a single decimal is what separates a measurement from a number.

8. Going further

Ship it

This explorer is the deliverable — a self-contained HTML deck of live widgets, no install, e-mailable. See explore in HTML, deliver in PDF for the pattern, and feed it documents for getting your micrograph in efficiently.