notshot.ai labs
docs/handover/SUMMARY_TEMPLATE.md

verdict: IN-FLIGHT # PROCEED | REJECT | IN-FLIGHT | INCONCLUSIVE title: "" date: 2026-05-30 cost_usd: 0.00 duration: ""


What this run is

Two sentences. What hypothesis was being tested, what method was used. No jargon, no internal abbreviations the next operator might not recognize.

Verdict reasoning

If verdict is PROCEED or REJECT: one paragraph on why. Reference the judge axes that drove the call. Quote the threshold (eg "Q1 80%, Q4 80%").

If IN-FLIGHT: what is the script / judge / process waiting on right now? When will it be terminal?

How to continue this work

Copy-pasteable Slack prompt(s). Example:

in the labs channel: re-run with seeds [555, 999] to get more variance on the identity axis. Cost ~$0.15. Surface on the labs server when done.

Add more prompts for sibling explorations (different identity, different hyperparams, different vendor).

Artifacts

Inputs

Notes / gotchas

Anything an operator who didn't run this would need to know. Cost surprises, vendor quirks, environmental requirements.