What this Mac mini is doing right now

As of 2026-05-30. This is notshot.ai labs — the model-selection harness for the notshot.ai product. The notshot.ai webapp also runs on this box but is a separate concern (different Slack channel: C0B77C0MS49). Public URLs:

• labs: https://labs.notshot.ngrok.dev (this surface) • webapp: https://webapp.notshot.ngrok.dev (Next.js)

Both via reserved-subdomain endpoints on the paid ngrok plan, multi-tunnel from a single agent (ngrok start --all).

What labs is for

Labs is a model-selection harness. For a given product task, labs probes candidate models/vendors/architectures, judges them against the real product bar, and produces a winner with evidence. The winning approach then ports to the webapp.

Today's task: generate the best variants from an uploaded portrait/face. See /docs/goals for status; current probe is at /run/20260529-cb-lora-v1.

After labs proves an approach → it transfers to the webapp (separate concern, separate channel). The labs→webapp port is the existing labs-first.md workflow.

Operator quick-reference

Question	Answer
Where do I find current goals?	`/docs/goals`
Where do I find handover docs?	`/docs/handover/TAKEOVER` and peers
How do I tell Claude to do new work?	`/docs/prompts` — copy-pasteable Slack prompts
Where are run results?	Listed below on the index (`/`)
How do I extend a specific run?	Open the run page → "How to continue this work" section
Where do new portraits go?	`ikatomi-labs/inputs/portraits/` (gitignored — drop locally OR commit to a private branch and pull)

Slack channel scopes

C0B75KPL50U — labs only. All labs work, all run results, all probe + sweep + train conversations.
C0B77C0MS49 — webapp / admin / ops. Database, AWS, dev server, /admin.

Cross-channel pointers OK; substance stays in the right channel.

Long-running processes on this box

Slack listener — bun slack-listener.ts in tmux slack:1. Receives Slack messages and pastes them into the Claude session.
ngrok — currently exposing port 3001 (webapp) at https://pinnatisect-hugh-selenographically.ngrok-free.dev. Labs server tunnel TBD.
Webapp dev server — port 3001. Separate concern.
Webapp worker — SQS poll loop. Separate concern.
Labs server (this thing) — port 8080.

Active goals

For the operator taking over: these are the explicit work directives in play. Each goal lists where it came from, what done looks like, and how to instruct Claude to continue.

Macro-framing

Per Thor 2026-05-30: "the overall goal of the labs is to figure out which model works best for a given task. Right now, its how to generate the best variants from an uploaded portrait. After figuring things out, we need to transfer this approach over to the webapp. These are the main tasks and challenges".

Every active goal below is a task in this sense — a model-selection problem to solve, with an evidence-backed winner as the deliverable. After a winner lands, the next step is the webapp port (out of scope on this channel; tracked separately).

Candidate vendor space (per Thor 2026-05-30)

For every probe, the candidate model space includes BOTH cloud APIs (Replicate, fal, OpenAI, Anthropic, etc.) AND HuggingFace / open-weight models. The Mac mini is the exploratory probe surface for open-weight models — bigger models won't fit locally, but that doesn't disqualify them: probe via a cheap cloud GPU instead. Once a winner is identified, the production deployment path is AWS (SageMaker endpoint or a directly-hosted EC2 instance). Mac mini hosting is NEVER the production path. See [[local-models-in-scope]] for constraints + the decision tree.

Concrete past example: a "lighting" task where commercial vendors didn't fit and a HuggingFace model was the answer — that's the kind of probe where this matters.

1. Realistic AI-based identity variants of an uploaded portrait

Source: Thor 2026-05-30 on C0B75KPL50U threads 1780128019.836289 / 1780131789.364859 — "goal is to ensure that after an uploaded portrait/face we can generate AI based variants of this face/portrait... very realistic AI generated pic variant so that comparing orig with variant, an amateur like me does not identify the variant as AI generated... models we generate only wear bare minimum, we dress them in another step".

Status: In flight. v1 LoRA on a 14-photo CB fixture landed at Q1=83%, Q4=83% with catalog-outfit prompts (wrong baseline for downstream use). v2 in training with stronger hyperparams (4000 steps, rank 64) but ALSO with the wrong outfit prompts. v3 will fix the prompt baseline.

Pipeline framing (Thor 2026-05-30): 1. This labs probe = identity LoRA → generates body variants of an identity in bare-minimum clothing (fitted black bodysuit / athletic basics) — a clean replaceable baseline. 2. Downstream FASHN tryon (existing webapp) → puts garments on the LoRA output.

The LoRA's job is identity + body + framing; the dressing pipeline's job is clothing. See [[identity-lora-vs-dressing-pipeline]] for prompt-design rules.

Done looks like (customer bar, Thor 2026-05-30): Amateur side-by-side test — comparing the original portrait and an AI-generated variant of the same person, a non-expert cannot identify which is AI-generated. Turing-test gestalt, tighter than per-axis thresholds.

Operational bar (used by the judge): Per-cell Q1 ≥ 0.7 AND Q4 ≥ 0.7. Aggregate Q1 pass rate ≥ 95% AND Q4 pass rate ≥ 90%. No cell with Q1 < 0.3 (catastrophic identity loss is a deal-breaker even at low frequency).

To extend: - See the run page for 20260529-cb-lora-v1 — has copy-pasteable next-step prompts. - General pattern: "drop N photos into inputs/<identity>/, train LoRA via scripts/train_<identity>_lora.py, run 6-cell matrix, judge it."

2. Labs web surface (this thing)

Source: Thor 2026-05-30 on C0B75KPL50U thread 1780129450.301019 — "lightweight server that has the ability to display whatever you finish... key is to organise the labs web setup in a way that the one taken over my work knows where to find what and how to best instruct you to work on stuff".

Status: v0 shipped (you're looking at it).

Done looks like: Any takeover operator can land on this URL and within 30 seconds know what's in flight, where to find context, and how to ask Claude to continue. Verdict: see for yourself.

To extend: Add new run dirs under runs/. Drop a SUMMARY.md in each (template in /docs/handover/SUMMARY_TEMPLATE). The server picks them up automatically.

Handover docs

→ Prompts library — copy-pasteable Slack messages to extend each workstream.

Runs (13)

❌ REJECT v9 — face-swap on v7 LoRA outputs (LoRA + face-swap composition)
20260530-cb-lora-v9-faceswap · 2026-05-30 · $1.10 · ~10min face-swap × 18 cells + ~3min judge
🛠 IN-FLIGHT 20260530-cb-lora-v8-hyperparam-scaling
20260530-cb-lora-v8-hyperparam-scaling · no SUMMARY.md
🛠 IN-FLIGHT 20260530-cb-lora-v7-ohwx-manualcaptions
20260530-cb-lora-v7-ohwx-manualcaptions · no SUMMARY.md
🛠 IN-FLIGHT 20260530-cb-lora-v6-trigger-CTBLNCHTT
20260530-cb-lora-v6-trigger-CTBLNCHTT · no SUMMARY.md
❌ REJECT CB LoRA v5 — same v1 catalog prompts, 6 NEW seeds, v1 LoRA (fluke-vs-fixture probe)
20260530-cb-lora-v5-newseeds-v1lora · 2026-05-30 · $1.65 · ~6min inference (18 cells) + ~3min judge
❌ REJECT CB LoRA v4 — bare-minimum prompts against v1 LoRA (isolates prompt question)
20260530-cb-lora-v4-bareminmin-v1lora · 2026-05-30 · $0.55 · ~2min inference + ~30s judge
❌ REJECT CB LoRA v3 — bare-minimum-clothing prompts against v2 LoRA
20260530-cb-lora-v3-bareminmin · 2026-05-30 · $0.55 · ~6min inference (one cell hit a slow Replicate slot) + ~30s judge
❌ REJECT CB LoRA v2 — stronger hyperparams to fix v1 identity drift
20260530-cb-lora-v2 · 2026-05-30 · $8.50 · ~50min training + ~2min inference + ~30s judge (in progress)
✅ PROCEED CB LoRA identity validation — 6-cell inference matrix
20260529-cb-lora-v1 · 2026-05-29 · $5.30 · ~35min training + ~2min inference (judge pending)
🛠 IN-FLIGHT 20260530-identity-body-multiface-v1
20260530-identity-body-multiface-v1 · no SUMMARY.md
🛠 IN-FLIGHT 20260530-replicate-body-multiface-v1
20260530-replicate-body-multiface-v1 · no SUMMARY.md
🛠 IN-FLIGHT _specs
_specs · no SUMMARY.md
🛠 IN-FLIGHT _aug_specs
_aug_specs · no SUMMARY.md