Cross-corpus comparison

Every case study at a glance

86 shipped case studies, sortable and filterable. Click any row to open the full study. The table reads frontmatter directly — when a study lands or its frontmatter updates, this page reflects it on the next deploy.

Showing 86 of 86 case studies

	Title				Overview	Headline
—	The Blank-Main Bug — Catching a Production SSR Regression After Promoting	2026-04-20	—	appendix	Production site shipped 251 bytes of HTML per page across 48 routes. Suspense + useSearchParams in the root layout susp…	251 — Bytes per page across 48 routes
—	Watching the Framework Build the Site That's Replaying It — The DispatchReplay Component	2026-04-21	—	appendix	Static blueprint diagrams tell you what a system is, not whether it's running. DispatchReplay plays a recorded trace of…	2 (Sprint I + fitme-story meta) — Live traces
—	The Lego Metaphor — Designing the PM-Flow Ecosystem Page	2026-04-22	—	appendix	How to render an 11-skill, 10-phase, 15-data-file framework on a single page without it looking like an org chart. The …	11 · 10 · 15 — Skills × phases × shared files
—	External Validation — Did Our Numbers Hold Up?	2026-04-16	—	appendix	Independent review of the normalization model, velocity claims, and measurement methodology — confirming what is solid,…	5 (normalization, velocity, cache, complexity, baselines) — Methodology dimensions reviewed
—	What If We Had Measurement From Day One? — A Retrospective ROI Analysis	2026-04-16	—	appendix	Counterfactual experiment: retroactively applying deterministic measurement infrastructure to all 24 features, then com…	24 — Features reanalysed under counterfactual
—	How We Normalized Complexity Across 16 Different Features	2026-04-16	—	appendix	Raw metrics like wall time and file count are meaningless without normalization. The Complexity Unit (CU) model — addit…	16 — Features normalised under CU model
7.10	Garmin Health Connection (Tier 1) — Source attribution without a backend	2026-06-12	—	light	A Garmin owner who installs FitMe gets a degraded readiness experience unless they already have "Garmin Connect → Apple…	0 / 0 / 0 — New backend / permissions / data egress
7.10	Control Room Live Feed — from build-time snapshot to fail-soft request-time data	2026-06-16	—	light	The Unified Control Center was complete and deployed, but every panel rendered a build-time snapshot — data only refres…	48 (16 probes / 16 data-source / 9 producer / 7 footer) — Unit tests across the live layer
7.10	Foundation Models Tier 3 — real on-device AI personalization, with an SDK-gated cloud-escalation path	2026-06-15	—	light	The AI engine's Tier-3 on-device personalization was a placeholder — a hardcoded confidence of 0.5 that never called a …	27 / 27 — Unit tests (incl. AB1–AB3 eval coverage)
7.10	T10 — AI Golden-Set Eval Harness (and why the FitMe AI is deterministic, not generative)	2026-06-10	—	light	T10 was spec'd as an "LLM golden-set eval harness — promptfoo-equivalent weekly run." Reading the actual code corrected…	24 — Golden cases
7.10	T3 — SignInService Passkey/WebAuthn Unit Tests (closing the highest-risk zero-coverage service)	2026-06-10	—	light	SignInService.swift (1244 lines) is on the CLAUDE.md high-risk list — it owns Apple Sign In, Google, Email/OTP, and pas…	10 (deterministic, device-free) — Tests
7.10	T5 — Mock-Protocol Drift Detection (a central conformance registry, reframed for the real codebase)	2026-06-10	—	light	T5 was spec'd as "wrap MockKeychainStorage, MockSupabaseClient, StubAIEngineClient, CountingAIEngineClient in a shared …	6 (AnalyticsProvider, GoogleAuthProviding, AppleAuthProviding, EmailAuthProviding, URLSessionProtocol, AIInputAdapter) — Protocol anchors
7.10	T13 Per-Gate `last_failed_at` Index — distinguishing "stopped running" from "running + catching" (v7.10)	2026-06-10	—	light	F17 materialized "when did this gate last run?" per gate in O(1). But "ran" and "caught a violation" are different sign…	6 — Cycle-time gates populated with failure history
7.10	Android Token Pipeline — making the design system real (AND-1)	2026-06-17	—	light	The android-design-system feature shipped 2026-04-04 as research-only — an iOS→Material-Design-3 mapping doc, an adapta…	3 — Generated artifacts
7.10	figma-design-architecture — honest closure of the design-system source-of-truth story	2026-06-18	—	light	After the 2026-06-15 Figma-mirror rebuild pivoted the design system to "code is canonical; Figma is a manually-maintain…	0/2 → 2/2 — Surfaces with a current, linked, no-false-claim arch doc
7.10	Contract-Fixture Consumer Adoption (E-15) — closing the W16 cross-repo silent-pass (v7.10)	2026-06-22	—	light	The /control-room/framework page threw a TypeError in production for 13 days with green CI the whole time: the FitTrack…	13 days — W16 silent-pass duration in production
7.10	Funnel Analysis Dashboards — running the canonical funnels against live GA4 (v7.10)	2026-06-23	—	light	The backlog framed this (RICE 6.0) as "just needs GA4 funnel definitions wired" — but a Phase 0.1 reality-check found t…	3 of 5 — Funnels with computable live drop-off
7.10	F18 — Mutation Testing on the Gate Dispatchers (v7.10)	2026-06-26	—	light	The framework's enforcement rests on two ~80 KB dispatcher files. F14/F15/F16 built a 3-layer test suite for them, but …	1,857 — Mutants enumerated across both dispatchers
7.9.1	F16 Try-repo Pre-commit Harness — The 3rd Gate-Test Layer (v7.9.1)	2026-06-04	—	light	The framework gained a 3rd layer of gate testing — try-repo. Where unit tests catch wrong-regex bugs and F14 dispatch t…	61 pass + 1 documented skip in 16.21s — Test count at ship
7.9.1	F17 Per-Gate `last_fired_at` Index — Derived Telemetry Materialization (v7.9.1)	2026-06-04	—	light	Mechanism A accumulated ~1800 gate-coverage.jsonl rows by v7.9. The planned v7.10 GATE_COVERAGE_ZERO meta-check would s…	under 1s for 1828 rows (PRD budget under 2s) — Refresh wall-clock
7.9.1	F2 Phase 0 Reality-Check — Mechanical Defense Against Post-Squash-Merge State Drift (v7.9.1)	2026-06-04	—	light	The post-squash-merge state-drift pattern repeated 5 times in 4 days (2026-06-01 → 2026-06-04). Each time, state.json::…	5 in 4 days (2026-06-01 to 2026-06-04) — Drift instances calibration baseline
7.9.1	Framework v7.9.1 — Single-day Build Window (8 ships, 14 PRs, 0 new gates)	2026-06-04	—	light	v7.9.1 was a single-day build window that opened at v7.9 Phase E exit (2026-06-04) and closed the same day. 8 ships lan…	8 — Ships in the build window
7.9.1	HADF Signature Expansion — empirical-first, with a real on-device calibration	2026-06-05	—	light	"Expand HADF into new chip families" naturally reads as "add more spec-sheet rows" — but Phase 2-bis changed the answer…	8 → 9 (real M4, n=80); target was ≥12 — Instrumented signatures
7.9.1	HADF Phase 3A — The Sensing Layer (detection-only observability over a validated dispatch signal)	2026-06-10	—	light	Phase 2-bis established that streaming TTFT/TPS signatures are real, provider-general, substrate-discriminating, and sh…	8 endpoints · 7197 valid records · min_n=50 · max_ttft 30s — Reference store
7.9.1	fitme-story Dual-Audience Redesign — a cookie-backed lens that re-narrates one site for two audiences	2026-06-09	—	light	The public site served one undifferentiated narrative to two distinct audiences (developers and product managers). GA4 …	76.2% (~3.5× content pages) — Home bounce rate (problem baseline)
7.9.1	T14 platforms_tested — Platform-test parity as a queryable state field	2026-06-07	—	light	Before T14, state.json recorded that a feature shipped and that it had a case study — but not which platforms its tests…	94 — Complete features backfilled
7.9	Framework v7.9 — Promotion Release: 3 Advisory Gates → Enforced via Single-Flag Flip After 14-Day Calibration	2026-05-21	—	light	v7.9 shipped 2026-05-21 via FT2 PR #417 (ea53ff4). A single-line edit at scripts/check-state-schema.py:132 (BRANCH_ISOL…	18 (Mode B) + 13 (Mode C) + 13 (FEATURE_CLOSURE_COMPLETENESS) = 44 — Mechanism A telemetry rows (14d)
7.9	When gh pr view Lies — The Stale-Base Branch Trap and a 33-Zombie Cleanup	2026-05-25	—	light	Two drafts that looked clean (~6 files each per gh API) would have reverted 23 files / -2157 lines on merge — including…	2 (#484 + #488) — Draft PRs near-miss
7.9	Orchid — From Dispatch Patterns to Silicon, Validated by Measurement	2026-06-05	—	standard	The capstone of the Orchid research arc: how a fitness app's PM-framework dispatch intelligence became a RISC-V acceler…	25,305 (0 invariant violations) — Orchid v1 behavioral runs
7.9	Readiness-Aware Training Alert — A Second Observer for Today's Training-Day Decision Aid (C2)	2026-06-01	—	light	C2 shipped 2026-06-01 in a single session as a 4-phase Enhancement on feature/readiness-aware-training-alert (merged vi…	5 (ReadinessAlertRecommendation + ReadinessAwareTrainingTrigger + TrainingStartTimeLearner + ReadinessAwareTrainingObserver + ReadinessAwareAlertStore) — New source files
7.9	Trend Alerts (HRV) — A Third Observer for the Multi-Day Sustained Pattern (C4)	2026-06-01	—	light	C4 shipped 2026-06-01 in a single session as a full 9-phase Feature on feature/trend-alerts-hrv (FT2 PR #564). Adds the…	6 (TrendAlertContext + TrendAlertTrigger + TrendAlertDispatchTimeLearner + TrendAlertObserver + TrendAlertStore + HRVTrendChart) — New source files
7.9	When Two Workflows Share a Name — The `${{ github.workflow }}` Concurrency Trap (W26)	2026-06-01	—	light	Every mixed-content PR (touching both FitTracker source AND docs) produced two Build and Test status checks — one CANCE…	4 (C2 #560, C4 #562 + #564, dev-guide #563 — all hit the cancellation race before the fix) — Affected PRs in the session
7.9	AI User Feedback Loop — Closing Audit UI-024 with a Per-User Reinforcement Cycle (C5)	2026-06-02	—	light	C5 shipped 2026-06-01/02 in a single session as a full 5-phase Feature on feature/ai-user-feedback-loop (FT2 PR #572 me…	UI-024 (FitTracker/Views/AI/AIInsightCard.swift:230-235; carried since PR #79 / 2026-04-10) — Closes deferred audit
7.9	Exercise Search/Filter — a discoverable read-only library on top of the 50-exercise catalog (C3)	2026-06-02	—	light	C3 closes backlog L347 — the longest-running "no in-app library" complaint. The training catalog already carried ~50 ex…	~50 (TrainingProgramData.allExercises) — Catalog exercises surfaced
7.9	Training Program Customization — custom splits on top of the fixed PPL (C6)	2026-06-02	—	light	C6 replaces the fixed 6-day Push/Pull/Legs split with per-user custom training programs. Users pick from 4 starter temp…	~2152 — Net LoC (9 standalone-buildable commits)
7.8.6	UCC Passkey Auth Security Hardening — closing 4 deferred gaps before the v7.9 freeze	2026-05-20	—	light	A single-day enhancement that closed 4 explicitly-deferred security gaps from the parent ucc-passkey-auth PRD: an email…	4 (allowlist + hybrid lockout + token audit + unlock CLI) — Deferred gaps closed
7.8.5	Analytics Observability — closing the measurement-debt loop (taxonomy cleanup + a live GA4 path)	2026-06-09	—	light	The project had shipped extensive analytics instrumentation but had no clean taxonomy and no live way to observe events…	56 → 0 — Analytics taxonomy drift (CSV-missing-rows)
7.8.3	Cross-Repo State Sync (v7.8.3) — 10 PRs Across 2 Repos, 3-Attempt Cutover Ceremony, Reverse-Sync Infrastructure Live	2026-05-11	—	light	v7.8.3 is the first FitMe PM framework release that spans two git repos as a single coordinated Feature. Shipped 2026-0…	10 total (FT2 #298-#302 + fitme-story #86-#90) — PRs shipped
7.8.3	Design System P2 Deferred — One Variant, One Migration, Ten Items Still Deferred	2026-05-12	—	light	Tightly-scoped enhancement closing one P2 item (P2-044) from the eleven deferred at the end of fitme-story-design-syste…	1 of 11 deferred — P2 items closed by this enhancement
7.8.3	iOS UI-Audit P1 Drift Cleanup — 44 → 14 (Option B Respected, 68% Reduction)	2026-05-12	—	light	Drift-cleanup follow-up to `ios-ui-audit-p1-burndown` (PRs #292+#294). Respects Option B's locked ceiling — tokenizes O…	44 → 14 (-68%) — ui-audit P1 reduction
7.8.3	fitme-story DS P2 Final Sweep — 3 Utility Classes, 5 P2s Closed, 5 Re-Deferred	2026-05-12	—	light	Final sweep on the 10 remaining deferred P2 items from the 2026-05-10 lens audit. Adds 3 utility classes to globals.css…	5 of 10 remaining (50%) — P2 items closed by this enhancement
7.8.3	UI/UX Final Sweep — iOS P1 to 0, fitme-story P2 to 93%, /case-studies browse-by-category	2026-05-12	—	light	Three-PR sweep in a single session. iOS reached 0 P1 findings (from baseline of 103 across three burndowns). fitme-stor…	14 → 0 (-100%) — iOS make ui-audit P1
7.8.3	HADF Phase 2-bis — Cross-Sub-Exp Synthesis (3 Sub-Experiments)	2026-05-27	—	light	Successor to HADF Phase 2 (slot 22b). Three pre-registered sub-experiments test orthogonal aspects of the HADF dispatch…	3 — Sub-experiments
7.8.2	fitme-story Website Design System — Public Showcase + Drift Detection + Heritage	2026-05-10	—	light	Single-session ship of the next-phase evolution of fitme-story's design system. 31-component typed manifest, public /de…	100% — Public Figma parity
7.8.2	iOS UI Audit P1 Burndown — 2-PR Token Substitution + Audit Hardening	2026-05-11	—	light	Single-session proactive burndown of the 103 P1 ui-audit findings the parent feature ui-audit-baseline-burndown deferre…	FT2 #292 + #294 — PRs merged
7.8.1	Framework v7.8.1 — Branch Isolation + Feature-Closure Completeness from One Full PM Cycle	2026-05-07	—	light	Two cooperating pre-commit gates shipped as one feature in advisory mode on 2026-05-07. BRANCH_ISOLATION_VIOLATION prev…	FT2 #244 + #245 + #246; fitme-story #53 — PRs merged
7.8.1	UCC Passkey Auth — Replacing basic-auth on the operator dashboard with WebAuthn	2026-05-07	—	light	Two-PR cross-repo ship that replaces shared HTTP basic-auth on /control-room/* with WebAuthn passkeys. Per-operator ide…	fitme-story #55 + FT2 #248 — PRs merged
7.8.1	Roadmap Stress Test — Compressing 10 Weeks of Backlog Through the v7.8.1 Protocol	2026-05-07	—	light	Single-session experiment measuring throughput + protocol overhead of the v7.8.1 framework against ~10 weeks of sequenc…	9 — Sub-features in scope
7.8	Bridge to v7.9 — How v7.8 Closed the v7.7 Silent-Pass	2026-05-03	—	light	v7.7 shipped a gate (CACHE_HITS_EMPTY_POST_V6) that ran on every commit but exercised data on 0 of 46 features — a text…	0/46 (silent-pass) — CACHE_HITS_EMPTY_POST_V6 effective coverage at v7.7 ship
7.8	Smart Reminders Behavioral Learning — PR-1 Shipped Across iOS + Backend	2026-05-04	—	light	Sub-feature of Smart Reminders. PR-1 shipped fully on 2026-05-04 in two halves — FT2 PR #190 (iOS data layer + Settings…	15 / 15 — PR-1 tasks: planned / complete
7.8	Unified Control Center — Migrating an Operator Dashboard Across Two Repos in 11 Days	2026-05-06	—	light	Retired the legacy Astro operator dashboard, migrated it inside the public showcase as a basic-auth-gated /control-room…	42 / 44 — Tasks done
7.8	Import Training Plan — Resume from Audit-Flagged Partial Ship to Full Phase 1 Ship in 14 Hours	2026-05-06	—	light	Resumed an audit-flagged partial-ship feature: rolled back to research mid-flight after discovering the original PRD cl…	4 — PRs landed
7.8	Push Notifications v2 — From v1 Partial-Ship to Platform-Layer Rebuild in a Single Day	2026-05-07	—	light	Reopened v1 push-notifications after audit UI-016 caught a substrate-built-but-never-wired partial-ship. Single-session…	FT2 #239 — PR landed
7.7	Case-Study Presentation Refactor — Locking Alt-A Across 25 Studies	2026-04-28	—	light	An 18-hour serial sprint locked a uniform presentation pattern across 25 case studies — every one now leads with a Summ…	25 of 25 — Case studies backfilled
7.7	Validity Closure — How v7.7 Closed the Last Closable Class B Gap	2026-04-28	—	light	v7.7 closes A1–A5 + B1–B2 + C1 from the post-v7.6 gap inventory: 5 new gates (4 write-time pre-commit hooks + 1 cycle-t…	2 of 9 — Post-v6 fully-adopted features
7.7	stats-v2 — Resume, Reconcile, and Ship Despite a Three-Layer Bug Stack	2026-04-30	—	light	stats-v2 was the first paused feature picked up after v7.7 shipped. A "small remaining tasks" job turned into a three-l…	~2.5 hours — Wall time (resume → ship)
7.7	When the Gate Read One Field and the Data Lived in Another	2026-05-01	—	light	The v7.7 case study claimed cache_hits was "100% gated on next write." A 2026-04-30 sweep found 43 of 46 state.json fil…	43 — state.json files migrated
7.7	HADF Phase 2 — Cloud Fingerprinting Measurement	2026-05-01	—	light	Pre-registered measurement experiment to test whether cloud inference endpoints cluster naturally by hardware class via…	0.5566 — Silhouette score (max across k=2..6)
7.6	Mechanical Enforcement — How v7.6 Closed the Class B Gap from Gemini's Audit	2026-04-25	—	light	v7.6 promoted four agent-attention checks into pre-commit failures and added two recurring CI defenses, closing the rem…	7 of 7 — Class B → A promotions
7.6	auth-polish-v2 — Three Workstreams Bundled, 18/18 Tasks Shipped	2026-05-01	—	light	auth-polish-v2 bundled three auth workstreams — forgot-password, biometric activation refinement, Google Sign-In SDK ac…	18 / 18 — PRD tasks: planned / shipped
7.1	UI-Audit Baseline Burndown — P0 27 → 0 and a Hard Gate	2026-04-24	—	light	The make ui-audit scanner had been running advisory-only since shipping — 27 P0 + 103 P1 across 44 files. This burndown…	27 → 0 — P0 findings
7.0	V7.0 HADF — Teaching the Framework to Detect Chip Architecture	2026-04-17	—	flagship	V7.0 HADF asks whether passive hardware fingerprinting can improve dispatch routing without requiring provider cooperat…	17 — Static chip profiles
6.1	185 Findings, 12 Critical — What a Full-System Audit Revealed	2026-04-18	—	light	The same AI that built the framework audited its own work across 22 features and 6 framework versions, using a 4-layer …	185 — Total findings surfaced
6.1	Building the Site That Tells the Story — A Two-Hour Meta-Build	2026-04-20	—	standard	PM framework built the website that hosts its own case studies. 37 commits over 2 hours of wall clock. 8 routes, 36 pre…	2h 2m — First commit → preview deploy
6.1	The Dual-Sync Race — Two Backends, One Last-Writer-Wins Silence	2026-04-18	—	light	v7.0 audit's top backend finding — two sync paths (CloudKit + Supabase) both pull on login with no merge coordination. …	13 / 3 — Sync findings / critical
6.1	The Stacked-PR Misfire — When "Merged" Didn't Mean "On Main"	2026-04-19	—	light	3 stacked PRs (M-2a → M-2c → M-2b). All marked "Merged" by GitHub. Main only got M-2a — downstream PRs merged into thei…	3 / 1 — PRs claimed merged / actually on main
6.1	The XCTWaiter Abort — Learning to Stop, Rollback, and Retry	2026-04-20	—	light	First M-series attempt-1 failure. XCTWaiter.wait(for: [a, b, c]) is wait-for-ALL, not wait-for-ANY — but the test code …	2 (attempt 1 aborted, attempt 2 passed) — Attempts to ship
6.0	When We Stopped Estimating and Started Measuring	2026-04-16	—	flagship	Through 16 features, every velocity claim rested on ±15–30 min wall-time estimates and narrative-inferred cache hit rat…	7 of 9 — Measurement DVs now deterministic
5.2	What Breaks When You Run 4 Features at Once — And How to Fix It	2026-04-15	—	flagship	A 4-feature parallel stress test exposed two bottlenecks (52% permission-routing denial, 23× agent variance) — context …	48% — Tool-use reduction with dispatch intelligence
5.2	From "Zero Conflicts by Luck" to "Zero Conflicts by Design"	2026-04-15	—	light	v5.1 stress test had 0 file conflicts across 15 same-file edits — by luck, not by design. v5.2 Parallel Write Safety (s…	15 across 3 files — Same-file edits without conflict
5.1	The Fastest Feature — 86% Velocity Improvement on Auth Flow	2026-04-13	—	flagship	Auth embedded into onboarding mid-flow (vs separate hub). Full PM lifecycle plus 3 design iterations in a single 100-mi…	86% (2.1 min/CU) — Velocity vs baseline
5.1	First Feature Under the New Architecture — AI Engine Adaptation	2026-04-13	—	flagship	First feature where 'how we build' and 'what we build' used the same architectural principles. Adapter protocol, valida…	45% — Cache hit rate (framework → product transfer)
5.1	Shipping 4 Features in 54 Minutes — The Parallel Stress Test	2026-04-14	—	flagship	4 features advanced through 8 lifecycle phases concurrently in 54 minutes. 0 build failures, 0 test failures, 0 merge c…	12.4× — Parallel throughput vs baseline
5.1	Smart Reminders — Six Reminder Types Designed and Shipped Inside a 12-Hour Stress Test	2026-04-20	—	light	Six reminder types, a reusable guest-lock overlay, and a frequency-cap engine — all designed, specified, implemented, a…	~12h — Wall-clock from init to complete
5.0	What If You Designed Software Like a Chip?	2026-04-12	—	flagship	7 hardware architecture principles applied to a PM framework — LoRA hot-swap, palettization, weight-stationary, UMA zer…	121,714 → 45,125 — Framework overhead tokens
5.0	SettingsView v2 — 1170 → 294 Lines via Phased Decomposition	2026-04-19	—	light	Closing audit finding UI-002: SettingsView.swift dropped 1170 → 294 lines (~75%) across 4 PRs (#122-#125) in a ~2-hour …	1170 → 294 — SettingsView.swift line count
5.0	Training Plan v2 — Biggest Surface in the App, Stress-Test for the v4.0 Cache	2026-04-10	—	light	The biggest surface in the app — 2,135 lines, 13 nested types, 32 audit findings — broken into 6 extracted views via th…	2,135 lines (largest in app) — v1 file size
4.4	Can You Test AI Output Quality the Same Way You Test Code?	2026-04-09	—	flagship	AI-output quality treated as a testable property. 2-layer eval design (golden I/O + heuristic XCTest evals + monitoring…	29 of 29 — Eval cases green on first run
4.4	The Most Complex Feature Completed at Refactor Speed	2026-04-10	—	light	First greenfield feature under v4.4 — new tab, new data model, 9 eval definitions, 5 views. Shipped in 2 hours end-to-e…	~2 hours — Wall time (research → Figma screen)
4.3	How 6 Screen Refactors Proved a 6.5x Speedup	2026-04-10	—	flagship	Six identical-scope screen refactors across four framework versions isolated framework improvement from practitioner le…	6.5× — Speedup across 6 refactors
2.0	The Pilot — Running the Full PM Lifecycle on Onboarding	2026-04-05	—	flagship	First feature run through the full 9-phase PM lifecycle. Retroactive UX/design-system alignment on a "finished" feature…	24 — Audit findings (Onboarding)
pre-v5.0	Home Today Screen v2 — The V2 Rule Pilot and Birth of Screen-Prefixed Analytics	2026-04-09	—	light	The second screen to go through a full UX Foundations alignment pass, and the first under the now-codified V2 Rule. Hom…	1029 → 703 lines (~32% reduction) — v1 → v2 line count
pre-v5.0	Backlog Roundup — Two Pre-Rule Features That Stay Roundup-Only	2026-04-20	—	light	Roundup of two features that pre-date the 2026-04-13 "every feature gets a case study" rule and have source material to…	2 (development-dashboard, ai-cohort-intelligence) — Features in this housekeeping roundup
pre-v5.0	Android Design System — A Documentation Deliverable That Skipped Six Phases on Purpose	2026-04-04	—	light	A 92-token iOS → Material Design 3 mapping that ships zero lines of Android code on purpose. Research + PRD execute nor…	92/92 — Tokens mapped (iOS → MD3)
pre-v5.0	GDPR Compliance — Two Hours End-to-End on a Legal-Blocker Feature	2026-04-04	—	light	The first feature in the project where kill criteria read "Legal requirement — cannot be killed." Shipped 8 files, +711…	2 hours — Wall time (init → complete)
pre-v5.0	Google Analytics — The Substrate Every Downstream Feature Depends On	2026-04-04	—	light	The feature that went from "11 shipped features, 40 defined metrics, zero analytics instrumentation" to a working GA4 p…	22 files / +1970 −39 — Files / lines (merge ac85c73)

The Blank-Main Bug — Catching a Production SSR Regression After Promotingappendix
v—2026-04-20
Production site shipped 251 bytes of HTML per page across 48 routes. Suspense + useSearchParams in the root layout suspended the entire chi…
251 — Bytes per page across 48 routes
Watching the Framework Build the Site That's Replaying It — The DispatchReplay Componentappendix
v—2026-04-21
Static blueprint diagrams tell you what a system is, not whether it's running. DispatchReplay plays a recorded trace of a real feature flow…
2 (Sprint I + fitme-story meta) — Live traces
The Lego Metaphor — Designing the PM-Flow Ecosystem Pageappendix
v—2026-04-22
How to render an 11-skill, 10-phase, 15-data-file framework on a single page without it looking like an org chart. The answer: a Lego wall …
11 · 10 · 15 — Skills × phases × shared files
External Validation — Did Our Numbers Hold Up?appendix
v—2026-04-16
Independent review of the normalization model, velocity claims, and measurement methodology — confirming what is solid, flagging what is we…
5 (normalization, velocity, cache, complexity, baselines) — Methodology dimensions reviewed
What If We Had Measurement From Day One? — A Retrospective ROI Analysisappendix
v—2026-04-16
Counterfactual experiment: retroactively applying deterministic measurement infrastructure to all 24 features, then computing the cost, the…
24 — Features reanalysed under counterfactual
How We Normalized Complexity Across 16 Different Featuresappendix
v—2026-04-16
Raw metrics like wall time and file count are meaningless without normalization. The Complexity Unit (CU) model — additive factors for task…
16 — Features normalised under CU model
Garmin Health Connection (Tier 1) — Source attribution without a backendlight
v7.102026-06-12
A Garmin owner who installs FitMe gets a degraded readiness experience unless they already have "Garmin Connect → Apple Health" sync on — a…
0 / 0 / 0 — New backend / permissions / data egress
Control Room Live Feed — from build-time snapshot to fail-soft request-time datalight
v7.102026-06-16
The Unified Control Center was complete and deployed, but every panel rendered a build-time snapshot — data only refreshed on redeploy, so …
48 (16 probes / 16 data-source / 9 producer / 7 footer) — Unit tests across the live layer
Foundation Models Tier 3 — real on-device AI personalization, with an SDK-gated cloud-escalation pathlight
v7.102026-06-15
The AI engine's Tier-3 on-device personalization was a placeholder — a hardcoded confidence of 0.5 that never called a model, so the "on-de…
27 / 27 — Unit tests (incl. AB1–AB3 eval coverage)
T10 — AI Golden-Set Eval Harness (and why the FitMe AI is deterministic, not generative)light
v7.102026-06-10
T10 was spec'd as an "LLM golden-set eval harness — promptfoo-equivalent weekly run." Reading the actual code corrected the framing: the Fi…
24 — Golden cases
T3 — SignInService Passkey/WebAuthn Unit Tests (closing the highest-risk zero-coverage service)light
v7.102026-06-10
SignInService.swift (1244 lines) is on the CLAUDE.md high-risk list — it owns Apple Sign In, Google, Email/OTP, and passkey/WebAuthn — yet …
10 (deterministic, device-free) — Tests
T5 — Mock-Protocol Drift Detection (a central conformance registry, reframed for the real codebase)light
v7.102026-06-10
T5 was spec'd as "wrap MockKeychainStorage, MockSupabaseClient, StubAIEngineClient, CountingAIEngineClient in a shared MockValidation.swift…
6 (AnalyticsProvider, GoogleAuthProviding, AppleAuthProviding, EmailAuthProviding, URLSessionProtocol, AIInputAdapter) — Protocol anchors
T13 Per-Gate `last_failed_at` Index — distinguishing "stopped running" from "running + catching" (v7.10)light
v7.102026-06-10
F17 materialized "when did this gate last run?" per gate in O(1). But "ran" and "caught a violation" are different signals, and F17 only ha…
6 — Cycle-time gates populated with failure history
Android Token Pipeline — making the design system real (AND-1)light
v7.102026-06-17
The android-design-system feature shipped 2026-04-04 as research-only — an iOS→Material-Design-3 mapping doc, an adaptation strategy, and a…
3 — Generated artifacts
figma-design-architecture — honest closure of the design-system source-of-truth storylight
v7.102026-06-18
After the 2026-06-15 Figma-mirror rebuild pivoted the design system to "code is canonical; Figma is a manually-maintained mirror" (Code Con…
0/2 → 2/2 — Surfaces with a current, linked, no-false-claim arch doc
Contract-Fixture Consumer Adoption (E-15) — closing the W16 cross-repo silent-pass (v7.10)light
v7.102026-06-22
The /control-room/framework page threw a TypeError in production for 13 days with green CI the whole time: the FitTracker2 producer emitted…
13 days — W16 silent-pass duration in production
Funnel Analysis Dashboards — running the canonical funnels against live GA4 (v7.10)light
v7.102026-06-23
The backlog framed this (RICE 6.0) as "just needs GA4 funnel definitions wired" — but a Phase 0.1 reality-check found the 5 canonical funne…
3 of 5 — Funnels with computable live drop-off
F18 — Mutation Testing on the Gate Dispatchers (v7.10)light
v7.102026-06-26
The framework's enforcement rests on two ~80 KB dispatcher files. F14/F15/F16 built a 3-layer test suite for them, but a green suite can st…
1,857 — Mutants enumerated across both dispatchers
F16 Try-repo Pre-commit Harness — The 3rd Gate-Test Layer (v7.9.1)light
v7.9.12026-06-04
The framework gained a 3rd layer of gate testing — try-repo. Where unit tests catch wrong-regex bugs and F14 dispatch tests catch wrong-mai…
61 pass + 1 documented skip in 16.21s — Test count at ship
F17 Per-Gate `last_fired_at` Index — Derived Telemetry Materialization (v7.9.1)light
v7.9.12026-06-04
Mechanism A accumulated ~1800 gate-coverage.jsonl rows by v7.9. The planned v7.10 GATE_COVERAGE_ZERO meta-check would scan that stream for …
under 1s for 1828 rows (PRD budget under 2s) — Refresh wall-clock
F2 Phase 0 Reality-Check — Mechanical Defense Against Post-Squash-Merge State Drift (v7.9.1)light
v7.9.12026-06-04
The post-squash-merge state-drift pattern repeated 5 times in 4 days (2026-06-01 → 2026-06-04). Each time, state.json::tasks said pending b…
5 in 4 days (2026-06-01 to 2026-06-04) — Drift instances calibration baseline
Framework v7.9.1 — Single-day Build Window (8 ships, 14 PRs, 0 new gates)light
v7.9.12026-06-04
v7.9.1 was a single-day build window that opened at v7.9 Phase E exit (2026-06-04) and closed the same day. 8 ships landed across 14 PRs wi…
8 — Ships in the build window
HADF Signature Expansion — empirical-first, with a real on-device calibrationlight
v7.9.12026-06-05
"Expand HADF into new chip families" naturally reads as "add more spec-sheet rows" — but Phase 2-bis changed the answer: what HADF actually…
8 → 9 (real M4, n=80); target was ≥12 — Instrumented signatures
HADF Phase 3A — The Sensing Layer (detection-only observability over a validated dispatch signal)light
v7.9.12026-06-10
Phase 2-bis established that streaming TTFT/TPS signatures are real, provider-general, substrate-discriminating, and short-term-stable — a …
8 endpoints · 7197 valid records · min_n=50 · max_ttft 30s — Reference store
fitme-story Dual-Audience Redesign — a cookie-backed lens that re-narrates one site for two audienceslight
v7.9.12026-06-09
The public site served one undifferentiated narrative to two distinct audiences (developers and product managers). GA4 showed the home page…
76.2% (~3.5× content pages) — Home bounce rate (problem baseline)
T14 platforms_tested — Platform-test parity as a queryable state fieldlight
v7.9.12026-06-07
Before T14, state.json recorded that a feature shipped and that it had a case study — but not which platforms its tests actually exercised.…
94 — Complete features backfilled
Framework v7.9 — Promotion Release: 3 Advisory Gates → Enforced via Single-Flag Flip After 14-Day Calibrationlight
v7.92026-05-21
v7.9 shipped 2026-05-21 via FT2 PR #417 (ea53ff4). A single-line edit at scripts/check-state-schema.py:132 (BRANCH_ISOLATION_ADVISORY_MODE …
18 (Mode B) + 13 (Mode C) + 13 (FEATURE_CLOSURE_COMPLETENESS) = 44 — Mechanism A telemetry rows (14d)
When gh pr view Lies — The Stale-Base Branch Trap and a 33-Zombie Cleanuplight
v7.92026-05-25
Two drafts that looked clean (~6 files each per gh API) would have reverted 23 files / -2157 lines on merge — including the same session's …
2 (#484 + #488) — Draft PRs near-miss
Orchid — From Dispatch Patterns to Silicon, Validated by Measurementstandard
v7.92026-06-05
The capstone of the Orchid research arc: how a fitness app's PM-framework dispatch intelligence became a RISC-V accelerator design (Orchid …
25,305 (0 invariant violations) — Orchid v1 behavioral runs
Readiness-Aware Training Alert — A Second Observer for Today's Training-Day Decision Aid (C2)light
v7.92026-06-01
C2 shipped 2026-06-01 in a single session as a 4-phase Enhancement on feature/readiness-aware-training-alert (merged via FT2 PR #560, squas…
5 (ReadinessAlertRecommendation + ReadinessAwareTrainingTrigger + TrainingStartTimeLearner + ReadinessAwareTrainingObserver + ReadinessAwareAlertStore) — New source files
Trend Alerts (HRV) — A Third Observer for the Multi-Day Sustained Pattern (C4)light
v7.92026-06-01
C4 shipped 2026-06-01 in a single session as a full 9-phase Feature on feature/trend-alerts-hrv (FT2 PR #564). Adds the third notification …
6 (TrendAlertContext + TrendAlertTrigger + TrendAlertDispatchTimeLearner + TrendAlertObserver + TrendAlertStore + HRVTrendChart) — New source files
When Two Workflows Share a Name — The `${{ github.workflow }}` Concurrency Trap (W26)light
v7.92026-06-01
Every mixed-content PR (touching both FitTracker source AND docs) produced two Build and Test status checks — one CANCELLED, one SUCCESS — …
4 (C2 #560, C4 #562 + #564, dev-guide #563 — all hit the cancellation race before the fix) — Affected PRs in the session
AI User Feedback Loop — Closing Audit UI-024 with a Per-User Reinforcement Cycle (C5)light
v7.92026-06-02
C5 shipped 2026-06-01/02 in a single session as a full 5-phase Feature on feature/ai-user-feedback-loop (FT2 PR #572 merged 2026-06-02 ec5d…
UI-024 (FitTracker/Views/AI/AIInsightCard.swift:230-235; carried since PR #79 / 2026-04-10) — Closes deferred audit
Exercise Search/Filter — a discoverable read-only library on top of the 50-exercise catalog (C3)light
v7.92026-06-02
C3 closes backlog L347 — the longest-running "no in-app library" complaint. The training catalog already carried ~50 exercises with rich me…
~50 (TrainingProgramData.allExercises) — Catalog exercises surfaced
Training Program Customization — custom splits on top of the fixed PPL (C6)light
v7.92026-06-02
C6 replaces the fixed 6-day Push/Pull/Legs split with per-user custom training programs. Users pick from 4 starter templates (PPL 6-day / U…
~2152 — Net LoC (9 standalone-buildable commits)
UCC Passkey Auth Security Hardening — closing 4 deferred gaps before the v7.9 freezelight
v7.8.62026-05-20
A single-day enhancement that closed 4 explicitly-deferred security gaps from the parent ucc-passkey-auth PRD: an email allowlist gate, a h…
4 (allowlist + hybrid lockout + token audit + unlock CLI) — Deferred gaps closed
Analytics Observability — closing the measurement-debt loop (taxonomy cleanup + a live GA4 path)light
v7.8.52026-06-09
The project had shipped extensive analytics instrumentation but had no clean taxonomy and no live way to observe events landing — instrumen…
56 → 0 — Analytics taxonomy drift (CSV-missing-rows)
Cross-Repo State Sync (v7.8.3) — 10 PRs Across 2 Repos, 3-Attempt Cutover Ceremony, Reverse-Sync Infrastructure Livelight
v7.8.32026-05-11
v7.8.3 is the first FitMe PM framework release that spans two git repos as a single coordinated Feature. Shipped 2026-05-11 via 10 PRs (5 F…
10 total (FT2 #298-#302 + fitme-story #86-#90) — PRs shipped
Design System P2 Deferred — One Variant, One Migration, Ten Items Still Deferredlight
v7.8.32026-05-12
Tightly-scoped enhancement closing one P2 item (P2-044) from the eleven deferred at the end of fitme-story-design-system-p2-cleanup. Adds a…
1 of 11 deferred — P2 items closed by this enhancement
iOS UI-Audit P1 Drift Cleanup — 44 → 14 (Option B Respected, 68% Reduction)light
v7.8.32026-05-12
Drift-cleanup follow-up to `ios-ui-audit-p1-burndown` (PRs #292+#294). Respects Option B's locked ceiling — tokenizes ONLY magic dims with …
44 → 14 (-68%) — ui-audit P1 reduction
fitme-story DS P2 Final Sweep — 3 Utility Classes, 5 P2s Closed, 5 Re-Deferredlight
v7.8.32026-05-12
Final sweep on the 10 remaining deferred P2 items from the 2026-05-10 lens audit. Adds 3 utility classes to globals.css (.term-label, .sect…
5 of 10 remaining (50%) — P2 items closed by this enhancement
UI/UX Final Sweep — iOS P1 to 0, fitme-story P2 to 93%, /case-studies browse-by-categorylight
v7.8.32026-05-12
Three-PR sweep in a single session. iOS reached 0 P1 findings (from baseline of 103 across three burndowns). fitme-story closed 13 of 14 P2…
14 → 0 (-100%) — iOS make ui-audit P1
HADF Phase 2-bis — Cross-Sub-Exp Synthesis (3 Sub-Experiments)light
v7.8.32026-05-27
Successor to HADF Phase 2 (slot 22b). Three pre-registered sub-experiments test orthogonal aspects of the HADF dispatch claim: Sub-exp 1 (c…
3 — Sub-experiments
fitme-story Website Design System — Public Showcase + Drift Detection + Heritagelight
v7.8.22026-05-10
Single-session ship of the next-phase evolution of fitme-story's design system. 31-component typed manifest, public /design-system Part 2 s…
100% — Public Figma parity
iOS UI Audit P1 Burndown — 2-PR Token Substitution + Audit Hardeninglight
v7.8.22026-05-11
Single-session proactive burndown of the 103 P1 ui-audit findings the parent feature ui-audit-baseline-burndown deferred to fix-as-you-touc…
FT2 #292 + #294 — PRs merged
Framework v7.8.1 — Branch Isolation + Feature-Closure Completeness from One Full PM Cyclelight
v7.8.12026-05-07
Two cooperating pre-commit gates shipped as one feature in advisory mode on 2026-05-07. BRANCH_ISOLATION_VIOLATION prevents agents from mut…
FT2 #244 + #245 + #246; fitme-story #53 — PRs merged
UCC Passkey Auth — Replacing basic-auth on the operator dashboard with WebAuthnlight
v7.8.12026-05-07
Two-PR cross-repo ship that replaces shared HTTP basic-auth on /control-room/* with WebAuthn passkeys. Per-operator identity, per-device cr…
fitme-story #55 + FT2 #248 — PRs merged
Roadmap Stress Test — Compressing 10 Weeks of Backlog Through the v7.8.1 Protocollight
v7.8.12026-05-07
Single-session experiment measuring throughput + protocol overhead of the v7.8.1 framework against ~10 weeks of sequenced backlog work. The…
9 — Sub-features in scope
Bridge to v7.9 — How v7.8 Closed the v7.7 Silent-Passlight
v7.82026-05-03
v7.7 shipped a gate (CACHE_HITS_EMPTY_POST_V6) that ran on every commit but exercised data on 0 of 46 features — a textbook silent-pass. Th…
0/46 (silent-pass) — CACHE_HITS_EMPTY_POST_V6 effective coverage at v7.7 ship
Smart Reminders Behavioral Learning — PR-1 Shipped Across iOS + Backendlight
v7.82026-05-04
Sub-feature of Smart Reminders. PR-1 shipped fully on 2026-05-04 in two halves — FT2 PR #190 (iOS data layer + Settings toggle-off, squash …
15 / 15 — PR-1 tasks: planned / complete
Unified Control Center — Migrating an Operator Dashboard Across Two Repos in 11 Dayslight
v7.82026-05-06
Retired the legacy Astro operator dashboard, migrated it inside the public showcase as a basic-auth-gated /control-room route, instrumented…
42 / 44 — Tasks done
Import Training Plan — Resume from Audit-Flagged Partial Ship to Full Phase 1 Ship in 14 Hourslight
v7.82026-05-06
Resumed an audit-flagged partial-ship feature: rolled back to research mid-flight after discovering the original PRD claimed an impossible …
4 — PRs landed
Push Notifications v2 — From v1 Partial-Ship to Platform-Layer Rebuild in a Single Daylight
v7.82026-05-07
Reopened v1 push-notifications after audit UI-016 caught a substrate-built-but-never-wired partial-ship. Single-session full PM cycle (Phas…
FT2 #239 — PR landed
Case-Study Presentation Refactor — Locking Alt-A Across 25 Studieslight
v7.72026-04-28
An 18-hour serial sprint locked a uniform presentation pattern across 25 case studies — every one now leads with a SummaryCard, a "how to r…
25 of 25 — Case studies backfilled
Validity Closure — How v7.7 Closed the Last Closable Class B Gaplight
v7.72026-04-28
v7.7 closes A1–A5 + B1–B2 + C1 from the post-v7.6 gap inventory: 5 new gates (4 write-time pre-commit hooks + 1 cycle-time check + 1 adviso…
2 of 9 — Post-v6 fully-adopted features
stats-v2 — Resume, Reconcile, and Ship Despite a Three-Layer Bug Stacklight
v7.72026-04-30
stats-v2 was the first paused feature picked up after v7.7 shipped. A "small remaining tasks" job turned into a three-layer reveal: the sta…
~2.5 hours — Wall time (resume → ship)
When the Gate Read One Field and the Data Lived in Anotherlight
v7.72026-05-01
The v7.7 case study claimed cache_hits was "100% gated on next write." A 2026-04-30 sweep found 43 of 46 state.json files used the legacy `…
43 — state.json files migrated
HADF Phase 2 — Cloud Fingerprinting Measurementlight
v7.72026-05-01
Pre-registered measurement experiment to test whether cloud inference endpoints cluster naturally by hardware class via TTFT/TPS alone. Pre…
0.5566 — Silhouette score (max across k=2..6)
Mechanical Enforcement — How v7.6 Closed the Class B Gap from Gemini's Auditlight
v7.62026-04-25
v7.6 promoted four agent-attention checks into pre-commit failures and added two recurring CI defenses, closing the remaining Class B → Cla…
7 of 7 — Class B → A promotions
auth-polish-v2 — Three Workstreams Bundled, 18/18 Tasks Shippedlight
v7.62026-05-01
auth-polish-v2 bundled three auth workstreams — forgot-password, biometric activation refinement, Google Sign-In SDK activation — into one …
18 / 18 — PRD tasks: planned / shipped
UI-Audit Baseline Burndown — P0 27 → 0 and a Hard Gatelight
v7.12026-04-24
The make ui-audit scanner had been running advisory-only since shipping — 27 P0 + 103 P1 across 44 files. This burndown migrated 12 view fi…
27 → 0 — P0 findings
V7.0 HADF — Teaching the Framework to Detect Chip Architectureflagship
v7.02026-04-17
V7.0 HADF asks whether passive hardware fingerprinting can improve dispatch routing without requiring provider cooperation. 5-layer archite…
17 — Static chip profiles
185 Findings, 12 Critical — What a Full-System Audit Revealedlight
v6.12026-04-18
The same AI that built the framework audited its own work across 22 features and 6 framework versions, using a 4-layer methodology (paralle…
185 — Total findings surfaced
Building the Site That Tells the Story — A Two-Hour Meta-Buildstandard
v6.12026-04-20
PM framework built the website that hosts its own case studies. 37 commits over 2 hours of wall clock. 8 routes, 36 pre-rendered pages, 12 …
2h 2m — First commit → preview deploy
The Dual-Sync Race — Two Backends, One Last-Writer-Wins Silencelight
v6.12026-04-18
v7.0 audit's top backend finding — two sync paths (CloudKit + Supabase) both pull on login with no merge coordination. Last writer wins. No…
13 / 3 — Sync findings / critical
The Stacked-PR Misfire — When "Merged" Didn't Mean "On Main"light
v6.12026-04-19
3 stacked PRs (M-2a → M-2c → M-2b). All marked "Merged" by GitHub. Main only got M-2a — downstream PRs merged into their stacked parents, n…
3 / 1 — PRs claimed merged / actually on main
The XCTWaiter Abort — Learning to Stop, Rollback, and Retrylight
v6.12026-04-20
First M-series attempt-1 failure. XCTWaiter.wait(for: [a, b, c]) is wait-for-ALL, not wait-for-ANY — but the test code reads like 'any'. Ap…
2 (attempt 1 aborted, attempt 2 passed) — Attempts to ship
When We Stopped Estimating and Started Measuringflagship
v6.02026-04-16
Through 16 features, every velocity claim rested on ±15–30 min wall-time estimates and narrative-inferred cache hit rates. v6.0 instrumente…
7 of 9 — Measurement DVs now deterministic
What Breaks When You Run 4 Features at Once — And How to Fix Itflagship
v5.22026-04-15
A 4-feature parallel stress test exposed two bottlenecks (52% permission-routing denial, 23× agent variance) — context window pressure was …
48% — Tool-use reduction with dispatch intelligence
From "Zero Conflicts by Luck" to "Zero Conflicts by Design"light
v5.22026-04-15
v5.1 stress test had 0 file conflicts across 15 same-file edits — by luck, not by design. v5.2 Parallel Write Safety (snapshot/rollback + r…
15 across 3 files — Same-file edits without conflict
The Fastest Feature — 86% Velocity Improvement on Auth Flowflagship
v5.12026-04-13
Auth embedded into onboarding mid-flow (vs separate hub). Full PM lifecycle plus 3 design iterations in a single 100-min session — first fe…
86% (2.1 min/CU) — Velocity vs baseline
First Feature Under the New Architecture — AI Engine Adaptationflagship
v5.12026-04-13
First feature where 'how we build' and 'what we build' used the same architectural principles. Adapter protocol, validation gate, analytics…
45% — Cache hit rate (framework → product transfer)
Shipping 4 Features in 54 Minutes — The Parallel Stress Testflagship
v5.12026-04-14
4 features advanced through 8 lifecycle phases concurrently in 54 minutes. 0 build failures, 0 test failures, 0 merge conflicts across 31 s…
12.4× — Parallel throughput vs baseline
Smart Reminders — Six Reminder Types Designed and Shipped Inside a 12-Hour Stress Testlight
v5.12026-04-20
Six reminder types, a reusable guest-lock overlay, and a frequency-cap engine — all designed, specified, implemented, and test-covered duri…
~12h — Wall-clock from init to complete
What If You Designed Software Like a Chip?flagship
v5.02026-04-12
7 hardware architecture principles applied to a PM framework — LoRA hot-swap, palettization, weight-stationary, UMA zero-copy, mixed precis…
121,714 → 45,125 — Framework overhead tokens
SettingsView v2 — 1170 → 294 Lines via Phased Decompositionlight
v5.02026-04-19
Closing audit finding UI-002: SettingsView.swift dropped 1170 → 294 lines (~75%) across 4 PRs (#122-#125) in a ~2-hour single-session decom…
1170 → 294 — SettingsView.swift line count
Training Plan v2 — Biggest Surface in the App, Stress-Test for the v4.0 Cachelight
v5.02026-04-10
The biggest surface in the app — 2,135 lines, 13 nested types, 32 audit findings — broken into 6 extracted views via the V2 Rule. First v2 …
2,135 lines (largest in app) — v1 file size
Can You Test AI Output Quality the Same Way You Test Code?flagship
v4.42026-04-09
AI-output quality treated as a testable property. 2-layer eval design (golden I/O + heuristic XCTest evals + monitoring schema) and a lifec…
29 of 29 — Eval cases green on first run
The Most Complex Feature Completed at Refactor Speedlight
v4.42026-04-10
First greenfield feature under v4.4 — new tab, new data model, 9 eval definitions, 5 views. Shipped in 2 hours end-to-end. Stress-tested wh…
~2 hours — Wall time (research → Figma screen)
How 6 Screen Refactors Proved a 6.5x Speedupflagship
v4.32026-04-10
Six identical-scope screen refactors across four framework versions isolated framework improvement from practitioner learning. Wall time fe…
6.5× — Speedup across 6 refactors
The Pilot — Running the Full PM Lifecycle on Onboardingflagship
v2.02026-04-05
First feature run through the full 9-phase PM lifecycle. Retroactive UX/design-system alignment on a "finished" feature surfaced 24 finding…
24 — Audit findings (Onboarding)
Home Today Screen v2 — The V2 Rule Pilot and Birth of Screen-Prefixed Analyticslight
vpre-v5.02026-04-09
The second screen to go through a full UX Foundations alignment pass, and the first under the now-codified V2 Rule. Home v2 produced two pr…
1029 → 703 lines (~32% reduction) — v1 → v2 line count
Backlog Roundup — Two Pre-Rule Features That Stay Roundup-Onlylight
vpre-v5.02026-04-20
Roundup of two features that pre-date the 2026-04-13 "every feature gets a case study" rule and have source material too thin to support a …
2 (development-dashboard, ai-cohort-intelligence) — Features in this housekeeping roundup
Android Design System — A Documentation Deliverable That Skipped Six Phases on Purposelight
vpre-v5.02026-04-04
A 92-token iOS → Material Design 3 mapping that ships zero lines of Android code on purpose. Research + PRD execute normally; Tasks / UX / …
92/92 — Tokens mapped (iOS → MD3)
GDPR Compliance — Two Hours End-to-End on a Legal-Blocker Featurelight
vpre-v5.02026-04-04
The first feature in the project where kill criteria read "Legal requirement — cannot be killed." Shipped 8 files, +711 lines, full 10-phas…
2 hours — Wall time (init → complete)
Google Analytics — The Substrate Every Downstream Feature Depends Onlight
vpre-v5.02026-04-04
The feature that went from "11 shipped features, 40 defined metrics, zero analytics instrumentation" to a working GA4 pipeline with protoco…
22 files / +1970 −39 — Files / lines (merge ac85c73)