Skip to content
fitme·story
Case studies
v7.8.2 · 5 min read

iOS UI Audit P1 Burndown — 2-PR Token Substitution + Audit Hardening

Summary card · 60-second read
Version
v7.8.2
Date
2026-05-11
Tier
light

Single-session proactive burndown of the 103 P1 ui-audit findings the parent feature ui-audit-baseline-burndown deferred to fix-as-you-touch. Frequency analysis revealed 4 high-recurrence magic frame values (44, 28, 80, 36) that warranted tokens. Two cross-cutting PRs landed: PR-1 added 4 AppSize tokens + 50 frame mass-substitutions across 21 views; PR-2 added 3 AppText tokens + 23 font mass-substitutions across 8 views + 4 explicit a11y labels + widened the DS-A11Y-BUTTON audit window 20→60 lines (two legitimate false-positives in StatsView/SupplementItemRow surfaced complex multi-line label blocks the heuristic missed). Combined: ui-audit P1 103 → 44 (-59, 57% reduction), 0 font + 0 a11y findings remaining. Original 5-PR area-bucketed plan replaced by 2-PR cross-area structure after frequency analysis — strictly faster (~3h vs estimated ~10h). Option B strategy locked via /AskUserQuestion: high-frequency tokens only, 44 long-tail P1s accepted as documented design-specific values.

How to read this case studyT1/T2/T3 · ledger · kill criterion
T1Instrumented
Numbers come from a machine-generated ledger or commit. Reproducible. Highest reader trust.
T2Declared
Numbers stated by a structured declaration (PRD, plan, frontmatter) but not directly measured.
T3Narrative
Estimates and observations from session memory. Useful for context; not citable as evidence.
Ledger
Where to verify the claim — a file path, GitHub issue, or backlog entry. Anything labelled ledger: is the audit trail.
Kill criterion
The pre-registered threshold under which this work would have been killed mid-flight. Not fired = work shipped without hitting the threshold.
Deferred
Items intentionally not closed in this version. Each cites the ledger that tracks remaining work.

Visual aid · key numbers at a glance

Default · no specialised visual declared
PRs mergedT1
FT2 #292 + #294
ui-audit P1 reductionT1
10344
P0 ui-audit findingsT1
0 / 0
DS-RAW-FONT-SHORTHAND closedT1
26 / 26 (100%)
DS-A11Y-BUTTON closedT1
6 / 6 (100%)
New design tokens addedT1
7 (4 AppSize + 3 AppText)
SwiftUI views touchedT1
29 (21 in PR-1 + 8 in PR-2)
Mass substitutionsT1
73 total (50 frame + 23 font)
Operator spot-checksT2
2 / 2
Kill criteria triggeredT1
No
Audit window wideningT1
2060

Context

The parent feature ui-audit-baseline-burndown shipped 2026-04-24 with the make ui-audit scanner promoted to a hard verify-local gate. At that moment the codebase sat at 0 P0 + 103 P1 across 42 files. The P0=0 gate ensures no NEW P0 ever lands; the 103 P1 findings stayed under "fix-as-you-touch" — addressed only when a PR happened to touch the file.

By 2026-05-11 the P1 count had drifted to 108 (+5 from baseline) despite the fix-as-you-touch policy. This enhancement shipped a proactive burndown instead of waiting for organic touches.

The strategy decision

Initial plan (drafted 2026-05-10 at session pause) was 5 area-bucketed PRs assuming "lightweight token substitution." Frequency analysis of the 62 DS-MAGIC-FRAME findings on resume morning revealed:

ValueOccurrencesPattern
4436iOS HIG tap target
2816icon container (distinct from iconBadge=26)
8013inline numeric field width
367compact tap target
23 other1–4 eachunique design-specific (chart 180, modal 260, divider 50, etc.)

Trying to tokenize every magic number forces inventing single-use tokens which inflates the design system. The honest move: add 4 tokens for genuinely-shared patterns, mass-substitute, and accept the long-tail as documented design-specific P1s.

Decision matrix locked via /AskUserQuestion 2026-05-11:

  • Strategy: Option B (Tokens for high-frequency only, ~78% target reduction)
  • PR structure: 2 PRs cross-area (tokens-first vs area-first)
  • Risk tolerance: Strict (operator iOS simulator spot-check per PR before merge)

This replaced the original 5-PR plan with 2 cross-cutting PRs — strictly better than the area-bucketed alternative (less review burden, cleaner git history, faster wall time — ~3h vs estimated ~10h).

What shipped

PR #292 — PR-1: Frame tokens + mass substitution (squash 953908b)

  • 4 new AppSize tokens in FitTracker/Services/AppTheme.swift:
    • AppSize.tapTarget = 44 (iOS HIG minimum)
    • AppSize.tapTargetCompact = 36
    • AppSize.iconContainer = 28 (distinct from existing iconBadge = 26)
    • AppSize.fieldWidthCompact = 80
  • scripts/ui-audit.py APP_SIZE_VALUES allowlist extended to {28, 36, 44, 80}
  • 50 mass-substitutions of .frame(width: N, height: N) literals across 21 SwiftUI views
  • HISTORICAL v1 files automatically excluded (script-level skip)

Verification (T1): xcodebuild build PASSED; make ui-audit P0=0 maintained; P1 baseline 103 → 72 (-31).

PR #294 — PR-2: AppText tokens + a11y labels + audit-window widen (squash c0759f9)

  • 3 new AppText tokens in AppTheme.swift:
    • AppText.subheadingStrong (subheadline / rounded / semibold)
    • AppText.captionMicro (caption2 / rounded)
    • AppText.captionMicroMedium (caption2 / rounded / medium)
  • 23 font shorthand substitutions across 8 SwiftUI views (Auth × 2, Nutrition Tabs × 3, Onboarding × 1, Shared × 2)
  • 4 explicit .accessibilityLabel(...) modifiers added (AIFeedbackView × 2 thumbs, AIIntelligenceSheet dismiss, SignInView error dismiss)
  • DS-A11Y-BUTTON audit window widened 20 → 60 lines in scripts/ui-audit.py

The audit window widen surfaced two legitimate-but-flagged findings in already-correct code: Stats/v2/StatsView.metricChip (a11y label at line 271, 40 lines past Button) and Nutrition/Components/SupplementItemRow.supplementToggle (comprehensive a11y chain 58 lines past Button). The 20-line heuristic produced false-positives. 60 lines covers complex multi-line label blocks without significant cross-button false-negative risk.

Verification (T1): xcodebuild build PASSED; combined PR-1+PR-2 P1 baseline 103 → 44 (-59, 57% reduction).

Combined outcome

MetricBaseline (2026-04-24)Pre-burndown drift (2026-05-11)Post-burndownReduction
P0 ui-audit findings000maintained
P1 ui-audit findings10310844-59 (57%)
Files with findings4226-16
DS-MAGIC-FRAME6240-22
DS-RAW-FONT-SHORTHAND260-26 (100%)
DS-A11Y-BUTTON60-6 (100%)
DS-MAGIC-PADDING64-2

All numbers T1 (instrumented via make ui-audit-baseline).

The 44 long-tail P1s — by design, not omission

40 DS-MAGIC-FRAME + 4 DS-MAGIC-PADDING remaining = unique design-specific values that don't warrant single-use tokens. Examples: chart container height 180 (chart-specific; another chart legitimately uses 240), modal sheet height 260 (sheet-height is design-decision, not reusable), divider visual 50 (single-use accent geometry).

These stay as honest P1s under CLAUDE.md fix-as-you-touch. Future PRs that touch these files SHOULD clear them as part of the change — but the audit doesn't block on them, and the design system doesn't grow to accommodate them.

Risk handling

Kill criterion (locked at task approval): Net visual regression on a critical user-path screen → revert the PR; investigate token; consider whether the audit rule's recommended token is actually wrong for that context.

Resolution (post-spot-check): not_triggered_no_visual_regression. All token substitutions are visually identical at the pixel level (same SwiftUI value, just named via AppSize.* / AppText.*). Operator iOS simulator spot-check on each PR before merge confirmed zero drift.

Audit-script hardening (incidental discovery)

PR-2's DS-A11Y-BUTTON closure surfaced two legitimate-but-flagged findings in already-correct code (StatsView.metricChip + SupplementItemRow.supplementToggle). Widening the 20-line audit window to 60 fixed both without significant cross-button false-negative risk. Honest framing: widening the audit window is not "the audit becoming weaker" — the heuristic produced false-positives that would have caused authors to add redundant inline labels (genuinely worse a11y). The widened window improves accuracy on complex multi-line label blocks.

Cross-references