Skip to content

Skill bloat: SKILL.md eats ~22k tokens per session — split into core + lazy-loaded references #39

@unni-facets

Description

@unni-facets

Problem

The embedded internal/app/skill/SKILL.md is ~22k tokens (1939 lines). Because Claude Code loads the full skill body on activation, every flow-bound session pays this cost up front — before the user has typed anything.

Measured on a fresh flow do --here <slug> bootstrap (cl100k_base, ballpark):

Source Tokens
flow SKILL.md ~22,413
Other (CLAUDE.md, project/task briefs, harness reminders, tool boilerplate) ~12k
Baseline before first user prompt ~34k+

In practice the user observed the context jumping to ~90k tokens immediately on session start. The skill is the single biggest controllable contributor.

Most of SKILL.md is workflow detail that only fires for specific intents:

  • §4.2 add-task interview (~200 lines)
  • §4.7 mark done (~94 lines)
  • §4.11 scope-creep detection (~88 lines)
  • §4.13 playbook run (~140 lines)
  • §7 brief format (~110 lines)
  • §8 anti-patterns (~92 lines)
  • §9 bootstrap contract (~85 lines)
  • …and more

A typical session triggers 0–2 of these. The rest is dead weight in context.

Proposed fix

Split SKILL.md into a small core + a references/ subdir that the model reads on demand (Read tool), following the same pattern superpowers and other skills already use.

Target layout

internal/app/skill/
  SKILL.md                          # core: intent triage, command cheat-sheet,
                                    # §4.10 scoop mode, §11 dispatch, pointer table
  references/
    add-task-interview.md           # §4.2
    add-project.md                  # §4.3
    work-dir.md                     # §6
    waiting.md                      # §4.6
    done.md                         # §4.7
    archive.md                      # §4.8
    weekly-review.md                # §4.9
    scope-creep.md                  # §4.11
    playbooks.md                    # §4.12 + §4.13
    substantive-unrelated.md        # §4.14
    upgrade.md                      # §4.15
    tagging.md                      # §4.16a
    bind-session.md                 # §4.16
    brief-format.md                 # §7
    anti-patterns.md                # §8
    bootstrap-contract.md           # §9

Core SKILL.md keeps only what fires every session: intent triage, command cheat-sheet, scoop-mode (4.10 — common), dispatch routing, and a pointer table of the form "for workflow X, read references/X.md".

Code changes

  1. internal/app/skill.go:12 — change //go:embed skill/SKILL.md to //go:embed all:skill and store the embed as embed.FS instead of []byte.
  2. skillInstall — walk the embedded FS and write each file under ~/.claude/skills/flow/, preserving the references/ subdir.
  3. maybeAutoUpgradeSkill — same walk on refresh. Before writing, remove the existing skill dir so stale single-file installs (and any obsolete reference files) get cleaned up.
  4. internal/app/skill_test.go — assert references/ is installed and that auto-upgrade replaces the whole tree, not just SKILL.md.

Expected impact

  • Core SKILL.md target: ~5–6k tokens (down from ~22k).
  • Net save: ~16k tokens per session.
  • References pulled only when their workflow triggers — usually 0–2 reads per session, each cheap relative to the baseline.

Trade-offs

  • Reduced cohesion: workflow detail lives across multiple files. Mitigation: clear pointer table at the top of core SKILL.md so the model knows exactly which reference to load for each intent.
  • Auto-upgrade must handle migration from the old single-file install. Detect via missing references/ dir, remove the old install, write the new tree.
  • One extra Read per workflow hit. Negligible vs the 22k baseline.

Suggested rollout

  1. Land the embed/install/upgrade code changes first with the current SKILL.md still in one piece (no behavior change, just embed.FS plumbing + tests).
  2. Then split the content in a follow-up PR so the diff is reviewable as a pure content move.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions