Commit 6219fa2
refactor(wiki): replace closed enums with data-driven axis registry (#28)
User direction (2026-05-12): "having only 4 values for each is a band-aid
fix, we should be able to manage anything using regex and recognition."
The Phase 1 implementation (#27, merged) shipped with hardcoded
``frozenset`` constants for kinds, lifecycles, audiences, and provenances.
That violated the ADR-2244 design intent — extensibility. This refactor
replaces every closed enum with an open-world registry.
What changes
------------
**New module: ``mcp_server/core/wiki_axis_registry.py``.**
- ``AxisValue`` dataclass: name, axis, display_name, ``patterns`` (compiled
regex), ``tag_aliases``, ``default``, ``requires_generator``,
``applies_to_kinds``, free-form ``description``.
- ``AxisRegistry``: indexed lookup, registered-names listing, default
resolver per axis.
- ``build_default_registry()``: pure-function bootstrap. Seeds the 8
kinds, 9 lifecycles (5 universal + 4 ADR-specific), 5 audiences, 4
provenances with their detection patterns and tag aliases.
- ``load_axis_registry(wiki_root)``: scans ``wiki/_schema/<axis>/<name>.md``
files and merges them with the defaults. User files override defaults
of the same name. Malformed files are skipped, never raise.
- ``match_axis(content, tags, axis, registry, restrict_to_kind=...)``:
returns the value names whose ``patterns`` or ``tag_aliases`` hit.
Lifecycle filtering enforces ``applies_to_kinds`` (ADRs use only
proposed/accepted/rejected/superseded; non-ADRs use only universal
lifecycle values).
- ``did_you_mean(axis, unknown, registry)``: difflib.get_close_matches
on registered names. Validators surface these as suggestions.
- ``get_registry()`` / ``reset_registry()``: lazy process-wide singleton
cached against ``WIKI_ROOT``.
**Refactor: ``mcp_server/shared/wiki_classification.py``.**
- ``Classification.validate()`` consults the registry rather than checking
membership against hardcoded frozensets.
- Unknown values raise ``ValueError`` with the closest matches in the
error message and the path of the file the user can write to register
the value (e.g. ``wiki/_schema/audiences/data-scientist.md``).
Implements the "reject + suggest" policy.
- The hardcoded ``KINDS`` / ``LIFECYCLES`` / ``AUDIENCES`` / ``PROVENANCES``
/ ``ADR_LIFECYCLES`` frozensets are removed. The Python defaults that
seed them now live in ``wiki_axis_registry._DEFAULT_*``.
- ``LEGACY_KINDS`` and ``LEGACY_KIND_TO_MODERN`` remain in this module
because they are a one-time backward-compat shim, not a configurable
axis. ``all_known_kinds()`` and ``normalize_legacy_kind()`` still work.
**Refactor: ``mcp_server/core/wiki_classifier.py``.**
- ``_TUTORIAL_PATTERNS`` / ``_HOWTO_PATTERNS`` / ``_RUNBOOK_PATTERNS`` /
``_RFC_PATTERNS`` / ``_JOURNAL_PATTERNS`` deleted. Their patterns
moved into ``_DEFAULT_KINDS`` entries in the registry.
- ``_detect_modern_kind``, ``_detect_provenance``, audience inference
rewritten to call ``match_axis`` against the registry. They now
contain zero hardcoded value names beyond the legacy → modern shim.
- ``_pick_lifecycle(kind)`` asks the registry for the value flagged
``default=True`` for the appropriate kind scope (ADR vs not-ADR).
- Generator block is required iff the registered provenance's
``requires_generator`` flag is True — no hardcoded provenance-name
check.
**Tests.**
- ``tests_py/core/test_wiki_axis_registry.py`` (new): 14 tests covering
the bootstrap seed, ``load_axis_registry`` against a tmp_path wiki,
the user-extension flow (adding a brand-new audience via a markdown
file), default-override semantics, malformed-file resilience,
``match_axis`` regex + tag detection, lifecycle filtering by kind,
``did_you_mean`` suggestions, and ``reset_registry`` cache invalidation.
- ``tests_py/shared/test_wiki_classification.py`` rewritten: closed-enum
membership tests replaced with reject + suggest tests verifying the
error message contains the closest match and the path of the file
the user can write to register the value.
- ``tests_py/core/test_wiki_classifier.py``: journal-detection fixture
cleaned so it doesn't accidentally trigger the ADR pattern.
How to add a new value
----------------------
Before:
1. Open ``mcp_server/shared/wiki_classification.py``.
2. Add to the ``AUDIENCES`` frozenset.
3. Update tests.
4. Open a PR. Wait for review.
After:
1. Write ``wiki/_schema/audiences/<name>.md`` with frontmatter
(name, axis, display_name, optional patterns, optional
tag_aliases, optional default flag).
2. Call ``reset_registry()`` (or restart the MCP).
3. Done.
Test results
------------
* ADR-2244 + registry surface: 96 passed
* tests_py/core/ + tests_py/shared/: 1975 passed
* ruff format and ruff check both clean
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent e1a088a commit 6219fa2
6 files changed
Lines changed: 1228 additions & 358 deletions
File tree
- mcp_server
- core
- tests_py
- core
0 commit comments