Skip to content

Commit aa73817

Browse files
author
jgstern-agent
committed
feat(entrypoints): serializer concept → SERIALIZER entrypoint (WI-gudob Phase 5)
Frameworks instantiate serializer / DTO classes from model data or request payloads and reflectively call serialize-like methods (to_representation, dump, to_internal_value, toArray, exposure). The class body therefore looks dead to the static call graph even when wired into a reachable route. Classifying any symbol tagged concept: serializer as a SERIALIZER entrypoint with confidence 0.90 restores its reachability at class level. This mirrors Phase 4 (form). Serializer producers are uniformly class-level base_class matches across 9 frameworks (django REST Serializer / ModelSerializer / HyperlinkedModelSerializer, flask Marshmallow Schema / SQLAlchemySchema / SQLAlchemyAutoSchema, grape Grape::Entity, laravel JsonResource / ResourceCollection, litestar AbstractDTO / DTOData, plus plumber / pyramid / quart / rails), so the scope is clean — no heterogeneous decorator / call-level noise of the kind that caused validator / auth to be deferred at Phase 4. Per-method dispatch (to_representation → field-specific methods) is deferred to a per-framework dispatch-registry follow-up because each framework names its reflective methods differently. docs/CONCEPTS.md regenerated via scripts/generate-concepts: serializer flipped inert → live (9 producers). 3 new tests in tests/test_entrypoints.py mirror the Phase 4 form tests: framework-labelled serializer, framework-less serializer ("Serializer" label), and multi-framework dedup (one entrypoint per symbol even when two framework patterns tag it). Signed-off-by: jgstern-agent <josh-agent@iterabloom.com>
1 parent db5cbf5 commit aa73817

5 files changed

Lines changed: 97 additions & 6 deletions

File tree

.ci/affected-tests.txt

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
11
# Test selection manifest
2-
# Generated by smart-test at 2026-04-20T10:46:40-04:00
2+
# Generated by smart-test at 2026-04-20T11:27:57-04:00
33
# Mode: targeted
44
# Baseline: e82d3803685a1f005db6b87ba33d3b16e11b9178
5-
# Changed files: 46
6-
# Changed source files: 7
7-
# Selected tests: 75
5+
# Changed files: 51
6+
# Changed source files: 8
7+
# Selected tests: 76
88
#
99
# === CHANGED_SOURCE_FILES ===
1010
packages/hypergumbo-core/src/hypergumbo_core/cli.py
11+
packages/hypergumbo-core/src/hypergumbo_core/entrypoints.py
1112
packages/hypergumbo-core/src/hypergumbo_core/linkers/kafka_streams_dispatch.py
1213
packages/hypergumbo-tracker/src/hypergumbo_tracker/hotspot_markup.py
1314
packages/hypergumbo-tracker/src/hypergumbo_tracker/id_matching.py
@@ -32,6 +33,7 @@ packages/hypergumbo-core/tests/test_cli_symbols.py
3233
packages/hypergumbo-core/tests/test_cli_test_coverage.py
3334
packages/hypergumbo-core/tests/test_cli_verify_claims.py
3435
packages/hypergumbo-core/tests/test_crypto_flow_linker.py
36+
packages/hypergumbo-core/tests/test_entrypoints.py
3537
packages/hypergumbo-core/tests/test_file_excludes.py
3638
packages/hypergumbo-core/tests/test_frameworks_flag.py
3739
packages/hypergumbo-core/tests/test_gitleaks.py

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ This changelog tracks the **tool version** (package releases). The **schema vers
1212

1313
### Added
1414

15+
- **`serializer` concept → SERIALIZER entrypoint (WI-gudob Phase 5):** mirrors Phase 4 (form) — any symbol tagged `concept: serializer` is classified as a `SERIALIZER` entrypoint with confidence 0.90 and a framework-derived label (``"Django serializer"``, ``"Flask serializer"``, or plain ``"Serializer"`` when no framework attribution is present). Producers are uniformly class-level ``base_class`` matches across 9 frameworks (django DRF `Serializer` / `ModelSerializer` / `HyperlinkedModelSerializer`, flask Marshmallow `Schema` / `SQLAlchemySchema` / `SQLAlchemyAutoSchema`, grape `Grape::Entity`, laravel `JsonResource` / `ResourceCollection`, litestar `AbstractDTO` / `DTOData`, plus plumber/pyramid/quart/rails), so the scope is clean — no heterogeneous decorator/call-level noise of the kind that caused validator/auth to be deferred at Phase 4. Restores class-level reachability for serializer/DTO classes that previously looked dead to the static call graph because the framework instantiates them and reflectively calls `to_representation` / `dump` / `to_internal_value` / `toArray` / `exposure`. Per-method dispatch (`to_representation` → field-specific methods) is deferred to a per-framework dispatch-registry follow-up because each framework names its reflective methods differently. `docs/CONCEPTS.md` regenerated via `scripts/generate-concepts`: `serializer` flipped inert → live (9 producers). 3 new tests (framework label, no-framework fallback, multi-framework dedup) in `tests/test_entrypoints.py`.
16+
1517
- **Kafka Streams topology-callback dispatch linker (WI-lisov):** new Framework-subcategory linker `packages/hypergumbo-core/src/hypergumbo_core/linkers/kafka_streams_dispatch.py` that recovers reflective-dispatch edges the Kafka Streams runtime uses to invoke topology callbacks on a per-record basis. For every Java / Kotlin / Scala class whose declared `base_classes` or `interfaces` short-match one of 17 Kafka Streams callback types (`ValueMapper`, `ValueMapperWithKey`, `KeyValueMapper`, `Predicate`, `ForeachAction`, `Aggregator`, `Reducer`, `Initializer`, `Merger`, `ValueTransformer`, `ValueTransformerWithKey`, `Transformer`, `Processor`, and their four `*Supplier` factory forms), the linker emits `dispatches_to` edges from the class to the interface-specific framework-called methods (`apply`, `transform`, `process`, `get`, `init`, `close`, `test`) with confidence 0.90 and evidence `kafka_streams_dispatch`. Fully-qualified names like `org.apache.kafka.streams.kstream.ValueMapper<K, V, VR>` normalize to the short form before matching, so the Scala wrapper namespace (`org.apache.kafka.streams.scala.kstream.*`) is covered by the same interface table. Lifecycle methods for stateful transformers (`init` / `close`) are emitted alongside the per-record method so slices through a `Transformer` reach the setup and teardown as well. Registered at linker priority 21 (same tier as `jackson-dispatch`) and wired into `cli.py` alongside the other Framework-dispatch linkers. 37 tests in `tests/test_kafka_streams_dispatch_linker.py` cover `_short_type_name` (bare / qualified / generic-parameter / qualified-with-generics), `_callback_interfaces_on` (non-dict meta, base_classes match, interfaces match, qualified-name normalization, unknown-interface reject, dedup across both keys, non-list entry skip, non-string entry skip, missing keys, multi-interface ordered), `_expected_method_names` (single, Transformer lifecycle, multi-interface union, empty-list), `_build_class_method_index` (class grouping, non-method skip, non-JVM-language skip, top-level-function skip), and the integration pass (ValueMapper → apply, Transformer → init/transform/close, qualified-name match, non-callback ignored, non-JVM class ignored, no-classes early return, Kotlin support, Scala support, interface-kind target, existing-edge dedup, ProcessorSupplier → get, multi-interface union, class-without-span emits `line=0`). Scope is intentionally limited to class→method edges (sufficient to lift the 2386 `kafka_streams_internal` dead-code candidates per WI-tubot aggregate-v5); call-site → impl edges are left as a Phase 2 follow-up in WI-lisov.
1618

1719
- **Auto-strip / regression-spawn for `awaits_bakeoff_validation` verdicts (WI-dolil slice 2b):** `scripts/bakeoff-deep-reflect aggregate` gains a `--apply-verdicts` flag that executes the tracker mutations implied by the slice-2 per-claim verdict distribution. On `plurality=moved` the `awaits_bakeoff_validation` tag is stripped from the item and a discuss note records the cohort + per-repo evidence; on `plurality=no_move` a regression sub-item is spawned under the original (kind `work_item`, tags `regression` + `awaits_bakeoff_validation`, priority 1, status `todo_soft`) and a discuss note is posted on the parent; on `plurality=tied` or `inconclusive` the cohort did not settle the claim and no mutation is performed. Default behaviour is dry-run: the plan is printed (one line per step) with `(dry-run; re-run with --apply-verdicts to execute)` so humans can preview before the mutations happen. New pure helpers `build_validation_mutation_plan(summary, cohort_label)` and `apply_mutation_plan(plan, runner=...)` are exercised via an injected runner in tests so no real tracker calls occur. Multi-cohort aggregate (`--all` / `--some`) invokes the surface helper per cohort so a batch re-aggregate can settle every open claim at once. 19 new tests in `tests/test_bakeoff_deep_reflect.py` cover the plan-builder (empty summary, moved → remove_tag+discuss, no_move → spawn_regression+discuss, tied/inconclusive → note-only, missing item_id/plurality skipped, evidence truncation at 120 chars, empty-evidence fallback, `+N more` roll-up when evidence exceeds max_items), the CLI-runner (remove_tag / discuss / spawn_regression argv shapes, note + unknown-action skip, non-zero exit marks failed, `OSError` runner exception marks failed with message captured), and the print/apply surface (no-plan returns None, dry-run prints plan without applying, apply returns stats dict and prints summary line, apply prints per-step failure lines, dry-run prints every action shape). Slice 2b closes the tooling half of WI-dolil; only slice 4 (acceptance demo — a real DEEP cycle exercising the end-to-end tag → validate → strip/regress round-trip) remains.

docs/CONCEPTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This file enumerates every concept string that the framework-YAML pattern layer
88
- **inert** — producers exist but no consumer reads the concept. Candidates for either removing the producer pattern or writing a Framework-subcategory linker that consumes it.
99
- **ghost** — a consumer mentions the concept by name but no YAML emits it. Likely dead code or a leftover reference to a removed pattern; investigate.
1010

11-
Total concepts: **317** (live: 40, inert: 277, ghost: 0).
11+
Total concepts: **317** (live: 41, inert: 276, ghost: 0).
1212

1313
## Inventory
1414

@@ -251,7 +251,7 @@ Total concepts: **317** (live: 40, inert: 277, ghost: 0).
251251
| `security` | inert | fastapi, flask-appbuilder | _(none)_ |
252252
| `seeder` | inert | adonisjs, cakephp, codeigniter, laravel, symfony | _(none)_ |
253253
| `serialization_callback` | live | go-encoding-callbacks | `entrypoints.py` |
254-
| `serializer` | inert | django, flask, grape, laravel, litestar, plumber, pyramid, quart, rails | _(none)_ |
254+
| `serializer` | live | django, flask, grape, laravel, litestar, plumber, pyramid, quart, rails | `entrypoints.py` |
255255
| `serializer_field` | inert | flask-restful | _(none)_ |
256256
| `server` | inert | cowboy, http4k, http4s, pedestal, plumber, restify, servant, shiny, vertx | _(none)_ |
257257
| `service` | inert | codeigniter, feathers, guice, hanami, jakarta-cdi, micronaut, nestjs, spring-boot | _(none)_ |

packages/hypergumbo-core/src/hypergumbo_core/entrypoints.py

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,7 @@ class EntrypointKind(Enum):
194194
EVENT_HANDLER = "event_handler" # Event/message handler
195195
ERROR_HANDLER = "error_handler" # HTTP / middleware exception handler
196196
FORM = "form" # Framework-reflected form class (is_valid/save/authorize)
197+
SERIALIZER = "serializer" # Framework-reflected serializer/DTO class (to_representation / dump / toArray)
197198
SCHEDULED_TASK = "scheduled_task" # Cron/scheduled job
198199
# Library entry points (exported API)
199200
LIBRARY_EXPORT = "library_export" # Exported function/class (library entry)
@@ -306,6 +307,7 @@ def _detect_from_concepts(symbols: List[Symbol]) -> List[Entrypoint]:
306307
- "event_handler" -> EVENT_HANDLER (event/message handler)
307308
- "error_handler" -> ERROR_HANDLER (HTTP / middleware exception handler)
308309
- "form" -> FORM (framework-reflected form class)
310+
- "serializer" -> SERIALIZER (framework-reflected serializer/DTO class)
309311
- "command" -> CLI_COMMAND (CLI command handler)
310312
- "liveview" -> CONTROLLER (Phoenix LiveView - real-time UI)
311313
- "graphql_resolver" -> GRAPHQL_SERVER (GraphQL resolver)
@@ -537,6 +539,41 @@ def _detect_from_concepts(symbols: List[Symbol]) -> List[Entrypoint]:
537539
))
538540
added_kinds.add(EntrypointKind.FORM)
539541

542+
# Serializer concept -> SERIALIZER (WI-gudob Phase 5):
543+
# Framework-reflected serializer / DTO classes (Django REST
544+
# Framework Serializer / ModelSerializer, Marshmallow Schema and
545+
# SQLAlchemySchema, Grape Entity, Laravel JsonResource /
546+
# ResourceCollection, Litestar AbstractDTO / DTOData, ...).
547+
# The framework instantiates the class from a model / payload
548+
# and reflectively calls a serialize-like method
549+
# (``to_representation``, ``dump``, ``to_internal_value``,
550+
# ``toArray``, ``exposure``) — the class body therefore looks
551+
# dead to the static call graph even when wired into a
552+
# reachable route. Classifying the serializer class itself as
553+
# an entrypoint restores its reachability at class level;
554+
# per-method dispatch (``to_representation`` → field methods)
555+
# is deferred to a per-framework dispatch-registry follow-up
556+
# because each framework names its reflective methods
557+
# differently. 9 framework YAML patterns emit this concept
558+
# (django, flask, flask-restful, grape, laravel, litestar,
559+
# hanami, django-ninja, pyramid per the WI-dajul concept
560+
# registry audit) and the producers are uniformly class-level
561+
# ``base_class`` matches, so the scope is clean.
562+
elif concept_type == "serializer":
563+
if EntrypointKind.SERIALIZER in added_kinds:
564+
continue
565+
if framework:
566+
label = f"{framework.title()} serializer"
567+
else:
568+
label = "Serializer"
569+
entrypoints.append(Entrypoint(
570+
symbol_id=sym.id,
571+
kind=EntrypointKind.SERIALIZER,
572+
confidence=0.90,
573+
label=label,
574+
))
575+
added_kinds.add(EntrypointKind.SERIALIZER)
576+
540577
# Command concept -> CLI_COMMAND
541578
elif concept_type == "command":
542579
if EntrypointKind.CLI_COMMAND in added_kinds:

packages/hypergumbo-core/tests/test_entrypoints.py

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -718,6 +718,56 @@ def test_form_dedupe_per_symbol(self) -> None:
718718
form_eps = [e for e in entrypoints if e.kind == EntrypointKind.FORM]
719719
assert len(form_eps) == 1
720720

721+
def test_detect_serializer_concept(self) -> None:
722+
"""Symbol with serializer concept is detected as SERIALIZER entrypoint (WI-gudob Phase 5)."""
723+
sym = make_symbol(
724+
"UserSerializer",
725+
path="myapp/serializers.py",
726+
kind="class",
727+
meta={
728+
"concepts": [
729+
{"concept": "serializer", "framework": "django"}
730+
]
731+
},
732+
)
733+
entrypoints = detect_entrypoints([sym], [])
734+
ser_eps = [e for e in entrypoints if e.kind == EntrypointKind.SERIALIZER]
735+
assert len(ser_eps) == 1
736+
assert ser_eps[0].symbol_id == sym.id
737+
assert ser_eps[0].confidence >= 0.85
738+
assert "Django" in ser_eps[0].label or "serializer" in ser_eps[0].label.lower()
739+
740+
def test_detect_serializer_concept_without_framework(self) -> None:
741+
"""serializer concept without a framework label still produces an entrypoint."""
742+
sym = make_symbol(
743+
"BareDTO",
744+
path="src/dtos.py",
745+
kind="class",
746+
meta={"concepts": [{"concept": "serializer"}]},
747+
)
748+
entrypoints = detect_entrypoints([sym], [])
749+
ser_eps = [e for e in entrypoints if e.kind == EntrypointKind.SERIALIZER]
750+
assert len(ser_eps) == 1
751+
assert ser_eps[0].label == "Serializer"
752+
753+
def test_serializer_dedupe_per_symbol(self) -> None:
754+
"""A symbol tagged serializer twice (matched by two framework patterns)
755+
emits at most one SERIALIZER entrypoint."""
756+
sym = make_symbol(
757+
"PolySchema",
758+
path="src/schemas.py",
759+
kind="class",
760+
meta={
761+
"concepts": [
762+
{"concept": "serializer", "framework": "flask"},
763+
{"concept": "serializer", "framework": "grape"},
764+
],
765+
},
766+
)
767+
entrypoints = detect_entrypoints([sym], [])
768+
ser_eps = [e for e in entrypoints if e.kind == EntrypointKind.SERIALIZER]
769+
assert len(ser_eps) == 1
770+
721771
def test_detect_command_concept(self) -> None:
722772
"""Symbol with command concept is detected as CLI command entrypoint."""
723773
sym = make_symbol(

0 commit comments

Comments
 (0)