Skip to content

feat(provenance): v2 per-function code-byte ranges (#143 DWARF Phase 2 inc 1)#200

Merged
avrabe merged 1 commit into
mainfrom
feat/dwarf-phase2-inc1-code-offset-map
May 28, 2026
Merged

feat(provenance): v2 per-function code-byte ranges (#143 DWARF Phase 2 inc 1)#200
avrabe merged 1 commit into
mainfrom
feat/dwarf-phase2-inc1-code-offset-map

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 28, 2026

Summary

First increment of #143 DWARF Phase 2 (address remap). Extends the component-provenance custom section to v2: each entry gains an optional code_range { start, end } giving the function body's byte span in the fused code section, rebased to the code-section content start (the WebAssembly-DWARF code-section-relative address convention). This is the anchor every later DWARF remap step builds on.

User-authorized path: extend #192 to v2 + full-correctness DWARF. gimli enters in increment 3.

Changes

  • provenance::CodeRange struct + Entry::code_range: Option<CodeRange> (#[serde(default, skip_serializing_if = "Option::is_none")])
  • provenance::code_section_function_ranges(module_bytes) — re-parses the output code section via wasmparser, rebasing each FunctionBody::range() to CodeSectionStart.range.start
  • build() index-aligns ranges with merged.functions (meld emits merged functions before adapter trampolines, so position imerged.functions[i])
  • VERSION 1 → 2, additive: v1-shaped entries (no code_range key) round-trip unchanged; v1 consumers that check version first still parse the entries

Scope (honest)

Delivers accurate current byte spans. DWARF .debug_line remapping inside rewritten functions is NOT in this PR — meld's rewriter shifts intra-function offsets via LEB128 operand-length changes, which needs:

  • Increment 2: rewriter emits an instruction-level offset map (large Tier-5 change, user-confirmed appetite)
  • Increment 3: gimli-based DWARF rewrite of .debug_info + .debug_line using both maps

Test plan

  • 5 new unit tests: range ordering, non-overlap, no-code-section path, rebasing cross-check, v1/v2 backward-compat
  • 1 new integration test (v2_code_ranges_are_populated_ordered_and_nonoverlapping) against the real fused fixture
  • 291 lib tests green, clippy clean, fmt clean
  • CI green + Mythos AI scan on provenance.rs (Tier-5)

🤖 Generated with Claude Code

…2 inc 1)

First increment of DWARF Phase 2 (address remap). Extends the
component-provenance custom section to v2: each entry gains an
optional code_range { start, end } giving the function body's byte
span in the fused code section, rebased to the code-section content
start (the WebAssembly-DWARF code-section-relative address
convention). Anchor for every later DWARF remap step.

  - provenance::CodeRange + Entry::code_range: Option<_>
    (serde default + skip_serializing_if)
  - provenance::code_section_function_ranges re-parses the output
    code section, rebasing each FunctionBody range to
    CodeSectionStart.range.start
  - build() index-aligns ranges with merged.functions
  - VERSION 1->2, additive: v1-shaped entries round-trip unchanged

Tests: 5 unit + 1 integration. 291 lib tests green, clippy clean
(verified before commit). LS-M-6 updated with v2 surface + residual.

Scope: accurate current byte spans. .debug_line remapping inside
rewritten functions deferred to increment 2 (rewriter offset map) +
increment 3 (gimli rewrite).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

LS-N verification gate

⚠️ 35/37 verified — 2 missing regression tests

count
Passed (≥1 test, all green) 35
Failed (≥1 test failure) 0
Missing (no ls_*_NN_* test found) 2

Approved loss-scenarios.yaml entries are expected to have a
regression test named ls_<letter>_<num>_* (e.g. LS-A-11
ls_a_11_*). The gate runs each prefix via cargo test --lib --no-fail-fast and aggregates pass/fail/missing.

Failed LS entries

(none)

Missing regression tests
  • LS-R-13
  • LS-M-6

Updated automatically by tools/post_verification_comment.py.
Source of truth: safety/stpa/loss-scenarios.yaml.

@github-actions
Copy link
Copy Markdown

Mythos delta-pass (auto)

NO FINDINGS across 1 Tier-5 file(s)

File Verdict Hypothesis
`` ✅ NO FINDINGS

Auto-run via anthropics/claude-code-action@v1
(SHA-pinned) on the touched Tier-5 files, using the
maintainer's Max-plan OAuth token. See
.github/workflows/mythos-auto.yml and
scripts/mythos/discover.md.

@github-actions github-actions Bot added the mythos-pass-done Mythos delta-pass completed on Tier-5 file changes; findings (or NO FINDINGS) attached to PR label May 28, 2026
@avrabe avrabe merged commit f6f36ee into main May 28, 2026
13 of 14 checks passed
@avrabe avrabe deleted the feat/dwarf-phase2-inc1-code-offset-map branch May 28, 2026 17:08
@avrabe avrabe mentioned this pull request May 28, 2026
4 tasks
avrabe added a commit that referenced this pull request May 28, 2026
DWARF Phase 2 increment 1 (#143, #200): component-provenance section
v2 with per-function code-byte ranges — the anchor for DWARF address
remapping. Plus the LS-M-5 status correction (#199, already-mitigated
multiply-instantiated-module hazard).

Increments 2 (rewriter instruction-offset map) and 3 (gimli DWARF
rewrite) follow in later releases.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mythos-pass-done Mythos delta-pass completed on Tier-5 file changes; findings (or NO FINDINGS) attached to PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant