Commit eed01b7
authored
dyninst/symdb: fix cross-CU inlined functions (#49860)
### What does this PR do?
Extends the `symdb` cross-compile-unit indexing scheme introduced in #48822 to cover **non-generic** abstract inline functions and every inline instance regardless of which compile unit holds it. Along the way, fixes a streaming-iteration bug where a package could be yielded more than once, with the trailing yield containing only "leftover" functions that symdb couldn't attribute to the first yield in time.
The patch series is:
1. **`dyninst/symdb: drive TestSymDBSnapshot through streaming PackagesIterator`**
Replace the existing `TestSymDBSnapshot` (which called `ExtractSymbols` and aggregated all yields into a single `Symbols` value before serialization) with a test that drives the streaming `PackagesIterator` API dire
2. **`dyninst/testprogs: add cross-CU inline test cases`**
Add `lib.InlinedInLaterCU`, a function defined in `lib` but called only from `lib.v2.FooV2`. Because `lib.v2`'s compile unit is emitted *after* `lib`'s CU in the binary, the Go compiler places the abstract-definition DIE in `lib.v2`'s CU. Regenerate the snapshot goldens with the existing (pre-fix) `symdb` code — the new streaming goldens clearly capture the bug described below.
3. **`dyninst/symdb: rename genericFuncIndex to funcOffsetByNameIndex`**
Pure rename of the sorted name-keyed index type to decouple it from its initial use for generic shape functions. No behaviour change.
4. **`dyninst/symdb: index inline instances across compile units`**
The actual fix. Adds two pre-pass indexes:
- `inlineDefIndex` — canonical qualified name → abstract-definition DWARF offset, for every `DW_TAG_subprogram` with `DW_AT_inline = DW_INL_inlined` (both generic and non-generic).
- `inlineInstanceIndex` — abstract-origin DWARF offset → instance DWARF offset, for every inline instance (both `DW_TAG_inlined_subroutine` and out-of-line `DW_TAG_subprogram` with `DW_AT_abstract_origin`).
At each package's CU emission, every instance of every abstract function owned by that package is replayed from `inlineInstanceIndex` so that `file`/`endLine`/`injectibleLines` and per-variable availability are populated regardless of which CU the abstract DIE or its instances live in. Emission is scoped to the defining CU: `b.abstractFunctions` is no longer cleared per CU, so instance data accumulates across CUs, and `addAbstractFunctions` only emits entries whose owning package matches. The `displacedFunctions` fallback is removed from the abstract-function path (and `addFunctionToPackage`) because the indexes cover that case by construction — verified with an empirical check that the fallback never fires on any of the sample programs.
### Motivation
A pre-existing bug in the streaming `PackagesIterator` API: a package can be yielded **more than once**, with the second yield containing only the "leftover" functions that `symdb` couldn't attribute to the first yield in time. The trailing yields are assembled from `displacedFunctions`, a side channel that accumulates functions parsed from non-owner CUs for later merging.
Two concrete cases in the sample binary:
- **Non-generic, cross-CU abstract.** `lib.InlinedInLaterCU` is defined in `lib` and inlined only from `lib.v2.FooV2`. The Go compiler emits the abstract-definition DIE in `lib.v2`'s CU, which comes *after* `lib`'s CU in the binary. When `symdb` iterates CUs:
- `lib`'s CU is processed first and yields a `lib` package without `InlinedInLaterCU`.
- `lib.v2`'s CU is processed later; its `exploreSubprogram` parses `InlinedInLaterCU`'s abstract DIE and, because it belongs to `lib` not `lib.v2`, routes it through `displacedFunctions["lib"]`.
- At end-of-iteration, `displacedFunctions["lib"]` is yielded as a new `lib` package containing just this function.
`ExtractSymbols` hid this by aggregating both yields into one `Symbols` value before serialization. `PackagesIterator` (used by the production uploader, `pkg/dyninst/module/symdb.go`) does not — it passes both yields through as two separate uploads, so the server sees the same package twice with disjoint contents. Switching the test to the streaming API (patch 1) makes this visible.
- **Generic shape, cross-CU abstract.** `lib.NewImmutableSet[go.shape.string]`'s abstract DIE lands in `main`'s CU (the compiler's generic-shape placement behaviour). The existing `augmentWithDisplacedGenerics` path re-parses the abstract DIE while processing `lib`'s CU, but never replays its inline instance, so it produces a broken stub entry with empty `file`/`endLine`/`injectibleLines` alongside the correctly-populated entry. Both entries end up in the output.
Both problems have the same shape: `symdb`'s abstract-function handling was scoped to the CU currently being walked, with a best-effort side channel for everything that didn't fit. The fix is to precompute the two indexes (abstract definitions, inline instances) in the existing pre-pass and replay instances at each package's emission point, so every abstract function is emitted exactly once, on its owning package's yield, with complete per-instance data — regardless of the compiler's DIE placement choices.
### Describe how you validated your changes
- New unit tests for the origin-keyed index (`TestFuncOffsetByOriginIndex`) covering empty / single / multi / duplicate origins / early-break iteration across both in-memory and on-disk (mmap-backed) variants.
- `TestSymDBSnapshot` goldens across 4 Go toolchains and 2 architectures. The golden diff across the commit chain visibly demonstrates:
- pre-fix streaming output: 5 `lib` yields for `sample`, with a trailing duplicate yield containing `InlinedInLaterCU` + a broken `NewImmutableSet` stub.
- post-fix: 4 yields, with `InlinedInLaterCU` attached to `lib`'s primary yield and the broken stub removed.
- `TestIRGenAndCompileSymDBProbes` passes across all toolchain/arch combinations.
- Dyninst E2E tests (`Dyn/sample/1.25`) pass.
### Additional Notes
https://datadoghq.atlassian.net/browse/DEBUG-5497
Co-authored-by: andrew.werner <andrew.werner@datadoghq.com>1 parent 4b1f0fd commit eed01b7
174 files changed
Lines changed: 18289 additions & 17195 deletions
File tree
- pkg/dyninst
- gosymname
- gotype/testdata/snapshot
- irgen/testdata/snapshot
- symdb
- testdata/snapshot
- testdata/decoded/sample
- testprogs/progs/sample
- lib.v2
- lib
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
40 | 74 | | |
41 | 75 | | |
42 | 76 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
0 commit comments