Path to `0.3`...

Notes and ideas on how to improve the library, in large part based on learnings from 0.2*, likely in backwards incompatible manner:

- [ ] 1. unify `Inspector` & `Symbolizer` types
  - because both may cache similar data, users may end up with increased memory usage and additional work being performed when symbolizing and inspecting from the same symbol source

- [ ] 2. have first class differentiation between container and non-container formats
  - the unification of symbolization using container formats (kernel, process, APK) with that of single sources (ELF, Gsym, ...) is not the best idea
  - for process symbolization, for example, it would be nice to report the binary that an address falls into, even if ultimately an address could not be symbolized, but this data makes little sense in other contexts
  - similarly, we may want to report more detailed "module" information (see https://github.com/libbpf/blazesym/pull/1183) for these container formats

- [ ] 3. consider keeping copies of cached data internally
  - right now we `mmap` symbol sources and effectively use zero-copy parsing and then hand out mmap'ed data
  - this is fine and works well and is performant, but it is troublesome if users modify symbol source data behind our backs
  - but because we tie everything we report to the `Symbolizer` instance anyway, it may be beneficial to just have a bump allocator inside the `Symbolizer` instance and hand out data allocated there
  - this could improve locality and would allow us to release the mmapings and it would be safer in the case of modified data

- [ ] 4. rework programmable dispatch (this is related to point 2., as it affects container formats)
  - we likely need a way to decide whether to invoke "default" dispatch path before or after, as both can make sense in different contexts
  - perhaps we may want to work with data from the file system unconditionally, to integrate with the `FileCache`
  - right now, because of the API design, we support arbitrary "resolvers" that don't expose any file system paths to the core library (the upside is that things could conceivably be kept in memory, but use of that is probably rare)

- [ ] 5. The on-demand created `KernelResolver` stuff is...weird. Perhaps it would be better to set relevant kernel data once for the `Symbolizer` object and not on a per-request basis. That would open the door to caching `KernelResolver` objects, which would allow us to move more logic in there.
- [ ] 6. our caching story has holes that are hard to plug with the current design; e.g., because users are allowed to create `ElfResolver` objects which are independent of the `Symbolizer`, it is easy [to accidentally circumvent the `Symbolizer` cache](https://github.com/libbpf/blazesym/pull/1535#discussion_r3088483509), potentially resulting in memory bloat (I recall this being a problem elsewhere)
  - if we created resolvers via the `Symbolizer` they belong to, we wouldn't have to worry about that (we'd also be able to remove APIs such as [`Symbolizer::register_elf_resolver`](https://docs.rs/blazesym/latest/blazesym/symbolize/struct.Symbolizer.html#method.register_elf_resolver))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Path to `0.3`... #1320

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Path to 0.3... #1320

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Path to `0.3`... #1320