You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Notes and ideas on how to improve the library, in large part based on learnings from 0.2*, likely in backwards incompatible manner:
1. unify Inspector & Symbolizer types
because both may cache similar data, users may end up with increased memory usage and additional work being performed when symbolizing and inspecting from the same symbol source
2. have first class differentiation between container and non-container formats
the unification of symbolization using container formats (kernel, process, APK) with that of single sources (ELF, Gsym, ...) is not the best idea
for process symbolization, for example, it would be nice to report the binary that an address falls into, even if ultimately an address could not be symbolized, but this data makes little sense in other contexts
3. consider keeping copies of cached data internally
right now we mmap symbol sources and effectively use zero-copy parsing and then hand out mmap'ed data
this is fine and works well and is performant, but it is troublesome if users modify symbol source data behind our backs
but because we tie everything we report to the Symbolizer instance anyway, it may be beneficial to just have a bump allocator inside the Symbolizer instance and hand out data allocated there
this could improve locality and would allow us to release the mmapings and it would be safer in the case of modified data
4. rework programmable dispatch (this is related to point 2., as it affects container formats)
we likely need a way to decide whether to invoke "default" dispatch path before or after, as both can make sense in different contexts
perhaps we may want to work with data from the file system unconditionally, to integrate with the FileCache
right now, because of the API design, we support arbitrary "resolvers" that don't expose any file system paths to the core library (the upside is that things could conceivably be kept in memory, but use of that is probably rare)
5. The on-demand created KernelResolver stuff is...weird. Perhaps it would be better to set relevant kernel data once for the Symbolizer object and not on a per-request basis. That would open the door to caching KernelResolver objects, which would allow us to move more logic in there.
6. our caching story has holes that are hard to plug with the current design; e.g., because users are allowed to create ElfResolver objects which are independent of the Symbolizer, it is easy to accidentally circumvent the Symbolizer cache, potentially resulting in memory bloat (I recall this being a problem elsewhere)
if we created resolvers via the Symbolizer they belong to, we wouldn't have to worry about that (we'd also be able to remove APIs such as Symbolizer::register_elf_resolver)
Notes and ideas on how to improve the library, in large part based on learnings from 0.2*, likely in backwards incompatible manner:
1. unify
Inspector&Symbolizertypes2. have first class differentiation between container and non-container formats
3. consider keeping copies of cached data internally
mmapsymbol sources and effectively use zero-copy parsing and then hand out mmap'ed dataSymbolizerinstance anyway, it may be beneficial to just have a bump allocator inside theSymbolizerinstance and hand out data allocated there4. rework programmable dispatch (this is related to point 2., as it affects container formats)
FileCache5. The on-demand created
KernelResolverstuff is...weird. Perhaps it would be better to set relevant kernel data once for theSymbolizerobject and not on a per-request basis. That would open the door to cachingKernelResolverobjects, which would allow us to move more logic in there.6. our caching story has holes that are hard to plug with the current design; e.g., because users are allowed to create
ElfResolverobjects which are independent of theSymbolizer, it is easy to accidentally circumvent theSymbolizercache, potentially resulting in memory bloat (I recall this being a problem elsewhere)Symbolizerthey belong to, we wouldn't have to worry about that (we'd also be able to remove APIs such asSymbolizer::register_elf_resolver)