You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This finding was identified during an agentic unsafe Rust code review performed by Gemini AI, followed by human review and verification.
The Issue
V2Serializer::serialize reserves capacity in self.buf (Vec<u8>) using self.buf.reserve(max_size) and subsequently sets the length of the vector to max_size inside an unsafe block without initializing the newly allocated elements:
// want to treat the rest of the vec as a slice, and we've already reserved this
// space, so this way we don't have to resize() on a lot of dummy bytes.
self.buf.set_len(max_size);
}
While reserve(max_size) ensures sufficient allocation capacity, reserve allocates uninitialized memory from the global allocator without zeroing or writing valid byte values into it. Consequently, when self.buf.set_len(max_size) is executed, the elements from index V2_HEADER_SIZE (40) to max_size are completely uninitialized.
This itself is not UB, however taking a slice to uninitialized memory is:
let counts_len = encode_counts(h,&mutself.buf[V2_HEADER_SIZE..])?;
Finally, encode_counts only writes up to total_len <= max_size, leaving any trailing capacity total_len..max_size permanently uninitialized while included in self.buf.len().
Any caller invoking V2Serializer::serialize (or Serializer::serialize) on a histogram triggers this UB.
Attempt at Minimal Reproduction (Miri)
Unfortunately, miri does not currently transitively check slice validity (I guess it's expensive). The hacky way to repro this is to show that the vector is in an uninitialized state, but you need unsafe code to pull it out.
The UB has already happened in the .serialize() line, before the unsafe code; Miri just doesn't catch it.
use hdrhistogram::serialization::{Serializer,V2Serializer};use hdrhistogram::Histogram;fnmain(){letmut h = Histogram::<u64>::new(3).unwrap();
h.record(42).unwrap();
h.record(1000).unwrap();letmut buf = Vec::new();letmut serializer = V2Serializer::new();let total_len = serializer.serialize(&h,&mut buf).unwrap();// V2Serializer contains a single field `buf: Vec<u8>`.// We can cast a pointer to V2Serializer to a pointer to Vec<u8> to access it.let internal_buf:&Vec<u8> =
unsafe{&*(&serializer as*constV2Serializeras*constVec<u8>)};// Read the uninitialized bytes beyond total_len (which are within internal_buf.len())if internal_buf.len() > total_len {println!("Reading uninitialized byte: {}", internal_buf[total_len]);}}
Running under Miri produces the following trace:
error: Undefined Behavior: reading memory at alloc6709[0x2d..0x2e], but memory is uninitialized at [0x2d..0x2e], and this operation requires initialized memory
--> /usr/local/google/home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/num.rs:143:55
|
143 | unsafe { f.pad_integral(true, "", self._fmt(&mut buf)) }
| ^^^^ Undefined Behavior occurred here
...
598 | impl_Display!(i8, u8, i16, u16, i32, u32, i64, u64, isize, usize; as u64 into display_u64);
| ------------------------------------------------------------------------------------------ in this macro invocation
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: stack backtrace:
0: core::fmt::num::imp::<impl std::fmt::Display for u8>::fmt
at /usr/local/google/home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/num.rs:143:55: 143:59
1: core::fmt::rt::Argument::<'_>::fmt
at /usr/local/google/home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:152:76: 152:95
2: std::fmt::write
at /usr/local/google/home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:1685:17: 1687:80
Note
The full audit report below also contains additional minor findings (such as missing safety comments or undocumented FFI assumptions) that are probably worth fixing as well but not the primary goal of this issue. The audit report has not been human-reviewed, it may contain misleading claims.
Full Audit Report
Unsafe Rust Review: hdrhistogram (v7)
Overall Safety Assessment
hdrhistogram (v7) is a Rust port of the high-dynamic-range histogram library originally developed in Java. The crate exhibits a very low density of unsafe Rust code overall, keeping almost all core histogram bucketing, indexing, recording, concurrency (SyncHistogram), and iteration logic within safe Rust boundaries.
The entire unsafe surface area of the crate is confined to a single file, src/serialization/v2_serializer.rs, containing exactly three unsafe blocks. However, auditing this localized unsafe code under strict Rust Reference and standard library proof-obligation semantics reveals significant issues:
One critical soundness vulnerability (Undefined Behavior) involving Vec::set_len on uninitialized heap capacity.
One fishy finding involving an off-by-one assertion intended to guard unchecked slice accesses (get_unchecked).
Complete absence of formal // SAFETY: proof comments across all three unsafe blocks.
Aside from the V2 serialization module, the crate's architecture is sound. Concurrency primitives (SyncHistogram and Recorder) build entirely on safe abstractions (Arc, Mutex, and crossbeam_channel) without manual Send or Sync implementations. Remediating the V2 serializer issues is straightforward and would bring the crate to complete soundness.
Description:
In V2Serializer::serialize, self.buf (Vec<u8>) is cleared to length 0, and self.buf.reserve(max_size) is called to ensure sufficient capacity for encoded histogram data. At line 96, the author calls self.buf.set_len(max_size) inside an unsafe block with the intention of avoiding the cost of zeroing memory during buffer resizing.
Proof of Soundness Violation:
According to the standard library documentation for std::vec::Vec::set_len, the caller must uphold two safety preconditions:
new_len must be less than or equal to capacity().
The elements at old_len..new_len must be initialized.
While reserve(max_size) satisfies the capacity precondition, reserve allocates uninitialized memory from the global allocator without zeroing or writing valid byte values into it. When self.buf.set_len(max_size) is executed, the elements from index V2_HEADER_SIZE (40) to max_size are completely uninitialized.
According to the Rust Reference section Behavior considered undefined:
Producing an invalid value, even in private fields and locals. "Producing" a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation... An uninitialized integer (i*/u*), floating point value (f*), or raw pointer.
Because u8 does not permit uninitialized byte values (unlike MaybeUninit<u8>), setting the length of a Vec<u8> to include uninitialized elements immediately produces invalid values. Furthermore, passing &mut self.buf[V2_HEADER_SIZE..] to encode_counts creates a reference &mut [u8] pointing to uninitialized bytes, violating the fundamental Rust invariant that references must point to valid data. Finally, encode_counts only writes up to total_len <= max_size, leaving the trailing slice total_len..max_size permanently uninitialized while included in self.buf.len().
Remediation:
Replace self.buf.reserve(max_size) and unsafe { self.buf.set_len(max_size) } with safe buffer resizing: self.buf.resize(max_size, 0). Alternatively, if avoiding zeroing overhead is strictly necessary, self.buf should be converted to Vec<MaybeUninit<u8>> or written to via spare_capacity_mut(), calling set_len only after elements have been written and initialized.
Description:
In encode_counts, histogram counts are iterated using an index up to index_limit, which is computed as h.index_for(h.max()). The iteration loop while index <= index_limit accesses elements via unsafe { *(h.counts.get_unchecked(index)) } at lines 149 and 161.
Analysis:
According to slice::get_unchecked safety contract, calling get_unchecked with an out-of-bounds index is instant Undefined Behavior. Since the loop executes when index == index_limit, sound execution requires that index_limit < h.counts.len().
At line 145, the author wrote an assertion intended to establish this invariant:
assert!(index_limit <= h.counts.len());
Note the use of <= instead of <. If index_limit == h.counts.len(), this assertion passes, but subsequent evaluation of h.counts.get_unchecked(index_limit) accesses element h.counts.len(), causing out-of-bounds memory access and Undefined Behavior.
In practice, for any uncorrupted Histogram, counts is resized during recording to cover highest_trackable_value, and index_for(h.max()) is strictly less than h.counts.len(). The author included a helpful inline note explaining their intent: // index is inside h.counts because of the assert above. This shows good maintainer intent to document why the unchecked access is safe. However, because the assertion at line 145 uses <= instead of <, expanding this check to < and formalizing the inline note into an explicit // SAFETY: comment will make the safety invariant robust and self-documenting for future maintainers.
Remediation:
Correct the assertion to strictly check in-bounds indices: assert!(index_limit < h.counts.len());.
Missing Safety Comments
None of the three unsafe blocks in the crate possess formal // SAFETY: comments outlining why proof obligations are met. Informal comments explain the author's optimization and safety intent, but formalizing them into explicit // SAFETY: comments will document the preconditions for future maintainers.
**1. src/serialization/v2_serializer.rs:96**🟠
-Context: unsafe { self.buf.set_len(max_size); }
-Missing Proof Obligation: Must prove max_size <= self.buf.capacity() and that elements in 0..max_size are initialized. (Currently unsound as noted in Critical Findings).
-Proposed Proof Comment(if refactored to initialize memory prior to set_len):
// SAFETY: `self.buf` has capacity >= `max_size` due to `reserve(max_size)` on line 79. All elements in `0..max_size` have been properly initialized prior to setting the length.
(Note: If remediated using safe code like self.buf.resize(max_size, 0), the unsafe block and comment are completely removed).
**2. src/serialization/v2_serializer.rs:149**🟡
-Context: let count = unsafe { *(h.counts.get_unchecked(index)) };
-Missing Proof Obligation: Must prove index < h.counts.len().
-Proposed Proof Comment(assuming line 145 assertion is corrected to <):
// SAFETY: The outer loop condition ensures `index <= index_limit`. The assertion on line 145 guarantees `index_limit < h.counts.len()`, proving that `index < h.counts.len()` holds for every iteration.
**3. src/serialization/v2_serializer.rs:161**🟡
-Context: && (unsafe { *(h.counts.get_unchecked(index)) } == T::zero())
-Missing Proof Obligation: Must prove index < h.counts.len() during the inner zero-skipping loop.
-Proposed Proof Comment:
// SAFETY: The inner loop condition explicitly checks `index <= index_limit` before evaluating `get_unchecked(index)`. Since `index_limit < h.counts.len()` (guaranteed by line 145), `index` is strictly within bounds of `h.counts`.
Note
This finding was identified during an agentic unsafe Rust code review performed by Gemini AI, followed by human review and verification.
The Issue
V2Serializer::serializereserves capacity inself.buf(Vec<u8>) usingself.buf.reserve(max_size)and subsequently sets the length of the vector tomax_sizeinside anunsafeblock without initializing the newly allocated elements:HdrHistogram_rust/src/serialization/v2_serializer.rs
Lines 96 to 100 in a3818d6
While
reserve(max_size)ensures sufficient allocation capacity,reserveallocates uninitialized memory from the global allocator without zeroing or writing valid byte values into it. Consequently, whenself.buf.set_len(max_size)is executed, the elements from indexV2_HEADER_SIZE(40) tomax_sizeare completely uninitialized.This itself is not UB, however taking a slice to uninitialized memory is:
HdrHistogram_rust/src/serialization/v2_serializer.rs
Line 102 in a3818d6
Finally,
encode_countsonly writes up tototal_len <= max_size, leaving any trailing capacitytotal_len..max_sizepermanently uninitialized while included inself.buf.len().Any caller invoking
V2Serializer::serialize(orSerializer::serialize) on a histogram triggers this UB.Attempt at Minimal Reproduction (Miri)
Unfortunately, miri does not currently transitively check slice validity (I guess it's expensive). The hacky way to repro this is to show that the vector is in an uninitialized state, but you need unsafe code to pull it out.
The UB has already happened in the
.serialize()line, before theunsafecode; Miri just doesn't catch it.Running under Miri produces the following trace:
Note
The full audit report below also contains additional minor findings (such as missing safety comments or undocumented FFI assumptions) that are probably worth fixing as well but not the primary goal of this issue. The audit report has not been human-reviewed, it may contain misleading claims.
Full Audit Report
Unsafe Rust Review:
hdrhistogram(v7)Overall Safety Assessment
hdrhistogram(v7) is a Rust port of the high-dynamic-range histogram library originally developed in Java. The crate exhibits a very low density ofunsafeRust code overall, keeping almost all core histogram bucketing, indexing, recording, concurrency (SyncHistogram), and iteration logic within safe Rust boundaries.The entire
unsafesurface area of the crate is confined to a single file,src/serialization/v2_serializer.rs, containing exactly threeunsafeblocks. However, auditing this localizedunsafecode under strict Rust Reference and standard library proof-obligation semantics reveals significant issues:Vec::set_lenon uninitialized heap capacity.get_unchecked).// SAFETY:proof comments across all threeunsafeblocks.Aside from the V2 serialization module, the crate's architecture is sound. Concurrency primitives (
SyncHistogramandRecorder) build entirely on safe abstractions (Arc,Mutex, andcrossbeam_channel) without manualSendorSyncimplementations. Remediating the V2 serializer issues is straightforward and would bring the crate to complete soundness.Critical Findings
1. Unsound buffer initialization via⚠️
Vec::set_lenexposing uninitializedu8values 🔴Priority: 🔴 High
Threat Vector:⚠️ Accidental Misuse
Bug Type:
Uninitialized Memory ExposureLocation:
src/serialization/v2_serializer.rs:96Description:
In
V2Serializer::serialize,self.buf(Vec<u8>) is cleared to length 0, andself.buf.reserve(max_size)is called to ensure sufficient capacity for encoded histogram data. At line 96, the author callsself.buf.set_len(max_size)inside anunsafeblock with the intention of avoiding the cost of zeroing memory during buffer resizing.Proof of Soundness Violation:
According to the standard library documentation for
std::vec::Vec::set_len, the caller must uphold two safety preconditions:new_lenmust be less than or equal tocapacity().old_len..new_lenmust be initialized.While
reserve(max_size)satisfies the capacity precondition,reserveallocates uninitialized memory from the global allocator without zeroing or writing valid byte values into it. Whenself.buf.set_len(max_size)is executed, the elements from indexV2_HEADER_SIZE(40) tomax_sizeare completely uninitialized.According to the Rust Reference section Behavior considered undefined:
Because
u8does not permit uninitialized byte values (unlikeMaybeUninit<u8>), setting the length of aVec<u8>to include uninitialized elements immediately produces invalid values. Furthermore, passing&mut self.buf[V2_HEADER_SIZE..]toencode_countscreates a reference&mut [u8]pointing to uninitialized bytes, violating the fundamental Rust invariant that references must point to valid data. Finally,encode_countsonly writes up tototal_len <= max_size, leaving the trailing slicetotal_len..max_sizepermanently uninitialized while included inself.buf.len().Remediation:
Replace
self.buf.reserve(max_size)andunsafe { self.buf.set_len(max_size) }with safe buffer resizing:self.buf.resize(max_size, 0). Alternatively, if avoiding zeroing overhead is strictly necessary,self.bufshould be converted toVec<MaybeUninit<u8>>or written to viaspare_capacity_mut(), callingset_lenonly after elements have been written and initialized.Fishy Findings
1. Off-by-one assertion guarding⚠️
slice::get_uncheckedaccesses 🟠Priority: 🟠 Medium
Threat Vector:⚠️ Accidental Misuse
Bug Type:
Improper Bounds CheckLocation:
src/serialization/v2_serializer.rs:145Description:
In
encode_counts, histogram counts are iterated using an index up toindex_limit, which is computed ash.index_for(h.max()). The iteration loopwhile index <= index_limitaccesses elements viaunsafe { *(h.counts.get_unchecked(index)) }at lines 149 and 161.Analysis:
According to
slice::get_uncheckedsafety contract, callingget_uncheckedwith an out-of-bounds index is instant Undefined Behavior. Since the loop executes whenindex == index_limit, sound execution requires thatindex_limit < h.counts.len().At line 145, the author wrote an assertion intended to establish this invariant:
Note the use of
<=instead of<. Ifindex_limit == h.counts.len(), this assertion passes, but subsequent evaluation ofh.counts.get_unchecked(index_limit)accesses elementh.counts.len(), causing out-of-bounds memory access and Undefined Behavior.In practice, for any uncorrupted
Histogram,countsis resized during recording to coverhighest_trackable_value, andindex_for(h.max())is strictly less thanh.counts.len(). The author included a helpful inline note explaining their intent:// index is inside h.counts because of the assert above. This shows good maintainer intent to document why the unchecked access is safe. However, because the assertion at line 145 uses<=instead of<, expanding this check to<and formalizing the inline note into an explicit// SAFETY:comment will make the safety invariant robust and self-documenting for future maintainers.Remediation:
Correct the assertion to strictly check in-bounds indices:
assert!(index_limit < h.counts.len());.Missing Safety Comments
None of the three
unsafeblocks in the crate possess formal// SAFETY:comments outlining why proof obligations are met. Informal comments explain the author's optimization and safety intent, but formalizing them into explicit// SAFETY:comments will document the preconditions for future maintainers.**1.
src/serialization/v2_serializer.rs:96**🟠-Context:
unsafe { self.buf.set_len(max_size); }-Missing Proof Obligation: Must prove
max_size <= self.buf.capacity()and that elements in0..max_sizeare initialized. (Currently unsound as noted in Critical Findings).-Proposed Proof Comment(if refactored to initialize memory prior to
set_len):// SAFETY: `self.buf` has capacity >= `max_size` due to `reserve(max_size)` on line 79. All elements in `0..max_size` have been properly initialized prior to setting the length.(Note: If remediated using safe code like
self.buf.resize(max_size, 0), theunsafeblock and comment are completely removed).**2.
src/serialization/v2_serializer.rs:149**🟡-Context:
let count = unsafe { *(h.counts.get_unchecked(index)) };-Missing Proof Obligation: Must prove
index < h.counts.len().-Proposed Proof Comment(assuming line 145 assertion is corrected to
<):// SAFETY: The outer loop condition ensures `index <= index_limit`. The assertion on line 145 guarantees `index_limit < h.counts.len()`, proving that `index < h.counts.len()` holds for every iteration.**3.
src/serialization/v2_serializer.rs:161**🟡-Context:
&& (unsafe { *(h.counts.get_unchecked(index)) } == T::zero())-Missing Proof Obligation: Must prove
index < h.counts.len()during the inner zero-skipping loop.-Proposed Proof Comment:
// SAFETY: The inner loop condition explicitly checks `index <= index_limit` before evaluating `get_unchecked(index)`. Since `index_limit < h.counts.len()` (guaranteed by line 145), `index` is strictly within bounds of `h.counts`.