From 6f4826dfddacd1ba321acd4260acdac15543f225 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 21 Nov 2025 20:59:28 +0000 Subject: [PATCH 01/18] Update documentation formatting and add decision log for hashing architecture --- src/main/docs/algorithm-profiles.adoc | 31 ++++---- src/main/docs/architecture-overview.adoc | 14 ++-- src/main/docs/change-log-template.adoc | 5 +- src/main/docs/decision-log.adoc | 78 ++++++++++++++++++++ src/main/docs/invariants-and-contracts.adoc | 15 ++-- src/main/docs/performance-benchmarks.adoc | 7 +- src/main/docs/project-requirements.adoc | 62 ++++++++++++++++ src/main/docs/specifications.adoc | 7 +- src/main/docs/testing-strategy.adoc | 8 +- src/main/docs/unsafe-and-platform-notes.adoc | 7 +- 10 files changed, 191 insertions(+), 43 deletions(-) create mode 100644 src/main/docs/decision-log.adoc create mode 100644 src/main/docs/project-requirements.adoc diff --git a/src/main/docs/algorithm-profiles.adoc b/src/main/docs/algorithm-profiles.adoc index f62b785..bbe8d29 100644 --- a/src/main/docs/algorithm-profiles.adoc +++ b/src/main/docs/algorithm-profiles.adoc @@ -1,18 +1,19 @@ -== Algorithm Profiles += Algorithm Profiles +:toc: +:lang: en-GB +:source-highlighter: rouge :pp: ++ Chronicle Software -toc::[] - === CityHash 1.1 Factories :: `LongHashFunction.city_1_1()`, `.city_1_1(long)`, `.city_1_1(long, long)` (`LongHashFunction.java:53-115`). Implementation :: -`net.openhft.hashing.CityAndFarmHash_1_1` ports Google’s CityHash64 v1.1 (`CityAndFarmHash_1_1.java`). +`net.openhft.hashing.CityAndFarmHash_1_1` ports Google's CityHash64 v1.1 (`CityAndFarmHash_1_1.java`). Key traits :: -* Normalises inputs to little-endian and forwards short-length cases to specialised mix routines (1–3, 4–7, 8–16 byte fast paths). +* Normalises inputs to little-endian and forwards short-length cases to specialised mix routines (1-3, 4-7, 8-16 byte fast paths). * Produces identical output across host endianness; big-endian incurs the expected byte swapping cost. * Provides seedless, single-seed, and dual-seed variants mirroring the upstream API. @@ -31,17 +32,17 @@ Key traits :: Factories :: `LongHashFunction.farmUo()`, `.farmUo(long)`, `.farmUo(long, long)` (`LongHashFunction.java:181-243`). Implementation :: -Also hosted in `CityAndFarmHash_1_1`, which covers the 1.1 update’s longer pipelines. +Also hosted in `CityAndFarmHash_1_1`, which covers the 1.1 update's longer pipelines. Key traits :: -* Maintains parity with Google’s C{pp} release for test vectors. -* Endianness neutral: always routes through an `Access` view that matches the algorithm’s little-endian assumptions. +* Maintains parity with Google's C{pp} release for test vectors. +* Endianness neutral: always routes through an `Access` view that matches the algorithm's little-endian assumptions. === MurmurHash3 Factories :: `LongHashFunction.murmur_3()`, `.murmur_3(long)` for 64-bit (`LongHashFunction.java:245-268`); `LongTupleHashFunction.murmur_3()`, `.murmur_3(long)` for 128-bit (`LongTupleHashFunction.java:35-69`). Implementation :: -`net.openhft.hashing.MurmurHash_3` adapts Austin Appleby’s x64 variants. +`net.openhft.hashing.MurmurHash_3` adapts Austin Appleby's x64 variants. It extends `DualHashFunction` so the 128-bit engine also exposes the low 64 bits through `LongHashFunction`. Key traits :: * Little-endian canonicalisation via `Access.byteOrder`. @@ -54,7 +55,7 @@ Factories :: Implementation :: `net.openhft.hashing.XxHash` ports the official XXH64 reference and keeps the unsigned prime constants as signed Java longs. Key traits :: -* Uses four-lane accumulation for ≥32 byte inputs, matching upstream behaviour bit-for-bit. +* Uses four-lane accumulation for >= 32 byte inputs, matching upstream behaviour bit-for-bit. * Applies the canonical avalanche round in `XxHash.finalize` for all lengths. * Seeded and seedless instances differ only by the stored `seed()` override; serialisation preserves both forms. @@ -67,7 +68,7 @@ Implementation :: `net.openhft.hashing.XXH3` keeps the FARSH-derived 192 byte secret and streaming logic. It defines distinct entry points for 64-bit, 128-bit, and low-64-bit projections. Key traits :: -* Optimises for short messages with dedicated 1–3, 4–8, 9–16, 17–128, and 129–240 byte paths. +* Optimises for short messages with dedicated 1-3, 4-8, 9-16, 17-128, and 129-240 byte paths. * Uses `UnsafeAccess.INSTANCE.byteOrder(null, LITTLE_ENDIAN)` once to avoid per-call adapter allocation. * The 128-bit variant reuses the same mixing core; exposing the low 64 bits avoids extra copies for callers that only need a single `long`. @@ -76,10 +77,10 @@ Key traits :: Factories :: `LongHashFunction.wy_3()`, `.wy_3(long)` (`LongHashFunction.java:343-369`). Implementation :: -`net.openhft.hashing.WyHash` mirrors Wang Yi’s version 3 reference, including the `_wymum` 128-bit multiply-fold helper built on `Maths.unsignedLongMulXorFold`. +`net.openhft.hashing.WyHash` mirrors Wang Yi's version 3 reference, including the `_wymum` 128-bit multiply-fold helper built on `Maths.unsignedLongMulXorFold`. Key traits :: * Supports streaming chunks up to 256 bytes per loop iteration; beyond that it accumulates in 32 byte strides. -* Handles ≤3, ≤8, ≤16, ≤24, ≤32 byte inputs with the same branching as the C code. +* Handles <=3, <=8, <=16, <=24, <=32 byte inputs with the same branching as the C code. * Maintains deterministic output across architectures while acknowledging the performance hit on big-endian systems. === MetroHash (metrohash64_2) @@ -87,8 +88,8 @@ Key traits :: Factories :: `LongHashFunction.metro()`, `.metro(long)` (`LongHashFunction.java:371-389`). Implementation :: -`net.openhft.hashing.MetroHash` implements the 64-bit metrohash variant with the `_2` initialisation vector, matching the original author’s reference. +`net.openhft.hashing.MetroHash` implements the 64-bit metrohash variant with the `_2` initialisation vector, matching the original author's reference. Key traits :: -* Performs four-lane unrolled mixing for ≥32 byte inputs and cascades down to 16, 8, 4, 2, and 1 byte tails. +* Performs four-lane unrolled mixing for >=32 byte inputs and cascades down to 16, 8, 4, 2, and 1 byte tails. * Uses deterministic finalisation (`MetroHash.finalize`) shared by scalar and streaming paths. * Seeded instances override `seed()` and cache the pre-hashed `hashVoid()` constant to avoid re-computation. diff --git a/src/main/docs/architecture-overview.adoc b/src/main/docs/architecture-overview.adoc index 72775ef..b5e7e7a 100644 --- a/src/main/docs/architecture-overview.adoc +++ b/src/main/docs/architecture-overview.adoc @@ -1,21 +1,21 @@ -== Zero-Allocation Hashing Architecture Overview -:pp: ++ += Zero-Allocation Hashing Architecture Overview +:toc: +:lang: en-GB +:source-highlighter: rouge Chronicle Software -toc::[] - === Entry Points * `net.openhft.hashing.LongHashFunction` is the primary façade for 64-bit hashes. -It exposes factory methods for CityHash 1.1, FarmHash (NA and UO variants), MurmurHash3, xxHash, XXH3 (64-bit), wyHash v3, and MetroHash (`LongHashFunction.java`). +It exposes factory methods for CityHash 1.1, FarmHash (NA and UO variants), MurmurHash_3, xxHash, XXH3 (64-bit), wyHash v3, and MetroHash (`LongHashFunction.java`). * `net.openhft.hashing.LongTupleHashFunction` provides multi-word hash results. -It currently delivers 128-bit MurmurHash3 and XXH3 outputs and mirrors the single-word API with reusable `long[]` buffers (`LongTupleHashFunction.java`). +It currently delivers 128-bit MurmurHash_3 and XXH3 outputs and mirrors the single-word API with reusable `long[]` buffers (`LongTupleHashFunction.java`). * `net.openhft.hashing.DualHashFunction` bridges tuple implementations back into the `LongHashFunction` contract, ensuring seeded XXH128 and similar algorithms can expose both 64-bit and 128-bit variants without duplicating logic (`DualHashFunction.java`). === Memory Access Abstractions -* All hashing flows rely on `net.openhft.hashing.Access` to read primitive values from arrays, direct buffers, off-heap memory, or custom structures. `Access.byteOrder(input, desiredOrder)` returns a view that matches the algorithm’s expected endianness (`Access.java:273-308`). +* All hashing flows rely on `net.openhft.hashing.Access` to read primitive values from arrays, direct buffers, off-heap memory, or custom structures. `Access.byteOrder(input, desiredOrder)` returns a view that matches the algorithm's expected endianness (`Access.java:273-308`). * Concrete strategies cover heap arrays (`UnsafeAccess.INSTANCE`), `ByteBuffer` (`ByteBufferAccess`), `CharSequence` in native or explicit byte order (`CharSequenceAccess`), and compact Latin-1 backed strings (`CompactLatin1CharSequenceAccess`). * `UnsafeAccess` wraps `sun.misc.Unsafe` for zero-copy reads, falling back to legacy helpers when `getByte` or `getShort` are absent (e.g., pre-Nougat Android) (`UnsafeAccess.java:40-118`). * Reverse-order wrappers are generated automatically through `Access.newDefaultReverseAccess`, allowing algorithms to treat every source as little-endian while still accepting big-endian buffers (`Access.java:295-344`). diff --git a/src/main/docs/change-log-template.adoc b/src/main/docs/change-log-template.adoc index da67691..52e8ae4 100644 --- a/src/main/docs/change-log-template.adoc +++ b/src/main/docs/change-log-template.adoc @@ -1,4 +1,7 @@ -== Change Log Template += Change Log Template +:toc: +:lang: en-GB +:source-highlighter: rouge Chronicle Software diff --git a/src/main/docs/decision-log.adoc b/src/main/docs/decision-log.adoc new file mode 100644 index 0000000..9da323f --- /dev/null +++ b/src/main/docs/decision-log.adoc @@ -0,0 +1,78 @@ += Decision Log +:toc: +:lang: en-GB +:source-highlighter: rouge + +Chronicle Software + +== ADR-001: Usage of `sun.misc.Unsafe` for Memory Access + +[cols="1,3"] +|=== +| Status | Accepted +| Context | To achieve the "Zero-Allocation" goal and maximize throughput, standard Java array access checks and byte-by-byte reading overhead are prohibitive. +| Decision | Use `sun.misc.Unsafe` (`UnsafeAccess.java`) to perform raw memory reads (getLong, getInt) directly from heap arrays and off-heap addresses. +| Consequences | +* **Positive:** Maximal performance; enables "type punning" (reading bytes as longs) efficiently. +* **Negative:** Depends on internal JDK APIs. Requires specific `--add-opens` flags on modularized JDKs (Java 9+). +|=== + +== ADR-002: Little-Endian Canonicalization + +[cols="1,3"] +|=== +| Status | Accepted +| Context | Different hardware architectures store multibyte primitives in different orders (Little-Endian vs Big-Endian). Hashing algorithms (often ported from C++) usually assume a specific order (mostly LE). +| Decision | All implementations must normalize input to Little-Endian before processing (`Primitives.nativeToLittleEndian`, `Access.byteOrder`). The `Access` abstraction handles the detection and necessary byte swapping. +| Consequences | +* **Positive:** Guarantees identical hash results for the same byte sequence regardless of the hardware platform (x86 vs s390x). +* **Negative:** Incurs a performance penalty on Big-Endian platforms due to the overhead of byte-swapping during reads. +|=== + +== ADR-003: Abstraction via `Access` Strategy pattern + +[cols="1,3"] +|=== +| Status | Accepted +| Context | The library needs to hash data residing in various formats: `byte[]`, `ByteBuffer`, `CharSequence`, and raw memory addresses, without duplicating the complex hashing logic for each source. +| Decision | Implement an `Access` strategy pattern. The hashing algorithms are written against the `Access` interface, which defines how to read 1 to 8 bytes from a given object `T` at a specific offset. +| Consequences | +* **Positive:** Eliminates code duplication; a single algorithm implementation supports all input types. Allows users to implement custom `Access` for their own POJOs. +* **Negative:** Virtual method dispatch could theoretically introduce overhead, though HotSpot inlining generally mitigates this. +|=== + +== ADR-004: Reflective String Value Access (Compact Strings) + +[cols="1,3"] +|=== +| Status | Accepted +| Context | Java 9 introduced "Compact Strings" (JEP 254), changing the internal representation of Strings from `char[]` (UTF-16) to `byte[]` (Latin-1 or UTF-16). Standard `CharSequence` methods involve copying or decoding overhead. +| Decision | Use reflection and `Unsafe` to inspect the internal `value` field of `java.lang.String`. Detect if the JVM uses compact strings and, if the string is Latin-1, use `CompactLatin1CharSequenceAccess` to treat the backing byte array directly. +| Consequences | +* **Positive:** Zero-copy, zero-allocation hashing for Strings on modern JVMs. +* **Negative:** Extremely brittle; depends on private implementation details of `java.lang.String`. Requires defensive fallbacks (`UnknownJvmStringHash`) for non-HotSpot or future JVMs. +|=== + +== ADR-005: Stateless Immutable Hash Functions + +[cols="1,3"] +|=== +| Status | Accepted +| Context | Hash functions are often used in concurrent environments (e.g., web servers, high-frequency trading). +| Decision | `LongHashFunction` instances are fully stateless and immutable. Seeds are baked in at construction time. +| Consequences | +* **Positive:** Instances are inherently thread-safe and can be stored in `static final` fields or singletons. No need for synchronization or `ThreadLocal`. +* **Negative:** To change a seed, a new object must be allocated (though this is typically a setup-time cost, not a runtime cost). +|=== + +== ADR-006: Handling 128-bit Hashes via `LongTupleHashFunction` + +[cols="1,3"] +|=== +| Status | Accepted +| Context | Java primitives are limited to 64 bits (`long`). Algorithms like MurmurHash3 and XXH128 produce 128-bit output. +| Decision | Introduce `LongTupleHashFunction` which accepts a `long[]` buffer to write the results into. Provide `DualHashFunction` to allow viewing the 128-bit function as a 64-bit function (returning the lower 64 bits). +| Consequences | +* **Positive:** Supports >64-bit hashes without allocating an Object wrapper (like `BigInteger` or a custom class). +* **Negative:** The `long[]` result array must be managed/reused by the caller to maintain the "zero-allocation" promise. +|=== diff --git a/src/main/docs/invariants-and-contracts.adoc b/src/main/docs/invariants-and-contracts.adoc index 57697e9..ae5624b 100644 --- a/src/main/docs/invariants-and-contracts.adoc +++ b/src/main/docs/invariants-and-contracts.adoc @@ -1,12 +1,13 @@ -== Invariants and Contracts += Invariants and Contracts +:toc: +:lang: en-GB +:source-highlighter: rouge Chronicle Software -toc::[] - === Hash Interface Guarantees -* Every `LongHashFunction` and `LongTupleHashFunction` implementation treats primitives as if they were written to memory using the platform’s native byte order; the API therefore guarantees that `hashLong(v)` equals `hashLongs(new long[] {v})` and similar array forms (`LongHashFunction.java`, `LongTupleHashFunction.java`). +* Every `LongHashFunction` and `LongTupleHashFunction` implementation treats primitives as if they were written to memory using the platform's native byte order; the API therefore guarantees that `hashLong(v)` equals `hashLongs(new long[] {v})` and similar array forms (`LongHashFunction.java`, `LongTupleHashFunction.java`). * All bundled algorithms normalise multi-byte reads to little-endian before mixing, so the same input bytes produce identical hashes on big- and little-endian machines. Performance may differ, but results must not (`CityAndFarmHash_1_1.java`, `XxHash.java`, `XXH3.java`, `WyHash.java`, `MetroHash.java`, `MurmurHash_3.java`). * `hash(Object, Access, long off, long len)` assumes the addressed region is contiguous and valid for the requested byte count. @@ -27,7 +28,7 @@ Alternative `Access` implementations should document whether they permit null ba === Result Buffer Handling -* `LongTupleHashFunction.hash*(…, long[] result)` requires a pre-sized buffer created via `newResultArray()`. +* `LongTupleHashFunction.hash*(..., long[] result)` requires a pre-sized buffer created via `newResultArray()`. The method throws `NullPointerException` for null buffers and `IllegalArgumentException` for undersized buffers; the helper checks are centralised in `DualHashFunction` (`DualHashFunction.java:12-74`). * The allocation-free path is only honoured when callers reuse buffers. The overloads that return `long[]` will always allocate exactly one new array per call by design (`LongTupleHashFunction.java:70-118`). @@ -41,7 +42,7 @@ Several implementations expose singleton seedless instances via `readResolve`, e === String Handling -* `hashChars` and `hash(CharSequence…)` delegate to `Util.VALID_STRING_HASH`, which inspects the running JVM to choose the correct memory layout strategy. +* `hashChars` and `hash(CharSequence...)` delegate to `Util.VALID_STRING_HASH`, which inspects the running JVM to choose the correct memory layout strategy. Altering char sequence hashing must preserve this runtime detection, or mixed HotSpot/OpenJ9 estates will diverge (`Util.java:29-63`, `ModernCompactStringHash.java`, `ModernHotSpotStringHash.java`, `HotSpotPrior7u6StringHash.java`). * Latin-1 compact strings are read through `CompactLatin1CharSequenceAccess`, which reinterprets the backing `byte[]` without allocating. Any change to string support must maintain zero-allocation access for both UTF-16 and compact encodings (`CompactLatin1CharSequenceAccess.java`). @@ -50,7 +51,7 @@ Any change to string support must maintain zero-allocation access for both UTF-1 * Methods that accept `byte[]` plus `off` and `len` use `Util.checkArrayOffs` for bounds validation. Negative lengths or offsets, or slices that extend past the array end, raise `IndexOutOfBoundsException` immediately (`Util.java:70-77`, `LongHashFunction.java:480-547`). -* ByteBuffer hashing honours the buffer’s position, limit, and order. +* ByteBuffer hashing honours the buffer's position, limit, and order. The implementation temporarily adjusts `Buffer` state to satisfy IBM JDK 7 quirks, then restores the original markers (`LongHashFunction.java:392-470`, `LongHashFunctionTest.java:120-176`). === Thread Safety diff --git a/src/main/docs/performance-benchmarks.adoc b/src/main/docs/performance-benchmarks.adoc index a3121a4..77a3771 100644 --- a/src/main/docs/performance-benchmarks.adoc +++ b/src/main/docs/performance-benchmarks.adoc @@ -1,9 +1,10 @@ -== Performance Benchmarks += Performance Benchmarks +:toc: +:lang: en-GB +:source-highlighter: rouge Chronicle Software -toc::[] - === Current Baseline * The published README table reports throughput (GB/s) and bootstrap latency (ns) per algorithm. diff --git a/src/main/docs/project-requirements.adoc b/src/main/docs/project-requirements.adoc new file mode 100644 index 0000000..9c1031b --- /dev/null +++ b/src/main/docs/project-requirements.adoc @@ -0,0 +1,62 @@ += Project Requirements +:toc: +:lang: en-GB +:source-highlighter: rouge + +Chronicle Software + +== 1. Overview + +The **Zero-Allocation Hashing** project aims to provide a high-performance, Java-based API for hashing various input sources (byte arrays, buffers, memory addresses) without incurring garbage collection overhead during the hashing process. It serves as a drop-in, allocation-free alternative to libraries like Guava Hashing for performance-critical applications. + +== 2. Functional Requirements + +=== 2.1. Core Hashing Capabilities +* **FR-001:** The library must provide implementations for 64-bit hashing via the `LongHashFunction` interface. +* **FR-002:** The library must provide implementations for 128-bit (and potentially higher) hashing via the `LongTupleHashFunction` interface. +* **FR-003:** The library must allow hashing of the following input types without intermediate object allocation: +** Primitive values (`long`, `int`, `short`, `char`, `byte`). +** Primitive arrays. +** `java.nio.ByteBuffer` (both heap and direct). +** `java.lang.CharSequence` (Strings, StringBuilders). +** Raw memory addresses (off-heap memory). +* **FR-004:** The library must support an empty input (Void), returning a deterministic constant for the specific algorithm. + +=== 2.2. Algorithms +The library must implement the following non-cryptographic hash algorithms: +* **FR-005:** CityHash (v1.1). +* **FR-006:** FarmHash (variants `na` and `uo`). +* **FR-007:** MurmurHash3 (128-bit and low 64-bit). +* **FR-008:** xxHash (XXH64). +* **FR-009:** XXH3 (64-bit and 128-bit). +* **FR-010:** WyHash (v3). +* **FR-011:** MetroHash (64-bit, variant 2). + +=== 2.3. Seeding +* **FR-012:** Algorithms must provide factory methods for unseeded (default seed) instances. +* **FR-013:** Algorithms supporting seeds must provide factory methods for single-seed and, where applicable, dual-seed instances. + +=== 2.4. Determinism and Endianness +* **FR-014:** Hash outputs must be deterministic. The same input sequence must produce the same hash code on every invocation. +* **FR-015:** Hash outputs must be cross-platform consistent. The library must normalize byte order internally (typically to Little-Endian) to ensure that Big-Endian architectures (e.g., s390x) produce the exact same hash values as Little-Endian architectures (e.g., x86_64). + +== 3. Non-Functional Requirements + +=== 3.1. Performance (Latency and Allocation) +* **NFR-001:** **Zero Allocation:** Steady-state hashing operations must not allocate objects on the Java Heap. +* **NFR-002:** **Throughput:** Algorithms should demonstrate multi-GB/s throughput on modern hardware (Baseline: Core i7-4870HQ benchmarks). +* **NFR-003:** **Bootstrap:** Static initialization overhead should be minimized, though one-time allocation during class loading is permitted. + +=== 3.2. Reliability and Safety +* **NFR-004:** **Thread Safety:** Hash function instances must be immutable and thread-safe. +* **NFR-005:** **Memory Safety:** While `Unsafe` is used internally, the API must validate array offsets and lengths where standard Java arrays are used (via `Util.checkArrayOffs`) to prevent SIGSEGV/crashes on heap access. Note: Raw memory hashing assumes the caller validates addresses. + +=== 3.3. Compatibility +* **NFR-006:** **JDK Version:** The library must support JDK 8 through the current LTS (JDK 21+). +* **NFR-007:** **Module System:** The library must function in Java 9+ environments, providing documentation on required `--add-opens` or `--add-exports` flags for accessing `sun.misc.Unsafe` and `sun.nio.ch.DirectBuffer`. +* **NFR-008:** **VM Support:** The library must support major JVMs including HotSpot, OpenJDK, and OpenJ9/IBM J9. + +== 4. Interface Constraints + +* **C-001:** The API must not depend on `ThreadLocal` variables to avoid memory leaks in containerized environments. +* **C-002:** The `Access` pattern must be extensible to allow users to define custom access strategies for proprietary data structures. diff --git a/src/main/docs/specifications.adoc b/src/main/docs/specifications.adoc index 2b00814..df7537a 100644 --- a/src/main/docs/specifications.adoc +++ b/src/main/docs/specifications.adoc @@ -1,9 +1,10 @@ -== Zero-Allocation Hashing Specification += Zero-Allocation Hashing Specification +:toc: +:lang: en-GB +:source-highlighter: rouge Chronicle Software -toc::[] - === DOC-001 Scope * Provides zero-allocation hashing utilities for byte-oriented inputs in Java. diff --git a/src/main/docs/testing-strategy.adoc b/src/main/docs/testing-strategy.adoc index 8f277bc..b4edc3a 100644 --- a/src/main/docs/testing-strategy.adoc +++ b/src/main/docs/testing-strategy.adoc @@ -1,10 +1,10 @@ -== Testing Strategy -:pp: ++ += Testing Strategy +:toc: +:lang: en-GB +:source-highlighter: rouge Chronicle Software -toc::[] - === Regression Coverage * `LongHashFunctionTest.test` is the canonical harness for verifying that an algorithm produces identical values across the entire API surface (primitives, arrays, buffers, `Access`-backed inputs, and direct memory). diff --git a/src/main/docs/unsafe-and-platform-notes.adoc b/src/main/docs/unsafe-and-platform-notes.adoc index f14fbfc..492a6eb 100644 --- a/src/main/docs/unsafe-and-platform-notes.adoc +++ b/src/main/docs/unsafe-and-platform-notes.adoc @@ -1,9 +1,10 @@ -== Unsafe and Platform Notes += Unsafe and Platform Notes +:toc: +:lang: en-GB +:source-highlighter: rouge Chronicle Software -toc::[] - === Internal API Usage * `UnsafeAccess` reflects on `sun.misc.Unsafe.theUnsafe` to obtain the singleton and uses it for raw array access, field offsets, and direct memory loads (`UnsafeAccess.java:40-118`). From 1f389dc18f47a592028ae898afd8c372d92e6709 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 21 Nov 2025 20:59:36 +0000 Subject: [PATCH 02/18] Refactor access method visibility and improve variable declarations in hashing algorithms --- src/main/java/net/openhft/hashing/Access.java | 2 +- .../net/openhft/hashing/CityAndFarmHash_1_1.java | 14 ++++++++------ .../java/net/openhft/hashing/MurmurHash_3.java | 8 +++++--- src/main/java/net/openhft/hashing/XxHash.java | 2 +- 4 files changed, 15 insertions(+), 11 deletions(-) diff --git a/src/main/java/net/openhft/hashing/Access.java b/src/main/java/net/openhft/hashing/Access.java index dbe0f05..c87c6e7 100644 --- a/src/main/java/net/openhft/hashing/Access.java +++ b/src/main/java/net/openhft/hashing/Access.java @@ -119,9 +119,9 @@ public static Access toNativeCharSequence() { * * @param backingOrder the byte order of {@code char} reads backing * {@code CharSequences} to access + * @param the {@code CharSequence} subtype to access * @return the {@code Access} to {@link CharSequence}s backed by {@code char} reads made in * the specified byte order - * @param the {@code CharSequence} subtype to access * @see #toNativeCharSequence() */ @SuppressWarnings("unchecked") diff --git a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java index 9908f0c..3feacee 100644 --- a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java +++ b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java @@ -55,7 +55,7 @@ private static long hash8To16Bytes(long len, long first8Bytes, long last8Bytes) return hashLen16(c, d, mul); } - static private long hashLen0To16(Access access, T in, long off, long len) { + private static long hashLen0To16(Access access, T in, long off, long len) { if (len >= 8L) { long a = access.i64(in, off); long b = access.i64(in, off + len - 8L); @@ -73,7 +73,7 @@ static private long hashLen0To16(Access access, T in, long off, long len) return K2; } - static private long hashLen17To32(Access access, T in, long off, long len) { + private static long hashLen17To32(Access access, T in, long off, long len) { long mul = mul(len); long a = access.i64(in, off) * K1; long b = access.i64(in, off + 8L); @@ -83,7 +83,7 @@ static private long hashLen17To32(Access access, T in, long off, long len a + rotateRight(b + K2, 18) + c, mul); } - static private long cityHashLen33To64(Access access, T in, long off, long len) { + private static long cityHashLen33To64(Access access, T in, long off, long len) { long mul = mul(len); long a = access.i64(in, off) * K2; long b = access.i64(in, off + 8L); @@ -105,6 +105,8 @@ static private long cityHashLen33To64(Access access, T in, long off, long } static long cityHash64(Access access, T in, long off, long len) { + // This method is a close translation of the upstream CityHash reference implementation. + // Variable declaration placement and naming are preserved for clarity against the original. if (len <= 32L) { if (len <= 16L) { return hashLen0To16(access, in, off, len); @@ -115,9 +117,9 @@ static long cityHash64(Access access, T in, long off, long len) { return cityHashLen33To64(access, in, off, len); } - long x = access.i64(in, off + len - 40L); - long y = access.i64(in, off + len - 16L) + access.i64(in, off + len - 56L); - long z = hashLen16(access.i64(in, off + len - 48L) + len, + final long x = access.i64(in, off + len - 40L); + final long y = access.i64(in, off + len - 16L) + access.i64(in, off + len - 56L); + final long z = hashLen16(access.i64(in, off + len - 48L) + len, access.i64(in, off + len - 24L)); long vFirst, vSecond, wFirst, wSecond; diff --git a/src/main/java/net/openhft/hashing/MurmurHash_3.java b/src/main/java/net/openhft/hashing/MurmurHash_3.java index a6c520f..2017076 100644 --- a/src/main/java/net/openhft/hashing/MurmurHash_3.java +++ b/src/main/java/net/openhft/hashing/MurmurHash_3.java @@ -5,6 +5,7 @@ import org.jetbrains.annotations.NotNull; import org.jetbrains.annotations.Nullable; + import javax.annotation.ParametersAreNonnullByDefault; import static java.nio.ByteOrder.LITTLE_ENDIAN; @@ -26,15 +27,16 @@ private static long hash(long seed, @Nullable T input, Access access, lon long remaining = length; while (remaining >= 16L) { long k1 = access.i64(input, offset); - long k2 = access.i64(input, offset + 8L); - offset += 16L; - remaining -= 16L; h1 ^= mixK1(k1); h1 = Long.rotateLeft(h1, 27); h1 += h2; h1 = h1 * 5L + 0x52dce729L; + long k2 = access.i64(input, offset + 8L); + offset += 16L; + remaining -= 16L; + h2 ^= mixK2(k2); h2 = Long.rotateLeft(h2, 31); diff --git a/src/main/java/net/openhft/hashing/XxHash.java b/src/main/java/net/openhft/hashing/XxHash.java index 4bc8455..aca092c 100644 --- a/src/main/java/net/openhft/hashing/XxHash.java +++ b/src/main/java/net/openhft/hashing/XxHash.java @@ -139,10 +139,10 @@ public long seed() { @Override public long hashLong(long input) { input = Primitives.nativeToLittleEndian(input); - long hash = seed() + P5 + 8; input *= P2; input = Long.rotateLeft(input, 31); input *= P1; + long hash = seed() + P5 + 8; hash ^= input; hash = Long.rotateLeft(hash, 27) * P1 + P4; return XxHash.finalize(hash); From cc483cb90b9e757373da82ec0a23cfa5ff891ada Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 21 Nov 2025 20:59:43 +0000 Subject: [PATCH 03/18] Add documentation for AI agents and project guidelines --- AGENTS.md | 161 +++++++++++++++++++++++++++++++++++++ CLAUDE.md | 223 ++++++++++++++++++++++++++++++++++++++++++++++++++++ README.adoc | 3 +- 3 files changed, 385 insertions(+), 2 deletions(-) create mode 100644 AGENTS.md create mode 100644 CLAUDE.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..5c83c58 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,161 @@ +# Guidance for AI agents, bots, and humans contributing to Chronicle Software's OpenHFT projects. + +LLM-based agents can accelerate development only if they respect our house rules. This file tells you: + +* how to run and verify the build; +* what *not* to comment; +* when to open pull requests. + +## Language & character-set policy + +| Requirement | Rationale | +|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------| +| **British English** spelling (`organisation`, `licence`, *not* `organization`, `license`) except technical US spellings like `synchronized` | Keeps wording consistent with Chronicle's London HQ and existing docs. See the University of Oxford style guide for reference. | +| **ISO-8859-1** (code-points 0-255), except in string literals. Avoid smart quotes, non-breaking spaces and accented characters. | ISO-8859-1 survives every toolchain Chronicle uses, incl. low-latency binary wire formats that expect the 8th bit to be 0. | +| If a symbol is not available in ISO-8859-1, use a textual form such as `micro-second`, `>=`, `:alpha:`, `:yes:`. This is the preferred approach and Unicode must not be inserted. | Extended or '8-bit ASCII' variants are *not* portable and are therefore disallowed. | + +## Javadoc guidelines + +**Goal:** Every Javadoc block should add information you cannot glean from the method signature alone. Anything else is +noise and slows readers down. + +| Do | Don't | +|----|-------| +| State *behavioural contracts*, edge-cases, thread-safety guarantees, units, performance characteristics and checked exceptions. | Restate the obvious ("Gets the value", "Sets the name"). | +| Keep the first sentence short; it becomes the summary line in aggregated docs. | Duplicate parameter names/ types unless more explanation is needed. | +| Prefer `@param` for *constraints* and `@throws` for *conditions*, following Oracle's style guide. | Pad comments to reach a line-length target. | +| Remove or rewrite autogenerated Javadoc for trivial getters/setters. | Leave stale comments that now contradict the code. | + +The principle that Javadoc should only explain what is *not* manifest from the signature is well-established in the +wider Java community. + +## Build & test commands + +Agents must verify that the project still compiles and all unit tests pass before opening a PR: + +```bash +# From repo root +mvn -q verify +``` + +## Commit-message & PR etiquette + +1. **Subject line <= 72 chars**, imperative mood: Fix roll-cycle offset in `ExcerptAppender`. +2. Reference the JIRA/GitHub issue if it exists. +3. In *body*: *root cause -> fix -> measurable impact* (latency, allocation, etc.). Use ASCII bullet points. +4. **Run `mvn verify`** again after rebasing. + +## What to ask the reviewers + +* *Is this AsciiDoc documentation precise enough for a clean-room re-implementation?* +* Does the Javadoc explain the code's *why* and *how* that a junior developer would not be expected to work out? +* Are the documentation, tests and code updated together so the change is clear? +* Does the commit point back to the relevant requirement or decision tag? +* Would an example or small diagram help future maintainers? + +## Project requirements + +See the [Decision Log](src/main/docs/decision-log.adoc) for the latest project decisions. +See the [Project Requirements](src/main/docs/project-requirements.adoc) for details on project requirements. + +## Elevating the Workflow with Real-Time Documentation + +Building upon our existing Iterative Workflow, the newest recommendation is to emphasise *real-time updates* to documentation. +Ensure the relevant `.adoc` files are updated when features, requirements, implementation details, or tests change. +This tight loop informs the AI accurately and creates immediate clarity for all team members. + +### Benefits of Real-Time Documentation + +* **Confidence in documentation**: Accurate docs prevent miscommunications that derail real-world outcomes. +* **Reduced drift**: Real-time updates keep requirements, tests and code aligned. +* **Faster feedback**: AI can quickly highlight inconsistencies when everything is in sync. +* **Better quality**: Frequent checks align the implementation with the specified behaviour. +* **Smoother onboarding**: Up-to-date AsciiDoc clarifies the system for new developers. +* **Incremental changes**: AIDE flags newly updated files so you can keep the documentation synchronised. + +### Best Practices + +* **Maintain Sync**: Keep documentation (AsciiDoc), tests, and code synchronised in version control. Changes in one area should prompt reviews and potential updates in the others. +* **Doc-First for New Work**: For *new* features or requirements, aim to update documentation first, then use AI to help produce or refine corresponding code and tests. For refactoring or initial bootstrapping, updates might flow from code/tests back to documentation, which should then be reviewed and finalised. +* **Small Commits**: Each commit should ideally relate to a single requirement or coherent change, making reviews easier for humans and AI analysis tools. +- **Team Buy-In**: Encourage everyone to review AI outputs critically and contribute to maintaining the synchronicity of all artefacts. + +## AI Agent Guidelines + +When using AI agents to assist with development, please adhere to the following guidelines: + +* **Respect the Language & Character-set Policy**: Ensure all AI-generated content follows the British English and ISO-8859-1 guidelines outlined above. +Focus on Clarity: AI-generated documentation should be clear and concise and add value beyond what is already present in the code or existing documentation. +* **Avoid Redundancy**: Do not generate content that duplicates existing documentation or code comments unless it provides additional context or clarification. +* **Review AI Outputs**: Always review AI-generated content for accuracy, relevance, and adherence to the project's documentation standards before committing it to the repository. + +## Company-Wide Tagging + +This section records **company-wide** decisions that apply to *all* Chronicle projects. All identifiers use the --xxx prefix. The `xxx` are unique across in the same Scope even if the tags are different. Component-specific decisions live in their xxx-decision-log.adoc files. + +### Tag Taxonomy (Nine-Box Framework) + +To improve traceability, we adopt the Nine-Box taxonomy for requirement and decision identifiers. These tags are used in addition to the existing ALL prefix, which remains reserved for global decisions across every project. + +.Adopt a Nine-Box Requirement Taxonomy + +|Tag | Scope | Typical examples | +|----|-------|------------------| +|FN |Functional user-visible behaviour | Message routing, business rules | +|NF-P |Non-functional - Performance | Latency budgets, throughput targets | +|NF-S |Non-functional - Security | Authentication method, TLS version | +|NF-O |Non-functional - Operability | Logging, monitoring, health checks | +|TEST |Test / QA obligations | Chaos scenarios, benchmarking rigs | +|DOC |Documentation obligations | Sequence diagrams, user guides | +|OPS |Operational / DevOps concerns | Helm values, deployment checklist | +|UX |Operator or end-user experience | CLI ergonomics, dashboard layouts | +|RISK |Compliance / risk controls | GDPR retention, audit trail | + +`ALL-*` stays global, case-exact tags. Pick one primary tag if multiple apply. + +### Decision Record Template + +```asciidoc +=== [Identifier] Title of Decision + +Date:: YYYY-MM-DD +Context:: +* What is the issue that this decision addresses? +* What are the driving forces, constraints, and requirements? +Decision Statement:: +* What is the change that is being proposed or was decided? +Alternatives Considered:: +* [Alternative 1 Name/Type]: +** *Description:* Brief description of the alternative. +** *Pros:* ... +** *Cons:* ... +* [Alternative 2 Name/Type]: +** *Description:* Brief description of the alternative. +** *Pros:* ... +** *Cons:* ... +Rationale for Decision:: +* Why was the chosen decision selected? +* How does it address the context and outweigh the cons of alternatives? +Impact & Consequences:: +* What are the positive and negative consequences of this decision? +* How does this decision affect the system, developers, users, or operations? +- What are the trade-offs made? +Notes/Links:: +** (Optional: Links to relevant issues, discussions, documentation, proof-of-concepts) +``` + +## Asciidoc formatting guidelines + +### List Indentation + +Do not rely on indentation for list items in AsciiDoc documents. Use the following pattern instead: + +```asciidoc +section:: Top Level Section +* first level + ** nested level +``` + +### Emphasis and Bold Text + +In AsciiDoc, an underscore `_` is _emphasis_; `*text*` is *bold*. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..8fb8657 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,223 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +Zero-Allocation Hashing is a Java library providing fast, non-cryptographic hash functions that allocate zero objects during hash computation. It implements multiple algorithms (CityHash, FarmHash, MurmurHash3, xxHash, XXH3, wyHash, MetroHash) for hashing byte sequences from various sources (arrays, buffers, CharSequence, raw memory). + +Target: Java 8+ (supports JDK 8, 11, 17, 21+) + +## Build & Test Commands + +```bash +# Run full build with tests +mvn verify + +# Run tests only +mvn test + +# Run specific test class +mvn test -Dtest=LongHashFunctionTest + +# Run specific test method +mvn test -Dtest=LongHashFunctionTest#testHashBytes + +# Clean build +mvn clean install + +# Skip tests (not recommended before commits) +mvn install -DskipTests + +# Generate javadoc +mvn javadoc:javadoc +``` + +## Architecture & Key Concepts + +### Core Abstractions + +**`LongHashFunction`** (src/main/java/net/openhft/hashing/LongHashFunction.java) +- Primary facade for 64-bit hashes +- Factory methods: `city_1_1()`, `farmHashNa()`, `farmHashUo()`, `murmur_3()`, `xx()`, `xx3()`, `wy_3()`, `metro()` +- All instances are immutable and thread-safe +- Seeds are baked into instances at construction time + +**`LongTupleHashFunction`** (src/main/java/net/openhft/hashing/LongTupleHashFunction.java) +- Multi-word hash results (128-bit and beyond) +- Uses reusable `long[]` buffers to maintain zero-allocation guarantee +- Caller must manage/reuse result arrays + +**`Access`** (src/main/java/net/openhft/hashing/Access.java) +- Strategy pattern abstracting byte sequence reading from different sources +- Implementations: `UnsafeAccess` (heap arrays), `ByteBufferAccess`, `CharSequenceAccess`, `CompactLatin1CharSequenceAccess` +- Handles byte-order normalization via `Access.byteOrder(input, desiredOrder)` +- Algorithms are written once against `Access` interface, work with all input types + +**Byte Order Handling** +- All algorithms normalize to Little-Endian internally (ADR-002) +- Ensures cross-platform deterministic results (x86 vs s390x) +- Performance penalty on Big-Endian platforms due to byte-swapping + +### Memory Access + +**`UnsafeAccess`** (src/main/java/net/openhft/hashing/UnsafeAccess.java) +- Wraps `sun.misc.Unsafe` for zero-copy memory reads +- Enables "type punning" (reading bytes as longs) efficiently +- Requires `--add-opens java.base/jdk.internal.misc=ALL-UNNAMED --add-opens java.base/sun.nio.ch=ALL-UNNAMED` on Java 9+ + +**String Hashing Runtime Adaptation** (src/main/java/net/openhft/hashing/Util.java) +- `VALID_STRING_HASH` selects correct strategy at JVM initialization +- Handles: HotSpot (pre-compact, compact strings), OpenJ9, Zing, unknown VMs +- Uses reflection to access internal `String.value` field (ADR-004) +- Fallback: `UnknownJvmStringHash` for unrecognized JVMs + +### Algorithm Implementations + +All live in package-private classes with factory methods exposed via `LongHashFunction`: +- `CityAndFarmHash_1_1` - CityHash64 v1.1, FarmHash NA/UO +- `MurmurHash_3` - 64-bit and 128-bit variants +- `XxHash` - XXH64 +- `XXH3` - XXH3 64-bit and 128-bit +- `WyHash` - wyHash v3 +- `MetroHash` - metrohash64_2 + +## Project-Specific Guidelines + +### Language & Character Set (CRITICAL) +- Use **British English**: "organisation", "licence", "optimisation" (NOT "organization", "license", "optimization") +- Technical US spellings allowed: "synchronized", "byte" +- **ISO-8859-1 only** (code-points 0-255) except in string literals +- NO smart quotes, non-breaking spaces, accented characters in code/docs +- Use textual forms: "micro-second", ">=", ":alpha:" instead of Unicode + +### Javadoc Standards +**DO:** +- Document behavioural contracts, edge-cases, thread-safety, units, performance characteristics +- Explain constraints via `@param`, conditions via `@throws` +- Keep first sentence short (becomes summary line) + +**DON'T:** +- Restate the obvious ("Gets the value", "Sets the name") +- Duplicate parameter names/types without additional explanation +- Leave autogenerated Javadoc for trivial getters/setters + +### Code Conventions +- Zero-allocation in steady state (one-time allocation during class loading permitted) +- Validate array offsets/lengths via `Util.checkArrayOffs` to prevent crashes +- Assume raw memory addresses are pre-validated by caller +- No ThreadLocal usage (containerization requirement) +- Immutable, stateless hash function instances + +### Documentation Workflow (IMPORTANT) +- Keep AsciiDoc files synchronized with code/tests +- Update `src/main/docs/*.adoc` when changing features, requirements, or implementations +- See `AGENTS.md` for company-wide documentation standards +- Reference decisions in commit messages (e.g., "Implements ADR-003") + +### Testing Requirements +- Thoroughly tested on LTS JDKs: 8, 11, 17, 21 +- Test both Little-Endian and Big-Endian platforms +- JDK 9+ runs tests twice: with and without `-XX:-CompactStrings` (see pom.xml profiles) +- All tests use JUnit 4 (supports Java 7+) + +### Commit & PR Guidelines +- Subject line <= 72 chars, imperative mood: "Fix roll-cycle offset in ExcerptAppender" +- Body format: root cause -> fix -> measurable impact +- Run `mvn verify` before committing +- Reference JIRA/GitHub issues +- Use ASCII bullet points in commit bodies +- Main branch for PRs: **ea** (not master) + +## File Structure + +``` +src/main/java/net/openhft/hashing/ + Access.java - Core abstraction for byte sequence reading + LongHashFunction.java - Primary 64-bit hash facade + LongTupleHashFunction.java - Multi-word hash facade + DualHashFunction.java - Bridges 128-bit -> 64-bit views + UnsafeAccess.java - sun.misc.Unsafe wrapper + + [Algorithm Implementations] + CityAndFarmHash_1_1.java + MurmurHash_3.java + XxHash.java + XXH3.java + WyHash.java + MetroHash.java + + [Memory Access Strategies] + ByteBufferAccess.java + CharSequenceAccess.java + CompactLatin1CharSequenceAccess.java + + [String Hashing Adapters] + StringHash.java + ModernHotSpotStringHash.java + ModernCompactStringHash.java + HotSpotPrior7u6StringHash.java + UnknownJvmStringHash.java + + [Utilities] + Util.java - Runtime detection, validation + Primitives.java - Byte-order normalization + Maths.java - Low-level arithmetic helpers + +src/main/docs/ + project-requirements.adoc - Functional/non-functional requirements + decision-log.adoc - Architecture Decision Records (ADRs) + architecture-overview.adoc - System architecture details + algorithm-profiles.adoc - Per-algorithm characteristics + testing-strategy.adoc - Test coverage approach +``` + +## Common Development Patterns + +### Adding a New Hash Algorithm +1. Create package-private class implementing the algorithm +2. Extend `LongHashFunction` or `LongTupleHashFunction` +3. Use `Access` abstraction for all memory reads +4. Normalize to Little-Endian via `Access.byteOrder(input, LITTLE_ENDIAN)` +5. Add factory methods to `LongHashFunction` facade +6. Add tests validating against reference implementation +7. Update `algorithm-profiles.adoc` with characteristics + +### Implementing Custom Access Strategy +```java +// Example from Access.java documentation +class Pair { + long first, second; + + static final long pairDataOffset = + theUnsafe.objectFieldOffset(Pair.class.getDeclaredField("first")); + + static long hashPair(Pair pair, LongHashFunction hashFunction) { + return hashFunction.hash(pair, Access.unsafe(), pairDataOffset, 16L); + } +} +``` + +### Running on Java 9+ +Requires JVM flags for internal API access: +```bash +--add-opens java.base/jdk.internal.misc=ALL-UNNAMED +--add-opens java.base/sun.nio.ch=ALL-UNNAMED +``` + +## Key Decisions (from decision-log.adoc) + +- **ADR-001**: Use `sun.misc.Unsafe` for maximum performance despite JDK dependency +- **ADR-002**: Normalize all inputs to Little-Endian for cross-platform determinism +- **ADR-003**: `Access` strategy pattern eliminates algorithm duplication across input types +- **ADR-004**: Reflectively access `String.value` for zero-copy hashing on modern JVMs +- **ADR-005**: Stateless immutable hash functions (thread-safe, no ThreadLocal) +- **ADR-006**: Use `long[]` buffers for 128-bit hashes to maintain zero-allocation + +## References + +- Javadoc: http://javadoc.io/doc/net.openhft/zero-allocation-hashing/latest +- GitHub: https://github.com/OpenHFT/Zero-Allocation-Hashing +- Issues: https://github.com/OpenHFT/Zero-Allocation-Hashing/issues +- Release Notes: https://chronicle.software/release-notes/ +- Company Guidelines: See `AGENTS.md` for Chronicle Software standards \ No newline at end of file diff --git a/README.adoc b/README.adoc index 0a8458d..e09863e 100644 --- a/README.adoc +++ b/README.adoc @@ -1,7 +1,6 @@ == Zero-Allocation Hashing -:pp: ++ -Chronicle Software +Chronicle Software :lang: en-GB :source-highlighter: rouge :pp: ++ image:https://maven-badges.herokuapp.com/maven-central/net.openhft/zero-allocation-hashing/badge.svg[caption="",link=https://maven-badges.herokuapp.com/maven-central/net.openhft/zero-allocation-hashing] image:https://javadoc.io/badge2/net.openhft/zero-allocation-hashing/javadoc.svg[link="https://www.javadoc.io/doc/net.openhft/zero-allocation-hashing/latest/index.html"] From 78228848b0d5831972fd6e2ccfb82516280d3b5b Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 21 Nov 2025 21:04:41 +0000 Subject: [PATCH 04/18] Refactor access method visibility and improve variable declarations in hashing algorithms --- src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java index 3feacee..6296501 100644 --- a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java +++ b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java @@ -117,9 +117,9 @@ static long cityHash64(Access access, T in, long off, long len) { return cityHashLen33To64(access, in, off, len); } - final long x = access.i64(in, off + len - 40L); - final long y = access.i64(in, off + len - 16L) + access.i64(in, off + len - 56L); - final long z = hashLen16(access.i64(in, off + len - 48L) + len, + long x = access.i64(in, off + len - 40L); + long y = access.i64(in, off + len - 16L) + access.i64(in, off + len - 56L); + long z = hashLen16(access.i64(in, off + len - 48L) + len, access.i64(in, off + len - 24L)); long vFirst, vSecond, wFirst, wSecond; From ccc07d2ef44b998631373431876e145a7517d87e Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 21 Nov 2025 21:09:14 +0000 Subject: [PATCH 05/18] Update JDK profile in POM to target JDK 17 --- pom.xml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pom.xml b/pom.xml index 6ce4970..0de8679 100644 --- a/pom.xml +++ b/pom.xml @@ -345,9 +345,9 @@ - jdk21-profile + jdk17-profile - [21,) + [17,) 8 From c7b9a3d5d8aabef573411478ce16cba105b4ee69 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Wed, 12 Nov 2025 20:56:10 +0000 Subject: [PATCH 06/18] Apply checkstyle adjustments --- src/main/docs/algorithm-profiles.adoc | 2 +- src/main/docs/architecture-overview.adoc | 2 +- src/main/docs/invariants-and-contracts.adoc | 2 +- src/main/java/net/openhft/hashing/Access.java | 36 +- .../net/openhft/hashing/LongHashFunction.java | 53 ++- .../hashing/LongTupleHashFunction.java | 4 +- .../{MurmurHash_3.java => MurmurHash3.java} | 142 +++--- .../java/net/openhft/hashing/Primitives.java | 84 +++- src/main/java/net/openhft/hashing/WyHash.java | 82 ++-- src/main/java/net/openhft/hashing/XXH3.java | 440 +++++++++--------- 10 files changed, 464 insertions(+), 383 deletions(-) rename src/main/java/net/openhft/hashing/{MurmurHash_3.java => MurmurHash3.java} (66%) diff --git a/src/main/docs/algorithm-profiles.adoc b/src/main/docs/algorithm-profiles.adoc index bbe8d29..a49393d 100644 --- a/src/main/docs/algorithm-profiles.adoc +++ b/src/main/docs/algorithm-profiles.adoc @@ -42,7 +42,7 @@ Key traits :: Factories :: `LongHashFunction.murmur_3()`, `.murmur_3(long)` for 64-bit (`LongHashFunction.java:245-268`); `LongTupleHashFunction.murmur_3()`, `.murmur_3(long)` for 128-bit (`LongTupleHashFunction.java:35-69`). Implementation :: -`net.openhft.hashing.MurmurHash_3` adapts Austin Appleby's x64 variants. +`net.openhft.hashing.MurmurHash3` adapts Austin Appleby's x64 variants. It extends `DualHashFunction` so the 128-bit engine also exposes the low 64 bits through `LongHashFunction`. Key traits :: * Little-endian canonicalisation via `Access.byteOrder`. diff --git a/src/main/docs/architecture-overview.adoc b/src/main/docs/architecture-overview.adoc index b5e7e7a..a7e7fc7 100644 --- a/src/main/docs/architecture-overview.adoc +++ b/src/main/docs/architecture-overview.adoc @@ -24,7 +24,7 @@ It currently delivers 128-bit MurmurHash_3 and XXH3 outputs and mirrors the sing * Each upstream hash family lives in its own package-private class and exposes seed-aware factories back to the public façade. ** `CityAndFarmHash_1_1` adapts CityHash64 1.1 plus FarmHash NA/UO variants, including the short-input specialisations from the original C{pp} sources. -** `MurmurHash_3` contains both 64-bit and 128-bit variants, reusing `DualHashFunction` to provide `LongHashFunction` and `LongTupleHashFunction` accessors. +** `MurmurHash3` contains both 64-bit and 128-bit variants, reusing `DualHashFunction` to provide `LongHashFunction` and `LongTupleHashFunction` accessors. ** `XxHash` implements XXH64 with the upstream prime constants and treats all inputs as little-endian via `Access.byteOrder` (`XxHash.java`). ** `XXH3` delivers XXH3 64-bit and 128-bit functions, including the FARSH-derived secret and block-stripe accumulation strategy (`XXH3.java`). ** `WyHash` ports wyHash v3, including the 256-byte streaming loop and `_wymum` mixing helper built on `Maths.unsignedLongMulXorFold` (`WyHash.java`). diff --git a/src/main/docs/invariants-and-contracts.adoc b/src/main/docs/invariants-and-contracts.adoc index ae5624b..3eca821 100644 --- a/src/main/docs/invariants-and-contracts.adoc +++ b/src/main/docs/invariants-and-contracts.adoc @@ -9,7 +9,7 @@ Chronicle Software * Every `LongHashFunction` and `LongTupleHashFunction` implementation treats primitives as if they were written to memory using the platform's native byte order; the API therefore guarantees that `hashLong(v)` equals `hashLongs(new long[] {v})` and similar array forms (`LongHashFunction.java`, `LongTupleHashFunction.java`). * All bundled algorithms normalise multi-byte reads to little-endian before mixing, so the same input bytes produce identical hashes on big- and little-endian machines. -Performance may differ, but results must not (`CityAndFarmHash_1_1.java`, `XxHash.java`, `XXH3.java`, `WyHash.java`, `MetroHash.java`, `MurmurHash_3.java`). +Performance may differ, but results must not (`CityAndFarmHash_1_1.java`, `XxHash.java`, `XXH3.java`, `WyHash.java`, `MetroHash.java`, `MurmurHash3.java`). * `hash(Object, Access, long off, long len)` assumes the addressed region is contiguous and valid for the requested byte count. Implementations do not insert bounds checks beyond those provided by the chosen `Access` strategy, so callers must uphold the contract (`LongHashFunction.java:548-612`). * `hashMemory(long address, long length)` treats the `address` as an absolute memory pointer. diff --git a/src/main/java/net/openhft/hashing/Access.java b/src/main/java/net/openhft/hashing/Access.java index c87c6e7..935224f 100644 --- a/src/main/java/net/openhft/hashing/Access.java +++ b/src/main/java/net/openhft/hashing/Access.java @@ -118,7 +118,7 @@ public static Access toNativeCharSequence() { * }} * * @param backingOrder the byte order of {@code char} reads backing - * {@code CharSequences} to access + * {@code CharSequences} to access * @param the {@code CharSequence} subtype to access * @return the {@code Access} to {@link CharSequence}s backed by {@code char} reads made in * the specified byte order @@ -240,13 +240,33 @@ public int getUnsignedByte(T input, long offset) { public abstract int getByte(T input, long offset); // short names - public long i64(final T input, final long offset) { return getLong(input, offset); } - public long u32(final T input, final long offset) { return getUnsignedInt(input, offset); } - public int i32(final T input, final long offset) { return getInt(input, offset); } - public int u16(final T input, final long offset) { return getUnsignedShort(input, offset); } - public int i16(final T input, final long offset) { return getShort(input, offset); } - public int u8(final T input, final long offset) { return getUnsignedByte(input, offset); } - public int i8(final T input, final long offset) { return getByte(input, offset); } + public long i64(final T input, final long offset) { + return getLong(input, offset); + } + + public long u32(final T input, final long offset) { + return getUnsignedInt(input, offset); + } + + public int i32(final T input, final long offset) { + return getInt(input, offset); + } + + public int u16(final T input, final long offset) { + return getUnsignedShort(input, offset); + } + + public int i16(final T input, final long offset) { + return getShort(input, offset); + } + + public int u8(final T input, final long offset) { + return getUnsignedByte(input, offset); + } + + public int i8(final T input, final long offset) { + return getByte(input, offset); + } /** * The byte order in which all multi-byte {@code getXXX()} reads from the given {@code input} diff --git a/src/main/java/net/openhft/hashing/LongHashFunction.java b/src/main/java/net/openhft/hashing/LongHashFunction.java index 6eb940d..2c60ee4 100644 --- a/src/main/java/net/openhft/hashing/LongHashFunction.java +++ b/src/main/java/net/openhft/hashing/LongHashFunction.java @@ -56,6 +56,7 @@ public abstract class LongHashFunction implements Serializable { private static final long serialVersionUID = 0L; + // CHECKSTYLE:OFF: MethodName /** * Returns a {@code LongHashFunction} that implements the * @@ -67,6 +68,7 @@ public abstract class LongHashFunction implements Serializable { * @see #city_1_1(long) * @see #city_1_1(long, long) */ + @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction city_1_1() { return CityAndFarmHash_1_1.asLongHashFunctionWithoutSeed(); } @@ -83,6 +85,7 @@ public static LongHashFunction city_1_1() { * @see #city_1_1() * @see #city_1_1(long, long) */ + @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction city_1_1(long seed) { return CityAndFarmHash_1_1.asLongHashFunctionWithSeed(seed); } @@ -100,6 +103,7 @@ public static LongHashFunction city_1_1(long seed) { * @see #city_1_1() * @see #city_1_1(long) */ + @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction city_1_1(long seed0, long seed1) { return CityAndFarmHash_1_1.asLongHashFunctionWithTwoSeeds(seed0, seed1); } @@ -225,8 +229,9 @@ public static LongHashFunction farmUo(long seed0, long seed1) { * @return a {@code LongHashFunction} implementing the MurmurHash3 algorithm without seed values * @see #murmur_3(long) */ + @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction murmur_3() { - return MurmurHash_3.asLongHashFunctionWithoutSeed(); + return MurmurHash3.asLongHashFunctionWithoutSeed(); } /** @@ -240,8 +245,9 @@ public static LongHashFunction murmur_3() { * @return a {@code LongHashFunction} implementing the MurmurHash3 algorithm with the given seed value * @see #murmur_3() */ + @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction murmur_3(long seed) { - return MurmurHash_3.asLongHashFunctionWithSeed(seed); + return MurmurHash3.asLongHashFunctionWithSeed(seed); } /** @@ -335,6 +341,7 @@ public static LongHashFunction xx128low(final long seed) { * @return a {@code LongHashFunction} implementing the wyhash algorithm, version 3, without a seed value * @see #wy_3(long) */ + @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction wy_3() { return WyHash.asLongHashFunctionWithoutSeed(); } @@ -350,9 +357,11 @@ public static LongHashFunction wy_3() { * @return a {@code LongHashFunction} implementing the wyhash algorithm, version 3, with the given seed value * @see #wy_3() */ + @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction wy_3(long seed) { return WyHash.asLongHashFunctionWithSeed(seed); } + // CHECKSTYLE:ON: MethodName /** * Returns a hash function implementing the 64 bit version of @@ -669,9 +678,9 @@ public long hashChars(@NotNull String input, int off, int len) { /** * Shortcut for {@link #hashChars(StringBuilder, int, int) hashChars(input, 0, input.length())}. - * - * @param input the StringBuilder to be hashed - * @return the hash code for the given StringBuilder + * + * @param input the StringBuilder to be hashed + * @return the hash code for the given StringBuilder */ public long hashChars(@NotNull StringBuilder input) { return hashNativeChars(input); @@ -697,33 +706,33 @@ public long hashChars(@NotNull StringBuilder input, int off, int len) { return hashNativeChars(input, off, len); } -/** - * Returns the hash code for the entire CharSequence. - * - * @param input the CharSequence to be hashed - * @return the hash code for the given CharSequence - */ + /** + * Returns the hash code for the entire CharSequence. + * + * @param input the CharSequence to be hashed + * @return the hash code for the given CharSequence + */ long hashNativeChars(CharSequence input) { return hashNativeChars(input, 0, input.length()); } -/** - * Returns the hash code for a subsequence of the given CharSequence. - * - * @param input the CharSequence to be hashed - * @param off the index of the first char in the subsequence - * @param len the length of the subsequence - * @return the hash code for the specified subsequence of the given CharSequence - */ + /** + * Returns the hash code for a subsequence of the given CharSequence. + * + * @param input the CharSequence to be hashed + * @param off the index of the first char in the subsequence + * @param len the length of the subsequence + * @return the hash code for the specified subsequence of the given CharSequence + */ long hashNativeChars(CharSequence input, int off, int len) { return hash(input, nativeCharSequenceAccess(), off * 2L, len * 2L); } /** * Shortcut for {@link #hashShorts(short[], int, int) hashShorts(input, 0, input.length)}. - * - * @param input the short array to be hashed - * @return the hash code for the given short array + * + * @param input the short array to be hashed + * @return the hash code for the given short array */ public long hashShorts(@NotNull short[] input) { return unsafeHash(input, SHORT_BASE, input.length * 2L); diff --git a/src/main/java/net/openhft/hashing/LongTupleHashFunction.java b/src/main/java/net/openhft/hashing/LongTupleHashFunction.java index baa4ba9..72d22b9 100644 --- a/src/main/java/net/openhft/hashing/LongTupleHashFunction.java +++ b/src/main/java/net/openhft/hashing/LongTupleHashFunction.java @@ -77,7 +77,7 @@ public abstract class LongTupleHashFunction implements Serializable { */ @NotNull public static LongTupleHashFunction murmur_3() { - return MurmurHash_3.asLongTupleHashFunctionWithoutSeed(); + return MurmurHash3.asLongTupleHashFunctionWithoutSeed(); } /** @@ -91,7 +91,7 @@ public static LongTupleHashFunction murmur_3() { */ @NotNull public static LongTupleHashFunction murmur_3(final long seed) { - return MurmurHash_3.asLongTupleHashFunctionWithSeed(seed); + return MurmurHash3.asLongTupleHashFunctionWithSeed(seed); } /** diff --git a/src/main/java/net/openhft/hashing/MurmurHash_3.java b/src/main/java/net/openhft/hashing/MurmurHash3.java similarity index 66% rename from src/main/java/net/openhft/hashing/MurmurHash_3.java rename to src/main/java/net/openhft/hashing/MurmurHash3.java index 2017076..6aa3459 100644 --- a/src/main/java/net/openhft/hashing/MurmurHash_3.java +++ b/src/main/java/net/openhft/hashing/MurmurHash3.java @@ -17,7 +17,7 @@ * /guava/src/com/google/common/hash/Murmur3_128HashFunction.java */ @ParametersAreNonnullByDefault -class MurmurHash_3 { +class MurmurHash3 { private static final long C1 = 0x87c37b91114253d5L; private static final long C2 = 0x4cf5ad432745937fL; @@ -91,72 +91,72 @@ private static long hash(long seed, @Nullable T input, Access access, lon // This version appears to be working slower -// if (remaining > 0L) { -// long k1 = 0L; -// long k2 = 0L; -// megaSwitch: -// { -// fetch0_7: -// { -// fetch8_11: -// { -// fetch0_3: -// { -// switch ((int) remaining) { -// case 15: -// k2 ^= ((long) access.u8(input, offset + 14L)) << 48; -// case 14: -// k2 ^= ((long) Primitives.nativeToLittleEndian( -// access.u16(input, offset + 12L))) << 32; -// break fetch8_11; -// case 13: -// k2 ^= ((long) access.u8(input, offset + 12L)) << 32; -// case 12: -// break fetch8_11; -// case 11: -// k2 ^= ((long) access.u8(input, offset + 10L)) << 16; -// case 10: -// k2 ^= (long) Primitives.nativeToLittleEndian( -// access.u16(input, offset + 8L)); -// break fetch0_7; -// case 9: -// k2 ^= ((long) access.u8(input, offset + 8L)); -// case 8: -// break fetch0_7; -// case 7: -// k1 ^= ((long) access.u8(input, offset + 6L)) << 48; -// case 6: -// k1 ^= ((long) Primitives.nativeToLittleEndian( -// access.u16(input, offset + 4L))) << 32; -// break fetch0_3; -// case 5: -// k1 ^= ((long) access.u8(input, offset + 4L)) << 32; -// case 4: -// break fetch0_3; -// case 3: -// k1 ^= ((long) access.u8(input, offset + 2L)) << 16; -// case 2: -// k1 ^= (long) Primitives.nativeToLittleEndian( -// access.u16(input, offset)); -// break megaSwitch; -// case 1: -// k1 ^= ((long) access.u8(input, offset)); -// break megaSwitch; -// default: -// throw new AssertionError(); -// } -// } // fetch0_3 -// k1 ^= access.u32(input, offset); -// break megaSwitch; -// } // fetch8_11 -// k2 ^= access.u32(input, offset + 8L); -// } // fetch0_7 -// k1 ^= access.i64(input, offset); -// } // megaSwitch -// -// h1 ^= mixK1(k1); -// h2 ^= mixK2(k2); -// } + // if (remaining > 0L) { + // long k1 = 0L; + // long k2 = 0L; + // megaSwitch: + // { + // fetch0_7: + // { + // fetch8_11: + // { + // fetch0_3: + // { + // switch ((int) remaining) { + // case 15: + // k2 ^= ((long) access.u8(input, offset + 14L)) << 48; + // case 14: + // k2 ^= ((long) Primitives.nativeToLittleEndian( + // access.u16(input, offset + 12L))) << 32; + // break fetch8_11; + // case 13: + // k2 ^= ((long) access.u8(input, offset + 12L)) << 32; + // case 12: + // break fetch8_11; + // case 11: + // k2 ^= ((long) access.u8(input, offset + 10L)) << 16; + // case 10: + // k2 ^= (long) Primitives.nativeToLittleEndian( + // access.u16(input, offset + 8L)); + // break fetch0_7; + // case 9: + // k2 ^= ((long) access.u8(input, offset + 8L)); + // case 8: + // break fetch0_7; + // case 7: + // k1 ^= ((long) access.u8(input, offset + 6L)) << 48; + // case 6: + // k1 ^= ((long) Primitives.nativeToLittleEndian( + // access.u16(input, offset + 4L))) << 32; + // break fetch0_3; + // case 5: + // k1 ^= ((long) access.u8(input, offset + 4L)) << 32; + // case 4: + // break fetch0_3; + // case 3: + // k1 ^= ((long) access.u8(input, offset + 2L)) << 16; + // case 2: + // k1 ^= (long) Primitives.nativeToLittleEndian( + // access.u16(input, offset)); + // break megaSwitch; + // case 1: + // k1 ^= ((long) access.u8(input, offset)); + // break megaSwitch; + // default: + // throw new AssertionError(); + // } + // } // fetch0_3 + // k1 ^= access.u32(input, offset); + // break megaSwitch; + // } // fetch8_11 + // k2 ^= access.u32(input, offset + 8L); + // } // fetch0_7 + // k1 ^= access.i64(input, offset); + // } // megaSwitch + // + // h1 ^= mixK1(k1); + // h2 ^= mixK2(k2); + // } return finalize(length, h1, h2, result); } @@ -232,7 +232,7 @@ long seed() { protected long hashNativeLong(long nativeLong, long len, @Nullable long[] result) { long h1 = mixK1(nativeLong); long h2 = 0L; - return MurmurHash_3.finalize(len, h1, h2, result); + return MurmurHash3.finalize(len, h1, h2, result); } @Override @@ -272,7 +272,7 @@ public long dualHashVoid(@Nullable long[] result) { @Override public long dualHash(@Nullable T input, Access access, long off, long len, @Nullable long[] result) { long seed = seed(); - return MurmurHash_3.hash(seed, input, access.byteOrder(input, LITTLE_ENDIAN), off, len, result); + return MurmurHash3.hash(seed, input, access.byteOrder(input, LITTLE_ENDIAN), off, len, result); } } @@ -294,7 +294,7 @@ private static class AsLongTupleHashFunctionSeeded extends AsLongTupleHashFuncti private AsLongTupleHashFunctionSeeded(long seed) { this.seed = seed; - MurmurHash_3.finalize(0L, seed, seed, voidHash); + MurmurHash3.finalize(0L, seed, seed, voidHash); } @Override @@ -307,7 +307,7 @@ protected long hashNativeLong(long nativeLong, long len, @Nullable long[] result long seed = this.seed; long h1 = seed ^ mixK1(nativeLong); long h2 = seed; - return MurmurHash_3.finalize(len, h1, h2, result); + return MurmurHash3.finalize(len, h1, h2, result); } @Override diff --git a/src/main/java/net/openhft/hashing/Primitives.java b/src/main/java/net/openhft/hashing/Primitives.java index cc91f62..b867e6c 100644 --- a/src/main/java/net/openhft/hashing/Primitives.java +++ b/src/main/java/net/openhft/hashing/Primitives.java @@ -8,7 +8,8 @@ final class Primitives { - private Primitives() {} + private Primitives() { + } static final boolean NATIVE_LITTLE_ENDIAN = nativeOrder() == LITTLE_ENDIAN; @@ -27,26 +28,75 @@ static int unsignedByte(int b) { private static final ByteOrderHelper H2LE = NATIVE_LITTLE_ENDIAN ? new ByteOrderHelper() : new ByteOrderHelperReverse(); private static final ByteOrderHelper H2BE = NATIVE_LITTLE_ENDIAN ? new ByteOrderHelperReverse() : new ByteOrderHelper(); - static long nativeToLittleEndian(final long v) { return H2LE.adjustByteOrder(v); } - static int nativeToLittleEndian(final int v) { return H2LE.adjustByteOrder(v); } - static short nativeToLittleEndian(final short v) { return H2LE.adjustByteOrder(v); } - static char nativeToLittleEndian(final char v) { return H2LE.adjustByteOrder(v); } + static long nativeToLittleEndian(final long v) { + return H2LE.adjustByteOrder(v); + } + + static int nativeToLittleEndian(final int v) { + return H2LE.adjustByteOrder(v); + } + + static short nativeToLittleEndian(final short v) { + return H2LE.adjustByteOrder(v); + } + + static char nativeToLittleEndian(final char v) { + return H2LE.adjustByteOrder(v); + } + + static long nativeToBigEndian(final long v) { + return H2BE.adjustByteOrder(v); + } + + static int nativeToBigEndian(final int v) { + return H2BE.adjustByteOrder(v); + } + + static short nativeToBigEndian(final short v) { + return H2BE.adjustByteOrder(v); + } - static long nativeToBigEndian(final long v) { return H2BE.adjustByteOrder(v); } - static int nativeToBigEndian(final int v) { return H2BE.adjustByteOrder(v); } - static short nativeToBigEndian(final short v) { return H2BE.adjustByteOrder(v); } - static char nativeToBigEndian(final char v) { return H2BE.adjustByteOrder(v); } + static char nativeToBigEndian(final char v) { + return H2BE.adjustByteOrder(v); + } private static class ByteOrderHelper { - long adjustByteOrder(final long v) { return v; } - int adjustByteOrder(final int v) { return v; } - short adjustByteOrder(final short v) { return v; } - char adjustByteOrder(final char v) { return v; } + long adjustByteOrder(final long v) { + return v; + } + + int adjustByteOrder(final int v) { + return v; + } + + short adjustByteOrder(final short v) { + return v; + } + + char adjustByteOrder(final char v) { + return v; + } } + private static class ByteOrderHelperReverse extends ByteOrderHelper { - long adjustByteOrder(final long v) { return Long.reverseBytes(v); } - int adjustByteOrder(final int v) { return Integer.reverseBytes(v); } - short adjustByteOrder(final short v) { return Short.reverseBytes(v); } - char adjustByteOrder(final char v) { return Character.reverseBytes(v); } + @Override + long adjustByteOrder(final long v) { + return Long.reverseBytes(v); + } + + @Override + int adjustByteOrder(final int v) { + return Integer.reverseBytes(v); + } + + @Override + short adjustByteOrder(final short v) { + return Short.reverseBytes(v); + } + + @Override + char adjustByteOrder(final char v) { + return Character.reverseBytes(v); + } } } diff --git a/src/main/java/net/openhft/hashing/WyHash.java b/src/main/java/net/openhft/hashing/WyHash.java index acdeb24..c1081b1 100644 --- a/src/main/java/net/openhft/hashing/WyHash.java +++ b/src/main/java/net/openhft/hashing/WyHash.java @@ -21,11 +21,11 @@ class WyHash { public static final long _wyp3 = 0x589965cc75374cc3L; public static final long _wyp4 = 0x1d8e4e27c47d124fL; - private static long _wymum(final long lhs, final long rhs) { + private static long wyMum(final long lhs, final long rhs) { return Maths.unsignedLongMulXorFold(lhs, rhs); } - private static long _wyr3(final Access access, T in, final long index, long k) { + private static long wyR3(final Access access, T in, final long index, long k) { return ((long) access.u8(in, index) << 16) | ((long) access.u8(in, index + (k >>> 1)) << 8) | ((long) access.u8(in, index + k - 1)); @@ -49,86 +49,88 @@ static long wyHash64(long seed, T input, Access access, long off, long le if(length <= 0) return 0; else if(length<4) - return _wymum(_wymum(_wyr3(access, input,off,length)^seed^_wyp0, + return wyMum(wyMum(wyR3(access, input,off,length)^seed^_wyp0, seed^_wyp1)^seed,length^_wyp4); else if(length<=8) - return _wymum(_wymum(access.u32(input, off) ^ seed ^ _wyp0, + return wyMum(wyMum(access.u32(input, off) ^ seed ^ _wyp0, access.u32(input, off + length - 4) ^ seed ^ _wyp1) ^ seed, length ^ _wyp4); else if(length<=16) - return _wymum(_wymum(u64Rorate32(access, input,off)^seed^_wyp0, + return wyMum(wyMum(u64Rorate32(access, input,off)^seed^_wyp0, u64Rorate32(access, input,off+length-8)^seed^_wyp1) ^seed,length^_wyp4); else if(length<=24) - return _wymum(_wymum(u64Rorate32(access, input,off)^seed^_wyp0, + return wyMum(wyMum(u64Rorate32(access, input,off)^seed^_wyp0, u64Rorate32(access, input,off+8)^seed^_wyp1)^ - _wymum(u64Rorate32(access, input,off+length-8) + wyMum(u64Rorate32(access, input,off+length-8) ^seed^_wyp2,seed^_wyp3),length^_wyp4); else if(length<=32) - return _wymum(_wymum(u64Rorate32(access, input,off)^seed^_wyp0, + return wyMum(wyMum(u64Rorate32(access, input,off)^seed^_wyp0, u64Rorate32(access, input,off+8)^seed^_wyp1) - ^_wymum(u64Rorate32(access, input,off+16)^seed^_wyp2, + ^wyMum(u64Rorate32(access, input,off+16)^seed^_wyp2, u64Rorate32(access, input,off+length-8)^seed^_wyp3),length^_wyp4); - long see1=seed; long i=length, p=off; + long see1 = seed; + long i = length; + long p = off; for(;i>256;i-=256,p+=256){ - seed = _wymum(access.i64(input, p) ^ seed ^ _wyp0, + seed = wyMum(access.i64(input, p) ^ seed ^ _wyp0, access.i64(input, p + 8) ^ seed ^ _wyp1) ^ - _wymum(access.i64(input, p + 16) ^ seed ^ _wyp2, + wyMum(access.i64(input, p + 16) ^ seed ^ _wyp2, access.i64(input, p + 24) ^ seed ^ _wyp3); - see1 = _wymum(access.i64(input, p + 32) ^ see1 ^ _wyp1, + see1 = wyMum(access.i64(input, p + 32) ^ see1 ^ _wyp1, access.i64(input, p + 40) ^ see1 ^ _wyp2) ^ - _wymum(access.i64(input, p + 48) ^ see1 ^ _wyp3, + wyMum(access.i64(input, p + 48) ^ see1 ^ _wyp3, access.i64(input, p + 56) ^ see1 ^ _wyp0); - seed = _wymum(access.i64(input, p + 64) ^ seed ^ _wyp0, + seed = wyMum(access.i64(input, p + 64) ^ seed ^ _wyp0, access.i64(input, p + 72) ^ seed ^ _wyp1) ^ - _wymum(access.i64(input, p + 80) ^ seed ^ _wyp2, + wyMum(access.i64(input, p + 80) ^ seed ^ _wyp2, access.i64(input, p + 88) ^ seed ^ _wyp3); - see1 = _wymum(access.i64(input, p + 96) ^ see1 ^ _wyp1, + see1 = wyMum(access.i64(input, p + 96) ^ see1 ^ _wyp1, access.i64(input, p + 104) ^ see1 ^ _wyp2) ^ - _wymum(access.i64(input, p + 112) ^ see1 ^ _wyp3, + wyMum(access.i64(input, p + 112) ^ see1 ^ _wyp3, access.i64(input, p + 120) ^ see1 ^ _wyp0); - seed = _wymum(access.i64(input, p + 128) ^ seed ^ _wyp0, + seed = wyMum(access.i64(input, p + 128) ^ seed ^ _wyp0, access.i64(input, p + 136) ^ seed ^ _wyp1) ^ - _wymum(access.i64(input, p + 144) ^ seed ^ _wyp2, + wyMum(access.i64(input, p + 144) ^ seed ^ _wyp2, access.i64(input, p + 152) ^ seed ^ _wyp3); - see1 = _wymum(access.i64(input, p + 160) ^ see1 ^ _wyp1, + see1 = wyMum(access.i64(input, p + 160) ^ see1 ^ _wyp1, access.i64(input, p + 168) ^ see1 ^ _wyp2) ^ - _wymum(access.i64(input, p + 176) ^ see1 ^ _wyp3, + wyMum(access.i64(input, p + 176) ^ see1 ^ _wyp3, access.i64(input, p + 184) ^ see1 ^ _wyp0); - seed = _wymum(access.i64(input, p + 192) ^ seed ^ _wyp0, + seed = wyMum(access.i64(input, p + 192) ^ seed ^ _wyp0, access.i64(input, p + 200) ^ seed ^ _wyp1) ^ - _wymum(access.i64(input, p + 208) ^ seed ^ _wyp2, + wyMum(access.i64(input, p + 208) ^ seed ^ _wyp2, access.i64(input, p + 216) ^ seed ^ _wyp3); - see1 = _wymum(access.i64(input, p + 224) ^ see1 ^ _wyp1, + see1 = wyMum(access.i64(input, p + 224) ^ see1 ^ _wyp1, access.i64(input, p + 232) ^ see1 ^ _wyp2) ^ - _wymum(access.i64(input, p + 240) ^ see1 ^ _wyp3, + wyMum(access.i64(input, p + 240) ^ see1 ^ _wyp3, access.i64(input, p + 248) ^ see1 ^ _wyp0); } for (; i > 32; i -= 32, p += 32) { - seed = _wymum(access.i64(input, p) ^ seed ^ _wyp0, + seed = wyMum(access.i64(input, p) ^ seed ^ _wyp0, access.i64(input, p + 8) ^ seed ^ _wyp1); - see1 = _wymum(access.i64(input, p + 16) ^ see1 ^ _wyp2, + see1 = wyMum(access.i64(input, p + 16) ^ see1 ^ _wyp2, access.i64(input, p + 24) ^ see1 ^ _wyp3); } if (i < 4) { - seed = _wymum(_wyr3(access, input, p, i) ^ seed ^ _wyp0, seed ^ _wyp1); + seed = wyMum(wyR3(access, input, p, i) ^ seed ^ _wyp0, seed ^ _wyp1); } else if (i <= 8) { - seed = _wymum(access.u32(input, p) ^ seed ^ _wyp0, + seed = wyMum(access.u32(input, p) ^ seed ^ _wyp0, access.u32(input, p + i - 4) ^ seed ^ _wyp1); } else if (i <= 16) { - seed = _wymum(u64Rorate32(access, input, p) ^ seed ^ _wyp0, + seed = wyMum(u64Rorate32(access, input, p) ^ seed ^ _wyp0, u64Rorate32(access, input, p + i - 8) ^ seed ^ _wyp1); } else if (i <= 24) { - seed = _wymum(u64Rorate32(access, input, p) ^ seed ^ _wyp0, + seed = wyMum(u64Rorate32(access, input, p) ^ seed ^ _wyp0, u64Rorate32(access, input, p + 8) ^ seed ^ _wyp1); - see1 = _wymum(u64Rorate32(access, input, p + i - 8) ^ see1 ^ _wyp2, see1 ^ _wyp3); + see1 = wyMum(u64Rorate32(access, input, p + i - 8) ^ see1 ^ _wyp2, see1 ^ _wyp3); } else { - seed = _wymum(u64Rorate32(access, input, p) ^ seed ^ _wyp0, + seed = wyMum(u64Rorate32(access, input, p) ^ seed ^ _wyp0, u64Rorate32(access, input, p + 8) ^ seed ^ _wyp1); - see1 = _wymum(u64Rorate32(access, input, p + 16) ^ see1 ^ _wyp2, + see1 = wyMum(u64Rorate32(access, input, p + 16) ^ see1 ^ _wyp2, u64Rorate32(access, input, p + i - 8) ^ see1 ^ _wyp3); } - return _wymum(seed ^ see1, length ^ _wyp4); + return wyMum(seed ^ see1, length ^ _wyp4); } static LongHashFunction asLongHashFunctionWithoutSeed() { @@ -152,7 +154,7 @@ public long hashLong(long input) { input = Primitives.nativeToLittleEndian(input); long hi = input & 0xFFFFFFFFL; long lo = (input >>> 32) & 0xFFFFFFFFL; - return _wymum(_wymum(hi ^ seed() ^ _wyp0, + return wyMum(wyMum(hi ^ seed() ^ _wyp0, lo ^ seed() ^ _wyp1) ^ seed(), 8 ^ _wyp4); } @@ -161,7 +163,7 @@ public long hashLong(long input) { public long hashInt(int input) { input = Primitives.nativeToLittleEndian(input); long longInput = (input & 0xFFFFFFFFL); - return _wymum(_wymum(longInput ^ seed() ^ _wyp0, + return wyMum(wyMum(longInput ^ seed() ^ _wyp0, longInput ^ seed() ^ _wyp1) ^ seed(), 4 ^ _wyp4); } @@ -171,7 +173,7 @@ public long hashShort(short input) { input = Primitives.nativeToLittleEndian(input); long hi = (input >>> 8) & 0xFFL; long wyr3 = hi | hi << 8 | (input & 0xFFL) << 16; - return _wymum(_wymum(wyr3 ^ seed() ^ _wyp0, + return wyMum(wyMum(wyr3 ^ seed() ^ _wyp0, seed() ^ _wyp1) ^ seed(), 2 ^ _wyp4); } @@ -184,7 +186,7 @@ public long hashChar(final char input) { public long hashByte(final byte input) { long hi = input & 0xFFL; long wyr3 = hi | hi << 8 | hi << 16; - return _wymum(_wymum(wyr3 ^ seed() ^ _wyp0, + return wyMum(wyMum(wyr3 ^ seed() ^ _wyp0, seed() ^ _wyp1) ^ seed(), 1 ^ _wyp4); } diff --git a/src/main/java/net/openhft/hashing/XXH3.java b/src/main/java/net/openhft/hashing/XXH3.java index 4ff62a3..76eaf67 100644 --- a/src/main/java/net/openhft/hashing/XXH3.java +++ b/src/main/java/net/openhft/hashing/XXH3.java @@ -47,19 +47,19 @@ class XXH3 { private static final long nbStripesPerBlock = (192 - 64) / 8; private static final long block_len = 64 * nbStripesPerBlock; - private static long XXH64_avalanche(long h64) { + private static long xxh64Avalanche(long h64) { h64 ^= h64 >>> 33; h64 *= XXH_PRIME64_2; h64 ^= h64 >>> 29; h64 *= XXH_PRIME64_3; return h64 ^ (h64 >>> 32); } - private static long XXH3_avalanche(long h64) { + private static long xxh3Avalanche(long h64) { h64 ^= h64 >>> 37; h64 *= 0x165667919E3779F9L; return h64 ^ (h64 >>> 32); } - private static long XXH3_rrmxmx(long h64, final long length) { + private static long xxh3Rrmxmx(long h64, final long length) { h64 ^= Long.rotateLeft(h64, 49) ^ Long.rotateLeft(h64, 24); h64 *= 0x9FB21C651E98DF25L; h64 ^= (h64 >>> 35) + length; @@ -67,42 +67,42 @@ private static long XXH3_rrmxmx(long h64, final long length) { return h64 ^ (h64 >>> 28); } - private static long XXH3_mix16B(final long seed, final T input, final Access access, final long offIn, final long offSec) { - final long input_lo = access.i64(input, offIn); - final long input_hi = access.i64(input, offIn + 8); + private static long xxh3Mix16B(final long seed, final T input, final Access access, final long offIn, final long offSec) { + final long inputLo = access.i64(input, offIn); + final long inputHi = access.i64(input, offIn + 8); return unsignedLongMulXorFold( - input_lo ^ (unsafeLE.i64(XXH3_kSecret, offSec) + seed), - input_hi ^ (unsafeLE.i64(XXH3_kSecret, offSec+8) - seed) + inputLo ^ (unsafeLE.i64(XXH3_kSecret, offSec) + seed), + inputHi ^ (unsafeLE.i64(XXH3_kSecret, offSec + 8) - seed) ); } /* - * A bit slower than XXH3_mix16B, but handles multiply by zero better. + * A bit slower than xxh3Mix16B, but handles multiply by zero better. */ - private static long XXH128_mix32B_once(final long seed, final long offSec, long acc, final long input0, final long input1, final long input2, final long input3) { + private static long xxh128Mix32BOnce(final long seed, final long offSec, long acc, final long input0, final long input1, final long input2, final long input3) { acc += unsignedLongMulXorFold( - input0 ^ (unsafeLE.i64(XXH3_kSecret, offSec ) + seed), + input0 ^ (unsafeLE.i64(XXH3_kSecret, offSec) + seed), input1 ^ (unsafeLE.i64(XXH3_kSecret, offSec + 8) - seed)); return acc ^ (input2 + input3); } - private static long XXH3_mix2Accs(final long acc_lh, final long acc_rh, final byte[] secret, final long offSec) { + private static long xxh3Mix2Accs(final long accLeft, final long accRight, final byte[] secret, final long offSec) { return unsignedLongMulXorFold( - acc_lh ^ unsafeLE.i64(secret, offSec), - acc_rh ^ unsafeLE.i64(secret, offSec+8) ); + accLeft ^ unsafeLE.i64(secret, offSec), + accRight ^ unsafeLE.i64(secret, offSec + 8)); } - private static long XXH3_64bits_internal(final long seed, final byte[] secret, final T input, final Access access, final long off, final long length) { + private static long xxh364BitsInternal(final long seed, final byte[] secret, final T input, final Access access, final long off, final long length) { if (length <= 16) { // XXH3_len_0to16_64b if (length > 8) { // XXH3_len_9to16_64b final long bitflip1 = (unsafeLE.i64(XXH3_kSecret, 24+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 32+BYTE_BASE)) + seed; final long bitflip2 = (unsafeLE.i64(XXH3_kSecret, 40+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 48+BYTE_BASE)) - seed; - final long input_lo = access.i64(input, off) ^ bitflip1; - final long input_hi = access.i64(input, off + length - 8) ^ bitflip2; - final long acc = length + Long.reverseBytes(input_lo) + input_hi + unsignedLongMulXorFold(input_lo, input_hi); - return XXH3_avalanche(acc); + final long inputLo = access.i64(input, off) ^ bitflip1; + final long inputHi = access.i64(input, off + length - 8) ^ bitflip2; + final long acc = length + Long.reverseBytes(inputLo) + inputHi + unsignedLongMulXorFold(inputLo, inputHi); + return xxh3Avalanche(acc); } if (length >= 4) { // XXH3_len_4to8_64b @@ -111,7 +111,7 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long input2 = access.u32(input, off + length - 4); final long bitflip = (unsafeLE.i64(XXH3_kSecret, 8+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 16+BYTE_BASE)) - s; final long keyed = (input2 + (input1 << 32)) ^ bitflip; - return XXH3_rrmxmx(keyed, length); + return xxh3Rrmxmx(keyed, length); } if (length != 0) { // XXH3_len_1to3_64b @@ -120,9 +120,9 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final int c3 = access.u8(input, off + length - 1); final long combined = Primitives.unsignedInt((c1 << 16) | (c2 << 24) | c3 | ((int)length << 8)); final long bitflip = Primitives.unsignedInt(unsafeLE.i32(XXH3_kSecret, BYTE_BASE) ^ unsafeLE.i32(XXH3_kSecret, 4+BYTE_BASE)) + seed; - return XXH64_avalanche(combined ^ bitflip); + return xxh64Avalanche(combined ^ bitflip); } - return XXH64_avalanche(seed ^ unsafeLE.i64(XXH3_kSecret, 56+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 64+BYTE_BASE)); + return xxh64Avalanche(seed ^ unsafeLE.i64(XXH3_kSecret, 56+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 64+BYTE_BASE)); } if (length <= 128) { // XXH3_len_17to128_64b @@ -131,19 +131,19 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre if (length > 32) { if (length > 64) { if (length > 96) { - acc += XXH3_mix16B(seed, input, access, off + 48, BYTE_BASE + 96); - acc += XXH3_mix16B(seed, input, access, off + length - 64, BYTE_BASE + 112); + acc += xxh3Mix16B(seed, input, access, off + 48, BYTE_BASE + 96); + acc += xxh3Mix16B(seed, input, access, off + length - 64, BYTE_BASE + 112); } - acc += XXH3_mix16B(seed, input, access, off + 32, BYTE_BASE + 64); - acc += XXH3_mix16B(seed, input, access, off + length - 48, BYTE_BASE + 80); + acc += xxh3Mix16B(seed, input, access, off + 32, BYTE_BASE + 64); + acc += xxh3Mix16B(seed, input, access, off + length - 48, BYTE_BASE + 80); } - acc += XXH3_mix16B(seed, input, access, off + 16, BYTE_BASE + 32); - acc += XXH3_mix16B(seed, input, access, off + length - 32, BYTE_BASE + 48); + acc += xxh3Mix16B(seed, input, access, off + 16, BYTE_BASE + 32); + acc += xxh3Mix16B(seed, input, access, off + length - 32, BYTE_BASE + 48); } - acc += XXH3_mix16B(seed, input, access, off, BYTE_BASE); - acc += XXH3_mix16B(seed, input, access, off + length - 16, BYTE_BASE + 16); + acc += xxh3Mix16B(seed, input, access, off, BYTE_BASE); + acc += xxh3Mix16B(seed, input, access, off + length - 16, BYTE_BASE + 16); - return XXH3_avalanche(acc); + return xxh3Avalanche(acc); } if (length <= 240) { // XXH3_len_129to240_64b @@ -151,35 +151,35 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final int nbRounds = (int)length / 16; int i = 0; for (; i < 8; ++i) { - acc += XXH3_mix16B(seed, input, access, off + 16*i, BYTE_BASE + 16*i); + acc += xxh3Mix16B(seed, input, access, off + 16*i, BYTE_BASE + 16*i); } - acc = XXH3_avalanche(acc); + acc = xxh3Avalanche(acc); for (; i < nbRounds; ++i) { - acc += XXH3_mix16B(seed, input, access, off + 16*i, BYTE_BASE + 16*(i-8) + 3); + acc += xxh3Mix16B(seed, input, access, off + 16*i, BYTE_BASE + 16*(i-8) + 3); } /* last bytes */ - acc += XXH3_mix16B(seed, input, access, off + length - 16, BYTE_BASE + 136 - 17); - return XXH3_avalanche(acc); + acc += xxh3Mix16B(seed, input, access, off + length - 16, BYTE_BASE + 136 - 17); + return xxh3Avalanche(acc); } // XXH3_hashLong_64b_internal - long acc_0 = XXH_PRIME32_3; - long acc_1 = XXH_PRIME64_1; - long acc_2 = XXH_PRIME64_2; - long acc_3 = XXH_PRIME64_3; - long acc_4 = XXH_PRIME64_4; - long acc_5 = XXH_PRIME32_2; - long acc_6 = XXH_PRIME64_5; - long acc_7 = XXH_PRIME32_1; + long acc0 = XXH_PRIME32_3; + long acc1 = XXH_PRIME64_1; + long acc2 = XXH_PRIME64_2; + long acc3 = XXH_PRIME64_3; + long acc4 = XXH_PRIME64_4; + long acc5 = XXH_PRIME32_2; + long acc6 = XXH_PRIME64_5; + long acc7 = XXH_PRIME32_1; // XXH3_hashLong_internal_loop final long nb_blocks = (length - 1) / block_len; for (long n = 0; n < nb_blocks; n++) { // XXH3_accumulate final long offBlock = off + n * block_len; - for (long s = 0; s < nbStripesPerBlock; s++ ) { + for (long s = 0; s < nbStripesPerBlock; s++) { // XXH3_accumulate_512 final long offStripe = offBlock + s * 64; final long offSec = s * 8; @@ -189,8 +189,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*0); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*1); /* swap adjacent lanes */ - acc_0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*2); @@ -198,8 +198,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*2); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*3); /* swap adjacent lanes */ - acc_2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*4); @@ -207,8 +207,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*4); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*5); /* swap adjacent lanes */ - acc_4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*6); @@ -216,21 +216,21 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*6); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*7); /* swap adjacent lanes */ - acc_6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } } // XXH3_scrambleAcc_scalar final long offSec = BYTE_BASE + 192 - 64; - acc_0 = (acc_0 ^ (acc_0 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*0)) * XXH_PRIME32_1; - acc_1 = (acc_1 ^ (acc_1 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*1)) * XXH_PRIME32_1; - acc_2 = (acc_2 ^ (acc_2 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*2)) * XXH_PRIME32_1; - acc_3 = (acc_3 ^ (acc_3 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*3)) * XXH_PRIME32_1; - acc_4 = (acc_4 ^ (acc_4 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*4)) * XXH_PRIME32_1; - acc_5 = (acc_5 ^ (acc_5 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*5)) * XXH_PRIME32_1; - acc_6 = (acc_6 ^ (acc_6 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*6)) * XXH_PRIME32_1; - acc_7 = (acc_7 ^ (acc_7 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*7)) * XXH_PRIME32_1; + acc0 = (acc0 ^ (acc0 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*0)) * XXH_PRIME32_1; + acc1 = (acc1 ^ (acc1 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*1)) * XXH_PRIME32_1; + acc2 = (acc2 ^ (acc2 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*2)) * XXH_PRIME32_1; + acc3 = (acc3 ^ (acc3 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*3)) * XXH_PRIME32_1; + acc4 = (acc4 ^ (acc4 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*4)) * XXH_PRIME32_1; + acc5 = (acc5 ^ (acc5 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*5)) * XXH_PRIME32_1; + acc6 = (acc6 ^ (acc6 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*6)) * XXH_PRIME32_1; + acc7 = (acc7 ^ (acc7 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*7)) * XXH_PRIME32_1; } /* last partial block */ @@ -246,8 +246,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*0); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*1); /* swap adjacent lanes */ - acc_0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*2); @@ -255,8 +255,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*2); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*3); /* swap adjacent lanes */ - acc_2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*4); @@ -264,8 +264,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*4); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*5); /* swap adjacent lanes */ - acc_4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*6); @@ -273,8 +273,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*6); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*7); /* swap adjacent lanes */ - acc_6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } } @@ -288,8 +288,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*0); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*1); /* swap adjacent lanes */ - acc_0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*2); @@ -297,8 +297,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*2); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*3); /* swap adjacent lanes */ - acc_2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*4); @@ -306,8 +306,8 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*4); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*5); /* swap adjacent lanes */ - acc_4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*6); @@ -315,66 +315,66 @@ private static long XXH3_64bits_internal(final long seed, final byte[] secre final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*6); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*7); /* swap adjacent lanes */ - acc_6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } // XXH3_mergeAccs final long result64 = length * XXH_PRIME64_1 - + XXH3_mix2Accs(acc_0, acc_1, secret, BYTE_BASE + 11) - + XXH3_mix2Accs(acc_2, acc_3, secret, BYTE_BASE + 11 + 16) - + XXH3_mix2Accs(acc_4, acc_5, secret, BYTE_BASE + 11 + 16 * 2) - + XXH3_mix2Accs(acc_6, acc_7, secret, BYTE_BASE + 11 + 16 * 3); + + xxh3Mix2Accs(acc0, acc1, secret, BYTE_BASE + 11) + + xxh3Mix2Accs(acc2, acc3, secret, BYTE_BASE + 11 + 16) + + xxh3Mix2Accs(acc4, acc5, secret, BYTE_BASE + 11 + 16 * 2) + + xxh3Mix2Accs(acc6, acc7, secret, BYTE_BASE + 11 + 16 * 3); - return XXH3_avalanche(result64); + return xxh3Avalanche(result64); } - private static long XXH3_128bits_internal(final long seed, final byte[] secret, final T input, final Access access, final long off, final long length, final long[] result) { + private static long xxh3128BitsInternal(final long seed, final byte[] secret, final T input, final Access access, final long off, final long length, final long[] result) { if (length <= 16) { // XXH3_len_0to16_128b if (length > 8) { // XXH3_len_9to16_128b final long bitflipl = (unsafeLE.i64(XXH3_kSecret, 32+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 40+BYTE_BASE)) - seed; final long bitfliph = (unsafeLE.i64(XXH3_kSecret, 48+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 56+BYTE_BASE)) + seed; - long input_hi = access.i64(input, off + length - 8); - final long input_lo = access.i64(input, off) ^ input_hi ^ bitflipl; - long m128_lo = input_lo * XXH_PRIME64_1; - long m128_hi = Maths.unsignedLongMulHigh(input_lo, XXH_PRIME64_1); - m128_lo += (length - 1) << 54; - input_hi ^= bitfliph; - m128_hi += input_hi + Primitives.unsignedInt((int)input_hi) * (XXH_PRIME32_2 - 1); - m128_lo ^= Long.reverseBytes(m128_hi); - - final long low = XXH3_avalanche(m128_lo * XXH_PRIME64_2); + long inputHi = access.i64(input, off + length - 8); + final long inputLo = access.i64(input, off) ^ inputHi ^ bitflipl; + long m128Lo = inputLo * XXH_PRIME64_1; + long m128Hi = Maths.unsignedLongMulHigh(inputLo, XXH_PRIME64_1); + m128Lo += (length - 1) << 54; + inputHi ^= bitfliph; + m128Hi += inputHi + Primitives.unsignedInt((int)inputHi) * (XXH_PRIME32_2 - 1); + m128Lo ^= Long.reverseBytes(m128Hi); + + final long low = xxh3Avalanche(m128Lo * XXH_PRIME64_2); if (null != result) { result[0] = low; - result[1] = XXH3_avalanche(Maths.unsignedLongMulHigh(m128_lo, XXH_PRIME64_2) + m128_hi * XXH_PRIME64_2); + result[1] = xxh3Avalanche(Maths.unsignedLongMulHigh(m128Lo, XXH_PRIME64_2) + m128Hi * XXH_PRIME64_2); } return low; } if (length >= 4) { // XXH3_len_4to8_128b long s = seed ^ Long.reverseBytes(seed & 0xFFFFFFFFL); - final long input_lo = access.u32(input, off); - final long input_hi = (long)access.i32(input, off + length - 4); // high int will be shifted + final long inputLo = access.u32(input, off); + final long inputHi = (long)access.i32(input, off + length - 4); // high int will be shifted final long bitflip = (unsafeLE.i64(XXH3_kSecret, 16+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 24+BYTE_BASE)) + s; - final long keyed = (input_lo + (input_hi << 32)) ^ bitflip; + final long keyed = (inputLo + (inputHi << 32)) ^ bitflip; final long pl = XXH_PRIME64_1 + (length << 2); /* Shift len to the left to ensure it is even, this avoids even multiplies. */ - long m128_lo = keyed * pl; - long m128_hi = Maths.unsignedLongMulHigh(keyed, pl); - m128_hi += (m128_lo << 1); - m128_lo ^= (m128_hi >>> 3); + long m128Lo = keyed * pl; + long m128Hi = Maths.unsignedLongMulHigh(keyed, pl); + m128Hi += (m128Lo << 1); + m128Lo ^= (m128Hi >>> 3); - m128_lo ^= m128_lo >>> 35; - m128_lo *= 0x9FB21C651E98DF25L; - m128_lo ^= m128_lo >>> 28; + m128Lo ^= m128Lo >>> 35; + m128Lo *= 0x9FB21C651E98DF25L; + m128Lo ^= m128Lo >>> 28; if (null != result) { - result[0] = m128_lo; - result[1] = XXH3_avalanche(m128_hi); + result[0] = m128Lo; + result[1] = xxh3Avalanche(m128Hi); } - return m128_lo; + return m128Lo; } if (length != 0) { // XXH3_len_1to3_128b @@ -386,17 +386,17 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long bitflipl = Primitives.unsignedInt(unsafeLE.i32(XXH3_kSecret, BYTE_BASE) ^ unsafeLE.i32(XXH3_kSecret, BYTE_BASE+4)) + seed; final long bitfliph = Primitives.unsignedInt(unsafeLE.i32(XXH3_kSecret, BYTE_BASE+8) ^ unsafeLE.i32(XXH3_kSecret, BYTE_BASE+12)) - seed; - final long low = XXH64_avalanche(Primitives.unsignedInt(combinedl) ^ bitflipl); + final long low = xxh64Avalanche(Primitives.unsignedInt(combinedl) ^ bitflipl); if (null != result) { result[0] = low; - result[1] = XXH64_avalanche(Primitives.unsignedInt(combinedh) ^ bitfliph); + result[1] = xxh64Avalanche(Primitives.unsignedInt(combinedh) ^ bitfliph); } return low; } - final long low = XXH64_avalanche(seed ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+64) ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+72)); + final long low = xxh64Avalanche(seed ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+64) ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+72)); if (null != result) { result[0] = low; - result[1] = XXH64_avalanche(seed ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+80) ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+88)); + result[1] = xxh64Avalanche(seed ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+80) ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+88)); } return low; } @@ -411,34 +411,34 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long input1 = access.i64(input, off + 48 + 8); final long input2 = access.i64(input, off + length - 64); final long input3 = access.i64(input, off + length - 64 + 8); - acc0 = XXH128_mix32B_once(seed, BYTE_BASE + 96, acc0, input0, input1, input2, input3); - acc1 = XXH128_mix32B_once(seed, BYTE_BASE + 96 + 16, acc1, input2, input3, input0, input1); + acc0 = xxh128Mix32BOnce(seed, BYTE_BASE + 96, acc0, input0, input1, input2, input3); + acc1 = xxh128Mix32BOnce(seed, BYTE_BASE + 96 + 16, acc1, input2, input3, input0, input1); } final long input0 = access.i64(input, off + 32); final long input1 = access.i64(input, off + 32 + 8); final long input2 = access.i64(input, off + length - 48); final long input3 = access.i64(input, off + length - 48 + 8); - acc0 = XXH128_mix32B_once(seed, BYTE_BASE + 64, acc0, input0, input1, input2, input3); - acc1 = XXH128_mix32B_once(seed, BYTE_BASE + 64 + 16, acc1, input2, input3, input0, input1); + acc0 = xxh128Mix32BOnce(seed, BYTE_BASE + 64, acc0, input0, input1, input2, input3); + acc1 = xxh128Mix32BOnce(seed, BYTE_BASE + 64 + 16, acc1, input2, input3, input0, input1); } final long input0 = access.i64(input, off + 16); final long input1 = access.i64(input, off + 16 + 8); final long input2 = access.i64(input, off + length - 32); final long input3 = access.i64(input, off + length - 32 + 8); - acc0 = XXH128_mix32B_once(seed, BYTE_BASE + 32, acc0, input0, input1, input2, input3); - acc1 = XXH128_mix32B_once(seed, BYTE_BASE + 32 + 16, acc1, input2, input3, input0, input1); + acc0 = xxh128Mix32BOnce(seed, BYTE_BASE + 32, acc0, input0, input1, input2, input3); + acc1 = xxh128Mix32BOnce(seed, BYTE_BASE + 32 + 16, acc1, input2, input3, input0, input1); } final long input0 = access.i64(input, off + 0); final long input1 = access.i64(input, off + 0 + 8); final long input2 = access.i64(input, off + length - 16); final long input3 = access.i64(input, off + length - 16 + 8); - acc0 = XXH128_mix32B_once(seed, BYTE_BASE, acc0, input0, input1, input2, input3); - acc1 = XXH128_mix32B_once(seed, BYTE_BASE + 16, acc1, input2, input3, input0, input1); + acc0 = xxh128Mix32BOnce(seed, BYTE_BASE, acc0, input0, input1, input2, input3); + acc1 = xxh128Mix32BOnce(seed, BYTE_BASE + 16, acc1, input2, input3, input0, input1); - final long low = XXH3_avalanche(acc0 + acc1); + final long low = xxh3Avalanche(acc0 + acc1); if (null != result) { result[0] = low; - result[1] = -XXH3_avalanche(acc0*XXH_PRIME64_1 + acc1*XXH_PRIME64_4 + (length - seed)*XXH_PRIME64_2); + result[1] = -xxh3Avalanche(acc0*XXH_PRIME64_1 + acc1*XXH_PRIME64_4 + (length - seed)*XXH_PRIME64_2); } return low; } @@ -454,19 +454,19 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long input1 = access.i64(input, off + 32*i + 8); final long input2 = access.i64(input, off + 32*i + 16); final long input3 = access.i64(input, off + 32*i + 24); - acc0 = XXH128_mix32B_once(seed, BYTE_BASE + 32*i, acc0, input0, input1, input2, input3); - acc1 = XXH128_mix32B_once(seed, BYTE_BASE + 32*i + 16, acc1, input2, input3, input0, input1); + acc0 = xxh128Mix32BOnce(seed, BYTE_BASE + 32*i, acc0, input0, input1, input2, input3); + acc1 = xxh128Mix32BOnce(seed, BYTE_BASE + 32*i + 16, acc1, input2, input3, input0, input1); } - acc0 = XXH3_avalanche(acc0); - acc1 = XXH3_avalanche(acc1); + acc0 = xxh3Avalanche(acc0); + acc1 = xxh3Avalanche(acc1); for (; i < nbRounds; ++i) { final long input0 = access.i64(input, off + 32*i); final long input1 = access.i64(input, off + 32*i + 8); final long input2 = access.i64(input, off + 32*i + 16); final long input3 = access.i64(input, off + 32*i + 24); - acc0 = XXH128_mix32B_once(seed, BYTE_BASE + 3 + 32*(i-4), acc0, input0, input1, input2, input3); - acc1 = XXH128_mix32B_once(seed, BYTE_BASE + 3 + 32*(i-4) + 16, acc1, input2, input3, input0, input1); + acc0 = xxh128Mix32BOnce(seed, BYTE_BASE + 3 + 32*(i-4), acc0, input0, input1, input2, input3); + acc1 = xxh128Mix32BOnce(seed, BYTE_BASE + 3 + 32*(i-4) + 16, acc1, input2, input3, input0, input1); } /* last bytes */ @@ -474,33 +474,33 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long input1 = access.i64(input, off + length - 16 + 8); final long input2 = access.i64(input, off + length - 32); final long input3 = access.i64(input, off + length - 32 + 8); - acc0 = XXH128_mix32B_once(-seed, BYTE_BASE + 136 - 17 - 16, acc0, input0, input1, input2, input3); - acc1 = XXH128_mix32B_once(-seed, BYTE_BASE + 136 - 17 , acc1, input2, input3, input0, input1); + acc0 = xxh128Mix32BOnce(-seed, BYTE_BASE + 136 - 17 - 16, acc0, input0, input1, input2, input3); + acc1 = xxh128Mix32BOnce(-seed, BYTE_BASE + 136 - 17 , acc1, input2, input3, input0, input1); - final long low = XXH3_avalanche(acc0 + acc1); + final long low = xxh3Avalanche(acc0 + acc1); if (null != result) { result[0] = low; - result[1] = -XXH3_avalanche(acc0*XXH_PRIME64_1 + acc1*XXH_PRIME64_4 + (length - seed)*XXH_PRIME64_2); + result[1] = -xxh3Avalanche(acc0*XXH_PRIME64_1 + acc1*XXH_PRIME64_4 + (length - seed)*XXH_PRIME64_2); } return low; } // XXH3_hashLong_128b_internal - long acc_0 = XXH_PRIME32_3; - long acc_1 = XXH_PRIME64_1; - long acc_2 = XXH_PRIME64_2; - long acc_3 = XXH_PRIME64_3; - long acc_4 = XXH_PRIME64_4; - long acc_5 = XXH_PRIME32_2; - long acc_6 = XXH_PRIME64_5; - long acc_7 = XXH_PRIME32_1; + long acc0 = XXH_PRIME32_3; + long acc1 = XXH_PRIME64_1; + long acc2 = XXH_PRIME64_2; + long acc3 = XXH_PRIME64_3; + long acc4 = XXH_PRIME64_4; + long acc5 = XXH_PRIME32_2; + long acc6 = XXH_PRIME64_5; + long acc7 = XXH_PRIME32_1; // XXH3_hashLong_internal_loop final long nb_blocks = (length - 1) / block_len; for (long n = 0; n < nb_blocks; n++) { // XXH3_accumulate final long offBlock = off + n * block_len; - for (long s = 0; s < nbStripesPerBlock; s++ ) { + for (long s = 0; s < nbStripesPerBlock; s++) { // XXH3_accumulate_512 final long offStripe = offBlock + s * 64; final long offSec = s * 8; @@ -510,8 +510,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*0); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*1); /* swap adjacent lanes */ - acc_0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*2); @@ -519,8 +519,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*2); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*3); /* swap adjacent lanes */ - acc_2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*4); @@ -528,8 +528,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*4); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*5); /* swap adjacent lanes */ - acc_4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*6); @@ -537,21 +537,21 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*6); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*7); /* swap adjacent lanes */ - acc_6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } } // XXH3_scrambleAcc_scalar final long offSec = BYTE_BASE + 192 - 64; - acc_0 = (acc_0 ^ (acc_0 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*0)) * XXH_PRIME32_1; - acc_1 = (acc_1 ^ (acc_1 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*1)) * XXH_PRIME32_1; - acc_2 = (acc_2 ^ (acc_2 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*2)) * XXH_PRIME32_1; - acc_3 = (acc_3 ^ (acc_3 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*3)) * XXH_PRIME32_1; - acc_4 = (acc_4 ^ (acc_4 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*4)) * XXH_PRIME32_1; - acc_5 = (acc_5 ^ (acc_5 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*5)) * XXH_PRIME32_1; - acc_6 = (acc_6 ^ (acc_6 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*6)) * XXH_PRIME32_1; - acc_7 = (acc_7 ^ (acc_7 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*7)) * XXH_PRIME32_1; + acc0 = (acc0 ^ (acc0 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*0)) * XXH_PRIME32_1; + acc1 = (acc1 ^ (acc1 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*1)) * XXH_PRIME32_1; + acc2 = (acc2 ^ (acc2 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*2)) * XXH_PRIME32_1; + acc3 = (acc3 ^ (acc3 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*3)) * XXH_PRIME32_1; + acc4 = (acc4 ^ (acc4 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*4)) * XXH_PRIME32_1; + acc5 = (acc5 ^ (acc5 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*5)) * XXH_PRIME32_1; + acc6 = (acc6 ^ (acc6 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*6)) * XXH_PRIME32_1; + acc7 = (acc7 ^ (acc7 >>> 47) ^ unsafeLE.i64(secret, offSec + 8*7)) * XXH_PRIME32_1; } /* last partial block */ @@ -567,8 +567,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*0); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*1); /* swap adjacent lanes */ - acc_0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*2); @@ -576,8 +576,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*2); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*3); /* swap adjacent lanes */ - acc_2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*4); @@ -585,8 +585,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*4); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*5); /* swap adjacent lanes */ - acc_4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*6); @@ -594,8 +594,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*6); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*7); /* swap adjacent lanes */ - acc_6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } } @@ -609,8 +609,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*0); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*1); /* swap adjacent lanes */ - acc_0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc0 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc1 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*2); @@ -618,8 +618,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*2); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*3); /* swap adjacent lanes */ - acc_2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc2 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc3 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*4); @@ -627,8 +627,8 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*4); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*5); /* swap adjacent lanes */ - acc_4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc4 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc5 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } { final long data_val_0 = access.i64(input, offStripe + 8*6); @@ -636,28 +636,28 @@ private static long XXH3_128bits_internal(final long seed, final byte[] secr final long data_key_0 = data_val_0 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*6); final long data_key_1 = data_val_1 ^ unsafeLE.i64(secret, BYTE_BASE + offSec + 8*7); /* swap adjacent lanes */ - acc_6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); - acc_7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); + acc6 += data_val_1 + (0xFFFFFFFFL & data_key_0) * (data_key_0 >>> 32); + acc7 += data_val_0 + (0xFFFFFFFFL & data_key_1) * (data_key_1 >>> 32); } // XXH3_mergeAccs - final long low = XXH3_avalanche(length * XXH_PRIME64_1 - + XXH3_mix2Accs(acc_0, acc_1, secret, BYTE_BASE + 11) - + XXH3_mix2Accs(acc_2, acc_3, secret, BYTE_BASE + 11 + 16) - + XXH3_mix2Accs(acc_4, acc_5, secret, BYTE_BASE + 11 + 16 * 2) - + XXH3_mix2Accs(acc_6, acc_7, secret, BYTE_BASE + 11 + 16 * 3)); + final long low = xxh3Avalanche(length * XXH_PRIME64_1 + + xxh3Mix2Accs(acc0, acc1, secret, BYTE_BASE + 11) + + xxh3Mix2Accs(acc2, acc3, secret, BYTE_BASE + 11 + 16) + + xxh3Mix2Accs(acc4, acc5, secret, BYTE_BASE + 11 + 16 * 2) + + xxh3Mix2Accs(acc6, acc7, secret, BYTE_BASE + 11 + 16 * 3)); if (null != result) { result[0] = low; - result[1] = XXH3_avalanche(~(length * XXH_PRIME64_2) - + XXH3_mix2Accs(acc_0, acc_1, secret, BYTE_BASE + 192 - 64 - 11) - + XXH3_mix2Accs(acc_2, acc_3, secret, BYTE_BASE + 192 - 64 - 11 + 16) - + XXH3_mix2Accs(acc_4, acc_5, secret, BYTE_BASE + 192 - 64 - 11 + 16 * 2) - + XXH3_mix2Accs(acc_6, acc_7, secret, BYTE_BASE + 192 - 64 - 11 + 16 * 3)); + result[1] = xxh3Avalanche(~(length * XXH_PRIME64_2) + + xxh3Mix2Accs(acc0, acc1, secret, BYTE_BASE + 192 - 64 - 11) + + xxh3Mix2Accs(acc2, acc3, secret, BYTE_BASE + 192 - 64 - 11 + 16) + + xxh3Mix2Accs(acc4, acc5, secret, BYTE_BASE + 192 - 64 - 11 + 16 * 2) + + xxh3Mix2Accs(acc6, acc7, secret, BYTE_BASE + 192 - 64 - 11 + 16 * 3)); } return low; } - private static void XXH3_initCustomSecret(final byte[] customSecret, final long seed64) { + private static void xxh3InitCustomSecret(final byte[] customSecret, final long seed64) { final int nbRounds = 192 / 16; final ByteBuffer bb = ByteBuffer.wrap(customSecret).order(LITTLE_ENDIAN); for (int i=0; i < nbRounds; i++) { @@ -686,7 +686,7 @@ public long hashLong(long input) { final long s = seed() ^ Long.reverseBytes(seed() & 0xFFFFFFFFL); final long bitflip = (unsafeLE.i64(XXH3.XXH3_kSecret, 8+BYTE_BASE) ^ unsafeLE.i64(XXH3.XXH3_kSecret, 16+BYTE_BASE)) - s; final long keyed = Long.rotateLeft(input, 32) ^ bitflip; - return XXH3_rrmxmx(keyed, 8); + return xxh3Rrmxmx(keyed, 8); } @Override @@ -695,7 +695,7 @@ public long hashInt(int input) { long s = seed() ^ Long.reverseBytes(seed() & 0xFFFFFFFFL); final long bitflip = (unsafeLE.i64(XXH3.XXH3_kSecret, 8+BYTE_BASE) ^ unsafeLE.i64(XXH3.XXH3_kSecret, 16+BYTE_BASE)) - s; final long keyed = (Primitives.unsignedInt(input) + (((long)input) << 32)) ^ bitflip; - return XXH3_rrmxmx(keyed, 4); + return xxh3Rrmxmx(keyed, 4); } @Override @@ -706,7 +706,7 @@ public long hashShort(short input) { final int c3 = c2; final long combined = Primitives.unsignedInt((c1 << 16) | (c2 << 24) | c3 | (2 << 8)); final long bitflip = (unsafeLE.u32(XXH3.XXH3_kSecret, BYTE_BASE) ^ unsafeLE.u32(XXH3.XXH3_kSecret, 4+BYTE_BASE)) + seed(); - return XXH64_avalanche(combined ^ bitflip); + return xxh64Avalanche(combined ^ bitflip); } @Override @@ -721,17 +721,17 @@ public long hashByte(byte input) { final int c3 = c1; final long combined = Primitives.unsignedInt((c1 << 16) | (c2 << 24) | c3 | (1 << 8)); final long bitflip = (unsafeLE.u32(XXH3.XXH3_kSecret, BYTE_BASE) ^ unsafeLE.u32(XXH3.XXH3_kSecret, 4+BYTE_BASE)) + seed(); - return XXH64_avalanche(combined ^ bitflip); + return xxh64Avalanche(combined ^ bitflip); } @Override public long hashVoid() { - return XXH64_avalanche(seed() ^ unsafeLE.i64(XXH3.XXH3_kSecret, 56+BYTE_BASE) ^ unsafeLE.i64(XXH3.XXH3_kSecret, 64+BYTE_BASE)); + return xxh64Avalanche(seed() ^ unsafeLE.i64(XXH3.XXH3_kSecret, 56+BYTE_BASE) ^ unsafeLE.i64(XXH3.XXH3_kSecret, 64+BYTE_BASE)); } @Override public long hash(final T input, final Access access, final long off, final long len) { - return XXH3.XXH3_64bits_internal(0, XXH3.XXH3_kSecret, input, access.byteOrder(input, LITTLE_ENDIAN), off, len); + return XXH3.xxh364BitsInternal(0, XXH3.XXH3_kSecret, input, access.byteOrder(input, LITTLE_ENDIAN), off, len); } } @@ -747,7 +747,7 @@ private static class AsLongHashFunctionSeeded extends AsLongHashFunction { private AsLongHashFunctionSeeded(final long seed) { this.seed = seed; - XXH3_initCustomSecret(this.secret, seed); + xxh3InitCustomSecret(this.secret, seed); } @Override @@ -757,7 +757,7 @@ public long seed() { @Override public long hash(final T input, final Access access, final long off, final long len) { - return XXH3.XXH3_64bits_internal(this.seed, this.secret, input, access.byteOrder(input, LITTLE_ENDIAN), off, len); + return XXH3.xxh364BitsInternal(this.seed, this.secret, input, access.byteOrder(input, LITTLE_ENDIAN), off, len); } } @@ -793,20 +793,20 @@ public long dualHashLong(long input, final long[] result) { final long bitflip = (unsafeLE.i64(XXH3_kSecret, 16+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 24+BYTE_BASE)) + s; final long keyed = input ^ bitflip; final long pl = XXH_PRIME64_1 + (8 << 2); /* Shift len to the left to ensure it is even, this avoids even multiplies. */ - long m128_lo = keyed * pl; - long m128_hi = Maths.unsignedLongMulHigh(keyed, pl); - m128_hi += (m128_lo << 1); - m128_lo ^= (m128_hi >>> 3); + long m128Lo = keyed * pl; + long m128Hi = Maths.unsignedLongMulHigh(keyed, pl); + m128Hi += (m128Lo << 1); + m128Lo ^= (m128Hi >>> 3); - m128_lo ^= m128_lo >>> 35; - m128_lo *= 0x9FB21C651E98DF25L; - m128_lo ^= m128_lo >>> 28; + m128Lo ^= m128Lo >>> 35; + m128Lo *= 0x9FB21C651E98DF25L; + m128Lo ^= m128Lo >>> 28; if (null != result) { - result[0] = m128_lo; - result[1] = XXH3_avalanche(m128_hi); + result[0] = m128Lo; + result[1] = xxh3Avalanche(m128Hi); } - return m128_lo; + return m128Lo; } @Override @@ -816,21 +816,21 @@ public long dualHashInt(final int input, final long[] result) { final long bitflip = (unsafeLE.i64(XXH3_kSecret, 16+BYTE_BASE) ^ unsafeLE.i64(XXH3_kSecret, 24+BYTE_BASE)) + s; final long keyed = (inputU + (inputU << 32)) ^ bitflip; final long pl = XXH_PRIME64_1 + (4 << 2); /* Shift len to the left to ensure it is even, this avoids even multiplies. */ - long m128_lo = keyed * pl; - long m128_hi = Maths.unsignedLongMulHigh(keyed, pl); + long m128Lo = keyed * pl; + long m128Hi = Maths.unsignedLongMulHigh(keyed, pl); - m128_hi += (m128_lo << 1); - m128_lo ^= (m128_hi >>> 3); + m128Hi += (m128Lo << 1); + m128Lo ^= (m128Hi >>> 3); - m128_lo ^= m128_lo >>> 35; - m128_lo *= 0x9FB21C651E98DF25L; - m128_lo ^= m128_lo >>> 28; + m128Lo ^= m128Lo >>> 35; + m128Lo *= 0x9FB21C651E98DF25L; + m128Lo ^= m128Lo >>> 28; if (null != result) { - result[0] = m128_lo; - result[1] = XXH3_avalanche(m128_hi); + result[0] = m128Lo; + result[1] = xxh3Avalanche(m128Hi); } - return m128_lo; + return m128Lo; } @Override @@ -844,10 +844,10 @@ public long dualHashShort(short input, final long[] result) { final long bitflipl = Primitives.unsignedInt(unsafeLE.i32(XXH3_kSecret, BYTE_BASE) ^ unsafeLE.i32(XXH3_kSecret, BYTE_BASE+4)) + seed(); final long bitfliph = Primitives.unsignedInt(unsafeLE.i32(XXH3_kSecret, BYTE_BASE+8) ^ unsafeLE.i32(XXH3_kSecret, BYTE_BASE+12)) - seed(); - final long low = XXH64_avalanche(Primitives.unsignedInt(combinedl) ^ bitflipl); + final long low = xxh64Avalanche(Primitives.unsignedInt(combinedl) ^ bitflipl); if (null != result) { result[0] = low; - result[1] = XXH64_avalanche(Primitives.unsignedInt(combinedh) ^ bitfliph); + result[1] = xxh64Avalanche(Primitives.unsignedInt(combinedh) ^ bitfliph); } return low; } @@ -868,27 +868,27 @@ public long dualHashByte(byte input, final long[] result) { final long bitflipl = Primitives.unsignedInt(unsafeLE.i32(XXH3_kSecret, BYTE_BASE) ^ unsafeLE.i32(XXH3_kSecret, BYTE_BASE+4)) + seed(); final long bitfliph = Primitives.unsignedInt(unsafeLE.i32(XXH3_kSecret, BYTE_BASE+8) ^ unsafeLE.i32(XXH3_kSecret, BYTE_BASE+12)) - seed(); - final long low = XXH64_avalanche(Primitives.unsignedInt(combinedl) ^ bitflipl); + final long low = xxh64Avalanche(Primitives.unsignedInt(combinedl) ^ bitflipl); if (null != result) { result[0] = low; - result[1] = XXH64_avalanche(Primitives.unsignedInt(combinedh) ^ bitfliph); + result[1] = xxh64Avalanche(Primitives.unsignedInt(combinedh) ^ bitfliph); } return low; } @Override public long dualHashVoid(final long[] result) { - final long low = XXH64_avalanche(seed() ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+64) ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+72)); + final long low = xxh64Avalanche(seed() ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+64) ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+72)); if (null != result) { result[0] = low; - result[1] = XXH64_avalanche(seed() ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+80) ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+88)); + result[1] = xxh64Avalanche(seed() ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+80) ^ unsafeLE.i64(XXH3_kSecret, BYTE_BASE+88)); } return low; } @Override public long dualHash(final T input, final Access access, final long off, final long len, final long[] result) { - return XXH3.XXH3_128bits_internal(0, XXH3.XXH3_kSecret, input, access.byteOrder(input, LITTLE_ENDIAN), off, len, result); + return XXH3.xxh3128BitsInternal(0, XXH3.XXH3_kSecret, input, access.byteOrder(input, LITTLE_ENDIAN), off, len, result); } } @@ -907,7 +907,7 @@ private static class AsLongTupleHashFunctionSeeded extends AsLongTupleHashFuncti private AsLongTupleHashFunctionSeeded(final long seed) { this.seed = seed; - XXH3_initCustomSecret(this.secret, seed); + xxh3InitCustomSecret(this.secret, seed); } @Override @@ -917,7 +917,7 @@ public long seed() { @Override public long dualHash(final T input, final Access access, final long off, final long len, final long[] result) { - return XXH3.XXH3_128bits_internal(seed, secret, input, access.byteOrder(input, LITTLE_ENDIAN), off, len, result); + return XXH3.xxh3128BitsInternal(seed, secret, input, access.byteOrder(input, LITTLE_ENDIAN), off, len, result); } } } From a8490b1fbb3b97a0c7e5c4ad70795cd52c090095 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 14 Nov 2025 12:28:15 +0000 Subject: [PATCH 07/18] Add MurmurHash_3 implementation for efficient hashing --- .../net/openhft/hashing/MurmurHash_3.java | 331 ++++++++++++++++++ 1 file changed, 331 insertions(+) create mode 100644 src/main/java/net/openhft/hashing/MurmurHash_3.java diff --git a/src/main/java/net/openhft/hashing/MurmurHash_3.java b/src/main/java/net/openhft/hashing/MurmurHash_3.java new file mode 100644 index 0000000..b428f5e --- /dev/null +++ b/src/main/java/net/openhft/hashing/MurmurHash_3.java @@ -0,0 +1,331 @@ +/* + * Copyright 2013-2025 chronicle.software; SPDX-License-Identifier: Apache-2.0 + */ +package net.openhft.hashing; + +import org.jetbrains.annotations.NotNull; +import org.jetbrains.annotations.Nullable; + +import javax.annotation.ParametersAreNonnullByDefault; + +import static java.nio.ByteOrder.LITTLE_ENDIAN; +import static net.openhft.hashing.Primitives.unsignedInt; +import static net.openhft.hashing.Primitives.unsignedShort; + +/** + * Derived from https://github.com/google/guava/blob/fa95e381e665d8ee9639543b99ed38020c8de5ef + * /guava/src/com/google/common/hash/Murmur3_128HashFunction.java + */ +@ParametersAreNonnullByDefault +class MurmurHash_3 { + private static final long C1 = 0x87c37b91114253d5L; + private static final long C2 = 0x4cf5ad432745937fL; + + private static long hash(long seed, @Nullable T input, Access access, long offset, long length, @Nullable long[] result) { + long h1 = seed; + long h2 = seed; + long remaining = length; + while (remaining >= 16L) { + long k1 = access.i64(input, offset); + h1 ^= mixK1(k1); + + h1 = Long.rotateLeft(h1, 27); + h1 += h2; + h1 = h1 * 5L + 0x52dce729L; + + long k2 = access.i64(input, offset + 8L); + offset += 16L; + remaining -= 16L; + + h2 ^= mixK2(k2); + + h2 = Long.rotateLeft(h2, 31); + h2 += h1; + h2 = h2 * 5L + 0x38495ab5L; + } + + if (remaining > 0L) { + long k1 = 0L; + long k2 = 0L; + switch ((int) remaining) { + case 15: + k2 ^= ((long) access.u8(input, offset + 14L)) << 48;// fall through + case 14: + k2 ^= ((long) access.u8(input, offset + 13L)) << 40;// fall through + case 13: + k2 ^= ((long) access.u8(input, offset + 12L)) << 32;// fall through + case 12: + k2 ^= ((long) access.u8(input, offset + 11L)) << 24;// fall through + case 11: + k2 ^= ((long) access.u8(input, offset + 10L)) << 16;// fall through + case 10: + k2 ^= ((long) access.u8(input, offset + 9L)) << 8; // fall through + case 9: + k2 ^= ((long) access.u8(input, offset + 8L)); // fall through + case 8: + k1 ^= access.i64(input, offset); + break; + case 7: + k1 ^= ((long) access.u8(input, offset + 6L)) << 48; // fall through + case 6: + k1 ^= ((long) access.u8(input, offset + 5L)) << 40; // fall through + case 5: + k1 ^= ((long) access.u8(input, offset + 4L)) << 32; // fall through + case 4: + k1 ^= access.u32(input, offset); + break; + case 3: + k1 ^= ((long) access.u8(input, offset + 2L)) << 16; // fall through + case 2: + k1 ^= ((long) access.u8(input, offset + 1L)) << 8; // fall through + case 1: + k1 ^= ((long) access.u8(input, offset)); + case 0: + break; + default: + throw new AssertionError("Should never get here."); + } + h1 ^= mixK1(k1); + h2 ^= mixK2(k2); + } + + // This version appears to be working slower + + // if (remaining > 0L) { + // long k1 = 0L; + // long k2 = 0L; + // megaSwitch: + // { + // fetch0_7: + // { + // fetch8_11: + // { + // fetch0_3: + // { + // switch ((int) remaining) { + // case 15: + // k2 ^= ((long) access.u8(input, offset + 14L)) << 48; + // case 14: + // k2 ^= ((long) Primitives.nativeToLittleEndian( + // access.u16(input, offset + 12L))) << 32; + // break fetch8_11; + // case 13: + // k2 ^= ((long) access.u8(input, offset + 12L)) << 32; + // case 12: + // break fetch8_11; + // case 11: + // k2 ^= ((long) access.u8(input, offset + 10L)) << 16; + // case 10: + // k2 ^= (long) Primitives.nativeToLittleEndian( + // access.u16(input, offset + 8L)); + // break fetch0_7; + // case 9: + // k2 ^= ((long) access.u8(input, offset + 8L)); + // case 8: + // break fetch0_7; + // case 7: + // k1 ^= ((long) access.u8(input, offset + 6L)) << 48; + // case 6: + // k1 ^= ((long) Primitives.nativeToLittleEndian( + // access.u16(input, offset + 4L))) << 32; + // break fetch0_3; + // case 5: + // k1 ^= ((long) access.u8(input, offset + 4L)) << 32; + // case 4: + // break fetch0_3; + // case 3: + // k1 ^= ((long) access.u8(input, offset + 2L)) << 16; + // case 2: + // k1 ^= (long) Primitives.nativeToLittleEndian( + // access.u16(input, offset)); + // break megaSwitch; + // case 1: + // k1 ^= ((long) access.u8(input, offset)); + // break megaSwitch; + // default: + // throw new AssertionError(); + // } + // } // fetch0_3 + // k1 ^= access.u32(input, offset); + // break megaSwitch; + // } // fetch8_11 + // k2 ^= access.u32(input, offset + 8L); + // } // fetch0_7 + // k1 ^= access.i64(input, offset); + // } // megaSwitch + // + // h1 ^= mixK1(k1); + // h2 ^= mixK2(k2); + // } + + return finalize(length, h1, h2, result); + } + + private static long finalize(long length, long h1, long h2, @Nullable long[] result) { + h1 ^= length; + h2 ^= length; + + h1 += h2; + h2 += h1; + + h1 = fmix64(h1); + h2 = fmix64(h2); + + if (null != result) { + h1 += h2; + result[0] = h1; + result[1] = h1 + h2; + return h1; + } else { + return h1 + h2; + } + } + + private static long fmix64(long k) { + k ^= k >>> 33; + k *= 0xff51afd7ed558ccdL; + k ^= k >>> 33; + k *= 0xc4ceb9fe1a85ec53L; + k ^= k >>> 33; + return k; + } + + private static long mixK1(long k1) { + k1 *= C1; + k1 = Long.rotateLeft(k1, 31); + k1 *= C2; + return k1; + } + + private static long mixK2(long k2) { + k2 *= C2; + k2 = Long.rotateLeft(k2, 33); + k2 *= C1; + return k2; + } + + private static class AsLongTupleHashFunction extends DualHashFunction { + private static final long serialVersionUID = 0L; + @NotNull + private static final AsLongTupleHashFunction SEEDLESS_INSTANCE = new AsLongTupleHashFunction(); + @NotNull + private static final LongHashFunction SEEDLESS_INSTANCE_LONG = SEEDLESS_INSTANCE.asLongHashFunction(); + + private Object readResolve() { + return SEEDLESS_INSTANCE; + } + + @Override + public int bitsLength() { + return 128; + } + @Override + @NotNull + public long[] newResultArray() { + return new long[2]; // override for a little performance + } + + long seed() { + return 0L; + } + + protected long hashNativeLong(long nativeLong, long len, @Nullable long[] result) { + long h1 = mixK1(nativeLong); + long h2 = 0L; + return MurmurHash_3.finalize(len, h1, h2, result); + } + + @Override + public long dualHashLong(long input, @Nullable long[] result) { + return hashNativeLong(Primitives.nativeToLittleEndian(input), 8L, result); + } + + @Override + public long dualHashInt(int input, @Nullable long[] result) { + return hashNativeLong(unsignedInt(Primitives.nativeToLittleEndian(input)), 4L, result); + } + + @Override + public long dualHashShort(short input, @Nullable long[] result) { + return hashNativeLong(unsignedShort(Primitives.nativeToLittleEndian(input)), 2L, result); + } + + @Override + public long dualHashChar(char input, @Nullable long[] result) { + return hashNativeLong(unsignedShort(Primitives.nativeToLittleEndian(input)), 2L, result); + } + + @Override + public long dualHashByte(byte input, @Nullable long[] result) { + return hashNativeLong(Primitives.unsignedByte((int) input), 1L, result); + } + + @Override + public long dualHashVoid(@Nullable long[] result) { + if (null != result) { + result[0] = 0; + result[1] = 0; + } + return 0; + } + + @Override + public long dualHash(@Nullable T input, Access access, long off, long len, @Nullable long[] result) { + long seed = seed(); + return MurmurHash_3.hash(seed, input, access.byteOrder(input, LITTLE_ENDIAN), off, len, result); + } + } + + @NotNull + static LongTupleHashFunction asLongTupleHashFunctionWithoutSeed() { + return AsLongTupleHashFunction.SEEDLESS_INSTANCE; + } + @NotNull + static LongHashFunction asLongHashFunctionWithoutSeed() { + return AsLongTupleHashFunction.SEEDLESS_INSTANCE_LONG; + } + + private static class AsLongTupleHashFunctionSeeded extends AsLongTupleHashFunction { + private static final long serialVersionUID = 0L; + + private final long seed; + @NotNull + private final transient long[] voidHash = newResultArray(); + + private AsLongTupleHashFunctionSeeded(long seed) { + this.seed = seed; + MurmurHash_3.finalize(0L, seed, seed, voidHash); + } + + @Override + long seed() { + return seed; + } + + @Override + protected long hashNativeLong(long nativeLong, long len, @Nullable long[] result) { + long seed = this.seed; + long h1 = seed ^ mixK1(nativeLong); + long h2 = seed; + return MurmurHash_3.finalize(len, h1, h2, result); + } + + @Override + public long dualHashVoid(@Nullable long[] result) { + if (null != result) { + result[0] = voidHash[0]; + result[1] = voidHash[1]; + } + return voidHash[0]; + } + } + + @NotNull + static LongTupleHashFunction asLongTupleHashFunctionWithSeed(long seed) { + return new AsLongTupleHashFunctionSeeded(seed); + } + @NotNull + static LongHashFunction asLongHashFunctionWithSeed(long seed) { + return new AsLongTupleHashFunctionSeeded(seed).asLongHashFunction(); + } +} From e6031d8e78c5d2c58dc4af436686b752d5aa6645 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 14 Nov 2025 12:34:43 +0000 Subject: [PATCH 08/18] Rename MurmurHash3 references to MurmurHash_3 for consistency --- src/main/docs/algorithm-profiles.adoc | 2 +- src/main/docs/architecture-overview.adoc | 2 +- src/main/docs/invariants-and-contracts.adoc | 2 +- .../net/openhft/hashing/LongHashFunction.java | 4 +- .../hashing/LongTupleHashFunction.java | 4 +- .../java/net/openhft/hashing/MurmurHash3.java | 331 ------------------ 6 files changed, 7 insertions(+), 338 deletions(-) delete mode 100644 src/main/java/net/openhft/hashing/MurmurHash3.java diff --git a/src/main/docs/algorithm-profiles.adoc b/src/main/docs/algorithm-profiles.adoc index a49393d..bbe8d29 100644 --- a/src/main/docs/algorithm-profiles.adoc +++ b/src/main/docs/algorithm-profiles.adoc @@ -42,7 +42,7 @@ Key traits :: Factories :: `LongHashFunction.murmur_3()`, `.murmur_3(long)` for 64-bit (`LongHashFunction.java:245-268`); `LongTupleHashFunction.murmur_3()`, `.murmur_3(long)` for 128-bit (`LongTupleHashFunction.java:35-69`). Implementation :: -`net.openhft.hashing.MurmurHash3` adapts Austin Appleby's x64 variants. +`net.openhft.hashing.MurmurHash_3` adapts Austin Appleby's x64 variants. It extends `DualHashFunction` so the 128-bit engine also exposes the low 64 bits through `LongHashFunction`. Key traits :: * Little-endian canonicalisation via `Access.byteOrder`. diff --git a/src/main/docs/architecture-overview.adoc b/src/main/docs/architecture-overview.adoc index a7e7fc7..b5e7e7a 100644 --- a/src/main/docs/architecture-overview.adoc +++ b/src/main/docs/architecture-overview.adoc @@ -24,7 +24,7 @@ It currently delivers 128-bit MurmurHash_3 and XXH3 outputs and mirrors the sing * Each upstream hash family lives in its own package-private class and exposes seed-aware factories back to the public façade. ** `CityAndFarmHash_1_1` adapts CityHash64 1.1 plus FarmHash NA/UO variants, including the short-input specialisations from the original C{pp} sources. -** `MurmurHash3` contains both 64-bit and 128-bit variants, reusing `DualHashFunction` to provide `LongHashFunction` and `LongTupleHashFunction` accessors. +** `MurmurHash_3` contains both 64-bit and 128-bit variants, reusing `DualHashFunction` to provide `LongHashFunction` and `LongTupleHashFunction` accessors. ** `XxHash` implements XXH64 with the upstream prime constants and treats all inputs as little-endian via `Access.byteOrder` (`XxHash.java`). ** `XXH3` delivers XXH3 64-bit and 128-bit functions, including the FARSH-derived secret and block-stripe accumulation strategy (`XXH3.java`). ** `WyHash` ports wyHash v3, including the 256-byte streaming loop and `_wymum` mixing helper built on `Maths.unsignedLongMulXorFold` (`WyHash.java`). diff --git a/src/main/docs/invariants-and-contracts.adoc b/src/main/docs/invariants-and-contracts.adoc index 3eca821..ae5624b 100644 --- a/src/main/docs/invariants-and-contracts.adoc +++ b/src/main/docs/invariants-and-contracts.adoc @@ -9,7 +9,7 @@ Chronicle Software * Every `LongHashFunction` and `LongTupleHashFunction` implementation treats primitives as if they were written to memory using the platform's native byte order; the API therefore guarantees that `hashLong(v)` equals `hashLongs(new long[] {v})` and similar array forms (`LongHashFunction.java`, `LongTupleHashFunction.java`). * All bundled algorithms normalise multi-byte reads to little-endian before mixing, so the same input bytes produce identical hashes on big- and little-endian machines. -Performance may differ, but results must not (`CityAndFarmHash_1_1.java`, `XxHash.java`, `XXH3.java`, `WyHash.java`, `MetroHash.java`, `MurmurHash3.java`). +Performance may differ, but results must not (`CityAndFarmHash_1_1.java`, `XxHash.java`, `XXH3.java`, `WyHash.java`, `MetroHash.java`, `MurmurHash_3.java`). * `hash(Object, Access, long off, long len)` assumes the addressed region is contiguous and valid for the requested byte count. Implementations do not insert bounds checks beyond those provided by the chosen `Access` strategy, so callers must uphold the contract (`LongHashFunction.java:548-612`). * `hashMemory(long address, long length)` treats the `address` as an absolute memory pointer. diff --git a/src/main/java/net/openhft/hashing/LongHashFunction.java b/src/main/java/net/openhft/hashing/LongHashFunction.java index 2c60ee4..9dcad87 100644 --- a/src/main/java/net/openhft/hashing/LongHashFunction.java +++ b/src/main/java/net/openhft/hashing/LongHashFunction.java @@ -231,7 +231,7 @@ public static LongHashFunction farmUo(long seed0, long seed1) { */ @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction murmur_3() { - return MurmurHash3.asLongHashFunctionWithoutSeed(); + return MurmurHash_3.asLongHashFunctionWithoutSeed(); } /** @@ -247,7 +247,7 @@ public static LongHashFunction murmur_3() { */ @SuppressWarnings("checkstyle:MethodName") public static LongHashFunction murmur_3(long seed) { - return MurmurHash3.asLongHashFunctionWithSeed(seed); + return MurmurHash_3.asLongHashFunctionWithSeed(seed); } /** diff --git a/src/main/java/net/openhft/hashing/LongTupleHashFunction.java b/src/main/java/net/openhft/hashing/LongTupleHashFunction.java index 72d22b9..baa4ba9 100644 --- a/src/main/java/net/openhft/hashing/LongTupleHashFunction.java +++ b/src/main/java/net/openhft/hashing/LongTupleHashFunction.java @@ -77,7 +77,7 @@ public abstract class LongTupleHashFunction implements Serializable { */ @NotNull public static LongTupleHashFunction murmur_3() { - return MurmurHash3.asLongTupleHashFunctionWithoutSeed(); + return MurmurHash_3.asLongTupleHashFunctionWithoutSeed(); } /** @@ -91,7 +91,7 @@ public static LongTupleHashFunction murmur_3() { */ @NotNull public static LongTupleHashFunction murmur_3(final long seed) { - return MurmurHash3.asLongTupleHashFunctionWithSeed(seed); + return MurmurHash_3.asLongTupleHashFunctionWithSeed(seed); } /** diff --git a/src/main/java/net/openhft/hashing/MurmurHash3.java b/src/main/java/net/openhft/hashing/MurmurHash3.java deleted file mode 100644 index 6aa3459..0000000 --- a/src/main/java/net/openhft/hashing/MurmurHash3.java +++ /dev/null @@ -1,331 +0,0 @@ -/* - * Copyright 2013-2025 chronicle.software; SPDX-License-Identifier: Apache-2.0 - */ -package net.openhft.hashing; - -import org.jetbrains.annotations.NotNull; -import org.jetbrains.annotations.Nullable; - -import javax.annotation.ParametersAreNonnullByDefault; - -import static java.nio.ByteOrder.LITTLE_ENDIAN; -import static net.openhft.hashing.Primitives.unsignedInt; -import static net.openhft.hashing.Primitives.unsignedShort; - -/** - * Derived from https://github.com/google/guava/blob/fa95e381e665d8ee9639543b99ed38020c8de5ef - * /guava/src/com/google/common/hash/Murmur3_128HashFunction.java - */ -@ParametersAreNonnullByDefault -class MurmurHash3 { - private static final long C1 = 0x87c37b91114253d5L; - private static final long C2 = 0x4cf5ad432745937fL; - - private static long hash(long seed, @Nullable T input, Access access, long offset, long length, @Nullable long[] result) { - long h1 = seed; - long h2 = seed; - long remaining = length; - while (remaining >= 16L) { - long k1 = access.i64(input, offset); - h1 ^= mixK1(k1); - - h1 = Long.rotateLeft(h1, 27); - h1 += h2; - h1 = h1 * 5L + 0x52dce729L; - - long k2 = access.i64(input, offset + 8L); - offset += 16L; - remaining -= 16L; - - h2 ^= mixK2(k2); - - h2 = Long.rotateLeft(h2, 31); - h2 += h1; - h2 = h2 * 5L + 0x38495ab5L; - } - - if (remaining > 0L) { - long k1 = 0L; - long k2 = 0L; - switch ((int) remaining) { - case 15: - k2 ^= ((long) access.u8(input, offset + 14L)) << 48;// fall through - case 14: - k2 ^= ((long) access.u8(input, offset + 13L)) << 40;// fall through - case 13: - k2 ^= ((long) access.u8(input, offset + 12L)) << 32;// fall through - case 12: - k2 ^= ((long) access.u8(input, offset + 11L)) << 24;// fall through - case 11: - k2 ^= ((long) access.u8(input, offset + 10L)) << 16;// fall through - case 10: - k2 ^= ((long) access.u8(input, offset + 9L)) << 8; // fall through - case 9: - k2 ^= ((long) access.u8(input, offset + 8L)); // fall through - case 8: - k1 ^= access.i64(input, offset); - break; - case 7: - k1 ^= ((long) access.u8(input, offset + 6L)) << 48; // fall through - case 6: - k1 ^= ((long) access.u8(input, offset + 5L)) << 40; // fall through - case 5: - k1 ^= ((long) access.u8(input, offset + 4L)) << 32; // fall through - case 4: - k1 ^= access.u32(input, offset); - break; - case 3: - k1 ^= ((long) access.u8(input, offset + 2L)) << 16; // fall through - case 2: - k1 ^= ((long) access.u8(input, offset + 1L)) << 8; // fall through - case 1: - k1 ^= ((long) access.u8(input, offset)); - case 0: - break; - default: - throw new AssertionError("Should never get here."); - } - h1 ^= mixK1(k1); - h2 ^= mixK2(k2); - } - - // This version appears to be working slower - - // if (remaining > 0L) { - // long k1 = 0L; - // long k2 = 0L; - // megaSwitch: - // { - // fetch0_7: - // { - // fetch8_11: - // { - // fetch0_3: - // { - // switch ((int) remaining) { - // case 15: - // k2 ^= ((long) access.u8(input, offset + 14L)) << 48; - // case 14: - // k2 ^= ((long) Primitives.nativeToLittleEndian( - // access.u16(input, offset + 12L))) << 32; - // break fetch8_11; - // case 13: - // k2 ^= ((long) access.u8(input, offset + 12L)) << 32; - // case 12: - // break fetch8_11; - // case 11: - // k2 ^= ((long) access.u8(input, offset + 10L)) << 16; - // case 10: - // k2 ^= (long) Primitives.nativeToLittleEndian( - // access.u16(input, offset + 8L)); - // break fetch0_7; - // case 9: - // k2 ^= ((long) access.u8(input, offset + 8L)); - // case 8: - // break fetch0_7; - // case 7: - // k1 ^= ((long) access.u8(input, offset + 6L)) << 48; - // case 6: - // k1 ^= ((long) Primitives.nativeToLittleEndian( - // access.u16(input, offset + 4L))) << 32; - // break fetch0_3; - // case 5: - // k1 ^= ((long) access.u8(input, offset + 4L)) << 32; - // case 4: - // break fetch0_3; - // case 3: - // k1 ^= ((long) access.u8(input, offset + 2L)) << 16; - // case 2: - // k1 ^= (long) Primitives.nativeToLittleEndian( - // access.u16(input, offset)); - // break megaSwitch; - // case 1: - // k1 ^= ((long) access.u8(input, offset)); - // break megaSwitch; - // default: - // throw new AssertionError(); - // } - // } // fetch0_3 - // k1 ^= access.u32(input, offset); - // break megaSwitch; - // } // fetch8_11 - // k2 ^= access.u32(input, offset + 8L); - // } // fetch0_7 - // k1 ^= access.i64(input, offset); - // } // megaSwitch - // - // h1 ^= mixK1(k1); - // h2 ^= mixK2(k2); - // } - - return finalize(length, h1, h2, result); - } - - private static long finalize(long length, long h1, long h2, @Nullable long[] result) { - h1 ^= length; - h2 ^= length; - - h1 += h2; - h2 += h1; - - h1 = fmix64(h1); - h2 = fmix64(h2); - - if (null != result) { - h1 += h2; - result[0] = h1; - result[1] = h1 + h2; - return h1; - } else { - return h1 + h2; - } - } - - private static long fmix64(long k) { - k ^= k >>> 33; - k *= 0xff51afd7ed558ccdL; - k ^= k >>> 33; - k *= 0xc4ceb9fe1a85ec53L; - k ^= k >>> 33; - return k; - } - - private static long mixK1(long k1) { - k1 *= C1; - k1 = Long.rotateLeft(k1, 31); - k1 *= C2; - return k1; - } - - private static long mixK2(long k2) { - k2 *= C2; - k2 = Long.rotateLeft(k2, 33); - k2 *= C1; - return k2; - } - - private static class AsLongTupleHashFunction extends DualHashFunction { - private static final long serialVersionUID = 0L; - @NotNull - private static final AsLongTupleHashFunction SEEDLESS_INSTANCE = new AsLongTupleHashFunction(); - @NotNull - private static final LongHashFunction SEEDLESS_INSTANCE_LONG = SEEDLESS_INSTANCE.asLongHashFunction(); - - private Object readResolve() { - return SEEDLESS_INSTANCE; - } - - @Override - public int bitsLength() { - return 128; - } - @Override - @NotNull - public long[] newResultArray() { - return new long[2]; // override for a little performance - } - - long seed() { - return 0L; - } - - protected long hashNativeLong(long nativeLong, long len, @Nullable long[] result) { - long h1 = mixK1(nativeLong); - long h2 = 0L; - return MurmurHash3.finalize(len, h1, h2, result); - } - - @Override - public long dualHashLong(long input, @Nullable long[] result) { - return hashNativeLong(Primitives.nativeToLittleEndian(input), 8L, result); - } - - @Override - public long dualHashInt(int input, @Nullable long[] result) { - return hashNativeLong(unsignedInt(Primitives.nativeToLittleEndian(input)), 4L, result); - } - - @Override - public long dualHashShort(short input, @Nullable long[] result) { - return hashNativeLong(unsignedShort(Primitives.nativeToLittleEndian(input)), 2L, result); - } - - @Override - public long dualHashChar(char input, @Nullable long[] result) { - return hashNativeLong(unsignedShort(Primitives.nativeToLittleEndian(input)), 2L, result); - } - - @Override - public long dualHashByte(byte input, @Nullable long[] result) { - return hashNativeLong(Primitives.unsignedByte((int) input), 1L, result); - } - - @Override - public long dualHashVoid(@Nullable long[] result) { - if (null != result) { - result[0] = 0; - result[1] = 0; - } - return 0; - } - - @Override - public long dualHash(@Nullable T input, Access access, long off, long len, @Nullable long[] result) { - long seed = seed(); - return MurmurHash3.hash(seed, input, access.byteOrder(input, LITTLE_ENDIAN), off, len, result); - } - } - - @NotNull - static LongTupleHashFunction asLongTupleHashFunctionWithoutSeed() { - return AsLongTupleHashFunction.SEEDLESS_INSTANCE; - } - @NotNull - static LongHashFunction asLongHashFunctionWithoutSeed() { - return AsLongTupleHashFunction.SEEDLESS_INSTANCE_LONG; - } - - private static class AsLongTupleHashFunctionSeeded extends AsLongTupleHashFunction { - private static final long serialVersionUID = 0L; - - private final long seed; - @NotNull - private final transient long[] voidHash = newResultArray(); - - private AsLongTupleHashFunctionSeeded(long seed) { - this.seed = seed; - MurmurHash3.finalize(0L, seed, seed, voidHash); - } - - @Override - long seed() { - return seed; - } - - @Override - protected long hashNativeLong(long nativeLong, long len, @Nullable long[] result) { - long seed = this.seed; - long h1 = seed ^ mixK1(nativeLong); - long h2 = seed; - return MurmurHash3.finalize(len, h1, h2, result); - } - - @Override - public long dualHashVoid(@Nullable long[] result) { - if (null != result) { - result[0] = voidHash[0]; - result[1] = voidHash[1]; - } - return voidHash[0]; - } - } - - @NotNull - static LongTupleHashFunction asLongTupleHashFunctionWithSeed(long seed) { - return new AsLongTupleHashFunctionSeeded(seed); - } - @NotNull - static LongHashFunction asLongHashFunctionWithSeed(long seed) { - return new AsLongTupleHashFunctionSeeded(seed).asLongHashFunction(); - } -} From 8d93bc270e2bef36a871f23e5d5d580eea42c0a7 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 14 Nov 2025 14:03:48 +0000 Subject: [PATCH 09/18] Updated documentation --- src/main/docs/algorithm-profiles.adoc | 2 +- src/main/docs/architecture-overview.adoc | 1 + src/main/docs/testing-strategy.adoc | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/main/docs/algorithm-profiles.adoc b/src/main/docs/algorithm-profiles.adoc index bbe8d29..61503f2 100644 --- a/src/main/docs/algorithm-profiles.adoc +++ b/src/main/docs/algorithm-profiles.adoc @@ -1,8 +1,8 @@ = Algorithm Profiles +:pp: ++ :toc: :lang: en-GB :source-highlighter: rouge -:pp: ++ Chronicle Software diff --git a/src/main/docs/architecture-overview.adoc b/src/main/docs/architecture-overview.adoc index b5e7e7a..795d7c4 100644 --- a/src/main/docs/architecture-overview.adoc +++ b/src/main/docs/architecture-overview.adoc @@ -1,4 +1,5 @@ = Zero-Allocation Hashing Architecture Overview +:pp: ++ :toc: :lang: en-GB :source-highlighter: rouge diff --git a/src/main/docs/testing-strategy.adoc b/src/main/docs/testing-strategy.adoc index b4edc3a..9b49547 100644 --- a/src/main/docs/testing-strategy.adoc +++ b/src/main/docs/testing-strategy.adoc @@ -1,4 +1,5 @@ = Testing Strategy +:pp: ++ :toc: :lang: en-GB :source-highlighter: rouge From 86874e7e7fedc48232be2121417ec19e62a15bf1 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 14 Nov 2025 15:09:03 +0000 Subject: [PATCH 10/18] Enhance documentation with British English conventions and formatting updates --- src/main/docs/algorithm-profiles.adoc | 3 +-- src/main/docs/architecture-overview.adoc | 1 - src/main/docs/testing-strategy.adoc | 1 - 3 files changed, 1 insertion(+), 4 deletions(-) diff --git a/src/main/docs/algorithm-profiles.adoc b/src/main/docs/algorithm-profiles.adoc index 61503f2..b46679b 100644 --- a/src/main/docs/algorithm-profiles.adoc +++ b/src/main/docs/algorithm-profiles.adoc @@ -1,5 +1,4 @@ = Algorithm Profiles -:pp: ++ :toc: :lang: en-GB :source-highlighter: rouge @@ -55,7 +54,7 @@ Factories :: Implementation :: `net.openhft.hashing.XxHash` ports the official XXH64 reference and keeps the unsigned prime constants as signed Java longs. Key traits :: -* Uses four-lane accumulation for >= 32 byte inputs, matching upstream behaviour bit-for-bit. +* Uses four-lane accumulation for >=32 byte inputs, matching upstream behaviour bit-for-bit. * Applies the canonical avalanche round in `XxHash.finalize` for all lengths. * Seeded and seedless instances differ only by the stored `seed()` override; serialisation preserves both forms. diff --git a/src/main/docs/architecture-overview.adoc b/src/main/docs/architecture-overview.adoc index 795d7c4..b5e7e7a 100644 --- a/src/main/docs/architecture-overview.adoc +++ b/src/main/docs/architecture-overview.adoc @@ -1,5 +1,4 @@ = Zero-Allocation Hashing Architecture Overview -:pp: ++ :toc: :lang: en-GB :source-highlighter: rouge diff --git a/src/main/docs/testing-strategy.adoc b/src/main/docs/testing-strategy.adoc index 9b49547..b4edc3a 100644 --- a/src/main/docs/testing-strategy.adoc +++ b/src/main/docs/testing-strategy.adoc @@ -1,5 +1,4 @@ = Testing Strategy -:pp: ++ :toc: :lang: en-GB :source-highlighter: rouge From d6c2e8492d1bf1e7f9d8b45a30f59fed1ac60581 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Sun, 16 Nov 2025 15:45:11 +0000 Subject: [PATCH 11/18] Update documentation --- README.adoc | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.adoc b/README.adoc index e09863e..8a9f221 100644 --- a/README.adoc +++ b/README.adoc @@ -1,6 +1,10 @@ == Zero-Allocation Hashing -Chronicle Software :lang: en-GB :source-highlighter: rouge :pp: ++ +:pp: ++ + +Chronicle Software +:lang: en-GB +:source-highlighter: rouge image:https://maven-badges.herokuapp.com/maven-central/net.openhft/zero-allocation-hashing/badge.svg[caption="",link=https://maven-badges.herokuapp.com/maven-central/net.openhft/zero-allocation-hashing] image:https://javadoc.io/badge2/net.openhft/zero-allocation-hashing/javadoc.svg[link="https://www.javadoc.io/doc/net.openhft/zero-allocation-hashing/latest/index.html"] From ed9d22a7f8f8a0daa4691c8da29ed238394afff5 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Tue, 18 Nov 2025 13:12:48 +0000 Subject: [PATCH 12/18] Checkpoint --- pom.xml | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 76 insertions(+), 1 deletion(-) diff --git a/pom.xml b/pom.xml index 0de8679..da4cc67 100644 --- a/pom.xml +++ b/pom.xml @@ -225,7 +225,6 @@ - maven-release-plugin 3.0.0-M4 @@ -377,5 +376,81 @@ + + quality + + + [11,) + + + + + org.apache.maven.plugins + maven-checkstyle-plugin + 3.6.0 + + + validate + validate + + check + + + + + ${checkstyle.config.location} + true + true + true + ${checkstyle.violationSeverity} + + + + com.puppycrawl.tools + checkstyle + 10.26.1 + + + net.openhft + chronicle-quality-rules + 1.27.0-SNAPSHOT + + + + + com.github.spotbugs + spotbugs-maven-plugin + + 4.9.8.1 + + Max + Low + true + true + + net/openhft/quality/spotbugs27/chronicle-spotbugs-include.xml + net/openhft/quality/spotbugs27/chronicle-spotbugs-exclude.xml + + + + net.openhft + chronicle-quality-rules + 1.27.0-SNAPSHOT + + + + + spotbugs-main + + process-test-classes + + check + + + + + + + From d9828c979efca516250ed9c1ef8cc26ceea9d1e9 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Wed, 19 Nov 2025 09:58:57 +0000 Subject: [PATCH 13/18] Refactor access modifiers in hashing methods for consistency and clarity --- .../net/openhft/hashing/CityAndFarmHash_1_1.java | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java index 6296501..a9531c3 100644 --- a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java +++ b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java @@ -105,6 +105,7 @@ private static long cityHashLen33To64(Access access, T in, long off, long } static long cityHash64(Access access, T in, long off, long len) { + // CHECKSTYLE:OFF // This method is a close translation of the upstream CityHash reference implementation. // Variable declaration placement and naming are preserved for clarity against the original. if (len <= 32L) { @@ -117,12 +118,15 @@ static long cityHash64(Access access, T in, long off, long len) { return cityHashLen33To64(access, in, off, len); } - long x = access.i64(in, off + len - 40L); - long y = access.i64(in, off + len - 16L) + access.i64(in, off + len - 56L); - long z = hashLen16(access.i64(in, off + len - 48L) + len, + final long x = access.i64(in, off + len - 40L); + final long y = access.i64(in, off + len - 16L) + access.i64(in, off + len - 56L); + final long z = hashLen16(access.i64(in, off + len - 48L) + len, access.i64(in, off + len - 24L)); - long vFirst, vSecond, wFirst, wSecond; + long vFirst; + long vSecond; + long wFirst; + long wSecond; // This and following 3 blocks are produced by a single-click inline-function refactoring. // IntelliJ IDEA ftw @@ -205,6 +209,7 @@ static long cityHash64(Access access, T in, long off, long len) { } while (len != 0); return hashLen16(hashLen16(vFirst, wFirst) + shiftMix(y) * K1 + z, hashLen16(vSecond, wSecond) + x); + // CHECKSTYLE:ON } private static class AsLongHashFunction extends LongHashFunction { From 020ca070865d11857447e0f2aa30a9150df251c6 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 21 Nov 2025 10:29:39 +0000 Subject: [PATCH 14/18] Code Analysis fixes --- src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java | 2 -- src/main/java/net/openhft/hashing/LongHashFunction.java | 2 -- 2 files changed, 4 deletions(-) diff --git a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java index a9531c3..a0c943a 100644 --- a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java +++ b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java @@ -105,7 +105,6 @@ private static long cityHashLen33To64(Access access, T in, long off, long } static long cityHash64(Access access, T in, long off, long len) { - // CHECKSTYLE:OFF // This method is a close translation of the upstream CityHash reference implementation. // Variable declaration placement and naming are preserved for clarity against the original. if (len <= 32L) { @@ -209,7 +208,6 @@ static long cityHash64(Access access, T in, long off, long len) { } while (len != 0); return hashLen16(hashLen16(vFirst, wFirst) + shiftMix(y) * K1 + z, hashLen16(vSecond, wSecond) + x); - // CHECKSTYLE:ON } private static class AsLongHashFunction extends LongHashFunction { diff --git a/src/main/java/net/openhft/hashing/LongHashFunction.java b/src/main/java/net/openhft/hashing/LongHashFunction.java index 9dcad87..969efbb 100644 --- a/src/main/java/net/openhft/hashing/LongHashFunction.java +++ b/src/main/java/net/openhft/hashing/LongHashFunction.java @@ -56,7 +56,6 @@ public abstract class LongHashFunction implements Serializable { private static final long serialVersionUID = 0L; - // CHECKSTYLE:OFF: MethodName /** * Returns a {@code LongHashFunction} that implements the * @@ -361,7 +360,6 @@ public static LongHashFunction wy_3() { public static LongHashFunction wy_3(long seed) { return WyHash.asLongHashFunctionWithSeed(seed); } - // CHECKSTYLE:ON: MethodName /** * Returns a hash function implementing the 64 bit version of From a60f311d53927d7837cec7bcbf8ed81df2314b3d Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Mon, 24 Nov 2025 10:22:42 +0000 Subject: [PATCH 15/18] Remove unnecessary blank lines in various classes for improved code readability --- .gitignore | 2 -- README.adoc | 4 +++- src/main/java-stub/java/lang/Math.java | 1 - src/main/java-stub/sun/misc/Unsafe.java | 1 - src/main/java/net/openhft/hashing/DualHashFunction.java | 8 ++++++-- src/main/java/net/openhft/hashing/XXH3.java | 2 -- src/main/java/sun/nio/ch/DirectBuffer.java | 1 - src/test/java/net/openhft/hashing/MetroHashTest.java | 1 - src/test/java/net/openhft/hashing/XXH128Test.java | 1 - src/test/java/net/openhft/hashing/XXH3Test.java | 1 - src/test/java/net/openhft/hashing/XxHashTest.java | 1 - 11 files changed, 9 insertions(+), 14 deletions(-) diff --git a/.gitignore b/.gitignore index 9604c77..d2ce16d 100644 --- a/.gitignore +++ b/.gitignore @@ -119,7 +119,6 @@ flycheck_*.el # network security /network-security.data - ### Intellij+all ### # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio, WebStorm and Rider # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839 @@ -257,7 +256,6 @@ hs_err_pid* # Icon must end with two \r Icon - # Thumbnails ._* diff --git a/README.adoc b/README.adoc index 8a9f221..eb1048a 100644 --- a/README.adoc +++ b/README.adoc @@ -1,6 +1,8 @@ == Zero-Allocation Hashing - +:toc: :pp: ++ +:lang: en-GB +:source-highlighter: rouge Chronicle Software :lang: en-GB diff --git a/src/main/java-stub/java/lang/Math.java b/src/main/java-stub/java/lang/Math.java index 87a0780..91cddce 100644 --- a/src/main/java-stub/java/lang/Math.java +++ b/src/main/java-stub/java/lang/Math.java @@ -11,7 +11,6 @@ * - Only used methods are exported. * - In test and production runtime, the real class is loaded from boot classpath. */ - public class Math { public static long multiplyHigh(long x, long y) { throw new UnsupportedOperationException(); } } diff --git a/src/main/java-stub/sun/misc/Unsafe.java b/src/main/java-stub/sun/misc/Unsafe.java index b255309..d0eab45 100644 --- a/src/main/java-stub/sun/misc/Unsafe.java +++ b/src/main/java-stub/sun/misc/Unsafe.java @@ -11,7 +11,6 @@ * - Only used methods are exported. * - In test and production runtime, the real class is loaded from boot classpath. */ - public final class Unsafe { public native Object getObject( Object o, long offset); public native int getInt( Object o, long offset); diff --git a/src/main/java/net/openhft/hashing/DualHashFunction.java b/src/main/java/net/openhft/hashing/DualHashFunction.java index 259bddd..f923b95 100644 --- a/src/main/java/net/openhft/hashing/DualHashFunction.java +++ b/src/main/java/net/openhft/hashing/DualHashFunction.java @@ -7,9 +7,13 @@ import org.jetbrains.annotations.NotNull; import javax.annotation.ParametersAreNonnullByDefault; -// An internal helper class for casting LongTupleHashFunction as LongHashFunction - @ParametersAreNonnullByDefault +/** + * Internal base class that exposes a tuple hash as both tuple and single-value + * {@link LongHashFunction}. Subclasses implement the dualHash* variants; this + * wrapper handles result-array checks and caches a single-value view to avoid + * repeated allocation. + */ abstract class DualHashFunction extends LongTupleHashFunction { private static final long serialVersionUID = 0L; diff --git a/src/main/java/net/openhft/hashing/XXH3.java b/src/main/java/net/openhft/hashing/XXH3.java index 76eaf67..c6a6ca5 100644 --- a/src/main/java/net/openhft/hashing/XXH3.java +++ b/src/main/java/net/openhft/hashing/XXH3.java @@ -36,13 +36,11 @@ class XXH3 { private static final long XXH_PRIME32_1 = 0x9E3779B1L; /*!< 0b10011110001101110111100110110001 */ private static final long XXH_PRIME32_2 = 0x85EBCA77L; /*!< 0b10000101111010111100101001110111 */ private static final long XXH_PRIME32_3 = 0xC2B2AE3DL; /*!< 0b11000010101100101010111000111101 */ - private static final long XXH_PRIME64_1 = 0x9E3779B185EBCA87L; /*!< 0b1001111000110111011110011011000110000101111010111100101010000111 */ private static final long XXH_PRIME64_2 = 0xC2B2AE3D27D4EB4FL; /*!< 0b1100001010110010101011100011110100100111110101001110101101001111 */ private static final long XXH_PRIME64_3 = 0x165667B19E3779F9L; /*!< 0b0001011001010110011001111011000110011110001101110111100111111001 */ private static final long XXH_PRIME64_4 = 0x85EBCA77C2B2AE63L; /*!< 0b1000010111101011110010100111011111000010101100101010111001100011 */ private static final long XXH_PRIME64_5 = 0x27D4EB2F165667C5L; /*!< 0b0010011111010100111010110010111100010110010101100110011111000101 */ - // only support fixed size secret private static final long nbStripesPerBlock = (192 - 64) / 8; private static final long block_len = 64 * nbStripesPerBlock; diff --git a/src/main/java/sun/nio/ch/DirectBuffer.java b/src/main/java/sun/nio/ch/DirectBuffer.java index 01b65e5..f5998f3 100644 --- a/src/main/java/sun/nio/ch/DirectBuffer.java +++ b/src/main/java/sun/nio/ch/DirectBuffer.java @@ -14,7 +14,6 @@ * - Only used methods are exported. * - In test and production runtime, the real class is loaded from boot classpath. */ - public interface DirectBuffer { public long address(); } diff --git a/src/test/java/net/openhft/hashing/MetroHashTest.java b/src/test/java/net/openhft/hashing/MetroHashTest.java index 3569291..3f40555 100644 --- a/src/test/java/net/openhft/hashing/MetroHashTest.java +++ b/src/test/java/net/openhft/hashing/MetroHashTest.java @@ -69,7 +69,6 @@ private void test(LongHashFunction metro, long[] hashesOfLoopingBytes) { * } * } */ - private static final long[] HASHES_OF_LOOPING_BYTES_WITHOUT_SEED = { 8097384203561113213L, 1044577374344929784L, diff --git a/src/test/java/net/openhft/hashing/XXH128Test.java b/src/test/java/net/openhft/hashing/XXH128Test.java index 7ddb48e..0806105 100644 --- a/src/test/java/net/openhft/hashing/XXH128Test.java +++ b/src/test/java/net/openhft/hashing/XXH128Test.java @@ -80,7 +80,6 @@ int main() printf("}\n"); } */ - class XXH128Test_HASHES { public static final long[][] HASHES_OF_LOOPING_BYTES_WITHOUT_SEED = { { 6918025063187695999L, -7374073936536430376L }, diff --git a/src/test/java/net/openhft/hashing/XXH3Test.java b/src/test/java/net/openhft/hashing/XXH3Test.java index fb8a540..f222c6d 100644 --- a/src/test/java/net/openhft/hashing/XXH3Test.java +++ b/src/test/java/net/openhft/hashing/XXH3Test.java @@ -77,7 +77,6 @@ int main() printf("}\n"); } */ - class XXH3Test_HASHES { public static final long[] HASHES_OF_LOOPING_BYTES_WITHOUT_SEED = { 3244421341483603138L, diff --git a/src/test/java/net/openhft/hashing/XxHashTest.java b/src/test/java/net/openhft/hashing/XxHashTest.java index 2323feb..1c7def8 100644 --- a/src/test/java/net/openhft/hashing/XxHashTest.java +++ b/src/test/java/net/openhft/hashing/XxHashTest.java @@ -69,7 +69,6 @@ private void test(LongHashFunction city, long[] hashesOfLoopingBytes) { * } * } */ - private static final long[] HASHES_OF_LOOPING_BYTES_WITHOUT_SEED = { -1205034819632174695L, -1642502924627794072L, From db0e56ad490e0ca76e9f07f239c5a7194f16c8bb Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Mon, 24 Nov 2025 10:53:48 +0000 Subject: [PATCH 16/18] Remove unnecessary blank lines in various classes for improved code readability --- AGENTS.md | 9 +++++++++ src/main/java/sun/nio/ch/package-info.java | 12 ++++++++++++ 2 files changed, 21 insertions(+) create mode 100644 src/main/java/sun/nio/ch/package-info.java diff --git a/AGENTS.md b/AGENTS.md index 5c83c58..03617d6 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -159,3 +159,12 @@ section:: Top Level Section ### Emphasis and Bold Text In AsciiDoc, an underscore `_` is _emphasis_; `*text*` is *bold*. + +## Zero-Allocation Hashing module specifics + +- Follow repository `AGENTS.md` as the base rules; this section adds module notes. Durable docs live in `src/main/docs/` with the landing page at `README.adoc`. +- Module purpose: zero-allocation hashing primitives and utilities for bytes/arrays/buffers with stable, cross-platform output. +- Build commands: full build `mvn -q clean verify`; module-only without tests `mvn -pl zero-allocation-hashing -am -DskipTests install`. +- Quality gates: keep Checkstyle/SpotBugs clean; preserve deterministic hash outputs across platforms; avoid hidden allocations in hot paths. +- Documentation: maintain Nine-Box IDs in `src/main/docs/specifications.adoc`/`project-requirements` if added, and link decisions/tests accordingly; British English, ASCII/ISO-8859-1, `:source-highlighter: rouge`. +- Guardrails: changes to hashing outputs are breaking; document any algorithm/version bumps; call out platform-specific behaviour in `unsafe-and-platform-notes.adoc`. diff --git a/src/main/java/sun/nio/ch/package-info.java b/src/main/java/sun/nio/ch/package-info.java new file mode 100644 index 0000000..f04b454 --- /dev/null +++ b/src/main/java/sun/nio/ch/package-info.java @@ -0,0 +1,12 @@ +/* + * Copyright 2013-2025 chronicle.software; SPDX-License-Identifier: Apache-2.0 + */ +/** + * Minimal stubs for {@code sun.nio.ch} used by Zero-Allocation-Hashing. + *

+ * This package provides the narrow subset of the JDK's internal + * {@code sun.nio.ch} types required to build and run the hashing + * library across different Java versions without depending on the + * actual JDK internals at runtime. + */ +package sun.nio.ch; From 177fed38e05f3f19b129ed879375e9321aa65b2a Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Mon, 24 Nov 2025 11:57:37 +0000 Subject: [PATCH 17/18] Remove JDK activation from quality profile in pom.xml --- pom.xml | 4 ---- 1 file changed, 4 deletions(-) diff --git a/pom.xml b/pom.xml index da4cc67..268d977 100644 --- a/pom.xml +++ b/pom.xml @@ -378,10 +378,6 @@ quality - - - [11,) - From 1a10d21c33431b57ec21b85cb2c789ec93a64e73 Mon Sep 17 00:00:00 2001 From: Peter Lawrey Date: Fri, 28 Nov 2025 10:55:25 +0000 Subject: [PATCH 18/18] Revert adding final local var --- src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java index a0c943a..5133081 100644 --- a/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java +++ b/src/main/java/net/openhft/hashing/CityAndFarmHash_1_1.java @@ -117,9 +117,9 @@ static long cityHash64(Access access, T in, long off, long len) { return cityHashLen33To64(access, in, off, len); } - final long x = access.i64(in, off + len - 40L); - final long y = access.i64(in, off + len - 16L) + access.i64(in, off + len - 56L); - final long z = hashLen16(access.i64(in, off + len - 48L) + len, + long x = access.i64(in, off + len - 40L); + long y = access.i64(in, off + len - 16L) + access.i64(in, off + len - 56L); + long z = hashLen16(access.i64(in, off + len - 48L) + len, access.i64(in, off + len - 24L)); long vFirst;