GitHub Issue Proposal for RootInteractive (JavaScript Browser Part)
Title: Feature: Implement Generalized LUT-Based Lossy Compression for Browser-Side Data (JS/WASM)
Labels: enhancement, javascript, performance, data-compression, browser
Assignees: (
Milestone: (Optional, depends on project planning)
🚀 Problem Statement
As RootInteractive pushes the boundaries of interactive data exploration in the browser, handling increasingly large numeric datasets becomes a critical challenge. Shipping and processing high-precision floating-point arrays can lead to:
- Increased Network Latency: Large data payloads slow down initial load times and subsequent updates.
- Higher Memory Consumption: Storing full-precision data in the browser consumes significant RAM, impacting performance, especially on mobile devices or for complex dashboards.
- Limited Interactivity: Slower data transfer and processing can hinder smooth zooming, panning, and real-time updates of visualizations.
While exact floating-point precision is often unnecessary for interactive visualization and approximate analysis (e.g., track residuals, cluster response), current browser-side solutions lack a generalized, efficient lossy compression mechanism that accounts for intrinsic data resolution and ensures predictable approximation errors.
✨ Proposed Solution: Generalized LUT-Based Lossy Compression
We propose integrating a generalized Lookup Table (LUT)-based lossy compression framework directly into the RootInteractive JavaScript browser environment, leveraging WebAssembly (WASM) where beneficial for performance-critical decoding.
This approach focuses on:
- Significant Data Reduction: Aims to achieve 4–8x compression ratios for numeric columns.
- Predictable Approximation: Guarantees bounded residuals and preserves statistical properties of the data through intelligent inverse mapping and dithering.
- Optimized for Browser Performance: Designed to exploit cache locality for rapid decompression, making it ideal for interactive dashboards.
🔧 Key Components & Browser-Specific Implementation Details
-
Forward LUT Compression (Server-side/Pre-processing):
- This component would primarily live on the server-side (e.g., Python backend generating data for RootInteractive).
- It applies a monotonic transform
f(x) \rightarrow y (e.g., log, sqrt, sinh) and quantizes y to discrete bins (e.g., 256 or 65536 bins for Uint8Array or Uint16Array storage).
- Optional preprocessing like
x' = (x - offset) / scale for data normalization.
- The generated LUT (mapping quantized
y back to x via finv(y)) and relevant metadata (offset, scale, function type) would be stored alongside the compressed data.
-
Inverse LUT Decompression (Browser-side - JavaScript/WASM):
- The core of the browser-side implementation.
- Decompression would involve:
- Loading the compressed
Uint8Array or Uint16Array data.
- Loading the corresponding LUT (as a
Float32Array or Float64Array).
- Applying the inverse LUT mapping
x̂ = finv(y) using direct array indexing.
-
Performance Advantage: Crucially, if the LUT size is small (e.g., $\approx$ 10-20 kB for 256 or 65536 entries of
Float32), it will almost certainly fit into the L1/L2 CPU cache. This enables significantly faster decoding in JavaScript/WASM than re-calculating analytical functions on the fly for each data point, which is paramount for smooth interactivity.
-
WASM Potential: Performance-critical decoding loops could be implemented in WebAssembly for near-native speed, especially for larger arrays, leveraging Rust/C++ compiled to WASM.
-
Dithering (Dequantization Smearing) (Browser-side - JavaScript):
- To counteract visual artifacts (aliasing, histogram spikes) introduced by quantization, a small amount of carefully calculated randomness is added during decompression.
- Formula: $x̂ \in [\text{finv}(y - 0.5), \text{finv}(y + 0.5)]$, where a random number is drawn within this reconstructed bin range.
- This is a pure JavaScript operation, preserving the original data's density and smoothness for visualizations and downstream statistical processing or ML models without increasing transmitted data size.
-
Parameterization & Metadata (JSON/Typed Arrays):
- Metadata (e.g.,
offset, scale, lutReference, transformType) would be stored as JSON objects alongside the compressed TypedArray data.
- This allows for per-column or per-branch configuration within RootInteractive's data structures.
- The
lutReference could point to a shared LUT for common transforms, or an inline LUT for unique cases.
🎯 Use Cases for RootInteractive
- Interactive Visualizations:
- Faster loading and smoother interaction for large 1D/2D histograms, scatter plots, and event displays.
- Example: Visualizing track residuals or cluster charges with minimal visual distortion but dramatically reduced data size.
- Machine Learning Feature Visualization: Displaying quantized ML features efficiently, while ensuring the underlying distribution shape is preserved for qualitative inspection.
- Low-Memory Environments: Enabling RootInteractive to handle larger datasets effectively on devices with limited memory (e.g., tablets, older laptops).
- Remote Data Access (WASM/Lazy Decoding): Efficiently fetch and decode only necessary data segments from a remote server, improving responsiveness for "drill-down" analysis.
🗓️ Current Status (RootInteractive Context)
SqrtScaling with LUT + inverse + smearing: ✅ A working prototype exists (likely in the Python backend and a conceptual JS equivalent).
- Generic LUT codec abstraction: ⌛ In progress (designing the generalized API).
- Metadata schema for
scale/offset: ⌛ In progress (defining how these parameters will be passed to and used by the browser).
- Multi-language decoder support (C++/Python/JS): ⏳ Planned (this proposal focuses on the JS part).
- Lazy decoding (WASM/ONNX/in browser): ⏳ Planned (this feature directly supports and benefits from this compression).
🛣️ Next Steps for RootInteractive JavaScript
- Define Browser-Side API: Design the JavaScript API for a
MonotonicLUTQuantizer or LUTLossyCodec that can:
- Accept compressed
TypedArray and metadata.
- Perform efficient inverse LUT lookup and dithering.
- Provide configuration options for different transform types.
- Implement Core Decompression Logic: Develop the initial JavaScript implementation for the inverse LUT lookup and dithering.
- Explore WASM Integration if needed Prototype a WASM module for the most performance-critical parts of the decompression (e.g., large array processing loops) and benchmark against pure JavaScript.
- Integrate with RootInteractive Data Flow: Hook the new decompression logic into RootInteractive's data loading and rendering pipeline
- Benchmarking: Conduct thorough browser-side benchmarks comparing:
- Raw
Float32Array transfer/processing.
- Compressed
Uint8Array/Uint16Array + JS decompression.
- Compressed
Uint8Array/Uint16Array + WASM decompression.
- Metrics: data transfer size, decode time, rendering frame rate.
- Documentation: Document the new compression scheme, its benefits, usage, and API for RootInteractive developers and users.
GitHub Issue Proposal for RootInteractive (JavaScript Browser Part)
Title: Feature: Implement Generalized LUT-Based Lossy Compression for Browser-Side Data (JS/WASM)
Labels:
enhancement,javascript,performance,data-compression,browserAssignees: (
Milestone: (Optional, depends on project planning)
🚀 Problem Statement
As RootInteractive pushes the boundaries of interactive data exploration in the browser, handling increasingly large numeric datasets becomes a critical challenge. Shipping and processing high-precision floating-point arrays can lead to:
While exact floating-point precision is often unnecessary for interactive visualization and approximate analysis (e.g., track residuals, cluster response), current browser-side solutions lack a generalized, efficient lossy compression mechanism that accounts for intrinsic data resolution and ensures predictable approximation errors.
✨ Proposed Solution: Generalized LUT-Based Lossy Compression
We propose integrating a generalized Lookup Table (LUT)-based lossy compression framework directly into the RootInteractive JavaScript browser environment, leveraging WebAssembly (WASM) where beneficial for performance-critical decoding.
This approach focuses on:
🔧 Key Components & Browser-Specific Implementation Details
Forward LUT Compression (Server-side/Pre-processing):
f(x) \rightarrow y(e.g.,log,sqrt,sinh) and quantizesyto discrete bins (e.g., 256 or 65536 bins forUint8ArrayorUint16Arraystorage).x' = (x - offset) / scalefor data normalization.yback toxviafinv(y)) and relevant metadata (offset,scale, function type) would be stored alongside the compressed data.Inverse LUT Decompression (Browser-side - JavaScript/WASM):
Uint8ArrayorUint16Arraydata.Float32ArrayorFloat64Array).x̂ = finv(y)using direct array indexing.Float32), it will almost certainly fit into the L1/L2 CPU cache. This enables significantly faster decoding in JavaScript/WASM than re-calculating analytical functions on the fly for each data point, which is paramount for smooth interactivity.Dithering (Dequantization Smearing) (Browser-side - JavaScript):
Parameterization & Metadata (JSON/Typed Arrays):
offset,scale,lutReference,transformType) would be stored as JSON objects alongside the compressedTypedArraydata.lutReferencecould point to a shared LUT for common transforms, or an inline LUT for unique cases.🎯 Use Cases for RootInteractive
🗓️ Current Status (RootInteractive Context)
SqrtScalingwith LUT + inverse + smearing: ✅ A working prototype exists (likely in the Python backend and a conceptual JS equivalent).scale/offset: ⌛ In progress (defining how these parameters will be passed to and used by the browser).🛣️ Next Steps for RootInteractive JavaScript
MonotonicLUTQuantizerorLUTLossyCodecthat can:TypedArrayand metadata.Float32Arraytransfer/processing.Uint8Array/Uint16Array+ JS decompression.Uint8Array/Uint16Array+ WASM decompression.