Skip to content

Commit fe85bf2

Browse files
LuciferYangMaxGekk
authored andcommitted
[MINOR][CORE] Fix swapped depth/width formulas in CountMinSketch class doc
### What changes were proposed in this pull request? This PR fixes the class-level Javadoc of `org.apache.spark.util.sketch.CountMinSketch`, which assigned the two table dimensions to the wrong symbols. The doc previously read: ``` d = ceil(2 / eps) w = ceil(-log(1 - confidence) / log(2)) ``` but the implementation (`CountMinSketchImpl`, the `(eps, confidence, seed)` constructor) actually sets: ```java this.width = (int) Math.ceil(2 / eps); this.depth = (int) Math.ceil(-Math.log1p(-confidence) / Math.log(2)); ``` i.e. `width` is derived from `eps` and `depth` from `confidence` -- the opposite of what the doc stated. The fix swaps the two lines so the doc matches the code: ``` w = ceil(2 / eps) d = ceil(-log(1 - confidence) / log(2)) ``` ### Why are the changes needed? Doc fix ### Does this PR introduce _any_ user-facing change? No. Documentation-only change; no behavior, serialization, or API impact. ### How was this patch tested? Pass Github Actions ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code Closes #56897 from LuciferYang/minor-countminsketch-doc-fix. Authored-by: YangJie <yangjie01@baidu.com> Signed-off-by: Max Gekk <max.gekk@gmail.com> (cherry picked from commit 1409d88) Signed-off-by: Max Gekk <max.gekk@gmail.com>
1 parent 12983be commit fe85bf2

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,8 @@
4545
* Under the cover, a {@link CountMinSketch} is essentially a two-dimensional {@code long} array
4646
* with depth {@code d} and width {@code w}, where
4747
* <ul>
48-
* <li>{@code d = ceil(2 / eps)}</li>
49-
* <li>{@code w = ceil(-log(1 - confidence) / log(2))}</li>
48+
* <li>{@code w = ceil(2 / eps)}</li>
49+
* <li>{@code d = ceil(-log(1 - confidence) / log(2))}</li>
5050
* </ul>
5151
*
5252
* This implementation is largely based on the {@code CountMinSketch} class from stream-lib.

0 commit comments

Comments
 (0)