Skip to content

Speed up length of text strings#417

Merged
01mf02 merged 1 commit into
mainfrom
length-str2
Mar 26, 2026
Merged

Speed up length of text strings#417
01mf02 merged 1 commit into
mainfrom
length-str2

Conversation

@01mf02
Copy link
Copy Markdown
Owner

@01mf02 01mf02 commented Mar 26, 2026

This is an improved version of #394 which does not use core::str::from_utf8, but rather implements the same technique as in BurntSushi/bstr#223. That should allow it to achieve much more stable results for strings that contain a few invalid UTF-8 sequences, but which are still mostly valid UTF-8.

Furthermore, this removes a few casts for length calculation, which previously could truncate the length of very large strings.

Results:

$ hyperfine -L v main,length-str2 "target/release/jaq-{v} -n '[\"\" | limit(100000; recurse(. + \"a\")) | length] | add'"
Benchmark 1: target/release/jaq-main -n '["" | limit(100000; recurse(. + "a")) | length] | add'
  Time (mean ± σ):      2.495 s ±  0.009 s    [User: 2.481 s, System: 0.003 s]
  Range (min … max):    2.487 s …  2.512 s    10 runs
 
Benchmark 2: target/release/jaq-length-str2 -n '["" | limit(100000; recurse(. + "a")) | length] | add'
  Time (mean ± σ):     156.4 ms ±   1.6 ms    [User: 152.4 ms, System: 3.1 ms]
  Range (min … max):   154.6 ms … 162.1 ms    19 runs
 
Summary
  target/release/jaq-length-str2 -n '["" | limit(100000; recurse(. + "a")) | length] | add' ran
   15.96 ± 0.17 times faster than target/release/jaq-main -n '["" | limit(100000; recurse(. + "a")) | length] | add'

In the pure ASCII case, calculating string length is now up to 16 times faster than in main.

@01mf02 01mf02 changed the title Speed up length of valid UTF-8 text strings, take 2. Speed up length of text strings Mar 26, 2026
@01mf02 01mf02 merged commit c0da08b into main Mar 26, 2026
4 checks passed
@01mf02 01mf02 deleted the length-str2 branch March 26, 2026 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant