Commit dfd3d2e
committed
Update tokenizers submodule to the linear-time HF encode path
Bump extension/llm/tokenizers, picking up the linear-time
HFTokenizer encode work (merge_all O(n log n), ReplaceNormalizer::normalize
O(N) single forward pass) plus a targeted encode-latency benchmark. This cuts
long-prompt prefill tokenization time in the gemma4 / eagle3 runners; token ids
and greedy output are unchanged, verified e2e on the gemma4-31B target
(identical 18-token encode + decode after the bump).
ghstack-source-id: 525dc35
ghstack-comment-id: 4734208425
Pull-Request: #203491 parent d9f3278 commit dfd3d2e
1 file changed
Lines changed: 1 addition & 1 deletion
Submodule tokenizers updated 17 files
- BUCK+1-1
- benchmark/.gitignore+1
- benchmark/CMakeLists.txt+21
- benchmark/hf_tokenizer_encode_latency.cpp+141
- benchmark/run.sh+15
- include/pytorch/tokenizers/hf_tokenizer.h+9-110
- include/pytorch/tokenizers/pre_tokenizer.h+16-1
- include/pytorch/tokenizers/regex.h+5-1
- include/pytorch/tokenizers/string_integer_map.h+5-1
- src/hf_tokenizer.cpp+101-15
- src/normalizer.cpp+29-6
- src/pre_tokenizer.cpp+36-12
- test/test_hf_tokenizer.cpp+129
- test/test_normalizer.cpp+32
- test/test_pre_tokenizer.cpp+196
- test/test_string_integer_map.cpp+38
- third-party/llama.cpp-unicode/src/unicode.cpp+65-31
0 commit comments