@@ -7,6 +7,96 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77
88## [ Unreleased]
99
10+ ## [ 0.9.0] ( https://github.com/ModelTC/mtc-inc-bpe/releases/tag/v0.9.0 ) - 2026-05-07
11+
12+ ### Added
13+
14+ - add a script to properize tokenizers
15+ - add more assertions when checking properness
16+ - check if dict is proper before normalization
17+ - expose heap implementation
18+ - use canonical tokens for automaton and add a shortcut byte to token lookup method
19+ - [ ** breaking** ] add eager tokenization
20+ - expose byte and char lookup operations
21+ - support empty tokens
22+ - add built-in byte and char lookup tables to ` Vocab `
23+ - heavy light decomposition
24+ - [ ** breaking** ] rename functions and expose more for ` IncBpeTokenization `
25+ - [ ** breaking** ] expose position in ` IncBpeTokenChainIter `
26+ - [ ** breaking** ] make ` NormalizedDict::new ` return ` Result ` , adjust several interfaces
27+ - expose IncBpeTokenChainIter
28+ - [ ** breaking** ] fetch token chain using iterator instead of vec for performance
29+ - [ ** breaking** ] expose more context when checking if a token is single
30+ - init
31+
32+ ### Fixed
33+
34+ - use usize for accumulated u16
35+ - * (eager)* add additional capacity for last flushing token
36+ - * (tests)* handle errors when improper
37+ - * (heap)* fix bugs in heap implementation
38+ - [ ** breaking** ] refine naming, update interface and remove redundant code
39+ - extract essential forest data for tokenization, keep memory usage minimized
40+ - allow setting different repr_id in SufSucNode
41+ - inline more functions
42+ - remove ` EagerBpeToken ` , ` feed_len ` is useless for external users
43+ - use rapidhash instead of default hash
44+ - move byte to token id table to heap
45+ - use the roots of the subtrees as indicators of parents, fix #19
46+ - expose priority of a token in normalized dict, fix panic when token id exceeded vocab size
47+ - use ` LinkedList ` for suffix chain
48+ - check token length explicitly
49+ - add ` non_exhaustive ` to errors
50+ - use u16 as ` skip_len `
51+ - expose ` NormalizedDictBuildError `
52+
53+ ### Other
54+
55+ - * (docs)* update repo url in changelog
56+ - * (release)* rename to ` mtc-inc-bpe ` and release 0.9.0
57+ - * (deps)* update ` rand `
58+ - * (deps)* bump dependabot/fetch-metadata from 2 to 3 ([ #48 ] ( https://github.com/ModelTC/mtc-inc-bpe/pull/48 ) )
59+ - update dependencies
60+ - bump version to v0.8.1
61+ - * (cargo)* exclude tools
62+ - update thiserror
63+ - bump version to v0.8.0
64+ - use a big node pool for centroid decomposition
65+ - * (cargo)* update dependencies and metadata
66+ - release v0.7.1
67+ - * (format)* format the code
68+ - disable default features of dependencies
69+ - release v0.7.0
70+ - release v0.6.0
71+ - make use of type inference
72+ - remove ` two_diff_mut ` from ` TypedVec `
73+ - remove authors field in Cargo.toml
74+ - * (tests)* move heap bpe into test utils
75+ - * (deps)* update ` derive_more `
76+ - keep transition table in the order of heavy chains
77+ - * (aho_corasik)* use sqrt decomposition to reduce memory footprint
78+ - release v0.5.0
79+ - use ` tinyvec ` replacing ` smallvec ` to reduce memory footprint
80+ - * (tests)* add tests on repeated characters
81+ - unify integer literals
82+ - release v0.4.1 ([ #18 ] ( https://github.com/ModelTC/mtc-inc-bpe/pull/18 ) )
83+ - optimize constructors in debug mode ([ #17 ] ( https://github.com/ModelTC/mtc-inc-bpe/pull/17 ) )
84+ - * (ci)* add auto-merge dependabot PR
85+ - * (deps)* bump actions/checkout from 5 to 6
86+ - release v0.4.0
87+ - release v0.3.1
88+ - release v0.3.0
89+ - rename parameters for clarity
90+ - release v0.2.1
91+ - release v0.2.0
92+ - clean up code
93+ - optimize validation to reduce execution time
94+ - pre-allocate vector whenever possible
95+ - reorder functions
96+ - update package name ([ #4 ] ( https://github.com/ModelTC/mtc-inc-bpe/pull/4 ) )
97+ - release v0.1.0 ([ #3 ] ( https://github.com/ModelTC/mtc-inc-bpe/pull/3 ) )
98+ - add more events to trigger build and test ([ #2 ] ( https://github.com/ModelTC/mtc-inc-bpe/pull/2 ) )
99+
10100## [ 0.7.1] ( https://github.com/ModelTC/mtc-inc-bpe/compare/v0.7.0...v0.7.1 ) - 2026-01-05
11101
12102### Fixed
0 commit comments