Skip to content

Commit c3e6bd8

Browse files
github-actions[bot]ChieloNewctle
authored andcommitted
chore: release v0.8.1
1 parent 44e983a commit c3e6bd8

1 file changed

Lines changed: 90 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,96 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.9.0](https://github.com/ModelTC/mtc-inc-bpe/releases/tag/v0.9.0) - 2026-05-07
11+
12+
### Added
13+
14+
- add a script to properize tokenizers
15+
- add more assertions when checking properness
16+
- check if dict is proper before normalization
17+
- expose heap implementation
18+
- use canonical tokens for automaton and add a shortcut byte to token lookup method
19+
- [**breaking**] add eager tokenization
20+
- expose byte and char lookup operations
21+
- support empty tokens
22+
- add built-in byte and char lookup tables to `Vocab`
23+
- heavy light decomposition
24+
- [**breaking**] rename functions and expose more for `IncBpeTokenization`
25+
- [**breaking**] expose position in `IncBpeTokenChainIter`
26+
- [**breaking**] make `NormalizedDict::new` return `Result`, adjust several interfaces
27+
- expose IncBpeTokenChainIter
28+
- [**breaking**] fetch token chain using iterator instead of vec for performance
29+
- [**breaking**] expose more context when checking if a token is single
30+
- init
31+
32+
### Fixed
33+
34+
- use usize for accumulated u16
35+
- *(eager)* add additional capacity for last flushing token
36+
- *(tests)* handle errors when improper
37+
- *(heap)* fix bugs in heap implementation
38+
- [**breaking**] refine naming, update interface and remove redundant code
39+
- extract essential forest data for tokenization, keep memory usage minimized
40+
- allow setting different repr_id in SufSucNode
41+
- inline more functions
42+
- remove `EagerBpeToken`, `feed_len` is useless for external users
43+
- use rapidhash instead of default hash
44+
- move byte to token id table to heap
45+
- use the roots of the subtrees as indicators of parents, fix #19
46+
- expose priority of a token in normalized dict, fix panic when token id exceeded vocab size
47+
- use `LinkedList` for suffix chain
48+
- check token length explicitly
49+
- add `non_exhaustive` to errors
50+
- use u16 as `skip_len`
51+
- expose `NormalizedDictBuildError`
52+
53+
### Other
54+
55+
- *(docs)* update repo url in changelog
56+
- *(release)* rename to `mtc-inc-bpe` and release 0.9.0
57+
- *(deps)* update `rand`
58+
- *(deps)* bump dependabot/fetch-metadata from 2 to 3 ([#48](https://github.com/ModelTC/mtc-inc-bpe/pull/48))
59+
- update dependencies
60+
- bump version to v0.8.1
61+
- *(cargo)* exclude tools
62+
- update thiserror
63+
- bump version to v0.8.0
64+
- use a big node pool for centroid decomposition
65+
- *(cargo)* update dependencies and metadata
66+
- release v0.7.1
67+
- *(format)* format the code
68+
- disable default features of dependencies
69+
- release v0.7.0
70+
- release v0.6.0
71+
- make use of type inference
72+
- remove `two_diff_mut` from `TypedVec`
73+
- remove authors field in Cargo.toml
74+
- *(tests)* move heap bpe into test utils
75+
- *(deps)* update `derive_more`
76+
- keep transition table in the order of heavy chains
77+
- *(aho_corasik)* use sqrt decomposition to reduce memory footprint
78+
- release v0.5.0
79+
- use `tinyvec` replacing `smallvec` to reduce memory footprint
80+
- *(tests)* add tests on repeated characters
81+
- unify integer literals
82+
- release v0.4.1 ([#18](https://github.com/ModelTC/mtc-inc-bpe/pull/18))
83+
- optimize constructors in debug mode ([#17](https://github.com/ModelTC/mtc-inc-bpe/pull/17))
84+
- *(ci)* add auto-merge dependabot PR
85+
- *(deps)* bump actions/checkout from 5 to 6
86+
- release v0.4.0
87+
- release v0.3.1
88+
- release v0.3.0
89+
- rename parameters for clarity
90+
- release v0.2.1
91+
- release v0.2.0
92+
- clean up code
93+
- optimize validation to reduce execution time
94+
- pre-allocate vector whenever possible
95+
- reorder functions
96+
- update package name ([#4](https://github.com/ModelTC/mtc-inc-bpe/pull/4))
97+
- release v0.1.0 ([#3](https://github.com/ModelTC/mtc-inc-bpe/pull/3))
98+
- add more events to trigger build and test ([#2](https://github.com/ModelTC/mtc-inc-bpe/pull/2))
99+
10100
## [0.7.1](https://github.com/ModelTC/mtc-inc-bpe/compare/v0.7.0...v0.7.1) - 2026-01-05
11101

12102
### Fixed

0 commit comments

Comments
 (0)