All notable user-facing changes to CompressKit are tracked here.
The project follows Keep a Changelog style categories and uses semantic versioning for releases.
- BREAKING: Project refactored to C++17-only. Go and Rust implementations removed.
- BREAKING: Build system migrated from raw g++ Makefile to CMake.
- Removed OpenSpec, Cursor, and Claude skill meta-tooling directories.
- Removed cross-language conformance matrix and streaming API contract tests.
- Buffer layer rewritten to use in-memory transforms instead of temporary files.
- Huffman encoding now uses
uint64_tcode words instead ofstd::stringfor 8x density. - Huffman decoding now uses 8-bit lookup table per internal node (~8x faster than bit-by-bit tree walk).
- Range Coder now uses shared
frequency_tableutilities (was duplicating scale/cumulative/header logic). scale_frequenciesrewritten from O(N×M) decrement loop to O(N log N) proportional reduction.build_cumulativenow signals all-zero tables via empty result instead of silent fallback.- Centralized constants (
SYMBOL_LIMIT,EOF_SYMBOL, magic bytes, size limits) inconstants.hpp. - VitePress documentation pruned of Go/Rust content; bilingual (en/zh) structure retained.
- Issue templates pruned of Go/Rust/OpenSpec/cross-language references; Language scope reduced to C++17 / Python scripts / Docs (feature template adds CI).
CMakeLists.txtwith static library target, 4 algorithm executables, and CTest integration.algorithms/shared/cpp/include/compresskit/constants.hppfor shared constants.- Shared
count_frequencies,scale_frequencies,build_cumulativeutilities infrequency_table.hpp. - Architecture Decision Records 0001, 0003, 0004, 0005 under
docs/adr/(validation metadata module shape, range coder corpus cap policy, RLE buffered streaming stance, range C++ bench mode migration path). tests/metadata.pyvalidation metadata module (C++17-only:LANGUAGE_ORDER = ("cpp",)).
- Merged remote
master(PR #9 architecture-deepening) into the C++17-only tree. - Dropped ADR 0002 (cross-language semantic error alignment) and
docs/architecture/contract-inventory.md: both depend on the removed Go/Rust implementations. CONTEXT.md"参考资料" section rebuilt: keeps ADR 0001/0003/0004/0005 references, drops the 0002 link and the deletedopenspec/spec links.docs/en/architecture/index.mdCLI section kept on the C++17-only single-binary contract (./build/<algo>_cpp); dropped multi-language--langand<algo>_<lang>invocations.
- Extracted in-memory little-endian serialization helpers (
write_u32_le,write_magic,write_frequency_header,read_frequency_header) to newcompresskit/serialization.hpp. Eliminates duplicatedpush_u32lambdas andread_frequenciesfunctions across huffman/arithmetic/range (~90 lines removed). - Extracted
BitWriterandBitReaderto newcompresskit/bit_io.hpp. Eliminates duplicatedBitWriterclass across huffman/arithmetic (~28 lines removed). - RLE encode now uses shared
write_magicandwrite_u32_leinstead of inline copies. - Removed unused
nameparameter fromcompresskit::cli::run(and all 4 algorithm call sites). - Renamed
kInitialEncodeOverhead(Google-style) toINITIAL_ENCODE_OVERHEAD(matches codebase UPPER_CASE convention for local constants).
- Added shared binary-format constants to
constants.hpp:MAGIC_SIZE,U32_SIZE,BITS_PER_BYTE,BYTE_VALUES,RLE_PAIR_SIZE,STREAM_READ_BUFFER_SIZE,INITIAL_ENCODE_OVERHEAD,INITIAL_DECODE_OVERHEAD. - Replaced bare
4(magic size) withMAGIC_SIZEacross huffman/arithmetic/range/rle decode paths andserialization.hpp. - Replaced bare
4(uint32 LE size) withU32_SIZEinserialization.hpp,frequency_table.cpp, and rle inline count decode. - Replaced bare
8/7(bits per byte) withBITS_PER_BYTEinbit_io.hpp, huffman decode table, andbuffer_api.cppencode-limit calculation. - Replaced bare
256(byte value count) withBYTE_VALUESin huffman 8-bit decode table. - Replaced bare
5(RLE count+value pair size) withRLE_PAIR_SIZEin rle decode. - Replaced bare
32 * 1024(stream read buffer) withSTREAM_READ_BUFFER_SIZEinfrequency_table.cpp. - Promoted
INITIAL_ENCODE_OVERHEADfrombuffer_api.cppanonymous namespace toconstants.hpp; replaced bare2048reserve overhead in huffman/range encode. - Added
INITIAL_DECODE_OVERHEADconstant; replaced bare1024decode buffer overhead inbuffer_api.cpp. - Replaced
0xFFFFFFFFuwithUINT32_MAXin range coder encoder/decoder state. - Named
STATE_BYTES = 4(range coder 32-bit state width) in range main. - Named
MAX_TREE_NODES = 2 * SYMBOL_LIMIT(huffman worst-case node count) in huffman main. - Named
EXPECTED_ARGC = 4(program + mode + input + output) incli_launcher.cpp. - Refactored
write_u32_le/write_magic/read_frequency_headerand rle count decode from unrolled byte shifts toU32_SIZE/MAGIC_SIZEloops.
- Converted stream-based
write_u32_le/read_u32_leinfrequency_table.cppfrom unrolled byte shifts toU32_SIZEloops (matchesserialization.hppstyle). - Replaced bare
<<8/>>24in range coder byte-renormalisation with<< BITS_PER_BYTE/>> TOP_BYTE_SHIFT; added localTOP_BYTE_SHIFT = (STATE_BYTES - 1) * BITS_PER_BYTEconstant.
1.0.0 - 2026-01-07
- Huffman Coding, Arithmetic Coding, Range Coder, and Run-Length Encoding implementations.
- C++17, Go, and Rust command-line tools for all four algorithms.
- Unified CLI shape:
<binary> <encode|decode> <input> <output>. - Cross-language file compatibility goals for educational verification.
- Test data generation scripts and benchmark scripts.
- VitePress documentation site with English and Chinese content.
- MIT license, contribution guide, code of conduct, security policy, issue templates, and pull request template.
- Documented maximum input size of 4 GiB.
- Documented maximum decoded output size of 1 GiB for decompression-bomb protection.