This document describes the public integration APIs exposed by Kanzi C++:
- The C ABI in
src/api/Compressor.hppandsrc/api/Decompressor.hpp - The C++ stream API in
src/io/CompressedOutputStream.hppandsrc/io/CompressedInputStream.hpp - Shared C++ extension interfaces for transforms, entropy codecs, bitstreams, contexts, events, and errors
- The Python
ctypeswrapper insrc/api/kanzi.py
Kanzi is a compressor, not an archive format. The APIs below operate on byte streams or blocks; callers remain responsible for file naming, metadata, multi-file packaging, and application-level lifecycle decisions.
- Compatibility
- Common Concepts
- C API
- C++ Stream API
- Context Keys
- Transforms
- Entropy Codecs
- Bitstream API
- Events and Listeners
- Errors and Exceptions
- Python Wrapper
- Integration Notes
The core library is written in portable C++ and supports multiple C++ standards. Some APIs expose extra functionality when the compiler supports it:
- C++98 and later: core C++ stream and codec APIs
- C++11 and later: move constructors/assignments in several value types
- C++17 and later:
Contextstores values internally withstd::variant CONCURRENCY_ENABLED: multi-job compression/decompression andThreadPoolintegration- MSVC 2008: some seek helpers are unavailable because the bitstream classes do not inherit
Seekableon that compiler path
The C API is exported with extern "C" and uses opaque contexts plus FILE*.
Most C++ APIs live in namespace kanzi.
Common integer aliases are defined in src/types.hpp:
| Alias | Meaning |
|---|---|
byte |
Byte type. std::byte when enabled with C++17 and KNZ_BYTE_AS_STD_BYTE, otherwise uint8_t. |
int8, uint8, int16, uint16, int32, uint32 |
Fixed-width integer aliases. |
uint |
uint32_t. |
int64, uint64 |
64-bit signed and unsigned aliases. |
Names are case-insensitive when parsed through factories or the C API.
Supported transform names:
| Name | Description |
|---|---|
NONE |
No transform. |
PACK |
Alias packing transform. |
DNA |
DNA-specialized alias packing. |
BWT |
Burrows-Wheeler transform block codec. |
BWTS |
Burrows-Wheeler Scott transform. |
LZ |
Default Lempel-Ziv transform. |
LZX |
Lempel-Ziv extra variant. |
LZP |
Lempel-Ziv predictive transform. |
ROLZ |
Reduced-offset Lempel-Ziv transform. |
ROLZX |
Extra ROLZ variant. |
RLT |
Run-length transform. |
ZRLT |
Zero run-length transform. |
MTFT |
Move-to-front family transform. |
RANK |
Rank transform. |
SRT |
Sorted-rank transform. |
TEXT |
Text/dictionary transform. |
MM |
Multimedia/FSD transform. |
EXE |
Executable-code transform. |
UTF |
UTF transform. |
Transforms can be chained with +, for example BWT+RANK+ZRLT. Up to 8 transform tokens are accepted.
Supported entropy names:
| Name | Description |
|---|---|
NONE |
No entropy coding. |
HUFFMAN |
Huffman coding. |
ANS0 |
ANS range coding, order 0. |
ANS1 |
ANS range coding, order 1. |
RANGE |
Range coding. |
FPAQ |
Fast PAQ-style bit coding. |
CM |
Context model. |
TPAQ |
Tangelo PAQ. |
TPAQX |
Tangelo PAQ extra. |
Kanzi compresses data in blocks.
General stream constraints:
| Parameter | Constraint |
|---|---|
| Block size | 1024 to 1024 * 1024 * 1024 bytes |
| Block size alignment | Multiple of 16 for compression streams |
| Checksum size | 0, 32, or 64 |
| Jobs | 1..64 when concurrency is enabled, exactly 1 otherwise |
The C API rounds blockSize up to the next multiple of 16 during initialization.
Default streams are headered. The compressed stream contains the transform, entropy codec, checksum size, block size, and bitstream version.
Headerless mode omits this metadata. Callers must provide matching parameters to both the compressor and decompressor.
Use headerless mode only when an outer protocol already carries the Kanzi stream parameters.
Headers:
#include "api/Compressor.hpp"
#include "api/Decompressor.hpp"The C API returns 0 on success and a Kanzi error code on failure. Error codes are listed in Errors and Exceptions.
struct cContext;
struct cData {
char transform[64];
char entropy[16];
size_t blockSize;
unsigned int jobs;
int checksum;
int headerless;
};cContext is opaque. Do not allocate or inspect it directly.
cData fields:
| Field | Direction | Description |
|---|---|---|
transform |
in/out | Transform name or chain. Validated and rewritten to canonical uppercase form by initCompressor(). |
entropy |
in/out | Entropy codec name. Validated and rewritten to canonical uppercase form by initCompressor(). |
blockSize |
in/out | Maximum input bytes accepted per compress() call. Rounded up to a multiple of 16 by initCompressor(). |
jobs |
in | Maximum concurrent jobs. |
checksum |
in | 0, 32, or 64. |
headerless |
in | Non-zero writes a headerless bitstream. |
unsigned int getCompressorVersion(void);Returns the compressor C API version packed as:
(major << 16) | (minor << 8) | patch
Current header constants are KANZI_COMP_VERSION_MAJOR, KANZI_COMP_VERSION_MINOR, and KANZI_COMP_VERSION_PATCH.
int initCompressor(struct cData* cParam, FILE* dst, struct cContext** ctx);Initializes a compressor writing to dst.
Rules:
cParam,dst, andctxmust be non-null.transformandentropymust be null-terminated inside their fixed arrays.transformandentropyare validated and normalized in place.blockSizeis rounded up to a multiple of 16.- On success,
*ctxreceives a new compressor context. - On failure, no compressor context should be used.
Common return codes:
| Code | Meaning |
|---|---|
0 |
Success. |
ERR_INVALID_PARAM |
Null argument, unterminated string, invalid parameter. |
ERR_CREATE_COMPRESSOR |
Could not create the internal compressor stream. |
int compress(struct cContext* ctx,
const unsigned char* src,
size_t inSize,
size_t* outSize);Compresses one input block and writes compressed bytes to the destination FILE* provided at initialization.
Rules:
ctxandoutSizemust be non-null.srcmay be null only wheninSize == 0.inSizemust not exceed the initializedblockSize.*outSizeis set to the number of bytes written during this call.- Zero-length input is accepted and reports zero output.
The compressor owns no input memory. The caller may reuse or free src after the call returns.
int disposeCompressor(struct cContext** ctx, size_t* outSize);Closes the compressor, flushes remaining internal data, releases resources, and sets *ctx to null.
Rules:
ctx,*ctx, andoutSizemust be non-null.*outSizereceives bytes written during final flush/close.- Always call this after successful
initCompressor().
struct dContext;
struct dData {
size_t bufferSize;
unsigned int jobs;
int headerless;
char transform[64];
char entropy[16];
unsigned int blockSize;
size_t originalSize;
int checksum;
int bsVersion;
};dContext is opaque.
dData fields:
| Field | Direction | Required | Description |
|---|---|---|---|
bufferSize |
in | Always | Maximum output bytes accepted in one decompress() call. Must not exceed 2 GiB. |
jobs |
in | Always | Maximum concurrent jobs. |
headerless |
in | Always | Non-zero means the compressed stream has no header. |
transform |
in/out | Headerless | Transform name or chain. Validated and normalized in headerless mode. |
entropy |
in/out | Headerless | Entropy codec name. Validated and normalized in headerless mode. |
blockSize |
in/out | Headerless | Original compression block size. Rounded up to a multiple of 16 by initDecompressor(). |
originalSize |
in | Headerless | Original stream size, when known. |
checksum |
in | Headerless | 0, 32, or 64. Must match compression. |
bsVersion |
in | Headerless | Bitstream version to decode. |
For headered streams, only bufferSize, jobs, and headerless = 0 are required. Stream metadata is read from the bitstream.
unsigned int getDecompressorVersion(void);Returns the decompressor C API version packed as:
(major << 16) | (minor << 8) | patch
Current header constants are KANZI_DECOMP_VERSION_MAJOR, KANZI_DECOMP_VERSION_MINOR, and KANZI_DECOMP_VERSION_PATCH.
int initDecompressor(struct dData* dParam, FILE* src, struct dContext** ctx);Initializes a decompressor reading from src.
Rules:
dParam,src, andctxmust be non-null.bufferSizemust be at most 2 GiB.- In headerless mode,
transformandentropymust be null-terminated inside their fixed arrays. - In headerless mode,
transformandentropyare validated and normalized in place. - On success,
*ctxreceives a new decompressor context.
int decompress(struct dContext* ctx,
unsigned char* dst,
size_t* inSize,
size_t* outSize);Decompresses bytes into dst.
Rules:
ctxandoutSizemust be non-null.*outSizeis the destination capacity on input.*outSizemust be less than or equal to the initializedbufferSize.- If input
*outSize == 0, the call returns success and writes nothing. dstmust be non-null when input*outSize != 0.- On success,
*outSizeis replaced with the number of bytes decoded. - If
inSizeis non-null, it receives the number of compressed bytes read during the call.
End of stream is reported as a successful call with *outSize == 0.
int disposeDecompressor(struct dContext** ctx);Closes the decompressor, releases resources, and sets *ctx to null.
Rules:
ctxand*ctxmust be non-null.- Always call this after successful
initDecompressor().
#include <stdio.h>
#include <string.h>
#include "api/Compressor.hpp"
#include "api/Decompressor.hpp"
int write_kanzi_file(const char* path, const unsigned char* data, size_t len) {
FILE* out = fopen(path, "wb");
if (out == NULL)
return -1;
struct cData params;
memset(¶ms, 0, sizeof(params));
strcpy(params.transform, "LZ");
strcpy(params.entropy, "ANS0");
params.blockSize = 1 << 20;
params.jobs = 1;
params.checksum = 32;
params.headerless = 0;
struct cContext* ctx = NULL;
int rc = initCompressor(¶ms, out, &ctx);
if (rc != 0) {
fclose(out);
return rc;
}
size_t written = 0;
rc = compress(ctx, data, len, &written);
size_t flushed = 0;
int close_rc = disposeCompressor(&ctx, &flushed);
fclose(out);
return (rc != 0) ? rc : close_rc;
}Headers:
#include "io/CompressedOutputStream.hpp"
#include "io/CompressedInputStream.hpp"kanzi::InputStream is an alias for std::istream.
kanzi::OutputStream is an alias for std::ostream.
The compressed streams derive from those standard stream types by using the wrapped stream's rdbuf(). The caller owns the underlying stream and must keep it alive for the lifetime of the Kanzi stream.
class kanzi::CompressedOutputStream : public kanzi::OutputStream {
public:
CompressedOutputStream(OutputStream& os,
int jobs = 1,
const std::string& entropy = "NONE",
const std::string& transform = "NONE",
int blockSize = 4 * 1024 * 1024,
int checksum = 0,
uint64 originalSize = 0,
#ifdef CONCURRENCY_ENABLED
ThreadPool* pool = nullptr,
#endif
bool headerless = false);
CompressedOutputStream(OutputStream& os, Context& ctx, bool headerless = false);
~CompressedOutputStream();
bool addListener(Listener<Event>& listener);
bool removeListener(Listener<Event>& listener);
std::ostream& write(const char* data, std::streamsize length);
std::ostream& put(char c);
std::ostream& flush();
std::streampos tellp();
std::ostream& seekp(std::streampos pos);
void close();
uint64 getWritten() const;
};Constructor parameters:
| Parameter | Description |
|---|---|
os |
Destination stream. Must remain alive. |
jobs |
Number of concurrent jobs. |
entropy |
Entropy codec name. |
transform |
Transform name or chain. |
blockSize |
Compression block size in bytes. Must be 1024..1 GiB and a multiple of 16. |
checksum |
0, 32, or 64. |
originalSize |
Optional original input size stored in the header when known. |
pool |
Optional caller-owned thread pool when concurrency is enabled. |
headerless |
If true, omit stream header. |
Context constructor keys are listed in Context Keys.
Methods:
| Method | Description |
|---|---|
write(data, length) |
Compresses length bytes from data. Throws on negative length, closed stream, write failure, or codec failure. |
put(c) |
Writes one byte. |
flush() |
No-op for Kanzi buffering. Underlying stream flushing remains caller-controlled. |
close() |
Finishes all pending blocks, closes the compressed bitstream, and releases internal resources. |
getWritten() |
Returns compressed bytes written so far. |
addListener(listener) |
Registers an event listener. |
removeListener(listener) |
Unregisters an event listener. Returns false if not registered. |
tellp() |
Not supported. Throws std::ios_base::failure. |
seekp(pos) |
Not supported. Throws std::ios_base::failure. |
Call close() before destroying or reading the generated compressed data. The destructor also cleans up, but explicit close makes error handling deterministic.
class kanzi::CompressedInputStream : public kanzi::InputStream {
public:
CompressedInputStream(InputStream& is,
int jobs = 1,
const std::string& entropy = "NONE",
const std::string& transform = "NONE",
int blockSize = 4 * 1024 * 1024,
int checksum = 0,
uint64 originalSize = 0,
#ifdef CONCURRENCY_ENABLED
ThreadPool* pool = nullptr,
#endif
bool headerless = false,
int bsVersion = BITSTREAM_FORMAT_VERSION);
CompressedInputStream(InputStream& is, Context& ctx, bool headerless = false);
~CompressedInputStream();
bool addListener(Listener<Event>& listener);
bool removeListener(Listener<Event>& listener);
std::istream& read(char* data, std::streamsize length);
std::streamsize gcount() const;
int get();
int peek();
void close();
uint64 getRead() const;
#if !defined(_MSC_VER) || _MSC_VER > 1500
bool seek(int64 bitPos);
int64 tell();
#endif
std::streampos tellg();
std::istream& seekg(std::streampos pos);
std::istream& putback(char c);
std::istream& unget();
};Constructor behavior:
- For headered streams (
headerless == false), metadata is read from the compressed stream. Constructorentropy,transform,blockSize,checksum, andoriginalSizeare not authoritative after header decoding. - For headerless streams (
headerless == true), the caller must provide matchingentropy,transform,blockSize,checksum, and bitstream version.
Methods:
| Method | Description |
|---|---|
read(data, length) |
Decodes up to length bytes into data. Sets gcount(). Throws on negative length, closed stream, invalid stream, checksum failure, or codec failure. |
gcount() |
Returns bytes decoded by the last read() or get(). |
get() |
Decodes and returns one byte, or EOF. |
peek() |
Returns next byte without consuming it, or EOF. |
close() |
Closes the compressed input stream and releases internal resources. |
getRead() |
Returns compressed bytes consumed so far. |
seek(bitPos) |
Seeks to a bit position. Valid positions are block boundaries. Returns false on invalid/closed stream or failed underlying seek. |
tell() |
Returns current bit position from the underlying bitstream. |
tellg() |
Not supported. Throws std::ios_base::failure. |
seekg(pos) |
Not supported. Throws std::ios_base::failure. |
putback(c) |
Not supported. Sets badbit and throws. |
unget() |
Not supported. Sets badbit and throws. |
#include <fstream>
#include "io/CompressedOutputStream.hpp"
#include "io/CompressedInputStream.hpp"
void round_trip(const char* input_path, const char* compressed_path, const char* output_path) {
{
std::ifstream in(input_path, std::ios::binary);
std::ofstream out(compressed_path, std::ios::binary);
kanzi::CompressedOutputStream cos(out, 1, "ANS0", "LZ", 1 << 20, 32);
char buffer[65536];
while (in.good()) {
in.read(buffer, sizeof(buffer));
std::streamsize n = in.gcount();
if (n > 0)
cos.write(buffer, n);
}
cos.close();
}
{
std::ifstream in(compressed_path, std::ios::binary);
std::ofstream out(output_path, std::ios::binary);
kanzi::CompressedInputStream cis(in, 1);
char buffer[65536];
while (true) {
cis.read(buffer, sizeof(buffer));
std::streamsize n = cis.gcount();
if (n <= 0)
break;
out.write(buffer, n);
}
cis.close();
}
}kanzi::Context is a string-keyed parameter map used by stream constructors, factories, transforms, and entropy codecs.
class kanzi::Context {
public:
bool has(const std::string& key) const;
int getInt(const std::string& key, int defValue = 0) const;
int64 getLong(const std::string& key, int64 defValue = 0) const;
std::string getString(const std::string& key,
const std::string& defValue = "") const;
void putInt(const std::string& key, int value);
void putLong(const std::string& key, int64 value);
void putString(const std::string& key, const std::string& value);
#ifdef CONCURRENCY_ENABLED
ThreadPool* getPool() const;
#endif
};Common stream keys:
| Key | Type | Used by | Description |
|---|---|---|---|
jobs |
int | Input/output streams, BWT, codecs | Concurrent jobs. |
blockSize |
int | Input/output streams, transforms, entropy predictors | Block size in bytes. |
checksum |
int | Input/output streams | 0, 32, or 64. |
entropy |
string | Input/output streams, factories, transforms | Entropy codec name. |
transform |
string | Input/output streams, factories, transforms | Transform name or chain. |
bsVersion |
int | Input stream, output stream context, version-sensitive codecs | Bitstream version. Defaults to current version in headerless input mode when omitted. |
outputSize |
int64 | Headerless input stream | Optional original decoded size. |
size |
int | Some entropy predictors | Current block size hint. |
dataType |
int | Transforms | Internal detected data type passed between transforms. |
textcodec |
int | TEXT transform |
Internal text codec selection. |
packOnlyDNA |
int | DNA transform |
Internal flag for DNA-only packing. |
Only the stream keys should be considered stable integration inputs. Other keys are implementation details used by codec pipelines.
Context constructor requirements:
kanzi::Context ctx;
ctx.putInt("jobs", 1);
ctx.putString("entropy", "ANS0");
ctx.putString("transform", "LZ");
ctx.putInt("blockSize", 1 << 20);
ctx.putInt("checksum", 32);
kanzi::CompressedOutputStream cos(out, ctx);For headerless input:
kanzi::Context ctx;
ctx.putInt("jobs", 1);
ctx.putString("entropy", "ANS0");
ctx.putString("transform", "LZ");
ctx.putInt("blockSize", 1 << 20);
ctx.putInt("checksum", 32);
ctx.putInt("bsVersion", 6); // optional; defaults to current version
ctx.putLong("outputSize", size); // optional
kanzi::CompressedInputStream cis(in, ctx, true);Header:
#include "Transform.hpp"
#include "transform/TransformFactory.hpp"template <class T>
class kanzi::Transform {
public:
virtual bool forward(SliceArray<T>& src, SliceArray<T>& dst, int length) = 0;
virtual bool inverse(SliceArray<T>& src, SliceArray<T>& dst, int length) = 0;
virtual int getMaxEncodedLength(int srcLen) const = 0;
virtual ~Transform();
};Contract:
forward()transformslengthelements fromsrcintodst.inverse()reversesforward().- Both methods update
SliceArray::_indexon success according to consumed/produced elements. - Return
falsewhen the transform elects not to apply or cannot encode/decode the provided data. - Throw
std::invalid_argumentfor invalid slice arguments. - Implementations must not retain cross-call state that changes output depending on prior blocks.
template <class T>
class kanzi::SliceArray {
public:
T* _array;
int _length;
int _index;
SliceArray(T* arr, int len, int index = 0);
static bool isValid(const SliceArray& sa);
};_length is the total buffer capacity. _index is the current offset. SliceArray does not own the buffer memory.
template <class T>
class kanzi::TransformFactory {
public:
static uint64 getType(const char* name);
static uint64 getTypeToken(const char* name);
static std::string getName(uint64 functionType);
static TransformSequence<T>* newTransform(Context& ctx, uint64 functionType);
};Methods:
| Method | Description |
|---|---|
getType(name) |
Parses a transform chain such as BWT+RANK+ZRLT into an encoded uint64. Throws on unknown transform or more than 8 transforms. |
getTypeToken(name) |
Parses a single transform token into its type id. |
getName(functionType) |
Converts an encoded transform chain back to canonical names. |
newTransform(ctx, functionType) |
Allocates a TransformSequence<T> for the encoded chain. Caller owns the returned pointer. |
Header:
#include "transform/TransformSequence.hpp"TransformSequence<T> is itself a Transform<T> that applies up to 8 transforms in order for forward() and in reverse order for inverse(). When constructed with deallocate = true, it owns and deletes the transform pointers.
Headers:
#include "EntropyEncoder.hpp"
#include "EntropyDecoder.hpp"
#include "entropy/EntropyEncoderFactory.hpp"
#include "entropy/EntropyDecoderFactory.hpp"class kanzi::EntropyEncoder {
public:
virtual int encode(const byte block[], uint blkptr, uint len) = 0;
virtual OutputBitStream& getBitStream() const = 0;
virtual void dispose() = 0;
virtual ~EntropyEncoder();
};encode() writes len bytes from block + blkptr and returns the number of bytes processed. Call dispose() before destroying or abandoning an encoder.
class kanzi::EntropyDecoder {
public:
virtual int decode(byte block[], uint blkptr, uint len) = 0;
virtual InputBitStream& getBitStream() const = 0;
virtual void dispose() = 0;
virtual ~EntropyDecoder();
};decode() writes len decoded bytes to block + blkptr and returns the number of bytes processed. Call dispose() before destroying or abandoning a decoder.
class kanzi::EntropyEncoderFactory {
public:
static EntropyEncoder* newEncoder(OutputBitStream& obs, Context& ctx, short entropyType);
static const char* getName(short entropyType);
static short getType(const char* name);
};
class kanzi::EntropyDecoderFactory {
public:
static EntropyDecoder* newDecoder(InputBitStream& ibs, Context& ctx, short entropyType);
static const char* getName(short entropyType);
static short getType(const char* name);
};Caller owns codecs returned by newEncoder() and newDecoder().
Factory type constants:
| Constant | Name |
|---|---|
NONE_TYPE |
NONE |
HUFFMAN_TYPE |
HUFFMAN |
FPAQ_TYPE |
FPAQ |
RANGE_TYPE |
RANGE |
ANS0_TYPE |
ANS0 |
ANS1_TYPE |
ANS1 |
CM_TYPE |
CM |
TPAQ_TYPE |
TPAQ |
TPAQX_TYPE |
TPAQX |
Headers:
#include "InputBitStream.hpp"
#include "OutputBitStream.hpp"
#include "bitstream/DefaultInputBitStream.hpp"
#include "bitstream/DefaultOutputBitStream.hpp"class kanzi::InputBitStream {
public:
virtual int readBit() = 0;
virtual uint64 readBits(uint length) = 0;
virtual uint readBits(byte bits[], uint length) = 0;
virtual void close() = 0;
virtual uint64 read() const = 0;
virtual bool hasMoreToRead() = 0;
virtual ~InputBitStream();
};readBits(uint length) accepts lengths in [1..64].
class kanzi::OutputBitStream {
public:
virtual void writeBit(int bit) = 0;
virtual uint writeBits(uint64 bits, uint length) = 0;
virtual uint writeBits(const byte bits[], uint length) = 0;
virtual void close() = 0;
virtual uint64 written() const = 0;
virtual ~OutputBitStream();
};writeBits(uint64 bits, uint length) accepts lengths in [1..64]; invalid lengths return 0 in the default implementation.
class kanzi::DefaultInputBitStream : public InputBitStream, public Seekable {
public:
DefaultInputBitStream(InputStream& is, uint bufferSize = 65536);
int readBit();
uint64 readBits(uint length);
uint readBits(byte bits[], uint count);
void close();
uint64 read() const;
bool hasMoreToRead();
bool isClosed() const;
int64 tell();
bool seek(int64 pos);
};
class kanzi::DefaultOutputBitStream : public OutputBitStream, public Seekable {
public:
DefaultOutputBitStream(OutputStream& os, uint bufferSize = 65536);
void writeBit(int bit);
uint writeBits(uint64 bits, uint length);
uint writeBits(const byte bits[], uint length);
void close();
uint64 written() const;
bool isClosed() const;
int64 tell();
bool seek(int64 pos);
};Default bitstream buffer sizes must be:
- At least 1024 bytes
- At most 536,870,912 bytes
- A multiple of 8
DefaultInputBitStream::seek() accepts any non-negative bit position and consumes partial-byte bits after seeking. DefaultOutputBitStream::seek() only accepts byte-aligned bit positions.
Headers:
#include "Event.hpp"
#include "Listener.hpp"Streams can notify caller-provided listeners during compression and decompression.
template <class T>
class kanzi::Listener {
public:
virtual void processEvent(const T& evt) = 0;
virtual ~Listener();
};Register with:
cos.addListener(listener);
cis.addListener(listener);class kanzi::Event {
public:
enum Type {
COMPRESSION_START,
COMPRESSION_END,
BEFORE_TRANSFORM,
AFTER_TRANSFORM,
BEFORE_ENTROPY,
AFTER_ENTROPY,
DECOMPRESSION_START,
DECOMPRESSION_END,
AFTER_HEADER_DECODING,
BLOCK_INFO
};
enum HashType {
NO_HASH,
SIZE_32,
SIZE_64
};
struct HeaderInfo {
std::string inputName;
int bsVersion;
int checksumSize;
int blockSize;
std::string entropyType;
std::string transformType;
int64 originalSize;
int64 fileSize;
};
int getId() const;
int64 getSize() const;
Type getType() const;
WallTimer::TimeData getTime() const;
uint64 getHash() const;
int64 getOffset() const;
HashType getHashType() const;
HeaderInfo* getInfo() const;
std::string toString() const;
std::string getTypeAsString() const;
};Common event data:
| Getter | Meaning |
|---|---|
getType() |
Event kind. |
getId() |
Block id or event id, depending on type. |
getSize() |
Block size or stage size. |
getHash() |
Block checksum when enabled, otherwise 0. |
getHashType() |
No hash, 32-bit hash, or 64-bit hash. |
getOffset() |
Compressed block offset when available. |
getInfo() |
Header metadata for AFTER_HEADER_DECODING. |
Minimal listener:
class MyListener : public kanzi::Listener<kanzi::Event> {
public:
void processEvent(const kanzi::Event& evt) {
std::cout << evt.toString() << std::endl;
}
};src/Error.hpp defines:
| Code | Name |
|---|---|
| 1 | ERR_MISSING_PARAM |
| 2 | ERR_BLOCK_SIZE |
| 3 | ERR_INVALID_CODEC |
| 4 | ERR_CREATE_COMPRESSOR |
| 5 | ERR_CREATE_DECOMPRESSOR |
| 6 | ERR_OUTPUT_IS_DIR |
| 7 | ERR_OVERWRITE_FILE |
| 8 | ERR_CREATE_FILE |
| 9 | ERR_CREATE_BITSTREAM |
| 10 | ERR_OPEN_FILE |
| 11 | ERR_READ_FILE |
| 12 | ERR_WRITE_FILE |
| 13 | ERR_PROCESS_BLOCK |
| 14 | ERR_CREATE_CODEC |
| 15 | ERR_INVALID_FILE |
| 16 | ERR_STREAM_VERSION |
| 17 | ERR_CREATE_STREAM |
| 18 | ERR_INVALID_PARAM |
| 19 | ERR_CRC_CHECK |
| 20 | ERR_RESERVED_NAME |
| 127 | ERR_UNKNOWN |
Header:
#include "io/IOException.hpp"class kanzi::IOException : public std::runtime_error {
public:
IOException(const std::string& msg);
IOException(const std::string& msg, int error);
int error() const;
};error() returns a value from kanzi::Error::ErrorCode.
Header:
#include "BitStreamException.hpp"class kanzi::BitStreamException : public std::runtime_error {
public:
enum BitStreamStatus {
UNDEFINED = 0,
INPUT_OUTPUT = 1,
END_OF_STREAM = 2,
INVALID_STREAM = 3,
STREAM_CLOSED = 4
};
BitStreamException(const std::string& msg);
BitStreamException(const std::string& msg, int code);
int error() const;
};Files:
src/api/kanzi_c_api.py: low-levelctypesdeclarationssrc/api/kanzi.py: higher-level Python classes
The wrapper expects the Kanzi shared library to be loadable by name:
| Platform | Library name |
|---|---|
| Windows | kanzi.dll |
| macOS | libkanzi.dylib |
| Linux/Unix | libkanzi.so |
Ensure the shared library is in the dynamic loader path before importing.
class KanziError(RuntimeError):
passRaised when a wrapped C function returns a non-zero Kanzi error code.
class Compressor:
def __init__(
self,
dst_path,
transform=b"LZ",
entropy=b"Huffman",
block_size=1 << 20,
jobs=1,
checksum=0,
headerless=0,
)
def compress(self, data: bytes) -> int
def close(self) -> intBehavior:
- Opens
dst_pathfor binary writing. - Initializes the C compressor.
compress(data)writes one block and returns bytes written by that call.close()disposes the C compressor, closes the file, and returns bytes flushed during close.- Supports context-manager use.
Example:
from kanzi import Compressor
with Compressor("sample.knz", transform=b"LZ", entropy=b"ANS0", checksum=32) as c:
c.compress(b"hello kanzi")class Decompressor:
def __init__(
self,
src_path,
buffer_size,
jobs=1,
headerless=0,
**headerless_params,
)
def decompress_block(self, max_output: int) -> bytes
def close(self)Behavior:
- Opens
src_pathfor binary reading. - Initializes the C decompressor.
decompress_block(max_output)returns up tomax_outputdecoded bytes.- An empty result indicates end of stream.
- Supports context-manager use.
Headerless parameters, when headerless=1:
| Python keyword | C field |
|---|---|
transform |
dData.transform |
entropy |
dData.entropy |
blockSize |
dData.blockSize |
originalSize |
dData.originalSize |
checksum |
dData.checksum |
bsVersion |
dData.bsVersion |
Example:
from kanzi import Decompressor
out = bytearray()
with Decompressor("sample.knz", buffer_size=1 << 20) as d:
while True:
block = d.decompress_block(1 << 20)
if not block:
break
out.extend(block)- Prefer the C API for ABI-stable dynamic-library integration.
- Prefer the C++ stream API when embedding directly in C++ applications.
- Keep the underlying
FILE*,std::istream, orstd::ostreamopen while the Kanzi context/stream exists. - Always dispose or close compressors to flush final bytes.
- Use headered streams unless another protocol stores Kanzi parameters.
- Match headerless decompression parameters exactly to compression parameters.
- Do not share one stream instance concurrently between threads.
- Transform and entropy objects returned by factories are heap-allocated; delete them when no longer needed unless ownership is transferred to
TransformSequence. - Check return codes in the C API and catch
kanzi::IOException,kanzi::BitStreamException,std::invalid_argument, andstd::ios_base::failurein C++ integrations.