wolfCrypt on TI C2000 C28x (LAUNCHXL-F28P55X) by dgarske · Pull Request #10724 · wolfSSL/wolfssl

dgarske · 2026-06-18T00:15:18Z

Summary

Adds WOLFSSL_WIDE_BYTE support so wolfCrypt builds and runs correctly on word-addressed targets where CHAR_BIT != 8 - specifically the TI C2000 C28x DSP family, where a C char/unsigned char (wolfSSL's byte) is 16 bits and is the smallest addressable unit. All changes are gated and are a no-op on normal 8-bit-byte targets.

The work was validated end-to-end on a TI LAUNCHXL-F28P55X (TMS320F28P550SJ, C28x, 150 MHz) using the bare-metal example added in the companion wolfssl-examples PR. Every algorithm below passes known-answer tests on hardware, and the standard host wolfcrypt_test continues to pass (no 8-bit regression).

Validated algorithms (on C28x hardware)

SHA-224/256, SHA-384/512, SHA-512/224, SHA-512/256
SHA3-224/256/384/512, SHAKE128/256 (with a 32-bit split Keccak permutation for WC_16BIT_CPU that emits native instructions instead of compiler 64-bit helper calls - ~53% faster SHAKE/SHA3 on this target)
ML-DSA-87 (Dilithium) verify and full keygen/sign/verify; ML-KEM-768 (FIPS 203)
AES-128/192/256 CBC/CTR/CFB/GCM; AES-CMAC, AES-CCM, AES-GMAC
HMAC + HKDF; ChaCha20-Poly1305; Poly1305
X25519 + Ed25519; ECDSA + ECDH (SECP256R1, SP math)
RSA-2048 PKCS#1 v1.5 verify (SP math)

What the `CHAR_BIT != 8` fixes address

All behind WOLFSSL_WIDE_BYTE (auto-enabled for CHAR_BIT != 8 and known 16-bit-char TI toolchain macros), each a no-op on 8-bit targets:

Byte/word aliasing. Serializing a word32/word64 by casting to byte* moves addressable cells, not octets. Replaced with explicit shift-based octet I/O via shared helpers in misc.c (WordsFromBytesBE32/BytesFromWordsBE32, BytesFromWordsLE32, the 64-bit variants, octet-correct readUnalignedWord32/readUnalignedWord64). sp_int.c sp_read_unsigned_bin uses an endian-/CHAR_BIT-agnostic shift loop for its leftover bytes (a 3-byte RSA exponent previously loaded as 1 instead of 65537).
(byte)x not truncating to an octet (it keeps 16 bits). Masked with WC_OCTET(x) = (byte)((x) & 0xFF). Used across the ML-KEM/ML-DSA encoders, the SP *_to_bin serializers, AES GETBYTE, base64, and the DRBG.
Integer promotion. 1U << n is 16-bit on C28x (use 1UL); a bit width written sizeof(t) * 8 is wrong when CHAR_BIT != 8 (use CHAR_BIT * sizeof(t)); byte operands promote to a 16-bit int.
sizeof counting cells, not octets. e.g. CHACHA_CHUNK_BYTES must be 16 * 4, not 16 * sizeof(word32) (= 32 on C28x, which halves the ChaCha block and desyncs the counter).
xorbuf word stride. WOLFSSL_WORD_SIZE_LOG2 vs sizeof(word) mismatch left half of each buffer un-XORed on a 16-bit-cell target; corrected for the WC_16BIT_CPU word16 path.

It also adds WOLFSSL_MLDSA_VERIFY_SMALLEST_MEM (streams the signature z vector per-row), which combined with WOLFSSL_MLDSA_ASSIGN_KEY brings ML-DSA-87 verify to ~10.7 KB RAM with zero heap.

Commit layout

wolfcrypt: add WOLFSSL_WIDE_BYTE support for CHAR_BIT != 8 targets (TI C2000 C28x) - core types, misc octet helpers, base64, DRBG
sha: octet-correct SHA-2 byte I/O and 32-bit split Keccak permutation for CHAR_BIT != 8
aes/chacha: octet-correct block, key and keystream I/O for CHAR_BIT != 8
mldsa/mlkem: correct ML-DSA and ML-KEM on CHAR_BIT != 8; add WOLFSSL_MLDSA_VERIFY_SMALLEST_MEM
ecc/25519/sp: octet-correct X25519/Ed25519 and SP byte<->mp conversion for CHAR_BIT != 8
test/benchmark/ci: CHAR_BIT != 8 test vectors, NO_MALLOC benchmark, TI C2000 compile CI and docs

Testing

Host: ./configure --enable-dilithium --enable-experimental --enable-shake256 --enable-shake128 && make && ./wolfcrypt/test/testwolfcrypt - passes (RSA, ECC, ML-DSA, ML-KEM, SHA-2/3, all crypto). No behavior change on 8-bit-byte targets.
Hardware: on the LAUNCHXL-F28P55X, KATs for every algorithm listed above pass, and wolfcrypt_test crypto passes.
CI: IDE/C2000/compile.sh runs cl2000 --compile_only over the CHAR_BIT != 8 wolfCrypt subset; .github/workflows/ti-c2000-compile.yml runs it on PRs (fetches/caches the TI C2000 code generation tools).

Benchmarks (F28P55X @ 150 MHz)

Primitive	Throughput
SHA-256	~284 KiB/s
SHA-384 / SHA-512	~166 KiB/s
SHA3-224 / 256 / 384 / 512	~279 / 264 / 206 / 146 KiB/s
SHAKE128 / SHAKE256	~319 / 264 KiB/s
RNG (Hash-DRBG)	~122 KiB/s

ML-DSA-87: verify ~225 ms/op (~10.7 KB RAM, zero heap); keygen and signing also run (SIGN=1).

Notes

wolfcrypt/src/sp_c32.c is generated. The & 0xFF octet masks added to its sp_*_to_bin_* serializers should be folded into the SP generator templates for a permanent fix; the in-tree edit is included here so the C28x build is correct today.
Documentation: IDE/C2000/README.md describes the support, the build options, and the benchmark results; the full bare-metal example (with KATs, benchmark, linker scripts, and per-algorithm make toggles) is in wolfssl-examples at embedded/ti-c2000-f28p55x/.

Companion PR

wolfssl-examples: "Add TI LAUNCHXL-F28P55X (C2000 C28x, CHAR_BIT==16) bare-metal wolfCrypt example".
wolfSSL/wolfssl-examples#576

Copilot

Pull request overview

This PR adds and CI-guards a bare-metal wolfCrypt port for TI C2000 C28x targets where CHAR_BIT == 16, introducing gated fixes so hashing, DRBG, ML-DSA verify, and SP-math ECC work correctly when a C “byte” is wider than 8 bits.

Changes:

Introduces WOLFSSL_NO_OCTET_BYTE detection and uses octet-wise load/store paths to avoid invalid byte/word aliasing on CHAR_BIT != 8 targets (SHA-256/512 family, SHA-3/SHAKE, Base64 CT decode, DRBG helpers, rotate helpers).
Adds “smallest memory” ML-DSA verify mode that streams z per polynomial to reduce pinned RAM in wc_MlDsaKey.
Adds TI C2000 compile-only guard scripts plus a GitHub Actions workflow that downloads the TI CGT and compiles a scoped subset.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
wolfssl/wolfcrypt/wc_port.h	Makes atomic arg type selection robust for 16-bit `int` by also checking `UINT_MAX`.
wolfssl/wolfcrypt/wc_mldsa.h	Adds `WOLFSSL_MLDSA_VERIFY_SMALLEST_MEM` struct layout variant for reduced verify RAM.
wolfssl/wolfcrypt/types.h	Adds `WOLFSSL_NO_OCTET_BYTE` auto-detection; adjusts `WC_16BIT_CPU` 64-bit availability behavior.
wolfssl/wolfcrypt/sp_int.h	Adds support for `unsigned char` being 16-bit (no native 8-bit type).
wolfssl/wolfcrypt/settings.h	Requires explicit opt-in for SP math on 16-bit-`int` CPUs via `WOLFSSL_SP_ALLOW_16BIT_CPU`.
wolfssl/wolfcrypt/dilithium.h	Adds smallest-mem verify gating and defaults slow Montgomery reduction macros on `WC_16BIT_CPU`.
wolfcrypt/test/test.c	Switches large-digest constants from C strings to `byte[]` to avoid `CHAR_BIT!=8` pitfalls.
wolfcrypt/src/wc_port.c	Fixes init-state static assert to use `CHAR_BIT` instead of hardcoded 8.
wolfcrypt/src/wc_mldsa.c	Adds octet-masking for packed bytes and fixes integer-promotion/sign issues on 16-bit `int`; adds streaming `z` verify path.
wolfcrypt/src/sha512.c	Adds octet-wise word load/store and corrects length carry/length placement for `CHAR_BIT!=8`.
wolfcrypt/src/sha3.c	Forces bytewise Keccak absorb/squeeze for `WOLFSSL_NO_OCTET_BYTE` and adds squeeze helper.
wolfcrypt/src/sha256.c	Adds octet-wise word load/store and corrects length carry/length placement for `CHAR_BIT!=8`.
wolfcrypt/src/random.c	Fixes DRBG serialization/addition helpers for non-8-bit “byte” targets.
wolfcrypt/src/misc.c	Fixes rotate helpers to use `CHAR_BIT`-based bit width when needed.
wolfcrypt/src/coding.c	Ensures Base64 CT decode returns `0xFF` for invalid chars even when `byte` is wider than 8 bits.
wolfcrypt/benchmark/benchmark.c	Adds static buffers for `WOLFSSL_NO_MALLOC` benchmarking and adjusts frees/allocations accordingly.
scripts/ti-c2000/user_settings.h	Adds minimal CI-only config for cl2000 compile-guard.
scripts/ti-c2000/compile.sh	Adds compile-only script to build a scoped source set with TI cl2000.
.github/workflows/ti-c2000-compile.yml	Adds CI workflow to download/cache TI CGT and run the compile-only guard.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…I C2000 C28x) - core types, misc octet helpers, base64, DRBG

… for CHAR_BIT != 8

…MLDSA_VERIFY_SMALLEST_MEM

…n for CHAR_BIT != 8

…I C2000 compile CI and docs

Copilot

Pull request overview

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.

+    #if !defined(MICROCHIP_PIC24) && \
+        !(defined(SIZEOF_LONG_LONG) && (SIZEOF_LONG_LONG == 8))
        #undef WORD64_AVAILABLE
    #endif


+void BlockSha3(word64* s)
+{
+    word32*       sp = (word32*)s;
+    const word32* rc = (const word32*)hash_keccak_r;
+    word32 sl[25], sh[25], nl[25], nh[25], bl[5], bh[5];
+    word32 i, k;
+
+    for (k = 0; k < 25; k++) {
+        sl[k] = sp[2 * k];
+        sh[k] = sp[2 * k + 1];
+    }
+    for (i = 0; i < 24; i += 2) {
+        WC_SHA3_THETA(sl, sh);
+        WC_SHA3_ROWMIX(nl, nh, sl, sh);
+        nl[0] ^= rc[2 * i];           nh[0] ^= rc[2 * i + 1];
+        WC_SHA3_THETA(nl, nh);
+        WC_SHA3_ROWMIX(sl, sh, nl, nh);
+        sl[0] ^= rc[2 * (i + 1)];     sh[0] ^= rc[2 * (i + 1) + 1];
+    }
+    for (k = 0; k < 25; k++) {
+        sp[2 * k]     = sl[k];
+        sp[2 * k + 1] = sh[k];
+    }
+}


dgarske self-assigned this Jun 18, 2026

Copilot AI review requested due to automatic review settings June 18, 2026 00:15

Copilot started reviewing on behalf of dgarske June 18, 2026 00:15 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Comment thread wolfssl/wolfcrypt/types.h Outdated

Comment thread wolfcrypt/benchmark/benchmark.c

dgarske force-pushed the ti_c25 branch 3 times, most recently from 20e4053 to 39c343a Compare June 23, 2026 14:56

dgarske added 6 commits June 24, 2026 14:25

wolfcrypt: add WOLFSSL_WIDE_BYTE support for CHAR_BIT != 8 targets (T…

ba1005b

…I C2000 C28x) - core types, misc octet helpers, base64, DRBG

sha: octet-correct SHA-2 byte I/O and 32-bit split Keccak permutation…

a95350b

… for CHAR_BIT != 8

aes/chacha: octet-correct block, key and keystream I/O for CHAR_BIT != 8

12b11e1

mldsa/mlkem: correct ML-DSA and ML-KEM on CHAR_BIT != 8; add WOLFSSL_…

7929398

…MLDSA_VERIFY_SMALLEST_MEM

ecc/25519/sp: octet-correct X25519/Ed25519 and SP byte<->mp conversio…

4a688b0

…n for CHAR_BIT != 8

test/benchmark/ci: CHAR_BIT != 8 test vectors, NO_MALLOC benchmark, T…

afaf660

…I C2000 compile CI and docs

dgarske force-pushed the ti_c25 branch from 39c343a to afaf660 Compare June 24, 2026 22:28

dgarske requested a review from Copilot June 24, 2026 22:30

Copilot started reviewing on behalf of dgarske June 24, 2026 22:30 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

wolfCrypt on TI C2000 C28x (LAUNCHXL-F28P55X)#10724

wolfCrypt on TI C2000 C28x (LAUNCHXL-F28P55X)#10724
dgarske wants to merge 6 commits into
wolfSSL:masterfrom
dgarske:ti_c25

dgarske commented Jun 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dgarske commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validated algorithms (on C28x hardware)

What the CHAR_BIT != 8 fixes address

Commit layout

Testing

Benchmarks (F28P55X @ 150 MHz)

Notes

Companion PR

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dgarske commented Jun 18, 2026 •

edited

Loading

What the `CHAR_BIT != 8` fixes address