Skip to content

Commit c7b8fd7

Browse files
authored
Merge pull request #351 from bigbrett/async-crypto-aes
Async AES
2 parents 07b6bf2 + a709b0a commit c7b8fd7

5 files changed

Lines changed: 3080 additions & 672 deletions

File tree

docs/draft/async-crypto.md

Lines changed: 87 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -416,8 +416,6 @@ int wh_Client_Sha256Dma(whClientContext* ctx, wc_Sha256* sha, const uint8_t* in,
416416
| **`requestSent` flag** | Adds a parameter to the API, but avoids unnecessary round-trips when input is absorbed entirely into the local buffer |
417417
| **Snapshot/rollback on send failure** | Small CPU cost to copy the partial buffer, but guarantees SHA state consistency even on transport failures |
418418
419-
420-
421419
## RNG: Single-Shot with Caller-Driven Chunking
422420
423421
The RNG generate operation is the second algorithm to receive the async
@@ -540,6 +538,88 @@ int wh_Client_RngGenerate(whClientContext* ctx, uint8_t* out, uint32_t size);
540538
int wh_Client_RngGenerateDma(whClientContext* ctx, uint8_t* out, uint32_t size);
541539
```
542540
541+
## AES: One-Shot with DMA Support
542+
543+
AES modes (CBC, CTR, ECB, GCM) are all **one-shot** operations — every call
544+
consumes a fixed buffer of input and returns a fixed buffer of output in a
545+
single round-trip. There is no client-side partial-block accumulation across
546+
Request/Response pairs the way SHA's `Update` family has, which makes the
547+
async split significantly simpler than SHA. (CBC and CTR still carry
548+
inter-call IV / counter state on the `Aes` struct — see *Mutable state*
549+
below — but that state is updated atomically by the Response, not buffered
550+
across calls.)
551+
552+
- **No partial-block buffering** on the client. The entire plaintext or
553+
ciphertext is handed to one Request and the full result comes back in
554+
one Response.
555+
- **No `requestSent` flag.** Each call sends exactly one request and
556+
expects exactly one response. If the request's serialised size would
557+
exceed `WOLFHSM_CFG_COMM_DATA_LEN`, the inline Request returns
558+
`WH_ERROR_BADARGS` up front; DMA variants bypass the cap for payload
559+
data.
560+
- **No snapshot/rollback.** There is no local buffer to corrupt: the key
561+
lives on `aes->devKey` (or as a cached keyId), the IV on `aes->reg`, and
562+
these are read-only until the Response arrives.
563+
564+
### Mutable state: IV and counter
565+
566+
For **CBC** and **CTR**, the Response updates mutable state on the `Aes`
567+
struct so subsequent calls chain correctly:
568+
569+
- **CBC** — `aes->reg` is updated with the last ciphertext block. For
570+
decryption, the Request captures the last ciphertext block from the
571+
input buffer into `aes->reg` *before* sending, so in-place (input
572+
pointer == output pointer) operation still produces the right chaining
573+
state after the Response overwrites the plaintext.
574+
- **CTR** — `aes->reg`, `aes->tmp`, and `aes->left` are updated from the
575+
Response so the counter advances correctly for subsequent calls. CTR
576+
is symmetric: callers should use `AES_ENCRYPTION` for the key schedule
577+
and pass `enc = 1` in both directions.
578+
579+
**ECB** and **GCM** carry no inter-call state on the `Aes` struct. For
580+
GCM, the IV, AAD, and (on decrypt) the expected tag are passed as explicit
581+
arguments on each call.
582+
583+
### DMA variant contract
584+
585+
The DMA pairs follow the same pattern as SHA DMA:
586+
587+
1. **Fail-fast** on `wh_CommClient_IsRequestPending()` before acquiring
588+
any DMA mapping, so a Request cannot be issued while another call is
589+
still outstanding and cannot leak a translated address if
590+
`wh_Client_SendRequest` later rejects the call.
591+
2. **PRE-translate** input, output, and (for GCM) AAD buffers. Non-DMA
592+
payload fields (key material, IV, auth tag) stay inline in the
593+
request message.
594+
3. **Stash** the translated addresses in `ctx->dma.asyncCtx.aes` so the
595+
matching Response can issue POST cleanup.
596+
4. **POST cleanup** runs on every non-`WH_ERROR_NOTREADY` return from the
597+
Response, so the caller's buffers are safe to read regardless of
598+
success or error.
599+
5. The caller must keep the input, output, and AAD buffers valid until
600+
the Response returns something other than `WH_ERROR_NOTREADY`.
601+
602+
### API Reference
603+
604+
Inline (non-DMA) pairs:
605+
606+
- `wh_Client_AesCbcRequest` / `wh_Client_AesCbcResponse`
607+
- `wh_Client_AesCtrRequest` / `wh_Client_AesCtrResponse`
608+
- `wh_Client_AesEcbRequest` / `wh_Client_AesEcbResponse`
609+
- `wh_Client_AesGcmRequest` / `wh_Client_AesGcmResponse`
610+
611+
DMA pairs (require `WOLFHSM_CFG_DMA`):
612+
613+
- `wh_Client_AesCbcDmaRequest` / `wh_Client_AesCbcDmaResponse`
614+
- `wh_Client_AesCtrDmaRequest` / `wh_Client_AesCtrDmaResponse`
615+
- `wh_Client_AesEcbDmaRequest` / `wh_Client_AesEcbDmaResponse`
616+
- `wh_Client_AesGcmDmaRequest` / `wh_Client_AesGcmDmaResponse`
617+
618+
The existing blocking wrappers (`wh_Client_AesCbc`, `wh_Client_AesCtr`,
619+
`wh_Client_AesEcb`, `wh_Client_AesGcm`, and their `*Dma` variants) are now
620+
thin shells that call the new async primitives in a poll loop, so blocking
621+
and async paths share identical wire behaviour.
622+
543623
## Roadmap: Remaining Algorithms
544624
545625
The async split pattern will be applied algorithm by algorithm to all crypto
@@ -555,15 +635,15 @@ the full set of operations and their planned async status.
555635
| SHA-384 | Update/Final Request/Response | Shares SHA-512 wire format |
556636
| SHA-512 | Update/Final Request/Response | Non-DMA and DMA variants |
557637
| RNG Generate | `wh_Client_RngGenerate{Request,Response}` and DMA variants | Single-shot per call; non-DMA callers chunk against `WH_MESSAGE_CRYPTO_RNG_MAX_INLINE_SZ`, DMA has no per-call cap |
638+
| AES-CBC | `wh_Client_AesCbc{,Dma}{Request,Response}` | Non-DMA and DMA variants |
639+
| AES-CTR | `wh_Client_AesCtr{,Dma}{Request,Response}` | Non-DMA and DMA variants |
640+
| AES-ECB | `wh_Client_AesEcb{,Dma}{Request,Response}` | Non-DMA and DMA variants |
641+
| AES-GCM | `wh_Client_AesGcm{,Dma}{Request,Response}` | Non-DMA and DMA variants; AAD supports DMA |
558642
559643
**Planned:**
560644
561645
| Algorithm | Functions | Complexity | Notes |
562646
|-------------------|--------------------------------------------|------------|-------|
563-
| AES-CBC | `wh_Client_AesCbc{Request,Response}` | Low | Single-shot; straightforward split |
564-
| AES-CTR | `wh_Client_AesCtr{Request,Response}` | Low | Single-shot |
565-
| AES-ECB | `wh_Client_AesEcb{Request,Response}` | Low | Single-shot |
566-
| AES-GCM | `wh_Client_AesGcm{Request,Response}` | Low | Single-shot; AAD + ciphertext in one message |
567647
| RSA Sign/Verify | `wh_Client_RsaFunction{Request,Response}` | Low | Single-shot; may need auto-import removed from Request |
568648
| RSA Get Size | `wh_Client_RsaGetSize{Request,Response}` | Low | Trivial query |
569649
| ECDSA Sign | `wh_Client_EccSign{Request,Response}` | Low | Single-shot |
@@ -572,7 +652,7 @@ the full set of operations and their planned async status.
572652
| Curve25519 | `wh_Client_Curve25519SharedSecret{Request,Response}` | Low | Single-shot |
573653
| Ed25519 Sign | `wh_Client_Ed25519Sign{Request,Response}` | Low | Single-shot |
574654
| Ed25519 Verify | `wh_Client_Ed25519Verify{Request,Response}`| Low | Single-shot |
575-
| CMAC | `wh_Client_Cmac{Request,Response}` | Low | Already has partial split pattern |
655+
| CMAC | `wh_Client_Cmac{Request,Response}` | Medium | Streaming (Init/Update/Final), so follows SHA-style pattern rather than the one-shot AES pattern |
576656
| ML-DSA Sign | `wh_Client_MlDsaSign{Request,Response}` | Low | Post-quantum; single-shot |
577657
| ML-DSA Verify | `wh_Client_MlDsaVerify{Request,Response}` | Low | Post-quantum; single-shot |
578658

0 commit comments

Comments
 (0)