Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,9 @@ For Keccak and AES we are using public-domain
code from sources and by authors listed in
comments on top of the respective files.

The code in crypto/fipsmodule/ml_kem/mlkem is imported from mlkem-native
(https://github.com/pq-code-package/mlkem-native) and carries the
Apache 2.0 license. This license is reproduced at the bottom of this file.

Licenses for support code
-------------------------
Expand Down Expand Up @@ -286,10 +289,10 @@ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Apache 2.0 license for AWS-LC content
-------------------------------------


Apache 2.0 license for AWS-LC content and mlkem-native
------------------------------------------------------
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Expand Down
5 changes: 5 additions & 0 deletions crypto/fipsmodule/ml_kem/META.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
name: mlkem-native
source: pq-code-package/mlkem-native.git
branch: main
commit: 83d85fe224bd6cf1b75f096a2b2fa01033b3dfda
imported-at: 2025-04-03T12:37:06+0100
163 changes: 150 additions & 13 deletions crypto/fipsmodule/ml_kem/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,154 @@
# AWS-LC ML-KEM readme file
# ML-KEM

The source code in this folder implements ML-KEM as defined in FIPS 203 Module-Lattice-Based Key-Encapsulation Mechanism Standard ([link](https://csrc.nist.gov/pubs/fips/203/final).
The source code in this directory implements ML-KEM as defined in
the [FIPS 203 Module-Lattice-Based Key-Encapsulation Mechanism Standard](https://csrc.nist.gov/pubs/fips/203/final).
It is imported from [mlkem-native](https://github.com/pq-code-package/mlkem-native)
using [importer.sh](importer.sh); see [META.yml](META.yml) for import details.

**Source code origin and modifications.** The source code was imported from a branch of the official repository of the Crystals-Kyber team that follows the standard draft: https://github.com/pq-crystals/kyber/tree/standard. The code was taken at [commit](https://github.com/pq-crystals/kyber/commit/11d00ff1f20cfca1f72d819e5a45165c1e0a2816) as of 03/26/2024. At the moment, only the reference C implementation is imported.
## Running the importer

The code was refactored in [this PR](https://github.com/aws/aws-lc/pull/1763) by parametrizing all functions that depend on values that are specific to a parameter set, i.e., that directly or indirectly depend on the value of KYBER_K. To do this, in `params.h` we defined a structure that holds those ML-KEM parameters and functions
that initialize a given structure with values corresponding to a parameter set. This structure is then passed to every function that requires it as a function argument. In addition, the following changes were made to the source code in `ml_kem_ref` directory:
- `randombytes.{h|c}` are deleted because we are using the randomness generation functions provided by AWS-LC.
- `kem.c`: call to randombytes function is replaced with a call to RAND_bytes and the appropriate header file is included (openssl/rand.h).
- `fips202.{h|c}` are deleted as all SHA3/SHAKE functionality is provided instead by AWS-LC fipsmodule/sha rather than the reference implementation.
- `symmetric-shake.c`: unnecessary include of fips202.h is removed.
- `api.h`: `pqcrystals` prefix substituted with `ml_kem` (to be able to build alongside `crypto/kyber`).
- `poly.c`: the `poly_frommsg` function was modified to address the constant-time issue described [here](https://github.com/pq-crystals/kyber/commit/9b8d30698a3e7449aeb34e62339d4176f11e3c6c).
- All internal header files were updated with unique `ML_KEM_*` include guards.
To re-run the importer, do

**Testing.** The KATs were obtained from an independent implementation of ML-KEM written in SPARK Ada subset: https://github.com/awslabs/LibMLKEM.
Comment thread
hanno-becker marked this conversation as resolved.
```bash
rm -rf mlkem # Remove old mlkem source
./importer.sh
```

By default, the importer will not run if [mlkem](mlkem) already/still exists. To force removal of any existing [mlkem](mlkem), use `./importer.sh --force`.

The repository and branch to be used for the import can be configured through the environment variables `GITHUB_REPOSITORY` and `GITHUB_SHA`, respectively. The default is equivalent to

```bash
GITHUB_REPOSITORY=pq-code-package/mlkem-native.git GITHUB_SHA=main ./importer.sh
```

That is, by default importer.sh will clone and install the latest [main](https://github.com/pq-code-package/mlkem-native/tree/main) of mlkem-native.

After a successful import, [META.yml](META.yml) will reflect the source, branch, commit and timestamp of the import.

### Import Scope

mlkem-native has a C-only version as well as native 'backends' in AVX2 and
Neon for high performance. At present, [importer.sh](importer.sh) imports only
the C-only version.

mlkem-native offers its own FIPS-202 implementation, including fast
versions of batched FIPS-202. [importer.sh](importer.sh) does _not_ import those.
Instead, glue-code around AWS-LC's own FIPS-202 implementation is provided in
[fips202_glue.h](fips202_glue.h) and [fips202x4_glue.h](fips202x4_glue.h).

## Configuration and compatibility layer

mlkem-native is used with a custom configuration file [mlkem_native_config.h](mlkem_native_config.h). This file includes
a compatibility layer between AWS-LC/OpenSSL and mlkem-native, covering:

* FIPS/PCT: If `AWSLC_FIPS` is set, `MLK_CONFIG_KEYGEN_PCT` is
enabled to includ a PCT.
* FIPS/PCT: If `BORINGSSL_FIPS_BREAK_TESTS` is set,
`MLK_CONFIG_KEYGEN_PCT_BREAKAGE_TEST` is set and `mlk_break_pct`
defined via `boringssl_fips_break_test("MLKEM_PWCT")`, to include
runtime-breakage of the PCT for testing purposes.
* CT: If `BORINGSSL_CONSTANT_TIME_VALIDATION` is set, then
`MLK_CONFIG_CT_TESTING_ENABLED` is set to enable valgrind testing.
* Zeroization: `MLK_CONFIG_CUSTOM_ZEROIZE` is set and `mlk_zeroize`
mapped to `OPENSSL_cleanse` to use OpenSSL's zeroization function.
* Randombytes: `MLK_CONFIG_CUSTOM_RANDOMBYTES` is set and `mlk_randombytes`
mapped to `RAND_bytes` to use AWS-LC's randombytes function.

## Build process

At the core, mlkem-native is a 'single-level' implementation of ML-KEM:
A build of the main source tree provides an implementation of
exactly one of ML-KEM-512/768/1024, depending on the MLK_CONFIG_PARAMETER_SET
parameter. All source files for a single-build of mlkem-native are bundled in
[mlkem_native_bcm.c](mlkem/mlkem_native_bcm.c), which is also imported from
mlkem-native.

To build all security levels, [mlkem_native_bcm.c](mlkem/mlkem_native_bcm.c)
is included three times into [ml_kem.c](ml_kem.c), once per security level.
Level-independent code is included only once and shared across the levels;
this is controlled through the configuration options
`MLK_CONFIG_MULTILEVEL_WITH_SHARED` and `MLK_CONFIG_MULTILEVEL_NO_SHARED`
used prior to importing the instances of [mlkem_native_bcm.c](mlkem/mlkem_native_bcm.c) into [ml_kem.c](ml_kem.c).

Note that the multilevel build process is entirely internal to `ml_kem.c`,
and does not affect the AWS-LC build otherwise.

## Formal Verification

All C-code imported by [importer.sh](importer.sh) is formally verified using the
C Bounded Model Checker ([CBMC](https://github.com/diffblue/cbmc/)) to be free of
various classes of undefined behaviour, including out-of-bounds memory accesses and
arithmetic overflow; the latter is of particular interest for ML-KEM because of
the use of lazy modular reduction for improved performance.

The heart of the CBMC proofs are function contract and loop annotations to
the C-code. Function contracts are denoted `__contract__(...)` clauses and
occur at the time of declaration, while loop contracts are denoted
`__loop__` and follow the `for` statement.

The function contract and loop statements are kept in the source, but
removed by the preprocessor so long as the CBMC macro is undefined. Keeping
them simplifies the import, and care has been taken to make them readable
to the non-expert, and thereby serve as precise documentation of
assumptions and guarantees upheld by the code.

## Testing

The KATs were obtained from an independent implementation of ML-KEM written
in SPARK Ada subset: https://github.com/awslabs/LibMLKEM.

## Side-channels

mlkem-native's CI uses a patched version of valgrind to check for various
compilers and compile flags that there are no secret-dependent memory
accesses, branches, or divisions. The relevant assertions are kept
and used if `MLK_CONFIG_CT_TESTING_ENABLED` is set, which is the case
if and only if `BORINGSSL_CONSTANT_TIME_VALIDATION` is set.

mlkem-native uses value barriers to block
potentially harmful compiler reasoning and optimization. Where standard
gcc/clang inline assembly is not available, mlkem-native falls back to a
slower 'opt blocker' based on a volatile global -- both are described in
[verify.h](https://github.com/aws/aws-lc/blob/df5b09029e27d54b2b117eeddb6abd983528ae15/crypto/fipsmodule/ml_kem/mlkem/verify.h).

## Comparison to reference implementation

mlkem-native is a fork of the ML-KEM [reference
implementation](https://github.com/pq-crystals/kyber).

The following gives an overview of the major changes:

- CBMC and debug annotations, and minor code restructurings or signature
changes to facilitate the CBMC proofs. For example, `poly_add(x,a)` only
comes in a destructive variant to avoid specifying aliasing constraints;
`poly_rej_uniform` has an additional `offset` parameter indicating the
position in the sampling buffer, to avoid passing shifted pointers).
- Introduction of 4x-batched versions of some functions from the reference
implementation. This is to leverage 4x-batched Keccak-f1600 implementations
if present. The batching happens at the C level even if no native backend
for FIPS 202 is present.
- FIPS 203 compliance: Introduced PK (FIPS 203, Section 7.2, 'modulus
check') and SK (FIPS 203, Section 7.3, 'hash check') check, as well as
optional PCT (FIPS 203, Section 7.1, Pairwise Consistency). Also,
introduced zeroization of stack buffers as required by (FIPS 203, Section
3.3, Destruction of intermediate values).
- Introduction of native backend implementations. With the exception of the
native backend for `poly_rej_uniform()`, which may fail and fall back to
the C implementation, those are drop-in replacements for the corresponding
C functions and dispatched at compile-time.
- Restructuring of files to separate level-specific from level-generic
functionality. This is needed to enable a multi-level build of mlkem-native
where level-generic code is shared between levels.
- More pervasive use of value barriers to harden constant-time primitives,
even when Link-Time-Optimization (LTO) is enabled. The use of LTO can lead
to insecure compilation in case of the reference implementation.
- Use of a multiplication cache ('mulcache') structure to simplify and
speedup the base multiplication.
- Different placement of modular reductions: We reduce to _unsigned_
canonical representatives in `poly_reduce()`, and _assume_ such in all
polynomial compression functions. The reference implementation works with a
_signed_ `poly_reduce()`, and embeds various signed->unsigned conversions
in the compression functions.
- More inlining: Modular multiplication and primitives are in a header
rather than a separate compilation unit.
64 changes: 64 additions & 0 deletions crypto/fipsmodule/ml_kem/fips202_glue.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0 OR ISC

#ifndef MLK_AWSLC_FIPS202_GLUE_H
#define MLK_AWSLC_FIPS202_GLUE_H
#include <stddef.h>
#include <stdint.h>

#include "../sha/internal.h"

#define SHAKE128_RATE 168
#define SHAKE256_RATE 136
#define SHA3_256_RATE 136
#define SHA3_384_RATE 104
#define SHA3_512_RATE 72

#define mlk_shake128ctx KECCAK1600_CTX

static MLK_INLINE void mlk_shake128_init(mlk_shake128ctx *state) {
// Return code checks can be omitted
// SHAKE_Init always returns 1 when called with correct block size value.
(void) SHAKE_Init(state, SHAKE128_BLOCKSIZE);
}

static MLK_INLINE void mlk_shake128_release(mlk_shake128ctx *state) {
(void) state;
}

static MLK_INLINE void mlk_shake128_absorb_once(mlk_shake128ctx *state,
const uint8_t *input, size_t inlen) {
// Return code check can be omitted
// since mlkem-native adheres to call discipline
(void) SHAKE_Absorb(state, input, inlen);
}

static MLK_INLINE void mlk_shake128_squeezeblocks(uint8_t *output, size_t nblocks,
mlk_shake128ctx *state) {
// Return code check can be omitted
// since mlkem-native adheres to call discipline
(void) SHAKE_Squeeze(output, state, nblocks * SHAKE128_RATE);
}

static MLK_INLINE void mlk_shake256(uint8_t *output, size_t outlen,
const uint8_t *input, size_t inlen) {
// Return code check can be omitted
// since mlkem-native adheres to call discipline
(void) SHAKE256(input, inlen, output, outlen);
}

static MLK_INLINE void mlk_sha3_256(uint8_t *output, const uint8_t *input,
size_t inlen) {
// Return code check can be omitted
// since mlkem-native adheres to call discipline
(void) SHA3_256(input, inlen, output);
}

static MLK_INLINE void mlk_sha3_512(uint8_t *output, const uint8_t *input,
size_t inlen) {
// Return code check can be omitted
// since mlkem-native adheres to call discipline
(void) SHA3_512(input, inlen, output);
}

#endif // MLK_AWSLC_FIPS202_GLUE_H
58 changes: 58 additions & 0 deletions crypto/fipsmodule/ml_kem/fips202x4_glue.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0 OR ISC

//
// This is a shim establishing the FIPS-202 API required by
// mlkem-native from the API exposed by AWS-LC.
//

#ifndef MLK_AWSLC_FIPS202X4_GLUE_H
#define MLK_AWSLC_FIPS202X4_GLUE_H

#include <stddef.h>
#include <stdint.h>

#include "fips202_glue.h"

#define mlk_shake128x4ctx KECCAK1600_CTX_x4

static MLK_INLINE void mlk_shake128x4_absorb_once(mlk_shake128x4ctx *state,
const uint8_t *in0,
const uint8_t *in1,
const uint8_t *in2,
const uint8_t *in3, size_t inlen) {
// Return code check can be omitted
// since mlkem-native adheres to call discipline
Comment thread
hanno-becker marked this conversation as resolved.
(void) SHAKE128_Absorb_once_x4(state, in0, in1, in2, in3, inlen);
}

static MLK_INLINE void mlk_shake128x4_squeezeblocks(uint8_t *out0, uint8_t *out1,
uint8_t *out2, uint8_t *out3,
size_t nblocks,
mlk_shake128x4ctx *state) {
// Return code check can be omitted
// since mlkem-native adheres to call discipline
(void) SHAKE128_Squeezeblocks_x4(out0, out1, out2, out3, state, nblocks);
}

static MLK_INLINE void mlk_shake128x4_init(mlk_shake128x4ctx *state) {
// Return code check can be omitted
// since mlkem-native adheres to call discipline
(void) SHAKE128_Init_x4(state);
}

static MLK_INLINE void mlk_shake128x4_release(mlk_shake128x4ctx *state) {
(void) state;
}

static MLK_INLINE void mlk_shake256x4(uint8_t *out0, uint8_t *out1, uint8_t *out2,
uint8_t *out3, size_t outlen, uint8_t *in0,
uint8_t *in1, uint8_t *in2, uint8_t *in3,
size_t inlen) {
// Return code check can be omitted
// since SHAKE256_x4 is documented not to fail for valid inputs.
(void) SHAKE256_x4(in0, in1, in2, in3, inlen,
out0, out1, out2, out3, outlen);
}

#endif // MLK_AWSLC_FIPS202X4_GLUE_H
Loading
Loading