Skip to content

Latest commit

 

History

History
271 lines (189 loc) · 10.1 KB

File metadata and controls

271 lines (189 loc) · 10.1 KB

Constant-Time Verification Guide

Document Information

Property Value
Document Version 3.2.0
Last Updated 2026-05-20
Classification Public
Maintainer Steel Security Advisors LLC

This document describes the constant-time verification methodology and tooling for AMA Cryptography's cryptographic implementations.

Overview

Constant-time implementations are critical for preventing timing side-channel attacks. AMA Cryptography employs a defense-in-depth approach to constant-time security:

  1. C Layer: Custom constant-time utilities in src/c/ama_consttime.c (C11 atomics for thread safety)
  2. Python Layer: Use of hmac.compare_digest() for constant-time comparison
  3. Native PQC Layer: All PQC implementations (ML-DSA-65, ML-KEM-1024, SLH-DSA) use constant-time primitives internally
  4. Ed25519 Layer: Dedicated fe25519_sq() field squaring, C11 _Atomic initialization guards

Constant-Time Implementations

C Utilities (src/c/ama_consttime.c)

All 5 constant-time functions are implemented and verified:

Function Purpose Implementation dudect Verified
ama_consttime_memcmp() Byte array comparison XOR accumulation without early exit Yes
ama_secure_memzero() Secure memory clearing Volatile pointer to prevent optimization Yes
ama_consttime_swap() Conditional buffer swap Bitwise masking based on condition Yes
ama_consttime_lookup() Table lookup Full table scan with conditional copy Yes
ama_consttime_copy() Conditional copy Bitwise masking based on condition Yes

Python Utilities (ama_cryptography/crypto_api.py)

HMAC verification uses Python's hmac.compare_digest():

def hmac_verify(message: bytes, tag: bytes, key: bytes) -> bool:
    expected_tag = hmac_authenticate(message, key)
    return hmac.compare_digest(expected_tag, tag)

This function is specifically designed to prevent timing attacks by comparing all bytes regardless of where differences occur.

Verification Methodology

dudect-Style Timing Analysis

We provide a dudect-style timing analysis harness based on the methodology from:

Reparaz, O., Balasch, J., & Verbauwhede, I. (2017). "Dude, is my code constant time?" https://eprint.iacr.org/2016/1123.pdf

The harness uses Welch's t-test to compare execution times between two input classes. A t-value with |t| < 4.5 after 10^6 measurements suggests no detectable timing leakage at the 99.999% confidence level.

Running the Verification

Quick Test (100K iterations)

cd tools/constant_time
make
make test

Full Test (1M iterations, recommended)

cd tools/constant_time
make
make test-full

Manual Execution

cd tools/constant_time
make
./dudect_harness 1000000

Expected Output

=======================================================
dudect-style Constant-Time Verification Harness
AMA Cryptography Cryptographic Library
=======================================================

Methodology: Welch's t-test on execution times
Threshold: |t| < 4.5 (99.999% confidence)
Iterations: 1000000 per test

Testing ama_consttime_memcmp (1000000 iterations)...
Testing ama_consttime_swap (1000000 iterations)...
Testing ama_secure_memzero (1000000 iterations)...

=======================================================
Results Summary
=======================================================
  ama_consttime_memcmp: t = 0.1234 [PASS - no leakage detected]
  ama_consttime_swap  : t = -0.5678 [PASS - no leakage detected]
  ama_secure_memzero  : t = 0.0912 [PASS - no leakage detected]

Overall: PASS - No timing leakage detected
=======================================================

Interpreting Results

t-value Interpretation
t
4.5 <= t
t

Note: Environmental factors such as CPU frequency scaling, interrupts, and cache effects can cause false positives. Run the test multiple times and consider disabling CPU frequency scaling for more accurate results.

Harness Setup-Symmetry Discipline

Two lanes in tests/c/test_dudect.c (test_consttime_memcmp and test_frost_scalar_negate_midrange) were hardened in v3.2.0 against a false-positive class identified on noisy CI runners. The underlying primitives (ama_consttime_memcmp and FROST scalar_negate) are byte-by-byte branchless in source, but the harnesses fed them inputs through asymmetric setup paths — class 1 in test_consttime_memcmp made an extra rand() call and one extra branch-conditional write before the timer started, and test_frost_scalar_negate_midrange served class-0 inputs from a stack array while class-1 came from .rodata. The pre-timer asymmetries (branch-predictor state, cache line provenance, libc call frequency) bled into the timed window and surfaced as ~+12σ and ~−6σ false-positive readings respectively.

The post-fix pattern, codified at the top of each lane in tests/c/test_dudect.c:

  1. Perform identical setup work for both classes (same rand() draws, same memcpy count, same conditional writes — driven by an index that is independent of class_idx).
  2. Stage every reference input into the same memory class (typically the local stack frame) so the kernel reads them through equivalent cache paths.
  3. Pointer-select between the two staged inputs OUTSIDE the timing region. The timed window contains exactly one indirect call with no class-correlated control flow.

Future dudect lanes should follow the same discipline. Helper patterns: a b_equal / b_diff pair for compare-style primitives, a single stack-staged reference for scalar-input primitives.

ctgrind/Valgrind Verification

For more rigorous verification, you can use ctgrind (constant-time grind) with Valgrind:

Installation

# Install Valgrind
sudo apt-get install valgrind

# Clone ctgrind (optional, for ct_poison/ct_unpoison macros)
git clone https://github.com/agl/ctgrind.git

Running ctgrind Analysis

cd tools/constant_time
make

# Run under Valgrind with memcheck
valgrind --tool=memcheck --track-origins=yes ./dudect_harness 10000

# For more detailed analysis, use cachegrind
valgrind --tool=cachegrind ./dudect_harness 10000

Expected Valgrind Output

A clean run should show:

  • No memory errors
  • No uninitialized value usage
  • Consistent cache behavior across input classes

Upstream Library Guarantees

Native PQC (ML-DSA-65, ML-KEM-1024, SLH-DSA-256f)

The native C implementations provide constant-time operations:

  • All NTT and polynomial arithmetic use constant-time primitives
  • No secret-dependent branches or memory accesses
  • Validated through NIST KAT (Known Answer Test) vectors (FIPS 203/204/205)
  • Rejection sampling uses constant-time comparisons

Native Ed25519 (src/c/ama_ed25519.c)

The native C Ed25519 implementation provides constant-time operations:

  • Constant-time scalar multiplication (Montgomery ladder)
  • No secret-dependent branches or memory accesses
  • Dedicated fe25519_sq() exploiting multiplication symmetry (~55 muls vs ~100)
  • C11 _Atomic with memory_order_acquire/memory_order_release for thread-safe base point initialization
  • Fallback to volatile for pre-C11 compilers (MSVC compatibility)
  • Sign/verify roundtrip validated against RFC 8032 Test Vector 1 (12 tests)

Native AES-256-GCM (src/c/ama_aes_gcm.c)

Caveat: The AES-256-GCM implementation uses a 256-byte lookup table for S-box operations. This is not constant-time with respect to cache-timing side channels in shared-tenant environments. For deployments where cache-timing attacks are a concern, use hardware AES-NI instructions or a bitsliced implementation.

Functional Correctness Tests

In addition to timing analysis, we provide functional correctness tests for the constant-time utilities:

cd tests/c
# Build and run C tests (requires CMake)
mkdir build && cd build
cmake ..
make
./test_consttime

These tests verify:

  • ama_consttime_memcmp: Identical buffers return 0, different buffers return non-zero
  • ama_secure_memzero: Buffer is completely zeroed
  • ama_consttime_swap: Buffers are swapped when condition=1, unchanged when condition=0

Limitations and Caveats

  1. Statistical Nature: Timing analysis is statistical and cannot prove the absence of all timing leaks. It can only detect leaks above a certain threshold.

  2. Environment Sensitivity: Results depend on the execution environment. Factors like CPU microarchitecture, OS scheduler, and system load can affect measurements.

  3. Compiler Optimizations: Aggressive compiler optimizations may introduce timing variations. The harness is compiled with -O2 which balances optimization with predictability.

  4. Scope: This verification covers the C constant-time utilities. The Python layer relies on hmac.compare_digest() and upstream library guarantees.

Recommendations for Production

  1. Run verification on target hardware: Timing characteristics vary by CPU architecture.

  2. Disable CPU frequency scaling: For accurate measurements, set CPU governor to "performance":

    sudo cpupower frequency-set -g performance
  3. Isolate the test: Run on an otherwise idle system to minimize interference.

  4. Regular re-verification: Re-run timing analysis after any changes to cryptographic code paths.

  5. Independent audit: For high-security deployments, engage a third-party security firm to perform formal constant-time verification.

References

  1. Reparaz, O., Balasch, J., & Verbauwhede, I. (2017). "Dude, is my code constant time?" https://eprint.iacr.org/2016/1123.pdf

  2. Langley, A. "ctgrind" - Valgrind-based constant-time verification. https://github.com/agl/ctgrind

  3. NIST FIPS 204 - ML-DSA (Dilithium) Standard. https://csrc.nist.gov/pubs/fips/204/final

  4. Open Quantum Safe Project. https://openquantumsafe.org/

  5. AMA Cryptography Ed25519 Implementation. src/c/ama_ed25519.c