Skip to content

PYTHON-5040 Regenerate test TLS certificates with Authority Key Identifier#2846

Draft
blink1073 wants to merge 27 commits into
mongodb:masterfrom
blink1073:PYTHON-5040
Draft

PYTHON-5040 Regenerate test TLS certificates with Authority Key Identifier#2846
blink1073 wants to merge 27 commits into
mongodb:masterfrom
blink1073:PYTHON-5040

Conversation

@blink1073

@blink1073 blink1073 commented Jun 4, 2026

Copy link
Copy Markdown
Member

PYTHON-5040

Depends on mongodb-labs/drivers-evergreen-tools#791

Changes

Regenerates test TLS certificates with the extensions required by Python 3.13/3.14 (AuthorityKeyIdentifier, SubjectKeyIdentifier, and critical basicConstraints/keyUsage on the CA). Adds a reproducible generation script and README. Also fixes the KMS mock server setup to present AKI/SKI-enabled certs, and reverts the PYTHON-5038 workaround that had disabled SSL verification in TestKmsRetryProse.http_post.

See test/certificates/README.md for the certificate design rationale, including the constraints that prevent adding certain extensions to the CA cert and MongoDB server certs.

Test Plan

  • TestKmsRetryProse::test_kms_retry (sync + async) should pass on Python 3.14 on macOS and Windows across all tested MongoDB versions.
  • Existing SSL integration tests should continue to pass.

Passing build: https://spruce.corp.mongodb.com/version/6a29aafe6b2a520007d54a69

Checklist

Checklist for Author

  • Did you update the changelog (if necessary)?
  • Is there test coverage? (existing SSL and encryption tests cover these certs)
  • Is any followup work tracked in a JIRA ticket?

Checklist for Reviewer

  • Does the title of the PR reference a JIRA Ticket?
  • Do you fully understand the implementation? (Would you be comfortable explaining how this code works to someone else?)
  • Is all relevant documentation (README or docstring) updated?

…ifier

Test certificates in test/certificates/ were missing the Authority Key
Identifier (AKI) and Subject Key Identifier (SKI) extensions, causing
ssl.SSLCertVerificationError on Python 3.13 (macOS and Windows).

Adds gen-certs.sh to document and reproduce the generation process.
Reverts the PYTHON-5038 workaround that had disabled SSL verification
in TestKmsRetryProse.http_post().
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

blink1073 added 23 commits June 4, 2026 07:43
setup_tests.py was pointing CLIENT_PEM and CA_PEM at the x509gen certs
from drivers-evergreen-tools, which were derived from the old
test/certificates/ca.pem. After regenerating that CA with a new key
pair, the server (which uses test/certificates/) and the client (which
trusted x509gen/ca.pem) no longer agreed on the CA, causing
ssl.SSLCertVerificationError in SSL auth tasks.
Set TLS_PEM_KEY_FILE, TLS_CA_FILE, and TLS_CERT_KEY_FILE to
test/certificates/ so that run-mongodb.sh uses our regenerated certs
when the SSL server is started, and async_client_context connects with
a CA that matches the server cert.
Set TLS_PEM_KEY_FILE, TLS_CA_FILE, and TLS_CERT_KEY_FILE on the
setup-mongodb-ssl workflow step so run-mongodb.sh uses our regenerated
test/certificates/ certs. async_client_context already trusts
test/certificates/ca.pem by default (helpers_shared.py), so server and
client now agree on the CA.

Also reverts setup_tests.py and integration_tests/run.sh to their
state before the failed x509gen fix attempts.
setup_tests.py was pointing CLIENT_PEM and CA_PEM at x509gen/ certs from
drivers-evergreen-tools, which were derived from the old ca.pem. After
regenerating test/certificates/ with a new CA key, the server
(test/certificates/) and client (x509gen/) no longer agree on the CA.

Switch both to test/certificates/ to match the server cert.
The CSFLE mock KMS servers were started using x509gen certs that lack
the Authority Key Identifier extension, causing Python 3.13 to reject
them with ssl.SSLCertVerificationError.

- Set CSFLE_TLS_CA_FILE and CSFLE_TLS_CERT_FILE to test/certificates/
  in setup_tests.py so the KMIP server and HTTP mock servers use our
  AKI-enabled certs.
- Add wrong-host.pem (SAN: wronghost.example.com) and expired.pem to
  test/certificates/ and gen-certs.sh for use in KMS TLS error tests.
Two test failures from regenerated certs:

1. test_mongodb_x509_auth: MongoDB derives the x509 username from the
   cert subject using RFC 4514 reverse order. The old client cert stored
   the subject with CN first so the reversed form matched
   MONGODB_X509_USERNAME ("C=US,...,CN=client"). Our new cert stored
   C=US first, reversing to "CN=client,...,C=US". Fix: use CN-first
   subject order (/CN=client/OU=.../C=US) in gen-certs.sh.

2. test_tlsCRLFile_support: The test verifies CRL enforcement works by
   connecting with tlsCRLFile and expecting ConnectionFailure. This
   requires the server cert to be listed as revoked in crl.pem. Fix:
   sign the server cert via `openssl ca` (tracked in the CA database),
   revoke it, then generate the CRL with the revoked entry.
- configure-env.sh: clone blink1073/allow-cert-folder-override branch
  of drivers-evergreen-tools which adds CSFLE_TLS_WRONG_HOST_FILE and
  CSFLE_TLS_EXPIRED_FILE support for overriding hardcoded cert paths
- setup_tests.py: set all five CSFLE_TLS_* env vars before setup-secrets.sh
  runs so they flow through csfle/setup_secrets.py into secrets-export.sh;
  load_config_from_file persists them for the test runner
- Regenerate test/certificates/ with: root CA without AKI (avoids macOS
  CSSMERR_TP_CERT_SUSPENDED), CN-first client subject (fixes x509 auth
  username), server cert revoked in CRL (fixes tlsCRLFile test), and
  wrong-host.pem/expired.pem for KMS TLS error tests
Two macOS/Python 3.13 issues with the regenerated certs:

1. CSSMERR_TP_CERT_SUSPENDED on macOS SSL replica sets: the issuer
   component in leaf cert AKI (authorityKeyIdentifier=keyid,issuer)
   triggers macOS Secure Transport to do an online revocation lookup for
   the CA. With no OCSP/CRL URL present, this fails with CERT_SUSPENDED.
   Fix: use authorityKeyIdentifier=keyid (no issuer) on leaf certs.

2. "CA cert does not include key usage extension" on Python 3.13 macOS:
   the CA cert was missing a keyUsage extension. Fix: add
   keyUsage=critical,keyCertSign,cRLSign to the CA and trusted-CA certs.
macOS Secure Transport treats cRLSign in the CA keyUsage as a signal
that CRLs exist for this CA and performs CRL revocation checking. Since
our server cert IS revoked in crl.pem (required for test_tlsCRLFile_
support), macOS marks it as CSSMERR_TP_CERT_SUSPENDED and the mongod
SSL replica set fails to initialise.

Python 3.13 only requires that keyUsage is present on CA certs, not
specifically cRLSign. Using keyUsage=critical,keyCertSign satisfies
Python 3.13 without triggering macOS CRL enforcement.
…o CA SKI

Replace the OpenSSL shell script with a Python script (gen-certs.py) that uses
the cryptography library for precise extension control.

AKI is present on all leaf certs (required by Python 3.13 / OpenSSL 3.x chain
building), but SKI is intentionally omitted from the CA cert. Without an
explicit SKI on the CA, macOS SecTrust cannot perform keyid-based chain lookup
and therefore does not trigger its hard-fail OCSP check, which was the root
cause of CSSMERR_TP_CERT_SUSPENDED errors during replica-set inter-node TLS.

gen-certs.sh is replaced with a thin wrapper that calls gen-certs.py. OpenSSL
3.6+ automatically injects SKI into every cert it signs regardless of the
extension config, making precise control impossible via the CLI.
… critical flag

Two fixes:

1. Add the id-pkix-ocsp-nocheck extension to server and client certs.
   This tells macOS SecTrust to skip OCSP revocation checking for these
   certs, suppressing CSSMERR_TP_CERT_SUSPENDED during MongoDB replica-set
   inter-node TLS without removing the AKI that Python 3.13 requires.

2. Restore critical=True on the CA basicConstraints extension.
   Python 3.13 on Windows rejects CA certs where basicConstraints is not
   marked critical (ssl.SSLCertVerificationError: Basic Constraints of CA
   cert not marked critical).
Use the issuer form of AKI (DirName + serial, no keyid) on leaf certs.
The keyid form was enabling macOS SecTrust keyid-based chain verification,
which triggered hard-fail OCSP (CSSMERR_TP_CERT_SUSPENDED) because the
test certs have no OCSP URL. The issuer form satisfies Python 3.13 /
OpenSSL 3.x's AKI requirement without providing a keyid, so macOS uses
name-based chain matching and does not attempt OCSP. This matches the
approach used by MongoDB's own jstests/libs server certs.

Also add critical keyUsage (keyCertSign, cRLSign) to the CA cert, which
Python 3.13 on Windows (OpenSSL 3.x) now requires on CA certs.

Also remove OCSPNoCheck from leaf certs — macOS ignores it on non-OCSP-
responder certs, and it added unnecessary complexity.
…rver cert

Revert ca.pem to a freshly generated Drivers Testing CA with only
basicConstraints:CA:TRUE (no keyUsage, no SAN, no SKI/AKI).  The
macos-trusted-ca.pem approach failed because that CA is only pre-installed
in the macOS system keychain on MongoDB server CI machines, not on Evergreen
driver CI hosts.  Any CA cert with SAN or SKI/AKI that is not in the macOS
system keychain causes Apple SecTrust (used by MongoDB Enterprise) to attempt
OCSP on the CA cert itself, returning CSSMERR_TP_CERT_SUSPENDED.

The minimal CA profile (basicConstraints only, no extras) matches the original
2019 test CA that worked on macOS for years.

Add server-kms.pem: a server cert WITH AKI (issuer form) used exclusively by
kms_failpoint_server.py.  Python 3.13 / OpenSSL 3.x requires AKI on non-root
certs when verifying the KMS server.  Since kms_failpoint_server.py is a
Python HTTP server (not MongoDB Enterprise), its cert is verified via
OpenSSL — not Apple SecTrust — so AKI does not trigger OCSP issues.

server.pem and client.pem retain no AKI so MongoDB inter-node and x509-auth
TLS continues to work on macOS.
Set TLS_DISABLE_CERTIFICATE_REVOCATION_CHECK env var on macOS for non-OCSP
SSL tests, which causes mongodb_runner.py to pass --tlsDisableCertificateRevocationCheck
to mongod. Fixes CSSMERR_TP_CERT_SUSPENDED during replica set initiation on macOS
where MongoDB Enterprise enforces OCSP with kSecRevocationRequirePositiveResponse.
Python 3.14 / OpenSSL 3.x strict mode requires the keyIdentifier field in
the Authority Key Identifier extension. The prior issuer-form AKI (DirName +
serial, no keyid) was insufficient. The macOS OCSP concern that motivated the
issuer form is now resolved via --tlsAllowInvalidCertificates, so
from_issuer_public_key (keyid form) is safe to use.
server-kms.pem has keyid-form AKI required by Python 3.14's strict
cert verification in ssl.create_default_context().  server.pem (the
MongoDB TLS cert) lacks AKI, causing TestKmsRetryProse::test_kms_retry
to fail on both macOS and Windows when connecting to the KMS failpoint
server on port 9003.
RFC 5280 §4.2.1.9 requires basicConstraints to be marked critical on CA
certificates.  Python 3.14 / OpenSSL 3.x strict mode (enabled by
ssl.create_default_context) enforces this, causing TestKmsRetryProse
to fail with "Basic Constraints of CA cert not marked critical".

Change critical=False to critical=True on the Drivers Testing CA and
regenerate all test certificates.  Also skip PEM files in codespell to
avoid false positives from base64-encoded binary data.
Python 3.14 / OpenSSL 3.x strict mode (ssl.create_default_context)
requires CA certificates to have a critical keyUsage extension with
keyCertSign set.  Without it, chain verification fails with "CA cert
does not include key usage extension".

Add critical keyUsage (keyCertSign + crlSign) to the Drivers Testing
CA, matching the profile already used by the Trusted Kernel Test CA.
No SKI/AKI/SAN added -- those would trigger macOS SecTrust OCSP checks
for the CA, which would fail because the CA has no OCSP URL.

Regenerate all test certificates.
Python 3.14 strict mode (ssl.create_default_context) requires Subject Key
Identifier on non-root leaf certs.  The KMS certs (server-kms.pem,
wrong-host.pem, expired.pem) were missing it.

Adding SKI to the CA cert was a previous wrong fix — it triggers macOS
SecTrust OCSP sweeps on the MongoDB 4.2 server startup path, causing
~67-second connection timeouts in sharded-cluster SSL tests.  The root CA
is self-signed and Python 3.14 only requires SKI on non-root certs, so the
CA can safely omit it.

gen-certs.py updated accordingly: CA omits SKI; KMS leaf certs now include
both AKI and SKI.  Verification section updated to match.
Python 3.14 sets X509_V_FLAG_X509_STRICT in ssl.create_default_context(),
which requires Subject Key Identifier on all certs including the root CA.
We intentionally omit SKI from the CA cert because adding it causes macOS
SecTrust to trigger OCSP revocation checks during MongoDB 4.2 server
startup, resulting in ~67-second connection timeouts.

Using ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT) instead gives the same
security guarantees (certificate verification, hostname checking) without
enabling strict mode, matching pre-Python-3.14 behavior.
Use bare type: ignore in synchro.py so it suppresses whichever import
error mypy raises depending on whether unasync is installed.  Add
arg-type ignore in gen-certs.py for a cryptography stubs version skew.
@blink1073 blink1073 closed this Jun 11, 2026
@blink1073 blink1073 reopened this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants