Add certificate renewal infrastructure for Collectors#25537
Conversation
78a4679 to
31fbe77
Compare
31fbe77 to
5741f26
Compare
Also add optional authority_key_identifier field.
CertificateBuilder#signCsr now ensures a signed certificate never outlives its issuer. When the requested lifetime exceeds the signing cert's remaining validity, it is automatically capped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CollectorCaKeyManager now extends X509ExtendedKeyManager instead of implementing X509KeyManager. Netty uses SSLEngine for TLS handshakes, and the JDK wraps plain X509KeyManager in an adapter that adds endpoint identification checks. X509ExtendedKeyManager is used directly, bypassing the wrapper. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Since we lookup the issuer cert by SKI, we need to ensure we don't accept any issuer cert.
There was a problem hiding this comment.
Pull request overview
Introduces certificate renewal support for the Collector mTLS endpoint by adding dynamic TLS certificate loading (key/trust managers backed by a cache) plus X.509 SKI/AKI plumbing to support overlapping signing certs during renewal.
Changes:
- Add SKI/AKI persistence + lookup (including Mongo index + service lookup method) and cap signed cert lifetimes to issuer remaining validity.
- Add Collector CA renewal logic (leader-only periodical) and dynamic Netty TLS wiring via cache-backed
X509ExtendedKeyManager/X509ExtendedTrustManager. - Add extensive unit/integration tests covering renewal behavior, key/trust manager behavior, and SKI-based issuer resolution.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| graylog2-server/src/main/java/org/graylog/security/pki/CertificateBuilder.java | Adds serial RNG, issuer-lifetime capping, and SKI/AKI handling for CSR-signed certs. |
| graylog2-server/src/main/java/org/graylog/security/pki/CertificateEntry.java | Adds authority_key_identifier field to persisted certificate records. |
| graylog2-server/src/main/java/org/graylog/security/pki/CertificateService.java | Adds SKI index + SKI lookup method; preserves AKI in DN enrichment. |
| graylog2-server/src/main/java/org/graylog/collectors/CollectorCaService.java | Implements renewal logic + clock injection; refactors cert creation helpers. |
| graylog2-server/src/main/java/org/graylog/collectors/CollectorCaCache.java | Introduces Caffeine-backed cache for CA/signing/server certs + SKI lookups and event invalidation. |
| graylog2-server/src/main/java/org/graylog/collectors/CollectorCaKeyManager.java | Provides dynamic server key/cert selection for TLS without restarts. |
| graylog2-server/src/main/java/org/graylog/collectors/CollectorCaTrustManager.java | Adds SKI/AKI-based issuer resolution + EKU/basic-constraints enforcement for client certs. |
| graylog2-server/src/main/java/org/graylog/collectors/CollectorTLSUtils.java | Centralizes OTLP server SslContextBuilder creation using dynamic key/trust managers. |
| graylog2-server/src/main/java/org/graylog/collectors/CollectorsConfigService.java | Posts a cluster event when cert IDs change to trigger cache invalidation. |
| graylog2-server/src/main/java/org/graylog/collectors/events/CollectorCaConfigUpdated.java | New cluster event type for CA config changes. |
| graylog2-server/src/main/java/org/graylog/collectors/CollectorsModule.java | Wires new cache/key/trust/TLS utils singletons and initializer. |
| graylog2-server/src/main/java/org/graylog/collectors/input/transport/CollectorIngestHttpTransport.java | Switches OTLP server TLS setup to CollectorTLSUtils. |
| graylog2-server/src/main/java/org/graylog/collectors/periodical/CollectorCaRenewalPeriodical.java | Makes renewal leader-only, more frequent, and exception-safe. |
| graylog2-server/src/main/java/org/graylog/collectors/CollectorsConfig.java | Adds toBuilder() to support updating cert IDs during renewal. |
| graylog2-server/src/test/java/org/graylog/security/pki/CertificateBuilderTest.java | Adds tests for lifetime capping + CSR signing extensions/behavior. |
| graylog2-server/src/test/java/org/graylog/security/pki/CertificateEntryTest.java | Updates record/JSON/Mongo roundtrip tests for new AKI field. |
| graylog2-server/src/test/java/org/graylog/security/pki/CertificateServiceTest.java | Adds SKI lookup tests and updates fixtures for AKI. |
| graylog2-server/src/test/java/org/graylog/collectors/CollectorCaServiceTest.java | Adds renewal-threshold and renewal-cascade tests; updates TLS builder test. |
| graylog2-server/src/test/java/org/graylog/collectors/CollectorCaCacheTest.java | New tests for cache behavior, invalidation, and SKI lookups. |
| graylog2-server/src/test/java/org/graylog/collectors/CollectorCaKeyManagerTest.java | New tests validating key manager alias/chain/key behavior. |
| graylog2-server/src/test/java/org/graylog/collectors/CollectorCaTrustManagerTest.java | New tests for issuer resolution, EKU/basic-constraints, and validation failures. |
| graylog2-server/src/test/java/org/graylog/collectors/CollectorTLSUtilsIT.java | New Netty-based integration test for end-to-end mTLS handshake with dynamic managers. |
| graylog2-server/src/test/java/org/graylog/collectors/CollectorsConfigServiceTest.java | New tests asserting config changes emit CA cache invalidation events. |
| graylog2-server/src/test/java/org/graylog/collectors/opamp/auth/AgentTokenServiceTest.java | Updates construction for new CollectorsConfigService/CollectorCaService signatures. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
graylog2-server/src/main/java/org/graylog/collectors/CollectorCaCache.java
Show resolved
Hide resolved
graylog2-server/src/test/java/org/graylog/collectors/CollectorCaCacheTest.java
Outdated
Show resolved
Hide resolved
graylog2-server/src/test/java/org/graylog/collectors/CollectorTLSUtilsIT.java
Show resolved
Hide resolved
graylog2-server/src/main/java/org/graylog/security/pki/CertificateEntry.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
graylog2-server/src/main/java/org/graylog/collectors/CollectorCaCache.java
Show resolved
Hide resolved
graylog2-server/src/main/java/org/graylog/security/pki/CertificateBuilder.java
Show resolved
Hide resolved
graylog2-server/src/main/java/org/graylog/security/pki/CertificateBuilder.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
graylog2-server/src/main/java/org/graylog/collectors/CollectorCaKeyManager.java
Show resolved
Hide resolved
The JDK can report Ed25519 keys as either "EdDSA" or "Ed25519" depending on the provider. chooseServerAlias now accepts both names, consistent with CertificateBuilder and PemUtils.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
graylog2-server/src/test/java/org/graylog/collectors/CollectorTLSUtilsIT.java
Outdated
Show resolved
Hide resolved
kroepke
left a comment
There was a problem hiding this comment.
Overall 👍, one minor nitpick and one question about how we want to deal with failures during renewal, but I won't block merge for that.
Warning
We added SKI/AKI values to the certificates and the database. Testing this PR requires a full reset of the Collector CA! We can't write a migration for this.
X509KeyManagerandX509TrustManagerimplementations fetch certificates from a Caffeine cache that is invalidated via cluster events when the CA config changes.CollectorCaService#renewCertificateswhich checks signing and OTLP server cert lifetimes and renews them when less than 20% of their lifetime remains, with cascading re-issue of the server cert when the signing cert is renewed.CertificateBuilder#signCsrto the issuer's remaining validity so a signed certificate never outlives its signing cert.CollectorCaTrustManagerchecks for client certificate EKU (clientAuth) and basic constraints (rejects CA certs used as client identities).subject_key_identifierfor efficient SKI-based lookups.Fixes #25231
/nocl extending unreleased feature