Skip to content

Add certificate renewal infrastructure for Collectors#25537

Merged
bernd merged 34 commits intomasterfrom
collectors/cert-renewal
Apr 10, 2026
Merged

Add certificate renewal infrastructure for Collectors#25537
bernd merged 34 commits intomasterfrom
collectors/cert-renewal

Conversation

@bernd
Copy link
Copy Markdown
Member

@bernd bernd commented Apr 2, 2026

Warning

We added SKI/AKI values to the certificates and the database. Testing this PR requires a full reset of the Collector CA! We can't write a migration for this.

  • Add dynamic TLS certificate management so the Collector mTLS endpoint supports certificate renewal without server restarts. Custom X509KeyManager and X509TrustManager implementations fetch certificates from a Caffeine cache that is invalidated via cluster events when the CA config changes.
  • Use X.509 SKI/AKI extensions for trust chain resolution, allowing multiple signing certs to coexist during renewal (old collector certs remain trusted via AKI to SKI lookup in the database).
  • Implement CollectorCaService#renewCertificates which checks signing and OTLP server cert lifetimes and renews them when less than 20% of their lifetime remains, with cascading re-issue of the server cert when the signing cert is renewed.
  • Cap collector cert lifetime in CertificateBuilder#signCsr to the issuer's remaining validity so a signed certificate never outlives its signing cert.
  • Add CollectorCaTrustManager checks for client certificate EKU (clientAuth) and basic constraints (rejects CA certs used as client identities).
  • Add MongoDB index on subject_key_identifier for efficient SKI-based lookups.

Fixes #25231
/nocl extending unreleased feature

@bernd bernd force-pushed the collectors/cert-renewal branch 2 times, most recently from 78a4679 to 31fbe77 Compare April 7, 2026 09:19
@bernd bernd force-pushed the collectors/cert-renewal branch from 31fbe77 to 5741f26 Compare April 7, 2026 09:50
@bernd bernd changed the title Start Collector key manager Add certificate renewal infrastructure for Collectors Apr 7, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces certificate renewal support for the Collector mTLS endpoint by adding dynamic TLS certificate loading (key/trust managers backed by a cache) plus X.509 SKI/AKI plumbing to support overlapping signing certs during renewal.

Changes:

  • Add SKI/AKI persistence + lookup (including Mongo index + service lookup method) and cap signed cert lifetimes to issuer remaining validity.
  • Add Collector CA renewal logic (leader-only periodical) and dynamic Netty TLS wiring via cache-backed X509ExtendedKeyManager/X509ExtendedTrustManager.
  • Add extensive unit/integration tests covering renewal behavior, key/trust manager behavior, and SKI-based issuer resolution.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
graylog2-server/src/main/java/org/graylog/security/pki/CertificateBuilder.java Adds serial RNG, issuer-lifetime capping, and SKI/AKI handling for CSR-signed certs.
graylog2-server/src/main/java/org/graylog/security/pki/CertificateEntry.java Adds authority_key_identifier field to persisted certificate records.
graylog2-server/src/main/java/org/graylog/security/pki/CertificateService.java Adds SKI index + SKI lookup method; preserves AKI in DN enrichment.
graylog2-server/src/main/java/org/graylog/collectors/CollectorCaService.java Implements renewal logic + clock injection; refactors cert creation helpers.
graylog2-server/src/main/java/org/graylog/collectors/CollectorCaCache.java Introduces Caffeine-backed cache for CA/signing/server certs + SKI lookups and event invalidation.
graylog2-server/src/main/java/org/graylog/collectors/CollectorCaKeyManager.java Provides dynamic server key/cert selection for TLS without restarts.
graylog2-server/src/main/java/org/graylog/collectors/CollectorCaTrustManager.java Adds SKI/AKI-based issuer resolution + EKU/basic-constraints enforcement for client certs.
graylog2-server/src/main/java/org/graylog/collectors/CollectorTLSUtils.java Centralizes OTLP server SslContextBuilder creation using dynamic key/trust managers.
graylog2-server/src/main/java/org/graylog/collectors/CollectorsConfigService.java Posts a cluster event when cert IDs change to trigger cache invalidation.
graylog2-server/src/main/java/org/graylog/collectors/events/CollectorCaConfigUpdated.java New cluster event type for CA config changes.
graylog2-server/src/main/java/org/graylog/collectors/CollectorsModule.java Wires new cache/key/trust/TLS utils singletons and initializer.
graylog2-server/src/main/java/org/graylog/collectors/input/transport/CollectorIngestHttpTransport.java Switches OTLP server TLS setup to CollectorTLSUtils.
graylog2-server/src/main/java/org/graylog/collectors/periodical/CollectorCaRenewalPeriodical.java Makes renewal leader-only, more frequent, and exception-safe.
graylog2-server/src/main/java/org/graylog/collectors/CollectorsConfig.java Adds toBuilder() to support updating cert IDs during renewal.
graylog2-server/src/test/java/org/graylog/security/pki/CertificateBuilderTest.java Adds tests for lifetime capping + CSR signing extensions/behavior.
graylog2-server/src/test/java/org/graylog/security/pki/CertificateEntryTest.java Updates record/JSON/Mongo roundtrip tests for new AKI field.
graylog2-server/src/test/java/org/graylog/security/pki/CertificateServiceTest.java Adds SKI lookup tests and updates fixtures for AKI.
graylog2-server/src/test/java/org/graylog/collectors/CollectorCaServiceTest.java Adds renewal-threshold and renewal-cascade tests; updates TLS builder test.
graylog2-server/src/test/java/org/graylog/collectors/CollectorCaCacheTest.java New tests for cache behavior, invalidation, and SKI lookups.
graylog2-server/src/test/java/org/graylog/collectors/CollectorCaKeyManagerTest.java New tests validating key manager alias/chain/key behavior.
graylog2-server/src/test/java/org/graylog/collectors/CollectorCaTrustManagerTest.java New tests for issuer resolution, EKU/basic-constraints, and validation failures.
graylog2-server/src/test/java/org/graylog/collectors/CollectorTLSUtilsIT.java New Netty-based integration test for end-to-end mTLS handshake with dynamic managers.
graylog2-server/src/test/java/org/graylog/collectors/CollectorsConfigServiceTest.java New tests asserting config changes emit CA cache invalidation events.
graylog2-server/src/test/java/org/graylog/collectors/opamp/auth/AgentTokenServiceTest.java Updates construction for new CollectorsConfigService/CollectorCaService signatures.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The JDK can report Ed25519 keys as either "EdDSA" or "Ed25519"
depending on the provider. chooseServerAlias now accepts both names,
consistent with CertificateBuilder and PemUtils.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bernd bernd marked this pull request as ready for review April 8, 2026 16:31
@bernd bernd requested a review from a team April 8, 2026 16:32
Copy link
Copy Markdown
Member

@kroepke kroepke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall 👍, one minor nitpick and one question about how we want to deal with failures during renewal, but I won't block merge for that.

@bernd bernd merged commit 924d742 into master Apr 10, 2026
23 checks passed
@bernd bernd deleted the collectors/cert-renewal branch April 10, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement cert renewal strategy for collectors

3 participants