[CRE] Wire confidential relay handler#21375
[CRE] Wire confidential relay handler#21375nadahalli wants to merge 21 commits intotejaswi/cw-phase4from
Conversation
|
👋 nadahalli, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
|
✅ No conflicts with other open PRs targeting |
There was a problem hiding this comment.
Pull request overview
Wires the confidential-compute relay handler into the CRE service graph as an optional subservice, gated by CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS, and updates dependencies to pull in required relay/confidentialrelay types.
Changes:
- Add a new
core/capabilities/confidentialrelaylifecycle wrapper that defers handler construction toStart(). - Wire the confidential relay service into
core/services/crewhenCL_CONFIDENTIAL_RELAY_TRUSTED_PCRSis set. - Bump
chainlink-commonand add confidential-compute relay/attestation dependencies.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| go.mod | Adds confidential-compute relay dependency and bumps chainlink-common. |
| go.sum | Updates module sums for new/bumped dependencies. |
| core/services/cre/cre.go | Conditionally registers the confidential relay service behind an env var gate. |
| core/capabilities/confidentialrelay/service.go | Introduces a thin service wrapper around the confidential relay handler lifecycle. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if trustedPCRs := os.Getenv("CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS"); trustedPCRs != "" { | ||
| relayService := confidentialrelay.NewService( | ||
| gatewayConnectorWrapper, | ||
| opts.CapabilitiesRegistry, | ||
| []byte(trustedPCRs), | ||
| lggr, | ||
| ) | ||
| srvs = append(srvs, relayService) | ||
| } |
There was a problem hiding this comment.
The relay service is conditionally enabled via CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS, but there’s no log signal when it is enabled/disabled. This makes it hard to diagnose why the relay handler isn’t running (especially since other conditional subservices here log when they are skipped/created). Consider adding a Debug/Info log when the env var is set (service enabled) and when it’s unset (service disabled).
| lggr, | ||
| ) | ||
| srvs = append(srvs, relayService) | ||
| } |
There was a problem hiding this comment.
If CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS is set but capCfg.GatewayConnector().DonID() is empty, the relay will silently not start because the gateway connector wrapper isn’t created. Consider logging a warning in that case (env var set but gateway connector not configured), since it’s a misconfiguration that’s otherwise non-obvious.
| } | |
| } | |
| } else { | |
| if trustedPCRs := os.Getenv("CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS"); trustedPCRs != "" { | |
| lggr.Warn("CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS is set but GatewayConnector DonID is empty; confidential relay will not start") | |
| } |
| h, err := relay.NewHandler(s.capRegistry, conn, s.trustedPCRs, s.lggr) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| s.handler = h | ||
| return h.Start(ctx) |
There was a problem hiding this comment.
start returns raw errors from relay.NewHandler/h.Start without adding context. To make startup failures actionable in logs, wrap these errors with service-specific context (e.g., include that this is the confidential relay handler startup and whether it was due to handler construction vs start).
| func (s *Service) start(ctx context.Context) error { | ||
| conn := s.wrapper.GetGatewayConnector() | ||
| if conn == nil { | ||
| return errors.New("gateway connector not available") | ||
| } | ||
| h, err := relay.NewHandler(s.capRegistry, conn, s.trustedPCRs, s.lggr) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| s.handler = h | ||
| return h.Start(ctx) | ||
| } | ||
|
|
||
| func (s *Service) close() error { | ||
| if s.handler != nil { | ||
| return s.handler.Close() | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
There are no unit tests covering the new lifecycle wrapper behavior (e.g., that it errors when the gateway connector isn’t available, and that it closes the underlying handler when started). Since the repo has extensive Go service tests elsewhere, consider adding a small test for Service.start/close using a fake ServiceWrapper/connector to prevent regressions.
Instantiate the confidential-compute relay handler as a CRE subservice. The handler validates Nitro attestation and dispatches secrets.get / capability.execute messages to VaultDON and capability DONs via the gateway connector. Gated by CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS env var (JSON with PCR0-2 hex values). When unset, no relay service is started.
Gateway-side handler that receives JSON-RPC requests from the enclave, fans them out to relay DON nodes, and aggregates 2F+1 quorum responses. Follows the vault handler pattern but simplified: no authorization, no caching, no OCR3 signatures, no owner-prefixed request IDs.
Add gateway handler type, capability flag, and Feature implementation so the CRE test framework can spin up a relay DON for remote-mode E2E tests.
Pass CL_CONFIDENTIAL_RELAY_CA_ROOTS_PEM env var to relay handler for custom attestation certificate validation. Update go.mod to use remote confidential-compute module commits.
… gateway service name Dockerfile: mount GIT_AUTH_TOKEN during go mod download for private module access in Docker builds. gateway_job.go: set ServiceName "confidential" on the confidential-relay handler so the gateway can route requests from the enclave's RemoteDispatcher (which sends methods like "confidential.capability.execute").
Adds mock binary to the GetCapabilityIDFromCommand and inverse mapping so the standard capabilities delegate properly associates the mock binary with its capability ID.
Both confidential plugins now point at ed10df3 which has replace directives removed from go.mod, allowing loopinstall to build from the module cache on a clean checkout.
Update relay and attestation pseudo-versions to the latest tejaswi/cw-e2e-test commit which fixes stale attestation pseudo-version and E2E subtest isolation bugs.
The full commit hash was wrong (first 12 chars matched, rest was garbage). Corrected to actual 1efd81acd9949fc79d14a46b1b55faa74fcb436e.
Secrets are prefetched by the executor before enclave dispatch, so the relay handler no longer needs to route secrets.get requests.
| if trustedPCRs := os.Getenv("CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS"); trustedPCRs != "" { | ||
| caRootsPEM := os.Getenv("CL_CONFIDENTIAL_RELAY_CA_ROOTS_PEM") |
There was a problem hiding this comment.
We can't fetch these from envars. We should use the capability registry configs here. And as for the CA_ROOTS_PEM, I assume it is the nitro one, so we could also have that in the config of the capabilities registry.
There was a problem hiding this comment.
I would say if the on-chain configs aren't ready here, we use a node-level toml config that is "EnableConfidentialRelay" or something, and then read those values from the on-chain config within the handler/service.
There was a problem hiding this comment.
Done. Replaced env vars with TOML config under [CRE.ConfidentialRelay]. Enabled/TrustedPCRs/CARootsPEM. CTF feature updated to set TOML config on nodes instead of env vars.
CORA - Pending Reviewers
Legend: ✅ Approved | ❌ Changes Requested | 💬 Commented | 🚫 Dismissed | ⏳ Pending | ❓ Unknown For more details, see the full review summary. |
| type aggregator struct{} | ||
|
|
||
| func (a *aggregator) Aggregate(resps map[string]jsonrpc.Response[json.RawMessage], donF int, donMembersCount int, l logger.Logger) (*jsonrpc.Response[json.RawMessage], error) { | ||
| requiredQuorum := 2*donF + 1 |
There was a problem hiding this comment.
we need 2F+1 responses? why not f+1?
There was a problem hiding this comment.
Good call, changed to F+1. Each honest node already validates the Nitro attestation independently, so F+1 matching guarantees at least one honest node vouched for the result. Consistent with the workflow metadata handler precedent.
There was a problem hiding this comment.
TDH2 threshold is T=F+1. With up to F Byzantine nodes, you need 2F+1 responses to guarantee at least F+1 valid shares for reconstruction.
| package confidentialrelay | ||
|
|
There was a problem hiding this comment.
This entire file seems like boilerplate. Do they not support some generic version of this?
There was a problem hiding this comment.
No generic handler in the framework. Every handler (functions, capabilities, vault) implements the Handler interface from scratch. Ours is comparable in size to the capabilities handler (431 vs 362 lines).
| RUN --mount=type=secret,id=GIT_AUTH_TOKEN \ | ||
| --mount=type=cache,target=/go/pkg/mod \ | ||
| set -e && \ | ||
| export GIT_CONFIG_GLOBAL=/tmp/gitconfig-gomod-download && \ | ||
| if [ -f /run/secrets/GIT_AUTH_TOKEN ] && [ -s /run/secrets/GIT_AUTH_TOKEN ]; then \ | ||
| TOKEN=$(cat /run/secrets/GIT_AUTH_TOKEN) && \ | ||
| git config --file "$GIT_CONFIG_GLOBAL" \ | ||
| url."https://oauth2:${TOKEN}@github.com/".insteadOf "https://github.com/" ; \ | ||
| fi && \ | ||
| go mod download && \ | ||
| rm -f "$GIT_CONFIG_GLOBAL" |
There was a problem hiding this comment.
obviously this cannot be included in the changes
There was a problem hiding this comment.
Agreed, will remove.
There was a problem hiding this comment.
Removed. Copied the attestation and relay handler code into chainlink directly, dropped the CC module dependency entirely. The Dockerfile no longer needs private repo access.
| ns.EnvVars["CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS"] = fmt.Sprintf("%v", v) | ||
| } | ||
| if v, exists := capConfig.Values["caRootsPEM"]; exists { | ||
| ns.EnvVars["CL_CONFIDENTIAL_RELAY_CA_ROOTS_PEM"] = fmt.Sprintf("%v", v) |
There was a problem hiding this comment.
again, we cannot use envars here.
There was a problem hiding this comment.
Done. Moved to TOML config.
|
lot of boilerplate wiring. only major comment is the envars. |
HandlerServiceName was missing a case for GatewayHandlerTypeConfidentialRelay, so it fell through to the default which returned "confidential-compute-relay". The JSON-RPC method "confidential.capability.execute" produces ServiceName() "confidential", causing "Service name not found" at the gateway.
|
Envars replaced with TOML config. Dockerfile dependency removed. Also, product direction has changed: remote secret fetching from the enclave is now the only path. No upfront vault_don_secrets declaration. This relay handler will carry the GetSecrets traffic at runtime. |
Node-side handler: adds handleSecretsGet which validates Nitro attestation, looks up vault capability, builds vault GetSecrets request with the enclave's ephemeral public key, translates the vault response (hex to base64 encoding), and returns encrypted TDH2 shares. Gateway handler: registers MethodSecretsGet alongside MethodCapabilityExec so the gateway routes secrets requests to relay DON nodes. No per-secret authorization. Relay DON validates attestation + PCR measurements only. Trust boundary is the TEE itself.
|
|
|
Superseded by #21603 |




Summary
Wire the confidential-compute relay handler (confidential-compute#265) into chainlink as a CRE subservice. The relay handler validates Nitro attestation and dispatches
confidential.secrets.get/confidential.capability.executemessages from the enclave to VaultDON and capability DONs via the gateway connector.The handler code lives in
confidential-compute/capabilities/relayas a standalone Go module. This PR imports it and wires it into the node's lifecycle.Changes
core/capabilities/confidentialrelay/service.go(new): Thin lifecycle wrapper around the relay handler. The relay handler needs the gateway connector at construction time, but the connector isn't available untilServiceWrapper.Start(). This wrapper defers handler creation to its ownStart(), bridging the gap.core/services/cre/cre.go: Instantiate the relay service inside the gateway connector block, gated byCL_CONFIDENTIAL_RELAY_TRUSTED_PCRSenv var.go.mod: Bumpchainlink-commonto3e3f9545d607(addsconfidentialrelaytypes), addconfidential-compute/capabilities/relayand transitive depconfidential-compute/attestation.Config: env var for now
Trusted PCR measurements are passed via
CL_CONFIDENTIAL_RELAY_TRUSTED_PCRSas a JSON string:{"pcr0":"<hex>","pcr1":"<hex>","pcr2":"<hex>"}When unset, no relay service is started (zero impact on nodes that don't use confidential workflows). Proper TOML config (
Capabilities.ConfidentialRelaysection) can be added in a follow-up; PCR values change rarely (only when the enclave binary is rebuilt), so an env var is sufficient for now and avoids config interface boilerplate.Dependency chain
Related PRs