engine: add Rest-SSZ spec#793
Conversation
|
One thing that I think would make sense would be to reuse the base structure of the beacon api (https://github.com/ethereum/beacon-APIs/) - this includes several things:
|
For top-up sync we also need "current block number", similar to |
|
I opened #773 a few weeks ago with a narrower scope: adding a single new endpoint ( Since #793 is now doing a full Engine API refactor with the same REST+SSZ foundation, I think the witness endpoint fits naturally into this design. A few thoughts: The witness endpoint should be added to this new spec. The existing two-call flow ( Suggested endpoint: Benchmark data (ethrex, 203 txs, 36 Mgas, ~502 MB SSZ witness):
Happy to close #773 in support of this PR I think it's a better, more seamless flow. Would love to discuss where the witness endpoint fits best in the new endpoint table. |
|
|
||
| | Old method | New endpoint | Notes | | ||
| | - | - | - | | ||
| | `engine_newPayloadV{1..5}` | `POST /{fork}/payloads` | `parentBeaconBlockRoot` and `executionRequests` folded into the SSZ envelope; `expectedBlobVersionedHashes` removed; `INVALID_BLOCK_HASH` removed from the status enum | |
There was a problem hiding this comment.
dropping /engine/v2 prefix misleads a bit
There was a problem hiding this comment.
+1 - the beacon api lives under /eth/vX/beacon - would be good to see this api under /eth/vX/engine to unify the approaches
|
|
||
| | Resource | Endpoint | Purpose | | ||
| | - | - | - | | ||
| | Payload | `POST /engine/v2/{fork}/payloads` | Submit a payload received from the CL gossip network for the EL to validate / import. Replaces `engine_newPayload`. | |
There was a problem hiding this comment.
What does v2 mean? Was there v1? Why is fork necessary in the URL? Fork name seems to be a way to describe minor API version, but on other side blobs endpoints have it after resource name, not before and it's a number
There was a problem hiding this comment.
Also if payload did not change across forks, does it mean we will need same endpoint under different urls? Just single /vN/ looked simpler
|
|
||
| #### Transport | ||
|
|
||
| - **HTTP/2 required**, h2c (cleartext) for both TCP and IPC. No |
There was a problem hiding this comment.
the consensus REST spec does not have this requirement - we use 1.1 throughout and going to 2.0 would not be viable short-term for Nimbus.
There was a problem hiding this comment.
I've spoken with several people about developing streaming extensions for the engine API, some examples: registering to receive all txs from a particular account; receiving all cells for a set of kzg commitments; streaming EL events. For most language/library stacks http/2 is the simplest and most effective tool for this job. A lot of flexibility is introduced by giving the client/server native multiplexing, customizable framing. This could be powerful for some of the syncing options we've discussed like EL being fed blocks by CL.
As a compromise we could define these streaming methods as optional while clients work on introducing http/2 support. In practice http/2 capable libraries also support http/1.1. This is usually negotiated as ALPN during the TLS handshake, but the fallback option defined for plaintext protocols (like the engine api) is an upgrade header: Upgrade: h2c. So when the CL first connects to the EL to determine capabilities, it can make an http/1.1 request with an upgrade header (to confirm http/2 support if the EL server gives the expected upgrade response preamble). These streaming extensions would presumably be themselves represented as capabilities, potentially with a prefix that indicates the L7 protocol (http2/streamBlobsForCommitments).
There was a problem hiding this comment.
during the TLS handshake,
which tls handshake? we're talking about a private el/cl connection that in many cases does not have tls enabled (because enabling it would be an absolute mess of managing certificates etc or disabling verification which is unusual).
multiplexing
just make 2 connections in these cases?
upgrade
sure - nothing prevents an el/cl to talk http/2 - that said, this is also not a public interface serving thousands of clients but rather a private connection for driving one part of the client with another. For the public rpc, it makes sense - for the engine api that sees a few messages per second, it seems overengineered.
There was a problem hiding this comment.
which tls handshake? we're talking about a private el/cl connection that in many cases does not have tls enabled (because enabling it would be an absolute mess of managing certificates etc or disabling verification which is unusual).
I might have confused things with too much superfluous info. I'm not saying TLS is involved, I'm saying it is not involved, which is ok because there is a cleartext upgrade path. More succinctly -- there are clear standards for a single server to support both, so we can define the ssz flavored existing methods as http/1.1 while allowing http/2 capable servers to upgrade their connections to http/2 and provide additional optional methods.
For the public rpc, it makes sense - for the engine api that sees a few messages per second, it seems overengineered
The point isn't about scale/concurrency - I'm not suggesting http/3 here. The appeal is here is actually the simplicity that comes from the shape of the protocol fitting the use case, vs requiring devs to layer on additional complexity in order to compensate for the fact that http/1.1 framing and encoding was built for a different kind of request/response cycle and has acquired technical debt.
Adds the EnableEngineSSZHTTP feature flag (off by default), the gate for the REST + SSZ Engine API v2 transport (ethereum/execution-apis#793). No behavior yet; JSON-RPC engine_* stays the default transport. Wires the flag into ConfigureBeaconChain and covers it with a test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lays out all eight REST + SSZ Engine API v2 endpoint operations (ethereum/execution-apis#793) as methods on enginehttp.Client: NewPayload, ForkchoiceUpdated, GetPayload, GetPayloadBodiesBy{Hash,Range}, GetBlobs, Capabilities, Identity. Each is a stub returning errNotImplemented with a per-endpoint TODO(ssz-over-http) comment, plus a skipped test pinning the intended call shape, so the Phase 4 empty spots are easy to find. No behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the EnableEngineSSZHTTP feature flag (off by default), the gate for the REST + SSZ Engine API v2 transport (ethereum/execution-apis#793). No behavior yet; JSON-RPC engine_* stays the default transport. Wires the flag into ConfigureBeaconChain and covers it with a test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lays out all eight REST + SSZ Engine API v2 endpoint operations (ethereum/execution-apis#793) as methods on enginehttp.Client: NewPayload, ForkchoiceUpdated, GetPayload, GetPayloadBodiesBy{Hash,Range}, GetBlobs, Capabilities, Identity. Each is a stub returning errNotImplemented with a per-endpoint TODO(ssz-over-http) comment, plus a skipped test pinning the intended call shape, so the Phase 4 empty spots are easy to find. No behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Implementer feedback: there are now three EL implementations of this spec —
|
Implement enginehttp, an HTTP/2 (h2c) transport client for the REST + SSZ Engine API v2 (ethereum/execution-apis#793). It round-trips arbitrary SSZ bodies under /engine/v2/{fork}/... with per-request JWT bearer auth, generic over the fastssz Marshaler/Unmarshaler interfaces, and decodes RFC 7807 problem+json errors (branching on HTTP status, not the type URI). Transport-only: not yet wired to the execution service or a feature flag. Reuses network JWT signing via a new network.NewJWTRoundTripper, and promotes golang.org/x/net to a direct dependency for x/net/http2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the EnableEngineSSZHTTP feature flag (off by default), the gate for the REST + SSZ Engine API v2 transport (ethereum/execution-apis#793). No behavior yet; JSON-RPC engine_* stays the default transport. Wires the flag into ConfigureBeaconChain and covers it with a test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lays out all eight REST + SSZ Engine API v2 endpoint operations (ethereum/execution-apis#793) as methods on enginehttp.Client: NewPayload, ForkchoiceUpdated, GetPayload, GetPayloadBodiesBy{Hash,Range}, GetBlobs, Capabilities, Identity. Each is a stub returning errNotImplemented with a per-endpoint TODO(ssz-over-http) comment, plus a skipped test pinning the intended call shape, so the Phase 4 empty spots are easy to find. No behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Implementer feedback from working through this on the Nethermind side (NethermindEth/nethermind#11887). All four are spec-internal issues that surfaced while reconciling the SSZ container sketches with EIP-7594, the consensus-specs PeerDAS document, and the running EL implementation against c-kzg-4844. 1.
|
|
Hey, sorry coming back to this after being sidetracked by other things. When I implemented it, I came across the same issues @LukaszRozmej specified, so I think we should update the spec as proposed. Another point I came across is that the Optional type is pretty wasteful for tightly packed structures. We could change it for some structures in order to save some bytes encoding/decoding, but I'm not super sure about it, I care more about encoding/decoding speed than size/waste |
|
I pushed some changes on top of this PR to address some of the feedback, please let me know how you feel about it @yperbasis @LukaszRozmej @arnetheduck |
|
Update from Prysm: Prysm is also implementing this one (OffchainLabs/prysm#16901), already tested with erigontech/erigon#21729 (commit hash: erigontech/erigon@1f9c45f). I guess the current geth version won't work with this Prysm version b/c of request/response type. It must be not a hard work to match new req/resp type (bare list vs. container) so will keep an eye on this. |
|
| - **HTTP status:** `200 OK` for all three payload-status outcomes. | ||
| `409 Conflict` is returned for an inconsistent forkchoice state | ||
| (today's `-38002`); `422 Unprocessable Entity` for invalid | ||
| `payload_attributes` (today's `-38003`); `409 Conflict` for a too-deep |
There was a problem hiding this comment.
for "too deep reorg", 418 seems appropriate ;)
|
|
||
| #### Removed concepts | ||
|
|
||
| - `engine_exchangeCapabilities` — replaced by `GET /capabilities`. |
nflaig
left a comment
There was a problem hiding this comment.
mostly wanna echo what Jacek mentioned here #793 (comment), it would be great if we can align this more with the beacon-api as we have figured out ssz/json and versioning across forks there already
| | - | - | - | | ||
| | Payload | `POST /engine/v2/{fork}/payloads` | Submit a payload received from the CL gossip network for the EL to validate / import. Replaces `engine_newPayload`. | | ||
| | Payload | `GET /engine/v2/{fork}/payloads/{payloadId}` | Retrieve a built payload by id. Replaces `engine_getPayload`. CL polls when it wants a fresher snapshot. | | ||
| | Forkchoice | `POST /engine/v2/{fork}/forkchoice` | Atomic forkchoice update: update head/safe/finalized, optionally start a payload build, optionally update custody set. Replaces `engine_forkchoiceUpdated`. | |
There was a problem hiding this comment.
why do we need the fork in the path, could adopt the same as we do in the beacon-api and sent it via header, so you don't need to touch the implementation at all if just the container changes across forks
| `expected_blob_versioned_hashes` is **removed**: it was a | ||
| defense-in-depth cross-check, but the block-hash check already covers | ||
| the transactions, so the EL recomputes the array from | ||
| `payload.transactions` during validation and a mismatch between CL | ||
| and EL views surfaces as `INVALID` exactly as before. |
There was a problem hiding this comment.
The CL derives the expected hashes from consensus commitments, while the EL extracts the actual hashes from transactions. Without the CL-provided array, the EL has nothing to compare against.
I tend to keep keep the hashes to have a structural rejection.
| successive `GET`s against the same `{payloadId}` may return different | ||
| bytes. The EL **MUST** include `Cache-Control: no-store` on the | ||
| response, and intermediaries **MUST NOT** cache or revalidate this | ||
| resource. CLs **MUST NOT** treat the response as cacheable. | ||
|
|
||
| **Path validation.** `{payloadId}` is a path segment carrying a hex- | ||
| encoded `Bytes8`. The EL **MUST** validate that the path segment is | ||
| well-formed (8 bytes, hex) before dispatching to lookup logic; a | ||
| malformed segment returns `400 invalid-request`. | ||
|
|
||
| **Token TTL.** A `payloadId` is valid until either the payload was | ||
| retrieved by `GET /{fork}/payloads/{payloadId}` or another payload | ||
| was built via a forkchoice with payload attributes. |
There was a problem hiding this comment.
Successive get vs token expiry seems contradicting.
either the payload was retrieved
So it's a GET call so a successful response considered as retrieval and then as per spec above token will expire, so user can't make the same successive call again.
There was a problem hiding this comment.
Ah yes I should clarify that a token ttl should be longer than a single get.
We should be able to get the payload twice, I think?
| When `available == false`, `contents` carries zero-valued bytes (a | ||
| `BYTES_PER_BLOB`-byte zero blob and a 48-byte zero proof) and CLs | ||
| MUST ignore them. |
There was a problem hiding this comment.
An available=false still carries a zero-filled fixed size blob and proofs. For 128 misses produce roughly 16MiB of zero bytes, for /blobs/v1 and same problem for blobs/v3 for cell proofs.
We can use the following ssz pattern to have an optional values.
contents: Optional[BlobAndProofV1]
Where Optional[T] = List[T, 1]
There was a problem hiding this comment.
That makes sense to me
|
|
||
| ### Payload submission | ||
|
|
||
| #### `POST /engine/v2/{fork}/payloads` |
There was a problem hiding this comment.
The specification should require {fork} to match the fork derived from payload.timestamp, similar to the rule for payload_attributes in /forkchoice.
Some adjacent forks have wire-compatible payload/envelope shapes, so decoding alone cannot detect a request sent to the wrong fork URL. Without an explicit check, clients may inconsistently accept an Osaka payload through /prague/payloads.
We should add following to harden the specs.
The URL {fork} MUST match the fork determined from payload.timestamp. Otherwise, the EL MUST return 400 /engine-api/errors/unsupported-fork.
| ```json | ||
| { | ||
| "supported_forks": ["paris", "shanghai", "cancun", "prague", "osaka", "amsterdam"], | ||
| "fork_scoped_endpoints": ["payloads", "forkchoice", "bodies"], | ||
| "independently_versioned": { "blobs": ["v1", "v2", "v3", "v4"] }, | ||
| "unscoped_endpoints": ["capabilities", "identity"], | ||
| "limits": { | ||
| "bodies.max_count": 32, | ||
| "blobs.max_versioned_hashes": 128, | ||
| "payload.max_bytes": 67108864 | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
The separate supported_forks and fork_scoped_endpoints arrays imply their Cartesian product, every listed endpoint is supported for every listed fork. Is that an intentional requirement?
If partial implementations are allowed, this format cannot accurately advertise them. For example, an EL might support SSZ payload submission for Osaka but only expose historical bodies for Cancun.
Consider either:
-
Normatively requiring every advertised fork-scoped endpoint to support every entry in
supported_forks; or -
Advertising forks per endpoint:
{
"fork_scoped_endpoints": {
"payloads": ["cancun", "prague", "osaka"],
"forkchoice": ["osaka"],
"bodies": ["paris", "shanghai", "cancun"]
}
}This also gives CL implementations an unambiguous capability test without probing endpoints.
There was a problem hiding this comment.
I think thats a bit overengineered, but willing to go with it
| All three fields are processed in one transaction: the EL MUST apply | ||
| the forkchoice state, then (if `payload_attributes` is present and | ||
| the new head is `VALID`) start the build, then (if `custody_columns` | ||
| is present) update the custody set, all before returning. If the | ||
| forkchoice update fails, no build is started and no custody change | ||
| is applied. |
There was a problem hiding this comment.
This atomicity rule appears to conflict with the custody semantics below, which say the custody update runs independently and that custody errors must not affect payload_status.
If custody application fails after forkchoice/build succeeds, does the EL:
- roll back forkchoice and payload building, as “one transaction” implies; or
- commit forkchoice/build and retain the previous custody set?
The latter seems implied by the independent-update language, but the response has no field for reporting custody failure. Please define the commit behavior and how custody errors are surfaced.
| There is **no per-method fallback ladder**. A CL either uses v2 or | ||
| JSON-RPC for the lifetime of an EL connection; mixing transports | ||
| within a connection is permitted but not required. |
There was a problem hiding this comment.
Either v2 or JSON-RPC for the lifetime of the connection" is workable as a coarse rule but leaves some practical questions unanswered for Amsterdam rollout:
-
Activation gap. Amsterdam activates at a specific timestamp. If a CL ships v2 by activation but its paired EL is on the legacy API only, the CL hits 404 on every /engine/v2/... URL until the EL catches up. The spec says "fall back to JSON-RPC for the duration of that EL's lifetime" — does that mean one 404 forces the entire connection to JSON-RPC for as long as the process runs? What re-triggers the check after an EL upgrade? A reconnection? A periodic probe?
-
Mixed-fork support. An EL might support
/engine/v2/...for Cancun + Prague but not Amsterdam yet. Under the rule above, a single Amsterdam 404/unsupported-fork for /amsterdam/payloads would force the CL back to JSON-RPC for all methods including ones the EL handles fine via v2. That seems wasteful — could the policy be "fall back per-fork", not per-connection? -
Health-check semantics. Without per-method fallback, how does a CL know an EL has upgraded mid-run without restarting the connection?
GET /capabilitiesis the natural probe but the spec doesn't suggest a polling cadence.
| `engine_getPayloadBodiesByRange` "no trailing nulls" rule. The CL | ||
| detects the unfilled suffix from the shortfall and re-issues against | ||
| the next fork URL if the range straddled a fork boundary. |
There was a problem hiding this comment.
Fork-boundary detection is being pushed into every CL implementation. For sync workloads that range over multiple forks (e.g. payload-body backfill), the CL needs to know each block's fork before constructing the URL, then handle the shortfall + re-issue dance for spanning ranges.
This is workable but worth either:
-
A reference helper in the spec for "given (from, count), partition into per-fork segments by timestamp." Otherwise every CL re-implements the same fork-boundary math.
-
Or an explicit error type
/engine-api/errors/range-spans-forkinstead of silent truncation, so a CL that didn't notice the shortfall doesn't quietly miss blocks.
Without either, a CL that omits the partitioning step looks at a short response and might conclude it hit head, when in fact the range crossed a boundary. The failure mode is silent.
There was a problem hiding this comment.
I think we should add an explicit error here
|
The following discussions were taken during the call
no specific devnet for this,can be added to the glamsterdam devnets and the flags can be enabled |
|
Updated the spec according to the results from the discussion |
There have been multiple attempts at this already.
Moving away from JSON-RPC to REST-SSZ.
However most kept the engine api as is.
I think we have a good shot at refactoring the engine api with this change.
This Draft does that; the move and the refactoring.
Happy for any feedback I can get!
The core of the change is:
engine_newPayloadV{1..5}POST /{fork}/payloadsparentBeaconBlockRootandexecutionRequestsfolded into the SSZ envelope;expectedBlobVersionedHashesremoved;INVALID_BLOCK_HASHremoved from the status enumengine_forkchoiceUpdatedV{1..4}POST /{fork}/forkchoicepayload_attributes, and (Amsterdam+) optionalcustody_columnsengine_getPayloadV{1..6}GET /{fork}/payloads/{id}engine_getPayloadBodiesByHashV{1,2}POST /{fork}/bodies/hash{fork}selects the response schema (not the era of requested blocks);POSTbecause hash lists are too large for URLsengine_getPayloadBodiesByRangeV{1,2}GET /{fork}/bodies?from=...&count=...{fork}selects the response schemaengine_getBlobsV1POST /blobs/v1engine_getBlobsV2POST /blobs/v2engine_getBlobsV3POST /blobs/v3engine_getBlobsV4POST /blobs/v4engine_getClientVersionV1GET /identity+X-Engine-Client-Versionrequest headerengine_exchangeCapabilitiesGET /capabilitiesengine_exchangeTransitionConfigurationV1