Skip to content

engine: add Rest-SSZ spec#793

Draft
MariusVanDerWijden wants to merge 9 commits into
ethereum:mainfrom
MariusVanDerWijden:rest-ssz
Draft

engine: add Rest-SSZ spec#793
MariusVanDerWijden wants to merge 9 commits into
ethereum:mainfrom
MariusVanDerWijden:rest-ssz

Conversation

@MariusVanDerWijden

@MariusVanDerWijden MariusVanDerWijden commented May 8, 2026

Copy link
Copy Markdown
Member

There have been multiple attempts at this already.
Moving away from JSON-RPC to REST-SSZ.
However most kept the engine api as is.
I think we have a good shot at refactoring the engine api with this change.

This Draft does that; the move and the refactoring.

Happy for any feedback I can get!

The core of the change is:

Old method New endpoint Notes
engine_newPayloadV{1..5} POST /{fork}/payloads parentBeaconBlockRoot and executionRequests folded into the SSZ envelope; expectedBlobVersionedHashes removed; INVALID_BLOCK_HASH removed from the status enum
engine_forkchoiceUpdatedV{1..4} POST /{fork}/forkchoice one atomic call; carries forkchoice state, optional payload_attributes, and (Amsterdam+) optional custody_columns
engine_getPayloadV{1..6} GET /{fork}/payloads/{id} poll-style, same semantics as today
engine_getPayloadBodiesByHashV{1,2} POST /{fork}/bodies/hash {fork} selects the response schema (not the era of requested blocks); POST because hash lists are too large for URLs
engine_getPayloadBodiesByRangeV{1,2} GET /{fork}/bodies?from=...&count=... {fork} selects the response schema
engine_getBlobsV1 POST /blobs/v1 independently versioned; legacy version numbers carry forward
engine_getBlobsV2 POST /blobs/v2 all-or-nothing cell proofs
engine_getBlobsV3 POST /blobs/v3 partial-response cell proofs
engine_getBlobsV4 POST /blobs/v4 cell-range selection
engine_getClientVersionV1 GET /identity + X-Engine-Client-Version request header unscoped
engine_exchangeCapabilities GET /capabilities unscoped
engine_exchangeTransitionConfigurationV1 removed already deprecated since Cancun

@arnetheduck

arnetheduck commented May 14, 2026

Copy link
Copy Markdown
Contributor

One thing that I think would make sense would be to reuse the base structure of the beacon api (https://github.com/ethereum/beacon-APIs/) - this includes several things:

  • primitives and their json/ssz encoding - ie in general, where possible, reuse the types from https://github.com/ethereum/beacon-APIs/blob/master/types/primitive.yaml so that we don't end up with pointless minor differences in how for example a number is string-encoded (0x0 vs 0x and the like)
  • execution payloads and other (consensus) spec types - the "shape" of objects in the beacon api generally follows the shape of things as they travel on the gossip network and their SSZ encoding - by reusing these types, we would reduce the maintenance overhead of having to pointlessly reorder and rename the exact same fields from the beacon api/consensus spec just to send the same info to the execution api in a slightly different shape
  • use of the canonical ssz/json encodings specified here: https://github.com/ethereum/consensus-specs/blob/master/ssz/simple-serialize.md#json-mapping - this aids debugging and removes the need to double-specify things
  • explicit encoding of fork in the http headers -> we can then upgrade to an new hard fork "automatically" without having to come up with V2, V3 etc

@arnetheduck

Copy link
Copy Markdown
Contributor

The core of the change is:

For top-up sync we also need "current block number", similar to eth_blockNumber but limited to latest and with a well-defined behavior for when the EL does not have a state.

@developeruche

Copy link
Copy Markdown

I opened #773 a few weeks ago with a narrower scope: adding a single new endpoint (POST /new-payload-with-witness) that combines engine_newPayload and debug_executionWitness into one call and returns the witness SSZ-encoded over HTTP. The motivation was to unblock zkVM provers and zkAttestors from having to follow the chain one block behind.

Since #793 is now doing a full Engine API refactor with the same REST+SSZ foundation, I think the witness endpoint fits naturally into this design. A few thoughts:

The witness endpoint should be added to this new spec. The existing two-call flow (engine_newPayloaddebug_executionWitness) has real-world latency problems at a ~500 MB witness, the JSON-RPC + JSON approach takes ~8s just to return the witness. With HTTP + SSZ and the EL pipeline optimizations I profiled (moving trie writes off the critical path, parallelizing storage trie updates), this drops to ~932ms total EL time. That's comfortably within the 8s newPayload timeout even for worst-case blocks.

Suggested endpoint: POST /{fork}/payloads/with-witness (or folded directly into POST /{fork}/payloads as an optional response field when requested via a query param or Accept header). The response would carry the PayloadStatus + ExecutionWitness SSZ-encoded, consistent with the rest of the new spec.

Benchmark data (ethrex, 203 txs, 36 Mgas, ~502 MB SSZ witness):

Approach EL Total Wire Size
JSON-RPC + JSON 8,131 ms ~502 MB
HTTP + SSZ 1232 ms 502 MB

Happy to close #773 in support of this PR I think it's a better, more seamless flow. Would love to discuss where the witness endpoint fits best in the new endpoint table.

cc: @MariusVanDerWijden

Comment thread src/engine/refactor.md Outdated

| Old method | New endpoint | Notes |
| - | - | - |
| `engine_newPayloadV{1..5}` | `POST /{fork}/payloads` | `parentBeaconBlockRoot` and `executionRequests` folded into the SSZ envelope; `expectedBlobVersionedHashes` removed; `INVALID_BLOCK_HASH` removed from the status enum |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropping /engine/v2 prefix misleads a bit

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - the beacon api lives under /eth/vX/beacon - would be good to see this api under /eth/vX/engine to unify the approaches

Comment thread src/engine/refactor.md Outdated

| Resource | Endpoint | Purpose |
| - | - | - |
| Payload | `POST /engine/v2/{fork}/payloads` | Submit a payload received from the CL gossip network for the EL to validate / import. Replaces `engine_newPayload`. |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does v2 mean? Was there v1? Why is fork necessary in the URL? Fork name seems to be a way to describe minor API version, but on other side blobs endpoints have it after resource name, not before and it's a number

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if payload did not change across forks, does it mean we will need same endpoint under different urls? Just single /vN/ looked simpler

Comment thread src/engine/refactor.md Outdated

#### Transport

- **HTTP/2 required**, h2c (cleartext) for both TCP and IPC. No

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the consensus REST spec does not have this requirement - we use 1.1 throughout and going to 2.0 would not be viable short-term for Nimbus.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've spoken with several people about developing streaming extensions for the engine API, some examples: registering to receive all txs from a particular account; receiving all cells for a set of kzg commitments; streaming EL events. For most language/library stacks http/2 is the simplest and most effective tool for this job. A lot of flexibility is introduced by giving the client/server native multiplexing, customizable framing. This could be powerful for some of the syncing options we've discussed like EL being fed blocks by CL.

As a compromise we could define these streaming methods as optional while clients work on introducing http/2 support. In practice http/2 capable libraries also support http/1.1. This is usually negotiated as ALPN during the TLS handshake, but the fallback option defined for plaintext protocols (like the engine api) is an upgrade header: Upgrade: h2c. So when the CL first connects to the EL to determine capabilities, it can make an http/1.1 request with an upgrade header (to confirm http/2 support if the EL server gives the expected upgrade response preamble). These streaming extensions would presumably be themselves represented as capabilities, potentially with a prefix that indicates the L7 protocol (http2/streamBlobsForCommitments).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

during the TLS handshake,

which tls handshake? we're talking about a private el/cl connection that in many cases does not have tls enabled (because enabling it would be an absolute mess of managing certificates etc or disabling verification which is unusual).

multiplexing

just make 2 connections in these cases?

upgrade

sure - nothing prevents an el/cl to talk http/2 - that said, this is also not a public interface serving thousands of clients but rather a private connection for driving one part of the client with another. For the public rpc, it makes sense - for the engine api that sees a few messages per second, it seems overengineered.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which tls handshake? we're talking about a private el/cl connection that in many cases does not have tls enabled (because enabling it would be an absolute mess of managing certificates etc or disabling verification which is unusual).

I might have confused things with too much superfluous info. I'm not saying TLS is involved, I'm saying it is not involved, which is ok because there is a cleartext upgrade path. More succinctly -- there are clear standards for a single server to support both, so we can define the ssz flavored existing methods as http/1.1 while allowing http/2 capable servers to upgrade their connections to http/2 and provide additional optional methods.

For the public rpc, it makes sense - for the engine api that sees a few messages per second, it seems overengineered

The point isn't about scale/concurrency - I'm not suggesting http/3 here. The appeal is here is actually the simplicity that comes from the shape of the protocol fitting the use case, vs requiring devs to layer on additional complexity in order to compensate for the fact that http/1.1 framing and encoding was built for a different kind of request/response cycle and has acquired technical debt.

syjn99 added a commit to syjn99/prysm that referenced this pull request Jun 9, 2026
Adds the EnableEngineSSZHTTP feature flag (off by default), the gate for
the REST + SSZ Engine API v2 transport (ethereum/execution-apis#793). No
behavior yet; JSON-RPC engine_* stays the default transport. Wires the
flag into ConfigureBeaconChain and covers it with a test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
syjn99 added a commit to syjn99/prysm that referenced this pull request Jun 9, 2026
Lays out all eight REST + SSZ Engine API v2 endpoint operations
(ethereum/execution-apis#793) as methods on enginehttp.Client: NewPayload,
ForkchoiceUpdated, GetPayload, GetPayloadBodiesBy{Hash,Range}, GetBlobs,
Capabilities, Identity. Each is a stub returning errNotImplemented with a
per-endpoint TODO(ssz-over-http) comment, plus a skipped test pinning the
intended call shape, so the Phase 4 empty spots are easy to find. No behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
syjn99 added a commit to OffchainLabs/prysm that referenced this pull request Jun 10, 2026
Adds the EnableEngineSSZHTTP feature flag (off by default), the gate for
the REST + SSZ Engine API v2 transport (ethereum/execution-apis#793). No
behavior yet; JSON-RPC engine_* stays the default transport. Wires the
flag into ConfigureBeaconChain and covers it with a test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
syjn99 added a commit to OffchainLabs/prysm that referenced this pull request Jun 10, 2026
Lays out all eight REST + SSZ Engine API v2 endpoint operations
(ethereum/execution-apis#793) as methods on enginehttp.Client: NewPayload,
ForkchoiceUpdated, GetPayload, GetPayloadBodiesBy{Hash,Range}, GetBlobs,
Capabilities, Identity. Each is a stub returning errNotImplemented with a
per-endpoint TODO(ssz-over-http) comment, plus a skipped test pinning the
intended call shape, so the Phase 4 empty spots are easy to find. No behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@yperbasis

Copy link
Copy Markdown
Member

Implementer feedback: there are now three EL implementations of this spec —
Erigon (erigontech/erigon#21729),
Nethermind (NethermindEth/nethermind#11887) and
ethrex (lambdaclass/ethrex#6770).
Comparing them at the byte level surfaced five places where the draft gets read
differently. Pinning these would stop the implementations from diverging further.

  1. Bare list vs single-field container for top-level bodies. refactor.md says
    requests/responses are "SSZ-encoded List[...]", but the refactor-ssz.md sketch
    wraps them in containers (BodiesByHashRequest, BodiesResponse, BlobsV1Request, …),
    which adds a 4-byte offset prefix on the wire. Erigon and ethrex went bare;
    Nethermind wraps. Affects every bodies/blobs request and response.

  2. validation_error encoding. The sketch pins Optional[String] =
    List[List[byte, MAX_ERROR_BYTES], 1] (a 4-byte inner offset precedes the text when
    present), and Erigon/ethrex implement that — but it's subtle enough that Nethermind
    shipped a plain List[byte, 1024]. A worked byte example with an error present
    (like the existing 41-byte PayloadStatus example) would remove all doubt.

  3. BuiltPayload for pre-Amsterdam forks is undefined. Two concrete splits:
    field order — Erigon/ethrex follow the spec's …, execution_requests, should_override_builder; Nethermind kept the legacy JSON-RPC order with
    execution_requests last — and the Paris shape (bare ExecutionPayload vs
    {payload, block_value}). A per-fork catalogue for BuiltPayload (and the
    envelope / ForkchoiceUpdate), like the one that exists for ExecutionPayload /
    PayloadAttributes / ExecutionPayloadBody, would settle this.

  4. Bodies range queries past the latest block. The text says past-head block
    numbers come back available=false, which implies the response is padded to
    count entries (Erigon does this); Nethermind and ethrex instead truncate at
    head, carrying over the legacy "no trailing nulls" rule. Please state the
    expected response length explicitly.

  5. Fork-era scoping of /bodies. "{fork} selects both the response schema and
    the era" is normative in refactor.md, and Erigon/ethrex filter accordingly, but
    one implementation only uses the segment for schema selection. Worth a MUST.

syjn99 added a commit to syjn99/prysm that referenced this pull request Jun 11, 2026
Implement enginehttp, an HTTP/2 (h2c) transport client for the REST + SSZ
Engine API v2 (ethereum/execution-apis#793). It round-trips arbitrary SSZ
bodies under /engine/v2/{fork}/... with per-request JWT bearer auth,
generic over the fastssz Marshaler/Unmarshaler interfaces, and decodes
RFC 7807 problem+json errors (branching on HTTP status, not the type URI).
Transport-only: not yet wired to the execution service or a feature flag.

Reuses network JWT signing via a new network.NewJWTRoundTripper, and
promotes golang.org/x/net to a direct dependency for x/net/http2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
syjn99 added a commit to syjn99/prysm that referenced this pull request Jun 11, 2026
Adds the EnableEngineSSZHTTP feature flag (off by default), the gate for
the REST + SSZ Engine API v2 transport (ethereum/execution-apis#793). No
behavior yet; JSON-RPC engine_* stays the default transport. Wires the
flag into ConfigureBeaconChain and covers it with a test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
syjn99 added a commit to syjn99/prysm that referenced this pull request Jun 11, 2026
Lays out all eight REST + SSZ Engine API v2 endpoint operations
(ethereum/execution-apis#793) as methods on enginehttp.Client: NewPayload,
ForkchoiceUpdated, GetPayload, GetPayloadBodiesBy{Hash,Range}, GetBlobs,
Capabilities, Identity. Each is a stub returning errNotImplemented with a
per-endpoint TODO(ssz-over-http) comment, plus a skipped test pinning the
intended call shape, so the Phase 4 empty spots are easy to find. No behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@LukaszRozmej

Copy link
Copy Markdown

Implementer feedback from working through this on the Nethermind side (NethermindEth/nethermind#11887). All four are spec-internal issues that surfaced while reconciling the SSZ container sketches with EIP-7594, the consensus-specs PeerDAS document, and the running EL implementation against c-kzg-4844.

1. BYTES_PER_CELL derivation is wrong (refactor-ssz.md MAX_* table)

| `BYTES_PER_CELL` | `BYTES_PER_BLOB / CELLS_PER_EXT_BLOB` (1,024) | derived |

The derivation cites EIP-7594 but the formula contradicts it. EIP-7594 defines (see polynomial-commitments-sampling.md):

  • FIELD_ELEMENTS_PER_CELL = 64
  • BYTES_PER_FIELD_ELEMENT = 32 (EIP-4844)
  • BYTES_PER_CELL = FIELD_ELEMENTS_PER_CELL * BYTES_PER_FIELD_ELEMENT = 2048

Cells span the extended blob (FIELD_ELEMENTS_PER_EXT_BLOB = 2 * FIELD_ELEMENTS_PER_BLOB = 8192), so 8192 / 64 = 128 cells × 2048 bytes = 262144 bytes total — twice BYTES_PER_BLOB. The "BYTES_PER_BLOB / CELLS_PER_EXT_BLOB" formula collapses the original-blob byte count over the extended-blob cell count, which is geometrically wrong.

Concrete consequence for implementers: c-kzg-4844's compute_cells writes CELLS_PER_EXT_BLOB * 2048 bytes. Encoding into a ByteVector[1024] slot either throws on length-mismatch or silently truncates half the cell payload, which then fails any KZG cell-proof verification on the CL side. The only viable implementation is BYTES_PER_CELL = 2048.

Proposed change:

-| `BYTES_PER_CELL` | `BYTES_PER_BLOB / CELLS_PER_EXT_BLOB` (1,024) | derived |
+| `BYTES_PER_CELL` | `FIELD_ELEMENTS_PER_CELL * BYTES_PER_FIELD_ELEMENT` (2,048) | [EIP-7594](https://eips.ethereum.org/EIPS/eip-7594) |

And the matching footnote near the /blobs/v4 section:

-`BYTES_PER_CELL` = `BYTES_PER_BLOB / CELLS_PER_EXT_BLOB` = `1024`
-(EIP-7594).
+`BYTES_PER_CELL` = `FIELD_ELEMENTS_PER_CELL * BYTES_PER_FIELD_ELEMENT` = `2048`
+([EIP-7594](https://eips.ethereum.org/EIPS/eip-7594)).

2. bodies.max_count mismatch between table and capabilities example

refactor-ssz.md MAX_* table:

| `MAX_BODIES_REQUEST` | `2**5` (32) | Shanghai |

refactor.md GET /engine/v2/capabilities example body:

"limits": {
  "bodies.max_count": 128,
  ...
}

128 only appears for blob requests (MAX_BLOBS_REQUEST = MAX_VERSIONED_HASHES_PER_REQUEST = 128). The bodies-side normative value is MAX_BODIES_REQUEST = 32 — inherited from Shanghai's engine_getPayloadBodiesByHashV1 which pins the count at 32. The example value of 128 in refactor.md looks like a copy-paste from the blob row.

Suggested fix:

-    "bodies.max_count":           128,
+    "bodies.max_count":           32,

3. payload.max_bytes has no normative constant backing the example

The MAX_* table in refactor-ssz.md doesn't define a MAX_REQUEST_BODY_SIZE (or similar), but the refactor.md capabilities example pins it at 67108864 (64 MiB). Implementers reading downstream of "limits.payload.max_bytes" have no upper-bound name to point at. It would help to either:

  • Add the constant to the SSZ MAX_* table (e.g. MAX_REQUEST_BODY_SIZE = 2**26 = 67108864), then reference it from refactor.md's capabilities example, or
  • Leave it unfixed and note explicitly that the value is operator-chosen and only advertised, not normative.

This came up while reading the Nethermind PR's MaxBodySize constant — its inline comment claimed it matched a spec constant that didn't actually exist.

4. BlobV3EntryWire vs BlobV2EntryWire (minor — sketch cleanup)

refactor-ssz.md §/blobs/v3:

BlobsV3Response {
    entries: List[BlobV2Entry, MAX_BLOBS_REQUEST]
}

So /blobs/v3 is meant to reuse BlobV2Entry verbatim. The text earlier in the same file ("BlobV3Entry = BlobV2Entry { available: Boolean; contents: BlobAndProofV2 } per spec" in the implementer-side comment) reads like there's a separate BlobV3Entry type. There isn't — the two are wire-identical and the sketch should just say "uses BlobV2Entry" without a separate name to avoid confusion in implementations (Nethermind currently has both as distinct types).


Happy to help with the spec wording if useful; the SSZ EL implementation is now aligned with the #1 (2048-byte cells) and #3 (64 MiB request limit, 32 bodies/req) interpretations above.

@MariusVanDerWijden

Copy link
Copy Markdown
Member Author

Hey, sorry coming back to this after being sidetracked by other things.
We have an implementation in geth here: ethereum/go-ethereum#35171

When I implemented it, I came across the same issues @LukaszRozmej specified, so I think we should update the spec as proposed. Another point I came across is that the Optional type is pretty wasteful for tightly packed structures. We could change it for some structures in order to save some bytes encoding/decoding, but I'm not super sure about it, I care more about encoding/decoding speed than size/waste

@MariusVanDerWijden

Copy link
Copy Markdown
Member Author

I pushed some changes on top of this PR to address some of the feedback, please let me know how you feel about it @yperbasis @LukaszRozmej @arnetheduck

@syjn99

syjn99 commented Jun 14, 2026

Copy link
Copy Markdown

Update from Prysm: Prysm is also implementing this one (OffchainLabs/prysm#16901), already tested with erigontech/erigon#21729 (commit hash: erigontech/erigon@1f9c45f). I guess the current geth version won't work with this Prysm version b/c of request/response type. It must be not a hard work to match new req/resp type (bare list vs. container) so will keep an eye on this.

@yperbasis

Copy link
Copy Markdown
Member

BuiltPayloadShanghai includes should_override_builder — off by one fork?

The new per-fork BuiltPayload catalogue in refactor-ssz.md defines BuiltPayloadShanghai with should_override_builder:

# Shanghai — Paris + should_override_builder
# (shouldOverrideBuilder was introduced in engine_getPayloadV3/Shanghai)
BuiltPayloadShanghai {
    payload:                 ExecutionPayloadShanghai
    block_value:             Uint256
    should_override_builder: Boolean
}

But shouldOverrideBuilder was introduced in engine_getPayloadV3 (Cancun/Deneb), not in the Shanghai method:

  • engine_getPayloadV2 (Shanghai/Capella) returns {executionPayload, blockValue} only.
  • engine_getPayloadV3 (Cancun/Deneb) is the first to return {executionPayload, blockValue, blobsBundle, shouldOverrideBuilder}.

The inline comment (shouldOverrideBuilder was introduced in engine_getPayloadV3/Shanghai) is also self-contradictory, since V3 is Cancun rather than Shanghai.

So one of these is intended:

  1. DeliberateBuiltPayloadShanghai really should carry should_override_builder (in which case the /Shanghai in the parenthetical is just misleading and could be dropped); or
  2. Typoshould_override_builder should first appear in BuiltPayloadCancun, with Shanghai being {payload, block_value} only (matching engine_getPayloadV2).

Could you confirm which you intend? We've implemented (1) literally to match the normative container, but it diverges from the legacy engine_getPayload method progression, so wanted to flag it before it's locked in.

Comment thread src/engine/refactor.md
- **HTTP status:** `200 OK` for all three payload-status outcomes.
`409 Conflict` is returned for an inconsistent forkchoice state
(today's `-38002`); `422 Unprocessable Entity` for invalid
`payload_attributes` (today's `-38003`); `409 Conflict` for a too-deep

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for "too deep reorg", 418 seems appropriate ;)

Comment thread src/engine/refactor.md

#### Removed concepts

- `engine_exchangeCapabilities` — replaced by `GET /capabilities`.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

@nflaig nflaig left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly wanna echo what Jacek mentioned here #793 (comment), it would be great if we can align this more with the beacon-api as we have figured out ssz/json and versioning across forks there already

Comment thread src/engine/refactor.md Outdated
| - | - | - |
| Payload | `POST /engine/v2/{fork}/payloads` | Submit a payload received from the CL gossip network for the EL to validate / import. Replaces `engine_newPayload`. |
| Payload | `GET /engine/v2/{fork}/payloads/{payloadId}` | Retrieve a built payload by id. Replaces `engine_getPayload`. CL polls when it wants a fresher snapshot. |
| Forkchoice | `POST /engine/v2/{fork}/forkchoice` | Atomic forkchoice update: update head/safe/finalized, optionally start a payload build, optionally update custody set. Replaces `engine_forkchoiceUpdated`. |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need the fork in the path, could adopt the same as we do in the beacon-api and sent it via header, so you don't need to touch the implementation at all if just the container changes across forks

Comment thread src/engine/refactor.md
Comment on lines +110 to +114
`expected_blob_versioned_hashes` is **removed**: it was a
defense-in-depth cross-check, but the block-hash check already covers
the transactions, so the EL recomputes the array from
`payload.transactions` during validation and a mismatch between CL
and EL views surfaces as `INVALID` exactly as before.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CL derives the expected hashes from consensus commitments, while the EL extracts the actual hashes from transactions. Without the CL-provided array, the EL has nothing to compare against.

I tend to keep keep the hashes to have a structural rejection.

Comment thread src/engine/refactor.md
Comment on lines +248 to +260
successive `GET`s against the same `{payloadId}` may return different
bytes. The EL **MUST** include `Cache-Control: no-store` on the
response, and intermediaries **MUST NOT** cache or revalidate this
resource. CLs **MUST NOT** treat the response as cacheable.

**Path validation.** `{payloadId}` is a path segment carrying a hex-
encoded `Bytes8`. The EL **MUST** validate that the path segment is
well-formed (8 bytes, hex) before dispatching to lookup logic; a
malformed segment returns `400 invalid-request`.

**Token TTL.** A `payloadId` is valid until either the payload was
retrieved by `GET /{fork}/payloads/{payloadId}` or another payload
was built via a forkchoice with payload attributes.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Successive get vs token expiry seems contradicting.

either the payload was retrieved

So it's a GET call so a successful response considered as retrieval and then as per spec above token will expire, so user can't make the same successive call again.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes I should clarify that a token ttl should be longer than a single get.
We should be able to get the payload twice, I think?

Comment on lines +851 to +853
When `available == false`, `contents` carries zero-valued bytes (a
`BYTES_PER_BLOB`-byte zero blob and a 48-byte zero proof) and CLs
MUST ignore them.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An available=false still carries a zero-filled fixed size blob and proofs. For 128 misses produce roughly 16MiB of zero bytes, for /blobs/v1 and same problem for blobs/v3 for cell proofs.

We can use the following ssz pattern to have an optional values.

  contents: Optional[BlobAndProofV1]

Where Optional[T] = List[T, 1]

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me

Comment thread src/engine/refactor.md Outdated

### Payload submission

#### `POST /engine/v2/{fork}/payloads`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification should require {fork} to match the fork derived from payload.timestamp, similar to the rule for payload_attributes in /forkchoice.

Some adjacent forks have wire-compatible payload/envelope shapes, so decoding alone cannot detect a request sent to the wrong fork URL. Without an explicit check, clients may inconsistently accept an Osaka payload through /prague/payloads.

We should add following to harden the specs.

The URL {fork} MUST match the fork determined from payload.timestamp. Otherwise, the EL MUST return 400 /engine-api/errors/unsupported-fork.

Comment thread src/engine/refactor.md
Comment on lines +418 to +430
```json
{
"supported_forks": ["paris", "shanghai", "cancun", "prague", "osaka", "amsterdam"],
"fork_scoped_endpoints": ["payloads", "forkchoice", "bodies"],
"independently_versioned": { "blobs": ["v1", "v2", "v3", "v4"] },
"unscoped_endpoints": ["capabilities", "identity"],
"limits": {
"bodies.max_count": 32,
"blobs.max_versioned_hashes": 128,
"payload.max_bytes": 67108864
}
}
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separate supported_forks and fork_scoped_endpoints arrays imply their Cartesian product, every listed endpoint is supported for every listed fork. Is that an intentional requirement?

If partial implementations are allowed, this format cannot accurately advertise them. For example, an EL might support SSZ payload submission for Osaka but only expose historical bodies for Cancun.

Consider either:

  1. Normatively requiring every advertised fork-scoped endpoint to support every entry in supported_forks; or

  2. Advertising forks per endpoint:

{
   "fork_scoped_endpoints": {
     "payloads": ["cancun", "prague", "osaka"],
     "forkchoice": ["osaka"],
     "bodies": ["paris", "shanghai", "cancun"]
   }
 }

This also gives CL implementations an unambiguous capability test without probing endpoints.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think thats a bit overengineered, but willing to go with it

Comment thread src/engine/refactor.md
Comment on lines +147 to +152
All three fields are processed in one transaction: the EL MUST apply
the forkchoice state, then (if `payload_attributes` is present and
the new head is `VALID`) start the build, then (if `custody_columns`
is present) update the custody set, all before returning. If the
forkchoice update fails, no build is started and no custody change
is applied.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This atomicity rule appears to conflict with the custody semantics below, which say the custody update runs independently and that custody errors must not affect payload_status.

If custody application fails after forkchoice/build succeeds, does the EL:

  • roll back forkchoice and payload building, as “one transaction” implies; or
  • commit forkchoice/build and retain the previous custody set?

The latter seems implied by the independent-update language, but the response has no field for reporting custody failure. Please define the commit behavior and how custody errors are surfaced.

Comment thread src/engine/refactor.md Outdated
Comment on lines +785 to +787
There is **no per-method fallback ladder**. A CL either uses v2 or
JSON-RPC for the lifetime of an EL connection; mixing transports
within a connection is permitted but not required.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either v2 or JSON-RPC for the lifetime of the connection" is workable as a coarse rule but leaves some practical questions unanswered for Amsterdam rollout:

  1. Activation gap. Amsterdam activates at a specific timestamp. If a CL ships v2 by activation but its paired EL is on the legacy API only, the CL hits 404 on every /engine/v2/... URL until the EL catches up. The spec says "fall back to JSON-RPC for the duration of that EL's lifetime" — does that mean one 404 forces the entire connection to JSON-RPC for as long as the process runs? What re-triggers the check after an EL upgrade? A reconnection? A periodic probe?

  2. Mixed-fork support. An EL might support /engine/v2/... for Cancun + Prague but not Amsterdam yet. Under the rule above, a single Amsterdam 404/unsupported-fork for /amsterdam/payloads would force the CL back to JSON-RPC for all methods including ones the EL handles fine via v2. That seems wasteful — could the policy be "fall back per-fork", not per-connection?

  3. Health-check semantics. Without per-method fallback, how does a CL know an EL has upgraded mid-run without restarting the connection? GET /capabilities is the natural probe but the spec doesn't suggest a polling cadence.

Comment thread src/engine/refactor.md
Comment on lines +333 to +335
`engine_getPayloadBodiesByRange` "no trailing nulls" rule. The CL
detects the unfilled suffix from the shortfall and re-issues against
the next fork URL if the range straddled a fork boundary.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fork-boundary detection is being pushed into every CL implementation. For sync workloads that range over multiple forks (e.g. payload-body backfill), the CL needs to know each block's fork before constructing the URL, then handle the shortfall + re-issue dance for spanning ranges.

This is workable but worth either:

  1. A reference helper in the spec for "given (from, count), partition into per-fork segments by timestamp." Otherwise every CL re-implements the same fork-boundary math.

  2. Or an explicit error type /engine-api/errors/range-spans-fork instead of silent truncation, so a CL that didn't notice the shortfall doesn't quietly miss blocks.

Without either, a CL that omits the partitioning step looks at a short response and might conclude it hit head, when in fact the range crossed a boundary. The failure mode is silent.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add an explicit error here

@RazorClient

Copy link
Copy Markdown
Member

The following discussions were taken during the call

  • Moving the fork into the header and out of the url
  • Moving from engine/v2/ down to engine/v1/
  • having both json and ssz co-existing and gradually making it compulsory past a certain fork per say
  • clients can start shipping this as fast as they want with a flag
    after the fork beyond which it becomes compulosry json-engine rpc is retired
  • /engine/vx/{fork}/payloads/witness would be the way to go for Introduction of REST based + SSZ serialized new-payload-with-witness #773
  • progressive will not be used for now.when 7688 is sfi-ed,we can rework to make the design using progressives

no specific devnet for this,can be added to the glamsterdam devnets and the flags can be enabled

@MariusVanDerWijden

Copy link
Copy Markdown
Member Author

Updated the spec according to the results from the discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.