Skip to content

feat(grpc): Phase 1 — in-house gRPC channel foundation#523

Closed
userFRM wants to merge 9 commits into
mainfrom
feat/grpc-channel-phase1
Closed

feat(grpc): Phase 1 — in-house gRPC channel foundation#523
userFRM wants to merge 9 commits into
mainfrom
feat/grpc-channel-phase1

Conversation

@userFRM
Copy link
Copy Markdown
Owner

@userFRM userFRM commented May 13, 2026

Closes #522.

Summary

Lays the foundation for replacing tonic on the MDDS path with an
in-house gRPC client that runs directly on the h2 crate. The new
stack lives under crates/thetadatadx/src/grpc/ behind the
inhouse-grpc Cargo feature; the existing tonic-backed MddsClient
remains the default code path.

The work in this PR:

  • adds a phantom-typed [Codec] performing prost encode/decode plus
    5-byte length-prefix gRPC framing ([1 byte compressed flag][4 bytes BE length][payload]),
    verified byte-for-byte against the gRPC HTTP/2 wire spec;
  • adds a [Status] parser that lifts grpc-status / grpc-message
    out of HTTP/2 trailers and refuses to panic on a missing or malformed
    trailer;
  • adds a [Channel] that owns one HTTP/2 connection (TLS via
    tokio-rustls, or plaintext h2c for sidecars and mocks), with the
    connection driver running on a background tokio task that is
    cancelled when the channel drops;
  • adds a [ServerStreaming] adapter implementing futures_core::Stream
    over an h2::RecvStream plus a BytesMut accumulator, with explicit
    state advancement from ReceivingAwaitingTrailersClosed;
  • wires stock_list_symbols end-to-end through the new transport as
    proof-of-life (and exposes stock_list_symbols_via_tonic so the
    bench compares like-for-like);
  • adds a mock h2 server (tests/grpc_mock_server.rs) used by both the
    integration tests and the bench;
  • adds a criterion bench comparing in-house vs tonic on the same mock,
    including a counting allocator that reports bytes alloc'd per call.

Test plan

  • Unit tests for codec: roundtrip, partial header, partial payload,
    compressed-flag rejection, invalid-flag rejection, oversized
    frame rejection, two-frame demultiplex, wire-format parity vs
    the gRPC spec, plus a proptest over arbitrary Vec<Vec<u8>>
    payloads.
  • Unit tests for status: OK, error-with-message, missing trailer,
    non-numeric status, non-UTF8 status, non-UTF8 message, Display
    collapses on empty message.
  • Integration tests against the mock h2 server: single chunk,
    multi-chunk in order, non-OK status surfaces as ChannelError::Rpc,
    connect to closed port fails cleanly.
  • End-to-end test for stock_list_symbols through the in-house
    Channel (single chunk and two-chunk merge).
  • cargo fmt --all -- --check
  • cargo clippy --workspace --locked -- -D warnings
  • cargo clippy -p thetadatadx --features inhouse-grpc --benches --tests -- -D warnings
  • cargo test --workspace --features thetadatadx/inhouse-grpc
  • cargo run -p thetadatadx --features config-file --bin generate_sdk_surfaces -- --check
  • python3 scripts/check_docs_consistency.py

Bench numbers

Run: cargo bench --bench grpc_channel -p thetadatadx --features inhouse-grpc
against the in-tree mock h2 server on loopback, 100 samples, 256-symbol
response payload.

Path p50 p95 p99 alloc/call
in_house 107.84 µs 114.39 µs 116.90 µs 83,423 B
tonic 136.41 µs 141.50 µs 143.30 µs 104,139 B

(Mock server on loopback isolates the client-side transport difference;
production numbers will be dominated by network RTT and TLS, so these
are a floor-floor comparison of the framing and dispatch overhead, not
a wall-clock claim against MDDS.)

Public surface impact

Default features unchanged. With --features inhouse-grpc:

  • thetadatadx::grpc::{Channel, ChannelError, Codec, CodecError, ServerStreaming, Status, StatusParseError}
  • thetadatadx::grpc::{stock_list_symbols, stock_list_symbols_via_tonic}

Tonic stays in Cargo.toml and the existing MddsClient paths are
untouched. Python / TypeScript / C++ bindings continue to work without
recompile because the default feature set produces an identical Rust
surface.

Constraints honored

  • Conventional Commits on every commit; each commit is one logical step.
  • No #[allow(dead_code)]. No .unwrap() on user data or wire bytes.
  • No manager / helper / util in names — only gRPC spec vocabulary.
  • Workspace lints stay strict.
  • No reference to internal review history, PR numbers, or external
    tooling in source comments.

userFRM added 9 commits May 13, 2026 15:48
Declares the `inhouse-grpc` Cargo feature on `thetadatadx` and the
empty `src/grpc/` module gated behind it. Optional dependencies
(`h2`, `http`, `http-body`, `bytes`, `pin-project-lite`, `futures-core`)
are pulled in only when the feature is enabled; default builds remain
on the existing tonic stack.

The feature flag is the A/B switch consumers will use to opt into the
in-house gRPC path while it matures. Public Rust surface is unchanged
by default, so cross-binding ABI and semver-checks stay clean.
`grpc::Codec<Req, Resp>` is a phantom-typed wrapper over prost
encode/decode plus the 5-byte gRPC frame header
(`[1 byte compressed flag][4 bytes big-endian length][payload]`).
Phase 1 rejects `compressed_flag == 1` and any reserved-bits value;
Phase 5 will negotiate `grpc-encoding`.

Tests (TDD-first):
- encode emits a 5-byte header then the prost payload
- encode matches the gRPC wire spec byte-for-byte (anchors the codec
  to grpc.io semantics, not to tonic internals)
- roundtrip preserves protobuf wire bytes
- decoder returns `Ok(None)` on partial header / partial payload
- decoder rejects compressed flag, invalid flag bytes, oversized frames
- two concatenated frames decode back to two distinct messages
- proptest: arbitrary `Vec<Vec<u8>>` of payloads roundtrips through the
  framing layer with payload bytes and frame boundaries preserved

`grpc::Status` is included as a typed stub so the module compiles; the
trailer parser lands in the next commit.

No `.unwrap()` on wire bytes; all decode paths return `Result`.
`grpc::Status::from_trailers` walks an `http::HeaderMap` and pulls
`grpc-status` (required, numeric) and `grpc-message` (optional, UTF-8)
into a typed `Status` value. Display, code accessor, and `is_ok`
shortcut round out the surface.

Tests (TDD-first):
- OK status (`grpc-status: 0`) parses to `is_ok()`
- error status (`grpc-status: 13, grpc-message: "internal"`) preserves
  both code and message
- missing trailer is `StatusParseError::Missing` — not a panic
- non-numeric / non-UTF8 `grpc-status` and non-UTF8 `grpc-message` each
  produce a typed parse error rather than panicking on a hostile peer

The struct is `Clone + PartialEq` so error-path tests can compare it,
and `Display` collapses to `grpc-status=N` when the message is empty.

References:
- https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
- https://grpc.github.io/grpc/core/md_doc_statuscodes.html
Rewrite every comment, rustdoc, and error message in the `grpc` module
in factual present tense. Removes "Phase 1" / "Phase 5" / "proof-of-life"
/ "for now" / "until later" / "production code path until..." /
"docs commit that closes this phase" / "re-export decision Phase 1 does
not need to make" and similar project-management vocabulary.

Source describes what the code does today. The codec rejects compressed
frames — full stop — rather than "Phase 5 will negotiate `grpc-encoding`".
The user-visible `CodecError::CompressionUnsupported` message is updated
to match.
`grpc::Channel` owns one HTTP/2 connection. `connect_tls` opens a TLS
session via `tokio-rustls` (caller supplies a `rustls::ClientConfig`
with `h2` ALPN); `connect_h2c` opens a plaintext connection for local
sidecars and mock harnesses. The h2 connection runs on a dedicated
tokio task spawned at connect; it terminates when the channel drops.

`server_streaming` POSTs a length-prefixed prost request over a new
HTTP/2 stream and returns a `ServerStreaming<Resp>` adapter. The
adapter drains DATA frames into a `BytesMut` accumulator, runs each
complete frame through `Codec::decode`, and on body close awaits
trailers via h2's `poll_trailers` to surface either `Stream::None`
(OK status) or `Err(ChannelError::Rpc)` (non-OK status).

`tests/grpc_mock_server.rs` spins a per-test `tokio`-driven h2 server
that drains the request, emits N hardcoded `ResponseData` frames, and
closes with configurable `grpc-status` / `grpc-message`. Four
integration tests cover:

- single-chunk decode roundtrip
- multi-chunk in-order delivery (3 chunks, distinct payloads)
- non-OK status surfaces as `ChannelError::Rpc` with code + message
- connect to a closed port fails cleanly without hanging

All paths return `Result`; no `.unwrap()` on wire bytes. The
`ChannelError` enum carries owned `String`s for diagnostics so error
values survive past the buffers that produced them.

References:
- https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
- https://www.rfc-editor.org/rfc/rfc7540 (HTTP/2)
`grpc::stock_list_symbols(channel, session_uuid, client_type)` issues
`BetaThetaTerminal::GetStockListSymbols` over the in-house transport.
The function builds the same `QueryInfo` the tonic path uses
(auth_token.session_uuid, client_type, terminal_version, the lone
`client=terminal` query parameter), calls
`Channel::server_streaming`, then runs the streamed `ResponseData`
frames through the existing `decode::decode_data_table` +
`decode::extract_text_column` pipeline so both transports share the
parser.

A new `tests/grpc_stock_list_symbols.rs` integration test composes a
zstd-compressed `DataTable` of symbols into a `ResponseData`, drives
the mock h2 server from `grpc_mock_server.rs`, and asserts that the
in-house path returns the decoded symbol list. Two cases:
single-chunk and two-chunk merge.

The tonic-backed `stock_list_symbols` on `MddsClient` is untouched
and remains the default path; the in-house function is reachable
only when the `inhouse-grpc` feature is enabled.
`benches/grpc_channel.rs` measures `stock_list_symbols` against the
same mock h2 server the integration tests use, exercising both the
in-house `grpc::Channel` and `tonic::transport::Channel` end-to-end
(connect, encode, dispatch, decode, return `Vec<String>`).

The bench tracks allocations through a `#[global_allocator]` wrapper
that atomic-tallies bytes alloc'd and dealloc'd. The summary line
reports `alloc_per_call` (gross bytes touched per iteration) so the
two paths can be compared on bandwidth as well as latency.

Bench is informational (`required-features = ["inhouse-grpc"]`); CI
runs it but does not gate on percentile movement. Initial local
numbers (Linux, 100 samples, mock server on loopback, 256 symbols):

  in_house  p50=107.84us  p95=114.39us  p99=116.90us  alloc/call=83423B
  tonic     p50=136.41us  p95=141.50us  p99=143.30us  alloc/call=104139B

A small `stock_list_symbols_via_tonic` helper in `grpc::endpoints`
runs the same request shape through the generated tonic stub so the
bench compares like-for-like without exposing the crate-private
`proto` module on the public surface.
The new `grpc_channel` bench enables criterion's `async_tokio` feature
to drive `.to_async(&rt).iter(...)`. Cargo.lock learns that criterion
now pulls `tokio` into its transitive graph for the bench profile.
Top-of-module rustdoc on `grpc/mod.rs` lays out the gRPC over HTTP/2
contract — request pseudo-headers, length-prefix frame layout, response
trailer schema — with citations to the canonical specs at grpc.github.io
and the gRPC HTTP/2 wire document on github.com/grpc/grpc.

Module-layout section maps each submodule (`Codec`, `Status`, `Channel`,
`ServerStreaming`, `endpoints`) to the role it plays in the request /
response pipeline so a first-time reader can navigate the code from
the top of `mod.rs`.

Also tightens a few intra-doc links that landed in earlier commits
(`super::Status` in `channel.rs`, two `DEFAULT_MAX_MESSAGE_SIZE`
references in `codec.rs` that pointed at a `pub(crate)` item, and
the `MddsClient` reference in `endpoints.rs`).
@userFRM userFRM closed this May 13, 2026
@userFRM userFRM deleted the feat/grpc-channel-phase1 branch May 13, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(grpc): in-house gRPC transport replacing tonic

1 participant