Skip to content

Abstract away gRPC details from the worker protocol #483

Description

@gabotechs

Today, this project is deeply tied to gRPC and protobuf, as the worker protocol is autogenerated from the following protobuf definition:

Typically, this is not a problem, as most projects are fine with just sticking with the classical gRPC+protobuf stack for inter-worker communication, but there are some cases where people would like to use different transportation mechanism.

Providing custom implementations for the prost generated stubs is not acceptable, as that still forces people into protobuf serde when they might be fine with either just in-memory comms or their own proprietary protocols.

The source of truth for all worker interactions is defined in the protobuf specification here:

service WorkerService {
  // Establishes a bidirectional message stream between a coordinator and a worker, over which messages
  // will be exchanged at any time during a query's lifetime. It's expected to be one coordinator channel
  // per task.
  rpc CoordinatorChannel(stream CoordinatorToWorkerMsg) returns (stream WorkerToCoordinatorMsg);
  // Executes the requested partition range of a subplan previously sent by the coordinator channel.
  rpc ExecuteTask(ExecuteTaskRequest) returns (stream FlightData);
  // Returns metadata about a worker. Currently only used for worker versioning.
  rpc GetWorkerInfo(GetWorkerInfoRequest) returns (GetWorkerInfoResponse);
}

And one possible way would be to build a new trait that abstracts away the WorkerServiceClient currently autogenerated by prost, and exposes an interface similar in shape to the one represented in today's worker.proto, but not coupled to gRPC or protobuf.

A structure that comes to mind can be to declare the protocol definition as a set of well documented traits in:

src/
  worker/
    protocol.rs

And specific implementations place somewhere else, like:

src/
  protocol_impl/
    grpc/
    ...

Ideally, that should allow hiding all gRPC-specific deps behind a feature flag, and even omit compiling them if people do not care about gRPC.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions