Replies: 1 comment
-
|
@kta-intel , this looks great, thanks for sharing the vision and write-up! As discussed offline, could you rename the local grpc client/server resp. to interop client/server in the text and in the diagrams? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Enable interoperability between Flower and OpenFL
Motivation
Proposal
Enable Flower workloads to run through OpenFL's Task Runner API. Consider a standard federation with a centralized aggregation server and distributed collaborator client sites. At the server Flower server components will communicate locally with OpenFL server components via the
FlowerInteropClient(local gRPC client). At the client sites, Flower client components will communicate locally with OpenFL client components via theFlowerInteropServer(local gRPC server). The message across the network will be managed by OpenFL.This will require:
FlowerInteropClientandFlowerInteropServer) and that can relay messages between the Flower and OpenFL componentsConnectorcomponent that will manage theFlowerInteropClientand Flower server componentsFlowerInteropServerand Flower client componentsConnectorHow it will work
plan.yamlthat includes the Connector and the Flower Task Runner and Data Loader../srcdirectory, a user will add the source code required to run a Flower experiment (i.e. app-pytorch)fx aggregator startwill start the aggregator server, theSuperLink, and theServerAppfx collaborator start -n collaboratorXwill start the collaborator client. The collaborator call theGetTask()which will pull in a single taskstart_client_adapterwhich will instruct the Task Runner to start the SuperNode and ClientAppServerAppwill shutoff, which will trigger theFlowerInteropClientto amend a flag to end the experiment as a metadata in theInteropMessage. This will signal the collaborator to shutdown theClientAppandSuperNode, then callTaskResultswhich is intended to signal the end of the OpenFL round. OpenFL will only run for 1 round (the Flower will run for as many rounds as indicated in the Flower app's pyproject.toml. This will start shutdown of all components and thus complete the experiment[Sequence Diagram to come]
We will split this into Two Phases for now
Phase 1 [Completed]
Contain the majority of workspace related changes inside a standalone workspace
flower-app-pytorch#1433Phase 2 [In Progress]
Begin to migrate extended methods and added scripts to OpenFL core-level components
Beta Was this translation helpful? Give feedback.
All reactions