shardgate

Playing around with rust and distributed systems.

Core components

node = storage node responsible for storing and retreiving data chunks in its 1:1 sqlite database
control-plane = core orchestrator service that handles chunk -> node mappings, performs health checks, handles replication and node management. Exposes basic admin endpoints to get an idea of your system state.
cli = client to upload and download files
assembler = service which stores file manifests for retreival allowing the cli client to request a filename, which it can then map to the client id and content hash for retreival

Interactions

UPLOAD

User uploads file via cli
Cli hashes file contents for unique id, breaks file into fixed size chunks and sends them to the control-plane to be stored
Cli also generates a manifest of the file containing metadata such as original filename, clientid and a list of chunks generates as -. This is sent to the assemlber.
Control plane using ring hash to determine which node to send each chunk too, each chunk will be replicated n times based on policy config. Each chunk is sent to a node to be stored.
Nodes receive chunks and PUT them into their db under the chunk id

DOWNLOAD

User requests download of a fileame via cli
Cli sends filename and client id to assembler to handle file retreival
Assembler finds the manifest matching the client and file name
Uses futures to send request for all file chunks for given manifest to the control plane and awaits return.
Control plane ring hash is deteministic so will know which nodes possibly hold the data chunks, it sends get requests to these nodes and returns whichever returns first back to the assembler.
Once all chunks received, assembler recreates file in index order and sends back to cli client as file download. If failed chunks it aborts.

BACKGROUND

Control-plane performs health checks on nodes for faile or absent responses
If node has failed health check, it marks it as dead, finds the chunks the nodes has and replicates them to another node to maintain n replica.
It then spawns another node to ensure node fleet maintained at certain threshold.

ENVOY?

I originally tried this using envoy as the orchestrator, works fine for single node storage but harder to have explicit control over replica and track where chunks are stored. Maybe envoy comes back later in some way or form, I do want to explore proxys more

TODO

would be cool to have a UI that can link in to the admin endpoints and visualise the node network and how the data is being transferred.
maybe some basic controls to stress test the network to see how it performs

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
crates		crates
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

shardgate

Core components

Interactions

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

shardgate

Core components

Interactions

TODO

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages