From 12dcfccca95f62ce612be268e35eab6af1f067de Mon Sep 17 00:00:00 2001 From: Jvst Me Date: Thu, 22 May 2025 12:22:28 +0200 Subject: [PATCH 1/2] Update dstack-proxy contributing guide --- CONTRIBUTING.md | 4 +- contributing/GATEWAY.md | 74 ------------- contributing/PROXY.md | 194 ++++++++++++++++++++++++++++++++++ contributing/RUNS-AND-JOBS.md | 2 +- gateway/README.md | 52 +-------- 5 files changed, 201 insertions(+), 125 deletions(-) delete mode 100644 contributing/GATEWAY.md create mode 100644 contributing/PROXY.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 3b65f0e94c..55a4ce65d2 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -54,6 +54,8 @@ uv run pytest src/tests --runpostgres If you'd like to integrate a new cloud provider to `dstack`, follow [contributing/BACKENDS.md](contributing/BACKENDS.md). -## Get help +## What's next + +You can find more subject-focused guides in the [contributing](contributing/) directory. If you have any questions, you can always get help in our [Discord](https://discord.gg/u8SmfwPpMd) community. diff --git a/contributing/GATEWAY.md b/contributing/GATEWAY.md deleted file mode 100644 index 4fabe316fd..0000000000 --- a/contributing/GATEWAY.md +++ /dev/null @@ -1,74 +0,0 @@ -# Gateway - -A `dstack` gateway is a dedicated instance responsible for publishing user applications to the outer internet via the HTTP protocol. One dstack gateway can serve many services, domains, or projects. - -## Gateway creation - -Gateways are managed by the `dstack` server. A gateway is associated with a project and some backend in the project. Users must attach a wildcard domain to the gateway, i.e., all direct subdomains should resolve to the gateway IP address. Since the IP address is unknown during provisioning, `dstack` doesn't check DNS records. - -Provisioning happens as follows: -1. Launch a non-GPU instance (usually the smallest) with all ports exposed. -2. Install Nginx, Certbot, and patch configs. -3. Create blue-green virtual environments. -4. Install the latest `dstack-gateway` from the S3 bucket. -5. Run the systemd service `dstack.gateway.service`. - -## Gateway update - -The `dstack-gateway` has a "blue-green deployment"-like configuration: there are two virtual environments to be swapped on update. The systemd service uses the newly installed package after a restart. - -The update process looks like this: -1. Install the new package to the not-used venv. -2. Update scripts and systemd service config. -3. Swap the active venv name in the file `version`. -4. Restart the systemd service. - -The `dstack-gateway` server dumps its internal state to the file `~/dstack/state.json` on termination. It tries to load the state from the same file on start. That allows updating the gateway with published services with minimal downtime. - -## Connection between server and gateway - -The `dstack` server keeps a bidirectional tunnel with each GatewayCompute for the whole uptime of the server. - -- The tunnel from the server to the gateway is used to manage the gateway: register and unregister services and replicas. -- The tunnel from the gateway to the server is used to authenticate requests to the gateway based on dstack's tokens. - -Authorization responses are cached for 60 seconds. If the server is not responding, the request is denied. - -## Nginx - -`dstack-gateway` configures an Nginx reverse proxy. Each service or entrypoint configuration is stored as `/etc/nginx/sites-enabled/{port}-{server_name}.conf`. If the Nginx reload fails, `dstack-gateway` rolls back the changes. - -`dstack-gateway` enforces HTTPS (except for local traffic). On each service registration a TLS certificate is issued by Let's Encrypt or other configured CA via Certbot. - -If there are no replicas, the service configuration always returns 503; otherwise, the upstream with replicas is used. The upstream handles load balancing for us. `dstack-gateway` uses Unix sockets for SSH tunnels to avoid port conflicts between services. - -Service authorization is handled with the `localhost:8000/auth` endpoint if needed. `dstack-gateway` may request services without authorization and HTTPS, for example, from the OpenAI interface. - -Entrypoint configurations forward requests back to `dstack-gateway`, to a specific module (e.g., OpenAI). Authorization is handled by those modules. - -## Gateway registry - -The core component of `dstack-gateway` is the services store. It is responsible for: - -- Registering a service — assigning a domain, creating an Nginx config. -- Registering a replica — starting an SSH tunnel, updating Nginx upstream. -- Unregistering a replica — stopping an SSH tunnel, updating Nginx upstream. -- Unregistering a service — releasing a domain, removing an Nginx config. -- Registering an entrypoint — assigning a domain, creating an Nginx config. - -To decouple the store from other modules, there is a subscription mechanism. Subscribers will be notified on register service and unregister service. - -## OpenAI interface - -The OpenAI interface subscribes to `Store` events and emulates the real OpenAI API for chat completion models. It can list running models in the project and redirect requests to the right service. - -## Stats Collector - -The Stats collector parses nginx `/var/log/nginx/dstack.access.log` to collect basic metrics: - -1. Requests per second -2. Average request processing time - -By default, it stores 5 minutes with 1-second resolution frames for each domain. It aggregates these frames in windows of the size 30 seconds, 1 minute, and 5 minutes, before sending to the server. - -To increase performance, `StatsCollector` keeps position in file and read only new records. It can detect log rotation and reopen the log file. diff --git a/contributing/PROXY.md b/contributing/PROXY.md new file mode 100644 index 0000000000..a88a063d7d --- /dev/null +++ b/contributing/PROXY.md @@ -0,0 +1,194 @@ +# `dstack-proxy` + +`dstack-proxy` is a set of `dstack` components responsible for exposing [services](https://dstack.ai/docs/concepts/services/). + +- By default, services are published at `dstack` server URL subpaths. The component that handles traffic to such services is called **in-server proxy**. It runs as part of the `dstack` server. It is implemented in `dstack._internal.server.services.proxy`. +- Users can optionally deploy a **gateway** to handle traffic to their services. The gateway app runs on a dedicated instance. Although it requires additional configuration, it provides higher performance and supports more features than the in-server proxy. It is implemented in `dstack._internal.proxy.gateway`. +- The in-server proxy and the gateway share some business logic, such as the OpenAI-compatible API and the connection pool implementation. The common details are implemented in `dstack._internal.proxy.lib`. + +## Proxy functions and modules + +### Reverse proxy + +`dstack-proxy` acts as a reverse proxy and load balancer for services. + +The in-server proxy uses a custom reverse proxy implementation based on FastAPI and httpx. It routes requests based on the `/proxy/services//` path. It performs load balancing by selecting a random service replica for each forwarded request. + +The gateway uses Nginx. It automatically maintains Nginx configs for each service in `/etc/nginx/sites-enabled/*`. Each service is published at a subdomain. Nginx performs load balancing using the round-robin method. Nginx forwards requests directly to service replicas, so traffic does not go through the gateway app. + +### HTTPS + +The in-server proxy is part of the `dstack` server, so services are only available over HTTPS if the `dstack` server is deployed with HTTPS. + +The gateway can enforce HTTPS in the Nginx config. It uses Certbot to obtain TLS certificates from Let's Encrypt or another configured CA when the service is registered, unless the service configuration specifies `https: false`. + +### Auth + +Unless the service configuration specifies `auth: false`, `dstack-proxy` checks the authorization of incoming requests based on `dstack` user tokens from the `Authorization` header. + +The in-server proxy validates the tokens by querying the correct token from the database. + +On gateways, Nginx makes a headers-only subrequest to the gateway app to check authorization for each incoming request. The gateway app then makes a request to the `dstack` server to validate the token. Responses from the `dstack` server are cached for 60 seconds. If the `dstack` server does not respond, the incoming request is denied. + +### OpenAI-compatible API + +The OpenAI interface emulates the real OpenAI API for chat completion models. It can list running models in the project, convert between OpenAI and TGI request formats, and forward requests to the correct service. + +The in-server proxy forwards requests directly to service replicas. + +```mermaid +sequenceDiagram + User->>dstack server: http(s):///proxy/models// + dstack server->>Service replica: http+unix:///path/to/ssh/socket +``` + +The gateway uses Nginx to forward requests so they are included in access logs and service stats. + +```mermaid +sequenceDiagram + User->>Nginx: http(s)://gateway./ + Nginx->>Gateway app: http://localhost:8000/api/models// + Gateway app->>Nginx: http://./ + Nginx->>Service replica: http+unix:///path/to/ssh/socket +``` + +### Stats collector + +`dstack-proxy` collects service usage stats that are then used by the `dstack` server for autoscaling. Stats collection is only supported on gateways and is implemented by reading `/var/log/nginx/dstack.access.log` + +### Service connection pool + +`dstack-proxy` connects to service replicas via SSH and forwards their service port to a local Unix socket, which is then used by the reverse proxy. All SSH connections are added to a pool and reused between requests. + +The in-server proxy opens an SSH connection when it needs to forward the first request to it, so there may be a delay when forwarding the first request. + +The gateway opens SSH connections when each replica is registered. + +### Communication with the `dstack` server + +The in-server proxy is part of the `dstack` server, so no network communication takes place between them. + +Gateway-to-server communication happens over an SSH connection that is established by the `dstack` server when it creates a new gateway or starts. The SSH connection includes bidirectional port forwarding, enabling the gateway and the `dstack` server to call each other's APIs: + +- The server calls the gateway to register services, fetch stats, etc. +- The gateway calls the server to validate user tokens. + +### Storage + +`dstack-proxy` has a set of stored models and a common storage repo interface. + +The in-server proxy repo implementation fetches the models from the database. It doesn't write to the database, as all the details about services and replicas are written by the `dstack` server. + +The gateway maintains its own in-memory storage repo. A copy of the repo is also stored in `~/dstack/state-v2.json`. The copy is updated on every write operation and is used for data recovery after restarts. The repo is populated when new services and replicas are registered. + +## Dependency injection + +When a module has to be implemented differently for the in-server proxy and the gateway, `dstack-proxy` uses interfaces with multiple implementations. For example, there are common interfaces for the storage repo and for auth checks. The in-server proxy and the gateway provide their own implementations for both interfaces. + +```mermaid +classDiagram + class BaseProxyRepo { + <> + } + BaseProxyRepo <|-- ServerProxyRepo + BaseProxyRepo <|-- GatewayProxyRepo + + class BaseProxyAuthProvider { + <> + } + BaseProxyAuthProvider <|-- ServerProxyAuthProvider + BaseProxyAuthProvider <|-- GatewayProxyAuthProvider +``` + +`dstack-proxy` then uses dependency injection to select the relevant implementation. Both the in-server proxy and the gateway provide an "injector" class that is used to obtain concrete interface implementations. + +```mermaid +classDiagram + class ProxyDependencyInjector { + <> + +get_repo() BaseProxyRepo + +get_auth_provider() BaseProxyAuthProvider + } + class ServerProxyDependencyInjector { + +get_repo() ServerProxyRepo + +get_auth_provider() ServerProxyAuthProvider + } + class GatewayDependencyInjector { + +get_repo() GatewayProxyRepo + +get_auth_provider() GatewayProxyAuthProvider + } + ProxyDependencyInjector <|-- ServerProxyDependencyInjector + ProxyDependencyInjector <|-- GatewayDependencyInjector +``` + +An instance of the relevant injector class is stored in the FastAPI global app state and can be accessed from FastAPI path operations. There are helper functions that can be used as FastAPI path operation dependencies to obtain the injector or a module implementation: `get_injector`, `get_proxy_repo`, `get_proxy_auth_provider`, etc. + +There are also similar gateway-specific helper functions to obtain gateway-specific module implementations: `get_gateway_proxy_repo` (guaranteed to return the gateway repo, which has more methods than the base repo interface), `get_nginx`, `get_stats_collector`, etc. + +## Gateway operations + +Gateway instances are managed by the `dstack` server. A gateway is associated with a project and some backend in the project. In `dstack` Sky, there is also a global gateway associated with all projects. + +### Creation + +Users can create a gateway using the `dstack apply` command. The gateway YAML configuration must specify the gateway's wildcard domain - all direct subdomains should resolve to the gateway IP address. Since the IP address is unknown during provisioning, `dstack` doesn't check DNS records. + +Provisioning happens as follows: +1. Launch a non-GPU instance (usually the smallest) with all ports exposed. +2. Install Nginx, Certbot, and patch configs. +3. Create blue-green virtual environments. +4. Install the latest `dstack-gateway` package from the S3 bucket. `dstack-gateway` is a thin package that depends on the `dstack` package, which contains the actual gateway implementation. +5. Run the systemd service `dstack.gateway.service`. + +### Update + +The gateway has a "blue-green deployment"-like configuration: there are two virtual environments to be swapped on update. The systemd service uses the newly installed package after a restart. + +The update process looks like this: +1. Install the new package to the unused venv. +2. Update scripts and systemd service config. +3. Swap the active venv name in `~/dstack/version`. +4. Restart the systemd service. + +## Gateway development + +The gateway app needs to interact with Nginx and certbot, so running it locally can be challenging. One way to test your code is to upload your development branch to an existing gateway and run the gateway app from source. + +1. Run `dstack server` with `DSTACK_SKIP_GATEWAY_UPDATE=1` environment variable. This will prevent `dstack` from updating and starting the standard gateway version on each server restart. + +1. Provision a gateway through `dstack`: + + ```shell + dstack apply -f my-gateway.dstack.yml + ``` + +1. Save the gateway key to a file. You can find the key in sqlite, e.g.: + + ```shell + sqlite3 ~/.dstack/server/data/sqlite.db "SELECT ip_address, ssh_private_key FROM gateway_computes" + ``` + +1. Connect to the gateway: + + ```shell + chmod 600 /path/to/the/gateway/key + ssh -i /path/to/the/gateway/key ubuntu@gateway.example + ``` + +1. Prepare an environment with your development branch on the gateway: + + ```shell + git clone https://github.com/dstackai/dstack.git dstack-repo + cd dstack-repo + git checkout my-development-branch + curl -LsSf https://astral.sh/uv/install.sh | sh + source ~/.local/bin/env + uv sync --extra gateway + ``` + +1. Stop the gateway service and start your development version from source: + + ```shell + sudo systemctl stop dstack.gateway.service + uv run uvicorn dstack._internal.proxy.gateway.main:app + ``` diff --git a/contributing/RUNS-AND-JOBS.md b/contributing/RUNS-AND-JOBS.md index 385f52fbe9..b2c0430af4 100644 --- a/contributing/RUNS-AND-JOBS.md +++ b/contributing/RUNS-AND-JOBS.md @@ -11,7 +11,7 @@ Runs are created from run configurations. There are three types of run configura 1. `dev-environment` — runs a VS Code server. 2. `task` — runs the user's bash script until completion. -3. `service` — runs the user's bash script and exposes a port through the [gateway](GATEWAY.md) or the server built-in proxy. +3. `service` — runs the user's bash script and exposes a port through [dstack-proxy](PROXY.md). A run can spawn one or multiple jobs, depending on the configuration. A task that specifies multiple `nodes` spawns a job for every node (a multi-node task). A service that specifies multiple `replicas` spawns a job for every replica. A job submission is always assigned to one particular instance. If a job fails and the configuration allows retrying, the server creates a new job submission for the job. diff --git a/gateway/README.md b/gateway/README.md index f70a7661f1..05d7678292 100644 --- a/gateway/README.md +++ b/gateway/README.md @@ -1,51 +1,5 @@ -# dstack gateway +# `dstack-gateway` -## Purpose +A thin package to deliver and install the gateway app. Expected to be merged with the `dstack` package in [#2251](https://github.com/dstackai/dstack/issues/2251). -* Make dstack services available to the outside world -* Manage SSL certificates -* Manage nginx configs -* Establish SSH tunnels from gateway to dstack runner -* Proxy OpenAI API requests to different formats (e.g. TGI) - -## Development - -1. Run `dstack server` with `DSTACK_SKIP_GATEWAY_UPDATE=1` environment variable. This will prevent dstack from updating the gateway to standard version on each server restart. - -1. Provision a gateway through dstack: - - ```shell - dstack gateway create --backend aws --region us-east-1 --domain my.wildcard.domain.com - ``` - -1. Save the gateway key to a file. You can find the key in sqlite, e.g.: - - ```shell - sqlite3 ~/.dstack/server/data/sqlite.db "SELECT ip_address, ssh_private_key FROM gateway_computes" - ``` - -1. Build gateway locally and deploy it: - - ```shell - HOST=ubuntu@x.my.wildcard.domain.com - ID_RSA=/path/to/the/gateway/key - WHEEL=dstack_gateway-0.0.0-py3-none-any.whl - - python -m build . - scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i "${ID_RSA}" "./dist/${WHEEL}" "${HOST}":/tmp/ - ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i "${ID_RSA}" "${HOST}" "/bin/sh /home/ubuntu/dstack/update.sh /tmp/${WHEEL} dev" - ``` - -1. Open SSH tunnel to the gateway: - - ```shell - ssh -L 9001:localhost:8000 -i "${ID_RSA}" "${HOST}" - ``` - -1. Visit the gateway docs page at http://localhost:9001/docs - -1. To follow logs, use this command in SSH: - - ```shell - journalctl -u dstack.gateway.service -f - ``` +For details about gateways, see [contributing/PROXY.md](../contributing/PROXY.md). From eca3e3def5add18a6ba08f1cd8b3fefbff61e821 Mon Sep 17 00:00:00 2001 From: Jvst Me Date: Wed, 28 May 2025 15:13:25 +0200 Subject: [PATCH 2/2] Example on pushing without GitHub Also drop the dependency on `build`, since the guide no longer suggests building wheels --- contributing/PROXY.md | 27 ++++++++++++++++++++------- pyproject.toml | 1 - 2 files changed, 20 insertions(+), 8 deletions(-) diff --git a/contributing/PROXY.md b/contributing/PROXY.md index a88a063d7d..bc79c4fbbe 100644 --- a/contributing/PROXY.md +++ b/contributing/PROXY.md @@ -162,25 +162,38 @@ The gateway app needs to interact with Nginx and certbot, so running it locally dstack apply -f my-gateway.dstack.yml ``` -1. Save the gateway key to a file. You can find the key in sqlite, e.g.: +1. Save the gateway key to a file: ```shell - sqlite3 ~/.dstack/server/data/sqlite.db "SELECT ip_address, ssh_private_key FROM gateway_computes" + sqlite3 ~/.dstack/server/data/sqlite.db "SELECT ssh_private_key FROM gateway_computes WHERE deleted = 0 AND ip_address = ''" > /tmp/gateway.key + chmod 600 /tmp/gateway.key + ``` + +1. Deliver your code to the gateway. For example, clone it from a remote repo: + + ```shell + ssh -i /tmp/gateway.key ubuntu@gateway.example "git clone https://github.com/dstackai/dstack.git ~/dstack-repo" + ``` + + Or push it from your machine: + + ```shell + ssh -i /tmp/gateway.key ubuntu@gateway.example "git init ~/dstack-repo" + git remote add gateway ubuntu@gateway.example:~/dstack-repo + GIT_SSH_COMMAND='ssh -i /tmp/gateway.key' git push gateway branch_name ``` 1. Connect to the gateway: ```shell - chmod 600 /path/to/the/gateway/key - ssh -i /path/to/the/gateway/key ubuntu@gateway.example + ssh -i /tmp/gateway.key ubuntu@gateway.example ``` 1. Prepare an environment with your development branch on the gateway: ```shell - git clone https://github.com/dstackai/dstack.git dstack-repo - cd dstack-repo - git checkout my-development-branch + cd ~/dstack-repo + git checkout branch_name curl -LsSf https://astral.sh/uv/install.sh | sh source ~/.local/bin/env uv sync --extra gateway diff --git a/pyproject.toml b/pyproject.toml index 436c37bfce..14507151be 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -78,7 +78,6 @@ ignore-case = true [dependency-groups] dev = [ - "build>=1.2.2.post1", "httpx>=0.28.1", "pre-commit>=4.2.0", "pytest-asyncio>=0.23.8",