|
| 1 | +# `dstack-proxy` |
| 2 | + |
| 3 | +`dstack-proxy` is a set of `dstack` components responsible for exposing [services](https://dstack.ai/docs/concepts/services/). |
| 4 | + |
| 5 | +- By default, services are published at `dstack` server URL subpaths. The component that handles traffic to such services is called **in-server proxy**. It runs as part of the `dstack` server. It is implemented in `dstack._internal.server.services.proxy`. |
| 6 | +- Users can optionally deploy a **gateway** to handle traffic to their services. The gateway app runs on a dedicated instance. Although it requires additional configuration, it provides higher performance and supports more features than the in-server proxy. It is implemented in `dstack._internal.proxy.gateway`. |
| 7 | +- The in-server proxy and the gateway share some business logic, such as the OpenAI-compatible API and the connection pool implementation. The common details are implemented in `dstack._internal.proxy.lib`. |
| 8 | + |
| 9 | +## Proxy functions and modules |
| 10 | + |
| 11 | +### Reverse proxy |
| 12 | + |
| 13 | +`dstack-proxy` acts as a reverse proxy and load balancer for services. |
| 14 | + |
| 15 | +The in-server proxy uses a custom reverse proxy implementation based on FastAPI and httpx. It routes requests based on the `/proxy/services/<project>/<service>` path. It performs load balancing by selecting a random service replica for each forwarded request. |
| 16 | + |
| 17 | +The gateway uses Nginx. It automatically maintains Nginx configs for each service in `/etc/nginx/sites-enabled/*`. Each service is published at a subdomain. Nginx performs load balancing using the round-robin method. Nginx forwards requests directly to service replicas, so traffic does not go through the gateway app. |
| 18 | + |
| 19 | +### HTTPS |
| 20 | + |
| 21 | +The in-server proxy is part of the `dstack` server, so services are only available over HTTPS if the `dstack` server is deployed with HTTPS. |
| 22 | + |
| 23 | +The gateway can enforce HTTPS in the Nginx config. It uses Certbot to obtain TLS certificates from Let's Encrypt or another configured CA when the service is registered, unless the service configuration specifies `https: false`. |
| 24 | + |
| 25 | +### Auth |
| 26 | + |
| 27 | +Unless the service configuration specifies `auth: false`, `dstack-proxy` checks the authorization of incoming requests based on `dstack` user tokens from the `Authorization` header. |
| 28 | + |
| 29 | +The in-server proxy validates the tokens by querying the correct token from the database. |
| 30 | + |
| 31 | +On gateways, Nginx makes a headers-only subrequest to the gateway app to check authorization for each incoming request. The gateway app then makes a request to the `dstack` server to validate the token. Responses from the `dstack` server are cached for 60 seconds. If the `dstack` server does not respond, the incoming request is denied. |
| 32 | + |
| 33 | +### OpenAI-compatible API |
| 34 | + |
| 35 | +The OpenAI interface emulates the real OpenAI API for chat completion models. It can list running models in the project, convert between OpenAI and TGI request formats, and forward requests to the correct service. |
| 36 | + |
| 37 | +The in-server proxy forwards requests directly to service replicas. |
| 38 | + |
| 39 | +```mermaid |
| 40 | +sequenceDiagram |
| 41 | + User->>dstack server: http(s)://<server>/proxy/models/<project>/ |
| 42 | + dstack server->>Service replica: http+unix:///path/to/ssh/socket |
| 43 | +``` |
| 44 | + |
| 45 | +The gateway uses Nginx to forward requests so they are included in access logs and service stats. |
| 46 | + |
| 47 | +```mermaid |
| 48 | +sequenceDiagram |
| 49 | + User->>Nginx: http(s)://gateway.<gateway>/ |
| 50 | + Nginx->>Gateway app: http://localhost:8000/api/models/<project>/ |
| 51 | + Gateway app->>Nginx: http://<service>.<gateway>/ |
| 52 | + Nginx->>Service replica: http+unix:///path/to/ssh/socket |
| 53 | +``` |
| 54 | + |
| 55 | +### Stats collector |
| 56 | + |
| 57 | +`dstack-proxy` collects service usage stats that are then used by the `dstack` server for autoscaling. Stats collection is only supported on gateways and is implemented by reading `/var/log/nginx/dstack.access.log` |
| 58 | + |
| 59 | +### Service connection pool |
| 60 | + |
| 61 | +`dstack-proxy` connects to service replicas via SSH and forwards their service port to a local Unix socket, which is then used by the reverse proxy. All SSH connections are added to a pool and reused between requests. |
| 62 | + |
| 63 | +The in-server proxy opens an SSH connection when it needs to forward the first request to it, so there may be a delay when forwarding the first request. |
| 64 | + |
| 65 | +The gateway opens SSH connections when each replica is registered. |
| 66 | + |
| 67 | +### Communication with the `dstack` server |
| 68 | + |
| 69 | +The in-server proxy is part of the `dstack` server, so no network communication takes place between them. |
| 70 | + |
| 71 | +Gateway-to-server communication happens over an SSH connection that is established by the `dstack` server when it creates a new gateway or starts. The SSH connection includes bidirectional port forwarding, enabling the gateway and the `dstack` server to call each other's APIs: |
| 72 | + |
| 73 | +- The server calls the gateway to register services, fetch stats, etc. |
| 74 | +- The gateway calls the server to validate user tokens. |
| 75 | + |
| 76 | +### Storage |
| 77 | + |
| 78 | +`dstack-proxy` has a set of stored models and a common storage repo interface. |
| 79 | + |
| 80 | +The in-server proxy repo implementation fetches the models from the database. It doesn't write to the database, as all the details about services and replicas are written by the `dstack` server. |
| 81 | + |
| 82 | +The gateway maintains its own in-memory storage repo. A copy of the repo is also stored in `~/dstack/state-v2.json`. The copy is updated on every write operation and is used for data recovery after restarts. The repo is populated when new services and replicas are registered. |
| 83 | + |
| 84 | +## Dependency injection |
| 85 | + |
| 86 | +When a module has to be implemented differently for the in-server proxy and the gateway, `dstack-proxy` uses interfaces with multiple implementations. For example, there are common interfaces for the storage repo and for auth checks. The in-server proxy and the gateway provide their own implementations for both interfaces. |
| 87 | + |
| 88 | +```mermaid |
| 89 | +classDiagram |
| 90 | + class BaseProxyRepo { |
| 91 | + <<abstract>> |
| 92 | + } |
| 93 | + BaseProxyRepo <|-- ServerProxyRepo |
| 94 | + BaseProxyRepo <|-- GatewayProxyRepo |
| 95 | +
|
| 96 | + class BaseProxyAuthProvider { |
| 97 | + <<abstract>> |
| 98 | + } |
| 99 | + BaseProxyAuthProvider <|-- ServerProxyAuthProvider |
| 100 | + BaseProxyAuthProvider <|-- GatewayProxyAuthProvider |
| 101 | +``` |
| 102 | + |
| 103 | +`dstack-proxy` then uses dependency injection to select the relevant implementation. Both the in-server proxy and the gateway provide an "injector" class that is used to obtain concrete interface implementations. |
| 104 | + |
| 105 | +```mermaid |
| 106 | +classDiagram |
| 107 | + class ProxyDependencyInjector { |
| 108 | + <<abstract>> |
| 109 | + +get_repo() BaseProxyRepo |
| 110 | + +get_auth_provider() BaseProxyAuthProvider |
| 111 | + } |
| 112 | + class ServerProxyDependencyInjector { |
| 113 | + +get_repo() ServerProxyRepo |
| 114 | + +get_auth_provider() ServerProxyAuthProvider |
| 115 | + } |
| 116 | + class GatewayDependencyInjector { |
| 117 | + +get_repo() GatewayProxyRepo |
| 118 | + +get_auth_provider() GatewayProxyAuthProvider |
| 119 | + } |
| 120 | + ProxyDependencyInjector <|-- ServerProxyDependencyInjector |
| 121 | + ProxyDependencyInjector <|-- GatewayDependencyInjector |
| 122 | +``` |
| 123 | + |
| 124 | +An instance of the relevant injector class is stored in the FastAPI global app state and can be accessed from FastAPI path operations. There are helper functions that can be used as FastAPI path operation dependencies to obtain the injector or a module implementation: `get_injector`, `get_proxy_repo`, `get_proxy_auth_provider`, etc. |
| 125 | + |
| 126 | +There are also similar gateway-specific helper functions to obtain gateway-specific module implementations: `get_gateway_proxy_repo` (guaranteed to return the gateway repo, which has more methods than the base repo interface), `get_nginx`, `get_stats_collector`, etc. |
| 127 | + |
| 128 | +## Gateway operations |
| 129 | + |
| 130 | +Gateway instances are managed by the `dstack` server. A gateway is associated with a project and some backend in the project. In `dstack` Sky, there is also a global gateway associated with all projects. |
| 131 | + |
| 132 | +### Creation |
| 133 | + |
| 134 | +Users can create a gateway using the `dstack apply` command. The gateway YAML configuration must specify the gateway's wildcard domain - all direct subdomains should resolve to the gateway IP address. Since the IP address is unknown during provisioning, `dstack` doesn't check DNS records. |
| 135 | + |
| 136 | +Provisioning happens as follows: |
| 137 | +1. Launch a non-GPU instance (usually the smallest) with all ports exposed. |
| 138 | +2. Install Nginx, Certbot, and patch configs. |
| 139 | +3. Create blue-green virtual environments. |
| 140 | +4. Install the latest `dstack-gateway` package from the S3 bucket. `dstack-gateway` is a thin package that depends on the `dstack` package, which contains the actual gateway implementation. |
| 141 | +5. Run the systemd service `dstack.gateway.service`. |
| 142 | + |
| 143 | +### Update |
| 144 | + |
| 145 | +The gateway has a "blue-green deployment"-like configuration: there are two virtual environments to be swapped on update. The systemd service uses the newly installed package after a restart. |
| 146 | + |
| 147 | +The update process looks like this: |
| 148 | +1. Install the new package to the unused venv. |
| 149 | +2. Update scripts and systemd service config. |
| 150 | +3. Swap the active venv name in `~/dstack/version`. |
| 151 | +4. Restart the systemd service. |
| 152 | + |
| 153 | +## Gateway development |
| 154 | + |
| 155 | +The gateway app needs to interact with Nginx and certbot, so running it locally can be challenging. One way to test your code is to upload your development branch to an existing gateway and run the gateway app from source. |
| 156 | + |
| 157 | +1. Run `dstack server` with `DSTACK_SKIP_GATEWAY_UPDATE=1` environment variable. This will prevent `dstack` from updating and starting the standard gateway version on each server restart. |
| 158 | + |
| 159 | +1. Provision a gateway through `dstack`: |
| 160 | + |
| 161 | + ```shell |
| 162 | + dstack apply -f my-gateway.dstack.yml |
| 163 | + ``` |
| 164 | + |
| 165 | +1. Save the gateway key to a file: |
| 166 | + |
| 167 | + ```shell |
| 168 | + sqlite3 ~/.dstack/server/data/sqlite.db "SELECT ssh_private_key FROM gateway_computes WHERE deleted = 0 AND ip_address = '<gateway-ip-addr>'" > /tmp/gateway.key |
| 169 | + chmod 600 /tmp/gateway.key |
| 170 | + ``` |
| 171 | + |
| 172 | +1. Deliver your code to the gateway. For example, clone it from a remote repo: |
| 173 | + |
| 174 | + ```shell |
| 175 | + ssh -i /tmp/gateway.key ubuntu@gateway.example "git clone https://github.com/dstackai/dstack.git ~/dstack-repo" |
| 176 | + ``` |
| 177 | + |
| 178 | + Or push it from your machine: |
| 179 | + |
| 180 | + ```shell |
| 181 | + ssh -i /tmp/gateway.key ubuntu@gateway.example "git init ~/dstack-repo" |
| 182 | + git remote add gateway ubuntu@gateway.example:~/dstack-repo |
| 183 | + GIT_SSH_COMMAND='ssh -i /tmp/gateway.key' git push gateway branch_name |
| 184 | + ``` |
| 185 | + |
| 186 | +1. Connect to the gateway: |
| 187 | + |
| 188 | + ```shell |
| 189 | + ssh -i /tmp/gateway.key ubuntu@gateway.example |
| 190 | + ``` |
| 191 | + |
| 192 | +1. Prepare an environment with your development branch on the gateway: |
| 193 | + |
| 194 | + ```shell |
| 195 | + cd ~/dstack-repo |
| 196 | + git checkout branch_name |
| 197 | + curl -LsSf https://astral.sh/uv/install.sh | sh |
| 198 | + source ~/.local/bin/env |
| 199 | + uv sync --extra gateway |
| 200 | + ``` |
| 201 | + |
| 202 | +1. Stop the gateway service and start your development version from source: |
| 203 | + |
| 204 | + ```shell |
| 205 | + sudo systemctl stop dstack.gateway.service |
| 206 | + uv run uvicorn dstack._internal.proxy.gateway.main:app |
| 207 | + ``` |
0 commit comments