Skip to content

Commit 7cb3124

Browse files
authored
add hotaisle backend (#2935)
* add hotaisle backend * Daemonize launch_command to solve dstack restart issue * Update backends.md and config.yml.md * Resolve Review Comments * Bump gpuhunt to 0.1.7 * Resolve Remaining Review Comments * Add hotaisle to TestListBackendTypes --------- Co-authored-by: Bihan Rana
1 parent f0a3cc9 commit 7cb3124

File tree

13 files changed

+523
-2
lines changed

13 files changed

+523
-2
lines changed

docs/docs/concepts/backends.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -579,6 +579,34 @@ gcloud projects list --format="json(projectId)"
579579
Using private subnets assumes that both the `dstack` server and users can access the configured VPC's private subnets.
580580
Additionally, [Cloud NAT](https://cloud.google.com/nat/docs/overview) must be configured to provide access to external resources for provisioned instances.
581581

582+
## Hot Aisle
583+
584+
Log in to the SSH TUI as described in the [Hot Aisle Quick Start :material-arrow-top-right-thin:{ .external }](https://hotaisle.xyz/quick-start/).
585+
Create a new team and generate an API key for the member in the team.
586+
587+
Then, go ahead and configure the backend:
588+
589+
<div editor-title="~/.dstack/server/config.yml">
590+
591+
```yaml
592+
projects:
593+
- name: main
594+
backends:
595+
- type: hotaisle
596+
team_handle: hotaisle-team-handle
597+
creds:
598+
type: api_key
599+
api_key: 9c27a4bb7a8e472fae12ab34.3f2e3c1db75b9a0187fd2196c6b3e56d2b912e1c439ba08d89e7b6fcd4ef1d3f
600+
```
601+
602+
</div>
603+
604+
??? info "Required permissions"
605+
The API key must have the following roles assigned:
606+
607+
* **Owner role for the user** - Required for creating and managing SSH keys
608+
* **Operator role for the team** - Required for managing virtual machines within the team
609+
582610
## Lambda
583611
584612
Log into your [Lambda Cloud :material-arrow-top-right-thin:{ .external }](https://lambdalabs.com/service/gpu-cloud) account, click API keys in the sidebar, and then click the `Generate API key`

docs/docs/reference/server/config.yml.md

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ to configure [backends](../../concepts/backends.md) and other [sever-level setti
1515
overrides:
1616
show_root_heading: false
1717
backends:
18-
type: 'Union[AWSBackendConfigWithCreds, AzureBackendConfigWithCreds, GCPBackendConfigWithCreds, LambdaBackendConfigWithCreds, NebiusBackendConfigWithCreds, RunpodBackendConfigWithCreds, VastAIBackendConfigWithCreds, KubernetesConfig]'
18+
type: 'Union[AWSBackendConfigWithCreds, AzureBackendConfigWithCreds, GCPBackendConfigWithCreds, HotAisleBackendConfigWithCreds, LambdaBackendConfigWithCreds, NebiusBackendConfigWithCreds, RunpodBackendConfigWithCreds, VastAIBackendConfigWithCreds, KubernetesConfig]'
1919

2020
#### `projects[n].backends` { #backends data-toc-label="backends" }
2121

@@ -126,6 +126,23 @@ to configure [backends](../../concepts/backends.md) and other [sever-level setti
126126
type:
127127
required: true
128128

129+
##### `projects[n].backends[type=hotaisle]` { #hotaisle data-toc-label="hotaisle" }
130+
131+
#SCHEMA# dstack._internal.core.backends.hotaisle.models.HotAisleBackendConfigWithCreds
132+
overrides:
133+
show_root_heading: false
134+
type:
135+
required: true
136+
item_id_prefix: hotaisle-
137+
138+
###### `projects[n].backends[type=hotaisle].creds` { #hotaisle-creds data-toc-label="creds" }
139+
140+
#SCHEMA# dstack._internal.core.backends.hotaisle.models.HotAisleAPIKeyCreds
141+
overrides:
142+
show_root_heading: false
143+
type:
144+
required: true
145+
129146
##### `projects[n].backends[type=lambda]` { #lambda data-toc-label="lambda" }
130147

131148
#SCHEMA# dstack._internal.core.backends.lambdalabs.models.LambdaBackendConfigWithCreds

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ dependencies = [
3232
"python-multipart>=0.0.16",
3333
"filelock",
3434
"psutil",
35-
"gpuhunt==0.1.6",
35+
"gpuhunt==0.1.7",
3636
"argcomplete>=3.5.0",
3737
"ignore-python>=0.2.0",
3838
"orjson",

src/dstack/_internal/core/backends/configurators.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,15 @@
5454
except ImportError:
5555
pass
5656

57+
try:
58+
from dstack._internal.core.backends.hotaisle.configurator import (
59+
HotAisleConfigurator,
60+
)
61+
62+
_CONFIGURATOR_CLASSES.append(HotAisleConfigurator)
63+
except ImportError:
64+
pass
65+
5766
try:
5867
from dstack._internal.core.backends.kubernetes.configurator import (
5968
KubernetesConfigurator,
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Hotaisle backend for dstack
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
from typing import Any, Dict, Optional
2+
3+
import requests
4+
5+
from dstack._internal.core.backends.base.configurator import raise_invalid_credentials_error
6+
from dstack._internal.utils.logging import get_logger
7+
8+
API_URL = "https://admin.hotaisle.app/api"
9+
10+
logger = get_logger(__name__)
11+
12+
13+
class HotAisleAPIClient:
14+
def __init__(self, api_key: str, team_handle: str):
15+
self.api_key = api_key
16+
self.team_handle = team_handle
17+
18+
def validate_api_key(self) -> bool:
19+
try:
20+
self._validate_user_and_team()
21+
return True
22+
except requests.HTTPError as e:
23+
if e.response.status_code == 401:
24+
raise_invalid_credentials_error(
25+
fields=[["creds", "api_key"]], details="Invalid API key"
26+
)
27+
elif e.response.status_code == 403:
28+
raise_invalid_credentials_error(
29+
fields=[["creds", "api_key"]],
30+
details="Authenticated user does note have required permissions",
31+
)
32+
raise e
33+
except ValueError as e:
34+
error_message = str(e)
35+
if "No Hot Aisle teams found" in error_message:
36+
raise_invalid_credentials_error(
37+
fields=[["creds", "api_key"]],
38+
details="Valid API key but no teams found for this user",
39+
)
40+
elif "not found" in error_message:
41+
raise_invalid_credentials_error(
42+
fields=[["team_handle"]], details=f"Team handle '{self.team_handle}' not found"
43+
)
44+
raise e
45+
46+
def _validate_user_and_team(self) -> None:
47+
url = f"{API_URL}/user/"
48+
response = self._make_request("GET", url)
49+
response.raise_for_status()
50+
user_data = response.json()
51+
52+
teams = user_data.get("teams", [])
53+
if not teams:
54+
raise ValueError("No Hot Aisle teams found for this user")
55+
56+
available_teams = [team["handle"] for team in teams]
57+
if self.team_handle not in available_teams:
58+
raise ValueError(f"Hot Aisle team '{self.team_handle}' not found.")
59+
60+
def upload_ssh_key(self, public_key: str) -> bool:
61+
url = f"{API_URL}/user/ssh_keys/"
62+
payload = {"authorized_key": public_key}
63+
64+
response = self._make_request("POST", url, json=payload)
65+
66+
if response.status_code == 409:
67+
return True # Key already exists - success
68+
response.raise_for_status()
69+
return True
70+
71+
def create_virtual_machine(self, vm_payload: Dict[str, Any]) -> Dict[str, Any]:
72+
url = f"{API_URL}/teams/{self.team_handle}/virtual_machines/"
73+
response = self._make_request("POST", url, json=vm_payload)
74+
response.raise_for_status()
75+
vm_data = response.json()
76+
return vm_data
77+
78+
def get_vm_state(self, vm_name: str) -> str:
79+
url = f"{API_URL}/teams/{self.team_handle}/virtual_machines/{vm_name}/state/"
80+
response = self._make_request("GET", url)
81+
response.raise_for_status()
82+
state_data = response.json()
83+
return state_data["state"]
84+
85+
def terminate_virtual_machine(self, vm_name: str) -> None:
86+
url = f"{API_URL}/teams/{self.team_handle}/virtual_machines/{vm_name}/"
87+
response = self._make_request("DELETE", url)
88+
if response.status_code == 404:
89+
logger.debug("Hot Aisle virtual machine %s not found", vm_name)
90+
return
91+
response.raise_for_status()
92+
93+
def _make_request(
94+
self, method: str, url: str, json: Optional[Dict[str, Any]] = None, timeout: int = 30
95+
) -> requests.Response:
96+
headers = {
97+
"accept": "application/json",
98+
"Authorization": f"Token {self.api_key}",
99+
}
100+
if json is not None:
101+
headers["Content-Type"] = "application/json"
102+
103+
return requests.request(
104+
method=method,
105+
url=url,
106+
headers=headers,
107+
json=json,
108+
timeout=timeout,
109+
)
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
from dstack._internal.core.backends.base.backend import Backend
2+
from dstack._internal.core.backends.hotaisle.compute import HotAisleCompute
3+
from dstack._internal.core.backends.hotaisle.models import HotAisleConfig
4+
from dstack._internal.core.models.backends.base import BackendType
5+
6+
7+
class HotAisleBackend(Backend):
8+
TYPE = BackendType.HOTAISLE
9+
COMPUTE_CLASS = HotAisleCompute
10+
11+
def __init__(self, config: HotAisleConfig):
12+
self.config = config
13+
self._compute = HotAisleCompute(self.config)
14+
15+
def compute(self) -> HotAisleCompute:
16+
return self._compute

0 commit comments

Comments
 (0)