diff --git a/docs/docs/concepts/fleets.md b/docs/docs/concepts/fleets.md
index b59848ab7b..5c24f73c5c 100644
--- a/docs/docs/concepts/fleets.md
+++ b/docs/docs/concepts/fleets.md
@@ -260,6 +260,10 @@ Define a fleet configuration as a YAML file in your project directory. The file
2. Hosts with Intel Gaudi accelerators should be pre-installed with [Gaudi software and drivers](https://docs.habana.ai/en/latest/Installation_Guide/Driver_Installation.html#driver-installation).
This should include the drivers, `hl-smi`, and Habana Container Runtime.
+ === "Tenstorrent"
+ 2. Hosts with Tenstorrent accelerators should be pre-installed with [Tenstorrent software](https://docs.tenstorrent.com/getting-started/README.html#software-installation).
+ This should include the drivers, `tt-smi`, and HugePages.
+
3. The user specified should have passwordless `sudo` access.
To create or update the fleet, pass the fleet configuration to [`dstack apply`](../reference/cli/dstack/apply.md):
diff --git a/docs/docs/index.md b/docs/docs/index.md
index fdcb1d383c..cf94916936 100644
--- a/docs/docs/index.md
+++ b/docs/docs/index.md
@@ -7,7 +7,7 @@ for AI workloads both in the cloud and on-prem, speeding up the development, tra
#### Accelerators
-`dstack` supports `NVIDIA`, `AMD`, `Google TPU`, and `Intel Gaudi` accelerators out of the box.
+`dstack` supports `NVIDIA`, `AMD`, `TPU`, `Intel Gaudi`, and `Tenstorrent` accelerators out of the box.
## How does it work?
diff --git a/docs/examples.md b/docs/examples.md
index 88c1440978..ff6122a998 100644
--- a/docs/examples.md
+++ b/docs/examples.md
@@ -101,6 +101,17 @@ hide:
+
+
+ TPU
+
+
+
+ Deploy and fine-tune LLMs on TPU
+
+
+
@@ -112,15 +123,14 @@ hide:
-
-
- TPU
+ Tenstorrent
- Deploy and fine-tune LLMs on TPU
+ Deploy and fine-tune LLMs on Tenstorrent
diff --git a/docs/examples/accelerators/tenstorrent/index.md b/docs/examples/accelerators/tenstorrent/index.md
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/examples/accelerators/tenstorrent/.dstack.yml b/examples/accelerators/tenstorrent/.dstack.yml
new file mode 100644
index 0000000000..6e3319a001
--- /dev/null
+++ b/examples/accelerators/tenstorrent/.dstack.yml
@@ -0,0 +1,9 @@
+type: dev-environment
+name: cursor
+
+image: dstackai/tt-smi:latest
+
+ide: cursor
+
+resources:
+ gpu: n150:1
diff --git a/examples/accelerators/tenstorrent/README.md b/examples/accelerators/tenstorrent/README.md
new file mode 100644
index 0000000000..557bac7df0
--- /dev/null
+++ b/examples/accelerators/tenstorrent/README.md
@@ -0,0 +1,193 @@
+# Tenstorrent
+
+`dstack` supports running dev environments, tasks, and services on Tenstorrent
+[Wormwhole :material-arrow-top-right-thin:{ .external }](https://tenstorrent.com/en/hardware/wormhole){:target="_blank"} accelerators via SSH fleets.
+
+
+??? info "SSH fleets"
+
+
+ ```yaml
+ type: fleet
+ name: wormwhole-fleet
+
+ ssh_config:
+ user: root
+ identity_file: ~/.ssh/id_rsa
+ # Configure any number of hosts with n150 or n300 PCEe boards
+ hosts:
+ - 192.168.2.108
+ ```
+
+
+
+ > Hosts should be pre-installed with [Tenstorrent software](https://docs.tenstorrent.com/getting-started/README.html#software-installation).
+ This should include the drivers, `tt-smi`, and HugePages.
+
+ To apply the fleet configuration, run:
+
+
+
+ ```bash
+ $ dstack apply -f examples/acceleators/tenstorrent/fleet.dstack.yml
+
+ FLEET RESOURCES PRICE STATUS CREATED
+ wormwhole-fleet cpu=12 mem=32GB disk=243GB n150:12GB $0 idle 18 sec ago
+ ```
+
+
+
+ For more details on fleet configuration, refer to [SSH fleets](https://dstack.ai/docs/concepts/fleets#ssh).
+
+## Services
+
+Here's an example of a service that deploys
+[`Llama-3.2-1B-Instruct` :material-arrow-top-right-thin:{ .external }](https://huggingface.co/meta-llama/Llama-3.2-1B){:target="_blank"}
+using [Tenstorrent Inference Service :material-arrow-top-right-thin:{ .external }](https://github.com/tenstorrent/tt-inference-server){:target="_blank"}.
+
+
+
+```yaml
+type: service
+name: tt-inference-server
+
+env:
+ - HF_TOKEN
+ - HF_MODEL_REPO_ID=meta-llama/Llama-3.2-1B-Instruct
+image: ghcr.io/tenstorrent/tt-inference-server/vllm-tt-metal-src-release-ubuntu-20.04-amd64:0.0.4-v0.56.0-rc47-e2e0002ac7dc
+commands:
+ - |
+ . ${PYTHON_ENV_DIR}/bin/activate
+ pip install "huggingface_hub[cli]"
+ export LLAMA_DIR="/data/models--$(echo "$HF_MODEL_REPO_ID" | sed 's/\//--/g')/"
+ huggingface-cli download $HF_MODEL_REPO_ID --local-dir $LLAMA_DIR
+ python /home/container_app_user/app/src/run_vllm_api_server.py
+port: 7000
+
+model: meta-llama/Llama-3.2-1B-Instruct
+
+# Cache downloaded model
+volumes:
+ - /mnt/data/tt-inference-server/data:/data
+
+resources:
+ gpu: n150:1
+```
+
+
+
+Go ahead and run configuration using `dstack apply`:
+
+
+
+ ```bash
+ $ dstack apply -f examples/acceleators/tenstorrent/tt-inference-server.dstack.yml
+ ```
+
+
+Once the service is up, it will be available via the service endpoint
+at `/proxy/services///`.
+
+
+
+```shell
+$ curl http://127.0.0.1:3000/proxy/models/main/chat/completions \
+ -X POST \
+ -H 'Authorization: Bearer <dstack token>' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "model": "meta-llama/Llama-3.2-1B-Instruct",
+ "messages": [
+ {
+ "role": "system",
+ "content": "You are a helpful assistant."
+ },
+ {
+ "role": "user",
+ "content": "What is Deep Learning?"
+ }
+ ],
+ "stream": true,
+ "max_tokens": 512
+ }'
+```
+
+
+
+Additionally, the model is available via `dstack`'s control plane UI:
+
+{ width=800 }
+
+When a [gateway](https://dstack.ai/docs/concepts/gateways.md) is configured, the service endpoint
+is available at `https://./`.
+
+> Services support many options, including authentication, auto-scaling policies, etc. To learn more, refer to [Services](https://dstack.ai/docs/concepts/services).
+
+## Tasks
+
+Below is a task that simply runs `tt-smi -s`. Tasks can be used for training, fine-tuning, batch inference, or antything else.
+
+
+
+```yaml
+type: task
+# The name is optional, if not specified, generated randomly
+name: tt-smi
+
+env:
+ - HF_TOKEN
+
+# (Required) Use any image with TT drivers
+image: dstackai/tt-smi:latest
+
+# Use any commands
+commands:
+ - tt-smi -s
+
+# Specify the number of accelerators, model, etc
+resources:
+ gpu: n150:1
+
+# Uncomment if you want to run on a cluster of nodes
+#nodes: 2
+```
+
+
+
+> Tasks support many options, including multi-node configuration, max duration, etc. To learn more, refer to [Tasks](https://dstack.ai/docs/concepts/tasks).
+
+## Dev environments
+
+Below is an example of a dev environment configuration. It can be used to provision a dev environemnt that can be accessed via your desktop IDE.
+
+
+
+```yaml
+type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: cursor
+
+# (Optional) List required env variables
+env:
+ - HF_TOKEN
+
+image: dstackai/tt-smi:latest
+
+# Can be `vscode` or `cursor`
+ide: cursor
+
+resources:
+ gpu: n150:1
+```
+
+
+
+If you run it via `dstack apply`, it will output the URL to access it via your desktop IDE.
+
+{ width=800 }
+
+> Dev nevironments support many options, including inactivity and max duration, IDE configuration, etc. To learn more, refer to [Dev environments](https://dstack.ai/docs/concepts/tasks).
+
+??? info "Feedback"
+ Found a bug, or want to request a feature? File it in the [issue tracker :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/issues){:target="_blank"},
+ or share via [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd){:target="_blank"}.
diff --git a/examples/accelerators/tenstorrent/tt-inference-server.dstack.yml b/examples/accelerators/tenstorrent/tt-inference-server.dstack.yml
new file mode 100644
index 0000000000..6f1815ead1
--- /dev/null
+++ b/examples/accelerators/tenstorrent/tt-inference-server.dstack.yml
@@ -0,0 +1,24 @@
+type: service
+name: tt-inference-server
+
+env:
+ - HF_TOKEN
+ - HF_MODEL_REPO_ID=meta-llama/Llama-3.2-1B-Instruct
+image: ghcr.io/tenstorrent/tt-inference-server/vllm-tt-metal-src-release-ubuntu-20.04-amd64:0.0.4-v0.56.0-rc47-e2e0002ac7dc
+commands:
+ - |
+ . ${PYTHON_ENV_DIR}/bin/activate
+ pip install "huggingface_hub[cli]"
+ export LLAMA_DIR="/data/models--$(echo "$HF_MODEL_REPO_ID" | sed 's/\//--/g')/"
+ huggingface-cli download $HF_MODEL_REPO_ID --local-dir $LLAMA_DIR
+ python /home/container_app_user/app/src/run_vllm_api_server.py
+port: 7000
+
+model: meta-llama/Llama-3.2-1B-Instruct
+
+# Cache downloaded model
+volumes:
+ - /mnt/data/tt-inference-server/data:/data
+
+resources:
+ gpu: n150:1
diff --git a/examples/accelerators/tenstorrent/tt-smi.dstack.yml b/examples/accelerators/tenstorrent/tt-smi.dstack.yml
new file mode 100644
index 0000000000..b9478cb166
--- /dev/null
+++ b/examples/accelerators/tenstorrent/tt-smi.dstack.yml
@@ -0,0 +1,10 @@
+type: task
+name: tt-smi
+
+image: dstackai/tt-smi:latest
+
+commands:
+ - tt-smi -s
+
+resources:
+ gpu: n150:1
diff --git a/mkdocs.yml b/mkdocs.yml
index f7f8464613..b86176de0d 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -256,8 +256,9 @@ nav:
- Llama: examples/llms/llama/index.md
- Accelerators:
- AMD: examples/accelerators/amd/index.md
- - Intel Gaudi: examples/accelerators/intel/index.md
- TPU: examples/accelerators/tpu/index.md
+ - Intel Gaudi: examples/accelerators/intel/index.md
+ - Tenstorrent: examples/accelerators/tenstorrent/index.md
- Misc:
- Docker Compose: examples/misc/docker-compose/index.md
- NCCL Tests: examples/misc/nccl-tests/index.md