Skip to content

[Roadmap] Q2 2026 #3826

@peterschmidt85

Description

@peterschmidt85

Kubernetes

Inference

  • Support PD-disaggregation with NVIDIA Dynamo/vLLM
  • Support multi-replica gateways (high availability)
  • Gateways on SSH fleets - requires research

Technical debt

  • Implement instance health (incl. GPU health) via Events - create a new event if health status or message per instance changes
  • Migrate to Pydantic V2
  • Multi-tenancy /SSH proxy docs & UI/CLI integration - 1) document better the current isolation; 2) document better how to use SSH proxy; 3) polish SSH proxy UI/CLI integration

Documentation

  • A dedicated API guide with examples - cover all the CLI functionality (in addition to the reference documentation)
  • Skills guide (improve skills, plus add a dedicated guide on how to use dstack via agents)
  • Distributed training examples, incl. RL - refresh existing examples or/and add better examples, incl TRL; look at Miles

Benchmarks

  • PD-disaggregation

Other / Minor

  • Orphaned resources - allow dstack server to detect orphaned instances (and other related resources)
  • Sandboxes - consider supporting a run configuration type - requires research
  • Monarch integration
  • CLI performance
  • Benchmark the overhead the gateway adds

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions