Releases: dstackai/dstack
0.18.37
0.18.36
Vultr
Cluster placement
The vultr
backend can now provision fleets with cluster placement.
type: fleet
nodes: 4
placement: cluster
resources:
gpu: MI300X:8
backends: [vultr]
Nodes in such a cluster will be interconnected and can be used to run distributed tasks.
Performance
The update optimizes the performance of dstack server
, allowing a single server replica to handle up to 150 active runs, jobs, and instances. Capacity can be further increased by using PostgreSQL and running multiple server replicas.
Last, getting instance offers from backends when you run dstack apply
has also been optimized and now takes less time.
What's changed
- Increase max active resources supported by server by @r4victor in #2189
- Implement bridge network mode for jobs by @un-def in #2191
- [Internal] Fix
python-json-logger
deprecation warning by @jvstme in #2201 - Fix local backend by @r4victor in #2203
- Implement offers cache by @r4victor in #2197
- Add
/api/instances/list
by @jvstme in #2199 - Allow getting by ID in
/api/project/_/fleets/get
by @jvstme in #2200 - Add termination reason and message to the runner API by @r4victor in #2204
- Add vpc cluster support in Vultr by @Bihan in #2196
- Fix instance_types not respected for pool instances by @r4victor in #2205
- Delete manually created empty fleets by @r4victor in #2206
- Return repo errors from runner by @r4victor in #2207
- Fix caching offers with GPU requirements by @jvstme in #2210
- Fix filtering idle instances by instance type by @jvstme in #2214
- Add more project URLs on PyPI by @jvstme in #2215
Full changelog: 0.18.35...0.18.36
0.18.35
Vultr
This update introduces initial integration with Vultr. This cloud provider offers a diverse range of NVIDIA and AMD accelerators, from cost-effective fractional GPUs to multi-GPU bare-metal hosts.
$ dstack apply -f examples/.dstack.yml
# BACKEND REGION RESOURCES PRICE
1 vultr ewr 2xCPU, 8GB, 1xA16 (2GB), 50.0GB (disk) $0.059
2 vultr ewr 1xCPU, 5GB, 1xA40 (2GB), 90.0GB (disk) $0.075
3 vultr ewr 1xCPU, 6GB, 1xA100 (4GB), 70.0GB (disk) $0.123
...
18 vultr ewr 32xCPU, 375GB, 2xL40S (48GB), 2200.0GB (disk) $3.342
19 vultr ewr 24xCPU, 240GB, 2xA100 (80GB), 1400.0GB (disk) $4.795
20 vultr ewr 96xCPU, 960GB, 16xA16 (16GB), 1700.0GB (disk) $7.534
21 vultr ewr 96xCPU, 1024GB, 4xA100 (80GB), 450.0GB (disk) $9.589
See the docs for detailed instructions on configuring the vultr
backend.
Note
This release includes all dstack features except support for volumes and clusters. These features will be added in an upcoming update.
Vast.ai
Previously, the vastai
backend only allowed using Docker images where root
is the default user. This limitation has been removed, so you can now run NVIDIA NIM or any other image regardless of the user.
Backward compatibility
If you are going to configure the vultr
backend, make sure you update all your dstack CLI and API clients to the latest version. Clients prior to 0.18.35 will not work when Vultr is configured.
What's changed
- [
dstack-shim
] Revamp logging and CLI by @un-def in #2176 - Download
dstack-runner
to a well-known location by @un-def in #2179 - Add Vultr Support by @Bihan in #2132
- Support non-root Docker images in Vast.ai by @jvstme in #2185
- Refactor idle instance termination by @jvstme in #2188
- Retry instance termination in case of errors by @jvstme in #2190
- Update PyPI Development Status classifier by @jvstme in #2192
- Add Vultr to
Concepts
andReference
pages by @Bihan in #2186
Full changelog: 0.18.34...0.18.35
0.18.34
Idle duration
If provisioned fleet instances aren’t used, they are marked as idle
for reuse within the configured idle duration. After this period, instances are automatically deleted. This behavior was previously configured using the termination_policy
and termination_idle_time
properties in run or fleet configurations.
With this update, we replace these two properties with idle_duration
, a simpler way to configure this behavior. This property can be set to a specific duration or to off
for unlimited time.
type: dev-environment
name: vscode
python: "3.11"
ide: vscode
# Terminate instances idle for more than 1 hour
idle_duration: 1h
resources:
gpu: 24GB
Docker
Previously, dstack had limitations on Docker images for dev environments, tasks, and services. These have now been lifted, allowing images based on various Linux distributions like Alpine, Rocky Linux, and Fedora.
dstack
now also supports Docker images with built-in OpenSSH servers, which previously caused issues.
Documentation
The documentation has been significantly improved:
- Backend configuration has been moved from the Reference page to Concepts→Backends.
- Major examples related to dev environments, tasks, and services have been relocated from the Reference page to their respective Concepts pages.
Deprecations
- The
termination_idle_time
andtermination_policy
parameters in run configurations have been deprecated in favor ofidle_duration
.
What's changed
- [
dstack-shim
] Implement Future API by @un-def in #2141 - [API] Add API support to get runs by id by @r4victor in #2157
- [TPU] Update TPU v5e runtime and update vllm-tpu example by @Bihan in #2155
- [Internal] Skip docs-build on PRs from forks by @r4victor in #2159
- [
dstack-shim
] Add API v2 compat support to ShimClient by @un-def in #2156 - [Run configurations] Support Alpine and more RPM-based images by @un-def in #2151
- [Internal] Omit
id
field in (API)Client.runs.get()
method by @un-def in #2174 - [
dstack-shim
] Remove API v1 by @un-def in #2160 - [Volumes] Fix volume attachment with dstack backend by @un-def in #2175
- Replace
termination_policy
andtermination_idle_time
withidle_duration: int|str|off
by @peterschmidt85 in #2167 - Allow running
sshd
indstack
runs by @jvstme in #2178 - [Docs] Many docs improvements by @peterschmidt85 in #2171
Full changelog: 0.18.33...0.18.34
0.18.33
This update fixes TPU v6e support and a potential gateway upgrade issue.
What's Changed
- Fix runtime version for TPU v6e by @r4victor in #2149
- Update
state.json
migration on gateways by @jvstme in #2152 - Optimize gateway startup and service update time by @jvstme in #2153
Full Changelog: 0.18.32...0.18.33
0.18.32
TPU
Trillium (v6e
)
dstack
adds support for the latest Trillium TPU (v6e
), which became generally available in GCP on December 12th. The new TPU generation doubles the TPU memory and bursts performance, supporting larger workloads.
Resources
dstack
now includes CPU, RAM, and TPU memory in Google Cloud TPU offers:
$ dstack apply --gpu tpu
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 gcp europe-west4 v5litepod-1 24xCPU, 48GB, 1xv5litepod-1 (16GB), 100.0GB (disk) no $1.56
2 gcp europe-west4 v6e-1 44xCPU, 176GB, 1xv6e-1 (32GB), 100.0GB (disk) no $2.97
3 gcp europe-west4 v2-8 96xCPU, 334GB, 1xv2-8 (64GB), 100.0GB (disk) no $4.95
Volumes
By default, TPU VMs contain a 100GB boot disk, and its size cannot be changed. Now, you can add more storage using Volumes.
Gateways
In this update, we've greatly refactored Gateways, improving their reliability and fixing several bugs.
Note
If you are running multiple replicas of the dstack
server, ensure all replicas are upgraded promptly. Leaving some replicas on an older version may prevent them from creating or deleting services and could result in minor errors in their logs.
Warning
Ensure you update to 0.18.33, which includes critical hot-fixes for important issues.
What's changed
- [
dstack-shim
] Rework resource management by @un-def in #2093 - [Gateways] Restore
dstack-proxy
state on gateway restarts by @jvstme in #2119 - [TPU] Support TPU v6e by @r4victor in #2124
- [UI] Updated
Backend config
Info section by @peterschmidt85 in #2125 - [UI] It's not possible to manage fleets by @olgenn in #2126
- [UI] Improvements by @olgenn in #2127
- [Gateways] Add migration from
state.json
on gateways by @jvstme in #2128 - [Volumes] Forbid deleting backends with active instances or volumes by @r4victor in #2131
- [TPU] Fix backward compatibility with new TPUs by @r4victor in #2138
- Update
gpuhunt
to0.0.17
by @r4victor in #2139 - [Docs] Improve docs by @r4victor in #2135
- [Gateways] Fix certbot process getting stuck in
dstack-proxy
by @jvstme in #2143 - [Gateways] Run
dstack-proxy
on gateways by @jvstme in #2136 - [Volumes] Support volumes for TPUs by @r4victor in #2144
- [Gateways] Optimize
dstack-gateway
installation time by @jvstme in #2146 - [Gateways] Fix OpenAI endpoint on Kubernetes gateways by @jvstme in #2147
Full changelog: 0.18.31...0.18.32
0.18.31
GCP
Running VMs on behalf of a service account
Like all major clouds, GCP supports running a VM on behalf of a managed identity using a service account. Now you can assign a service account to a GCP VM with dstack
by specifying the vm_service_account
property in the GCP config:
type: gcp
project_id: myproject
vm_service_account: [email protected]
creds:
type: default
Assigning a service account to a VM can be used to access GCP resources from within runs. Another use case is using firewall rules that rely on the service account as the target. Such rules are typical for Shared VPC setups when admins of a host project can create firewall rules for service projects based on their service accounts.
Volumes
Creating user home directory automatically
Following support for non-root users in Docker images, dstack
improves handling of users' home directories. Most importantly, the HOME
environment variable is set according to /etc/passwd
, and the home directory is created automatically if it does not exist.
The update opens up new possibilities including the use of an empty volume for /home
:
type: dev-environment
ide: vscode
image: ubuntu
user: ubuntu
volumes:
- volume-aws:/home
AWS volumes with non-Nitro instances
dstack
users previously reported AWS Volumes not working with some instance types. This is now fixed and tested for all instance types supported by dstack
including older Xen-based instances like the P3 family.
Deprecations
- The
home_dir
andsetup
parameters in run configurations have been deprecated. If you're usingsetup
, movesetup
commands to the top ofinit
.
What's changed
- [
dstack-shim
] Implement multi-task state by @un-def in #2078 - [AWS] Support AWS volumes for Xen-based instances by @r4victor in #2088
- Handle empty user when processing image manifest by @un-def in #2090
- [Docs] Move Reference to a separate page for more space and better st… by @peterschmidt85 in #2092
- Init VirtualRepo when
--no-repo
specified by @r4victor in #2098 - [Docs] Add missing backends docs reference by @r4victor in #2099
- [gateways] Support gateway features in
dstack-proxy
by @jvstme in #2087 - [Docs] Add
Repos
page insideConcepts
to explain how repos work #2096 by @peterschmidt85 in #2097 - [GCP] Allow specifying
vm_service_account
in GCP config by @r4victor in #2110 - [
dstack-shim
] Create user home directory if it doesn't exist by @un-def in #2109 - [Tests] Disallow remote network connections in tests by @un-def in #2111
- [Docs] Add Developers page featuring community links, ambassador program, contributing links, etc #2103 by @peterschmidt85 in #2104
- [Docs] Refactor the reference guide #2112 by @peterschmidt85 in #2113
- [Tests] Support tests that access db from a new thread by @r4victor in #2116
- [Deprecation] Deprecate
home_dir
andsetup
by @un-def in #2115
Full changelog: 0.18.30...0.18.31
0.18.30
AWS Capacity Reservations and Capacity Blocks
dstack
now allows provisioning AWS instances using Capacity Reservations and Capacity Blocks. Given a CapacityReservationId
, you can specify it in a fleet or a run configuration:
type: fleet
nodes: 1
name: my-cr-fleet
reservation: cr-0f45ab39cd64a1cee
The instance will use the reserved capacity, so as long as you have enough, the provisioning is guaranteed to succeed.
Non-root users in Docker images
Previously, dstack
always executed the workload as root
, ignoring the user property set in the image. Now, dstack
executes the workload with the default image user, and you can override it with a new user
property:
type: task
image: nvcr.io/nim/meta/llama-3.1-8b-instruct
user: nim
The format of the user
property is the same as Docker uses: username[:groupname]
, uid[:gid]
, and so on.
Improved dstack apply
and repos UX
Previously, dstack apply
used the current directory as the repo that's made available within the run at /workflow
. The directory had to be initialized with dstack init
before running dstack apply
.
Now you can pass --repo
to dstack apply
. It can be a path to a local directory or a remote Git repo URL. The specified repo will be available within the run at /workflow
. You can also specify --no-repo
if the run doesn't need any repo. With --repo
or --no-repo
specified, you don't need to run dstack init
:
$ dstack apply -f task.dstack.yaml --repo .
$ dstack apply -f task.dstack.yaml --repo ../parent_dir
$ dstack apply -f task.dstack.yaml --repo https://github.com/dstackai/dstack.git
$ dstack apply -f task.dstack.yaml --no-repo
Specifying --repo
explicitly can be useful when running dstack apply
from scripts, pipelines, or CI. dstack init
stays relevant for use cases when you work with dstack apply
interactively and want to set up the repo to work with once.
Lightweight pip install dstack
pip install dstack
used to install all the dstack
server dependencies. Now pip install dstack
installs only the CLI and Python API, which is optimal for use cases when a remote dstack server is used. You can do pip install "dstack[server]"
to install the server or do pip install "dstack[all]"
to install the server with all backends supported.
Breaking changes
pip install dstack
no longer install the server dependencies. If you relied on it to install the server, ensure you usepip install "dstack[server]"
orpip install "dstack[all]"
.
What's Changed
- [chore]: Move
run_async
to_internal/utils
by @jvstme in #2057 - Move server deps to dstack[server] extra by @r4victor in #2058
- Add
user
property to run configurations by @un-def in #2055 - [Blog] Exploring inference memory saturation effect: H100 vs MI300x by @peterschmidt85 in #2061
- [Internal]: Fix building docs in CI by @jvstme in #2063
- [chore]: Drop unused gateway-related runner code by @jvstme in #2062
- [shim] Clean up and document API by @un-def in #2060
- Improve RESP API docs by @r4victor in #2064
- Allow underscores in custom GCP tags by @r4victor in #2065
- Make repo optional when submitting runs via HTTP API by @r4victor in #2066
- Fix changing configuration type with dstack apply by @r4victor in #2070
- Fix instances stuck in busy status by @r4victor in #2071
- [Minor] If errors should be passed silently, then in pythonic way by @dimitriillarionov in #2075
- AWS Capacity Reservation support by @solovyevt in #1977
- [Blog] Beyond Kubernetes: 2024 recap and what's next for AI infra by @peterschmidt85 in #2074
- Fix
reservation
property backward compatibility by @un-def in #2077 - Fix ~/.ssh write permissions check by @r4victor in #2079
- Fix errors exit codes in dstack apply by @r4victor in #2081
- Fix RESERVATIONS display in fleets table by @r4victor in #2082
- Support --repo, --no-repo, and autoinit in dstack apply by @r4victor in #2080
- Support AWS partitioned volumes by @r4victor in #2084
- [shim] Update OpenAPI doc by @un-def in #2085
New Contributors
- @dimitriillarionov made their first contribution in #2075
- @solovyevt made their first contribution in #1977
Full Changelog: 0.18.29...0.18.30
0.18.29
Support internal_ip
for SSH fleet clusters
It's now possible to specify instance IP addresses used for communication inside SSH fleet clusters using the internal_ip
property:
type: fleet
name: my-ssh-fleet
placement: cluster
ssh_config:
user: ubuntu
identity_file: ~/.ssh/dstack/key.pem
hosts:
- hostname: "3.79.203.200"
internal_ip: "172.17.0.1"
- hostname: "18.184.67.100"
internal_ip: "172.18.0.2"
If internal_ip
is not specified, dstack
automatically detects internal IPs by inspecting network interfaces. This works when all instances have IPs belonging to the same subnet and are accessible on those IPs. The explicitly specified internal_ip
enables networking configurations when the instances are accessible on IPs that do not belong to the same subnet.
UX enhancements for dstack apply
The dstack apply
command gets many improvements including more concise and consistent output and better error reporting. When applying run configurations, dstack apply
now prints a table similar to the dstack ps
output:
✗ dstack apply
Project main
User admin
...
Submit a new run? [y/n]: y
NAME BACKEND RESOURCES PRICE STATUS SUBMITTED
spicy-tiger-1 gcp 2xCPU, 8GB, $0.06701 running 14:52
(us-central1) 100.0GB (disk)
spicy-tiger-1 provisioning completed (running)
What's Changed
- [UX]: live table when provisioning dstack configuration runs #1978 by @Tob-iee in #2036
- Fix returning metrics from deleted runs by @jvstme in #2038
- [UI] Migrate the chat components to the new CloudScape chat componets by @olgenn in #2044
- Recover unreachable instances by @un-def in #2043
- UX enhancements for
dstack apply
by @jvstme in #2045 - Implement /api/fleets/list endpoint by @r4victor in #2050
- Remove padding in
dstack apply
live tables by @jvstme in #2048 - Fix typo in
dstack attach --help
by @jvstme in #2054 - Support specifying internal_ip for SSH fleet hosts by @r4victor in #2056
New Contributors
Full Changelog: 0.18.28...0.18.29
0.18.28
CLI improvements
- Added alias
-R
for--reuse
withdstack apply
- Shorten model URL output
dstack apply
anddstack attach
no longer rely on external tools such asps
andgrep
on Unix-like systems andpowershell
on Windows. With this change, it's now possible to usedstack
CLI client in minimal environments such as Docker containers, including the official dstackai/dstack image
What's Changed
- Add
DSTACK_{RUNNER,SHIM}_DOWNLOAD_URL
env vars by @un-def in #2023 - [Feature] Add alias
-R
for--reuse
withdstack apply
by @peterschmidt85 in #2032 - Replace
ps | grep
with psutil in SSHAttach by @un-def in #2029 - Shorten model URL output in CLI by @jvstme in #2035
Full Changelog: 0.18.27...0.18.28