Skip to content

Commit 4fdf652

Browse files
c2w-seaRexwang8
andauthored
feat(vllm): Update to vllm v0.17.1, lmcache to v0.4.1, and switch vllm-tensorizer build to dedicated buildkit endpoints (#132)
* Update vllm version to v0.17.1. * Update lmcache to v0.4.1. Also move LMCache version parameter to build config * Keep the same flashinfer version (v0.6.4). Checked openai's vllm image and vllm's v.0.17.1 runtime requirements to confirm the version * The vllm upgrade caused the docker buildkit job to consistently OOM. Per CBS team suggestion, we switch to dedicated buildkit endpoints for vllm -tensorizer to launch pods with much higher mem limit (~500G vs ~60G) --------- Co-authored-by: rexwang8 <rexkingsbackyard@gmail.com>
1 parent c684e31 commit 4fdf652

4 files changed

Lines changed: 13 additions & 4 deletions

File tree

.github/configurations/vllm-tensorizer.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
vllm-commit:
2-
- 'v0.16.0'
2+
- 'v0.17.1'
33
flashinfer-commit:
44
- 'v0.6.4'
5+
lmcache-commit:
6+
- 'v0.4.1'
57
builder-base-image:
68
- 'ghcr.io/coreweave/ml-containers/torch:17ad6db-nccl-cuda12.9.1-ubuntu22.04-nccl2.29.2-1-torch2.10.0-vision0.25.0-audio2.10.0-abi1'
79
final-base-image:

.github/workflows/build.yml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,11 @@ on:
2424
description: "Platforms for which to build (default: linux/amd64,linux/arm64)"
2525
type: string
2626
default: linux/amd64,linux/arm64
27+
dedicated-buildkit:
28+
required: false
29+
description: "Instead of shared consumer endpoints, use dedicated BuildKit endpoints (BUILDKIT_DEDICATED_0) backed by high-memory-limit worker pods to prevent OOMs during large builds."
30+
type: boolean
31+
default: false
2732
outputs:
2833
outcome:
2934
description: "The outcome of the build"
@@ -62,10 +67,10 @@ jobs:
6267
uses: docker/setup-buildx-action@v3.7.1
6368
with:
6469
driver: remote
65-
endpoint: ${{ secrets.BUILDKIT_CONSUMER_AMD64_ENDPOINT }}
70+
endpoint: ${{ inputs.dedicated-buildkit && secrets.BUILDKIT_DEDICATED_0_AMD64_ENDPOINT || secrets.BUILDKIT_CONSUMER_AMD64_ENDPOINT }}
6671
platforms: linux/amd64
6772
append: |
68-
- endpoint: ${{ secrets.BUILDKIT_CONSUMER_ARM64_ENDPOINT }}
73+
- endpoint: ${{ inputs.dedicated-buildkit && secrets.BUILDKIT_DEDICATED_0_ARM64_ENDPOINT || secrets.BUILDKIT_CONSUMER_ARM64_ENDPOINT }}
6974
platforms: linux/arm64
7075
env:
7176
BUILDER_NODE_0_AUTH_TLS_CACERT: ${{ steps.client-certs.outputs.TLS_CACERT }}

.github/workflows/vllm-tensorizer.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,11 @@ jobs:
2222
with:
2323
image-name: vllm-tensorizer
2424
folder: vllm-tensorizer
25+
dedicated-buildkit: true
2526
tag-suffix: ${{ matrix.vllm-commit }}
2627
build-args: |
2728
VLLM_COMMIT=${{ matrix.vllm-commit }}
2829
FLASHINFER_COMMIT=${{ matrix.flashinfer-commit }}
30+
LMCACHE_COMMIT=${{ matrix.lmcache-commit }}
2931
BUILDER_BASE_IMAGE=${{ matrix.builder-base-image }}
3032
FINAL_BASE_IMAGE=${{ matrix.final-base-image }}

vllm-tensorizer/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ RUN git clone --filter=tree:0 --no-single-branch --no-checkout \
7979

8080
FROM alpine/git:2.36.3 AS lmcache-downloader
8181
WORKDIR /git
82-
ARG LMCACHE_COMMIT='v0.3.13'
82+
ARG LMCACHE_COMMIT
8383
RUN git clone --filter=tree:0 --no-single-branch --no-checkout \
8484
https://github.com/LMCache/LMCache && \
8585
git -C LMCache checkout "${LMCACHE_COMMIT}"

0 commit comments

Comments
 (0)