Skip to content

Commit 45cac7c

Browse files
authored
ggml-webgpu: fix compiler warnings and refactor FlashAttention encoding (ggml-org#21052)
* Update workflows to remove dependence on llvmpipe * Try setting Dawn_DIR * remove c++20 initializers * Move to proper guid * Try avoiding segfaults on vulkan backend process exit * Remove compiler warnings on parameter casting * Fix soft_max and update reg_tile accumulation to f32 for better precision * Refactor flash_attn a bit * remove c++20 initializers and format * Increase div precision for NVIDIA * revert div precision and comment out ggml-ci node for now * Formatting * Try debugging on a failing CI node * Revert "Try debugging on a failing CI node" This reverts commit 1971e33.
1 parent b94050e commit 45cac7c

4 files changed

Lines changed: 948 additions & 1177 deletions

File tree

.github/workflows/build-self-hosted.yml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,36 @@ jobs:
9797
vulkaninfo --summary
9898
GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
9999
100+
# TODO: investigate slight precision issues in some operations for test-backend-ops on the WebGPU backend.
101+
#ggml-ci-nvidia-webgpu:
102+
# runs-on: [self-hosted, Linux, NVIDIA]
103+
104+
# steps:
105+
# - name: Clone
106+
# id: checkout
107+
# uses: actions/checkout@v6
108+
109+
# - name: Dawn Dependency
110+
# id: dawn-depends
111+
# run: |
112+
# DAWN_VERSION="v20260317.182325"
113+
# DAWN_OWNER="google"
114+
# DAWN_REPO="dawn"
115+
# DAWN_ASSET_NAME="Dawn-18eb229ef5f707c1464cc581252e7603c73a3ef0-ubuntu-latest-Release"
116+
# echo "Fetching release asset from https://github.com/google/dawn/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.tar.gz"
117+
# curl -L -o artifact.tar.gz \
118+
# "https://github.com/google/dawn/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.tar.gz"
119+
# mkdir dawn
120+
# tar -xvf artifact.tar.gz -C dawn --strip-components=1
121+
122+
# - name: Test
123+
# id: ggml-ci
124+
# run: |
125+
# GG_BUILD_WEBGPU=1 \
126+
# GG_BUILD_WEBGPU_DAWN_PREFIX="$GITHUB_WORKSPACE/dawn" \
127+
# GG_BUILD_WEBGPU_DAWN_DIR="$GITHUB_WORKSPACE/dawn/lib64/cmake/Dawn" \
128+
# bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
129+
100130
# TODO: provision AMX-compatible machine
101131
#ggml-ci-cpu-amx:
102132
# runs-on: [self-hosted, Linux, CPU, AMX]

0 commit comments

Comments
 (0)