Skip to content

Commit bb2c8d5

Browse files
LostBeardclaude
andcommitted
Stage 4.13.0 stable: version flip 4.13.0-local.10 -> 4.13.0 + CHANGELOG release date
Promotes the wrapper to the stable 4.13.0 (forks stay at the clean 2.0.26 = the PackageReference, so the four-package version-sync is satisfied). CHANGELOG header dated 2026-06-16. 4.13.0 = low-precision floats on all 6 backends (Half + BFloat16 + FP8 Float8E4M3/E5M2) + generic INumber<T> mixed-precision kernels + PrecisionConvert + bf16/FP8 portability to pre-Ampere CUDA. Release gate: full PMT sweep 3569 pass / 1 transient (OpenCL RadixSort2M GPU-contention flake, passes 9/9 isolated, not a regression) / 224 skip; FP8 PrecisionConvert round-trip + relu 257/257; BFloat16 107/0 incl CUDA; AcceleratorRequirements 19/0/1. Consumed by ML (Tuvok bumped to local.10, GGUFDecode KVCache 8/0, bf16 KV path validated on the pre-Ampere fix). Source staged to master FIRST per the nuget.org hard-gate; the nuget.org push (3 packages: forks 2.0.26 + SpawnDev.ILGPU 4.13.0) awaits Captain per-push sign-off. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 2190a6b commit bb2c8d5

2 files changed

Lines changed: 4 additions & 2 deletions

File tree

CHANGELOG.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@
22

33
This file tracks notable changes per release. The README's "Recent Highlights" section links here for the full version history.
44

5-
## 4.13.0 (unreleased) - BFloat16 (bfloat16) Phases 0-3b: core type (CPU) + WebGPU + WebGL + Wasm + OpenCL + CUDA codegen (all 6 backends)
5+
## 4.13.0 (2026-06-16) - Low-precision floats on all 6 backends: BFloat16 + FP8 (Float8E4M3 / Float8E5M2), generic INumber<T> mixed-precision kernels, PrecisionConvert, and bf16/FP8 portability to pre-Ampere CUDA cards
6+
7+
> 4.13.0 was developed across the local.5 -> local.10 series; the dated headline above is the stable cut. Per-milestone detail follows.
68
79
### local.10 - FP8 complete on ALL 6 backends + bf16 pre-Ampere CUDA fix
810

SpawnDev.ILGPU/SpawnDev.ILGPU.csproj

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
<TargetFramework>net10.0</TargetFramework>
55
<ImplicitUsings>enable</ImplicitUsings>
66
<Nullable>enable</Nullable>
7-
<Version>4.13.0-local.10</Version>
7+
<Version>4.13.0</Version>
88
<!-- Brief current-version highlights only. Full per-version history with code samples lives in CHANGELOG.md (linked from the README). -->
99
<PackageReleaseNotes>4.13.0 brings full low-precision floating-point support across ALL 6 backends (CPU, OpenCL, WebGPU, WebGL, Wasm, CUDA): Half, BFloat16, and now FP8 (Float8E4M3 + Float8E5M2), plus generic INumber&lt;T&gt; mixed-precision kernels and PrecisionConvert for transpilable generic float&lt;-&gt;T conversion inside a kernel. This release also fixes bf16 on PRE-AMPERE CUDA cards (GTX 1080 / RTX 2060 etc.): the PTX bf16 path used sm_80+ cvt instructions and failed to compile on older cards; it now uses portable bit-manipulation that works on every CUDA architecture (FP8 likewise). Full per-version history with code samples: CHANGELOG.md at https://github.com/LostBeard/SpawnDev.ILGPU/blob/master/CHANGELOG.md</PackageReleaseNotes>
1010
<GeneratePackageOnBuild>True</GeneratePackageOnBuild>

0 commit comments

Comments
 (0)