Summary
Intel Graphics Compiler (libigc.so.2) corrupts glibc heap metadata during SYCL kernel JIT compilation targeting DG2 (Arc A770). The corruption is silent under glibc's default allocator but detectable via LD_PRELOAD=libtcmalloc.so.4, which crashes with munmap_chunk(): invalid pointer or double free or corruption (out) inside IGC's internal free() calls during IgcOclTranslationCtx::Translate.
Reproducer
// igc_heap_repro.cpp
#include <sycl/sycl.hpp>
#include <complex>
#include <cstdio>
#include <vector>
using cf = std::complex<float>;
int main() {
sycl::queue q;
fprintf(stderr, "device: %s\n",
q.get_device().get_info<sycl::info::device::name>().c_str());
constexpr int N = 177147; // 3^11
constexpr int G = 9;
cf *d_sv = sycl::malloc_device<cf>(N * G, q);
float *d_gate = sycl::malloc_device<float>(G * G, q);
if (!d_sv || !d_gate) { fprintf(stderr, "alloc failed\n"); return 1; }
std::vector<float> h_gate(G * G, 0.1f);
h_gate[0] = 1.0f;
q.memcpy(d_gate, h_gate.data(), G * G * sizeof(float)).wait();
std::vector<cf> h_sv(N * G, cf(0.0f));
h_sv[0] = cf(1.0f, 0.0f);
q.memcpy(d_sv, h_sv.data(), N * G * sizeof(cf)).wait();
fprintf(stderr, "dispatching kernel (triggers IGC JIT)...\n");
q.parallel_for(sycl::range<1>(N), [=](sycl::id<1> id) {
int bid = id[0];
cf local[9];
for (int i = 0; i < G; i++)
local[i] = d_sv[bid * G + i];
cf result[9];
for (int i = 0; i < G; i++) {
cf sum(0.0f, 0.0f);
for (int j = 0; j < G; j++)
sum += cf(d_gate[i * G + j], 0.0f) * local[j];
result[i] = sum;
}
for (int i = 0; i < G; i++)
d_sv[bid * G + i] = result[i];
}).wait();
fprintf(stderr, "kernel complete\n");
sycl::free(d_gate, q);
sycl::free(d_sv, q);
for (int i = 0; i < 100; i++) {
void *p = malloc(1024 + i * 64);
free(p);
}
fprintf(stderr, "PASS\n");
return 0;
}
Build & Run
# Build
icpx -std=c++20 -O2 -g -fsycl -fsycl-targets=spir64 \
-o igc_heap_repro igc_heap_repro.cpp -lze_loader -lsycl -lur_loader
# Normal run — passes (corruption is silent under glibc):
./igc_heap_repro
# Output: "PASS"
# tcmalloc run — crashes (corruption detected):
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 ./igc_heap_repro
# Output: "munmap_chunk(): invalid pointer" then SIGABRT
GDB Backtrace (under tcmalloc)
Thread 1 "igc_heap_repro" received signal SIGABRT, Aborted.
#0 __pthread_kill_implementation at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill at ./nptl/pthread_kill.c:89
#3 __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
#4 __GI_abort () at ./stdlib/abort.c:79
#5 __libc_message_impl at ../sysdeps/posix/libc_fatal.c:134
#6 malloc_printerr (str="munmap_chunk(): invalid pointer") at ./malloc/malloc.c:5775
#7 munmap_chunk (p=<optimized out>) at ./malloc/malloc.c:3040
#8 __GI___libc_free (mem=0x...) at ./malloc/malloc.c:3388
#9 ?? () from /lib/x86_64-linux-gnu/libigc.so.2
#10 ?? () from /lib/x86_64-linux-gnu/libigc.so.2
...
#20 IGC::IgcOclTranslationCtx<3ul>::TranslateImpl(...) from /lib/x86_64-linux-gnu/libigc.so.2
#21 ?? () from /lib/x86_64-linux-gnu/libze_intel_gpu.so.1
...
#28 sycl::_V1::detail::ProgramManager::build(...) from libsycl.so.9
All frames #9–#20 are inside libigc.so.2. The corruption happens during the Translate (kernel compilation) phase.
Additional observation: tcmalloc_debug TCMALLOC_PAGE_FENCE=1
Running with the debug tcmalloc library reveals a related issue at startup:
memory allocation/deallocation mismatch at 0x...: allocated with malloc being deallocated with delete
This fires during SYCL/L0 runtime initialization (before any user kernel dispatch), suggesting a malloc/delete type mismatch inside the runtime libraries.
Environment
- OS: Ubuntu 24.04 (kernel 6.17.0-19-generic)
- GPU: Intel Arc A770 (DG2)
- IGC: libigc2 2.10.11-1133~24.04
- Compute Runtime: intel-opencl-icd 25.13.33276.22-1133~24.04
- oneAPI: 2026.0 (icpx, libsycl.so.9)
- glibc: 2.39-0ubuntu8.7
- tcmalloc: libgoogle-perftools4 2.15-3build1
Impact
Under glibc's default allocator, the corruption is dormant — it only manifests as double free or corruption (!prev) during process exit (destructor chain). This makes it appear as an application bug when it's actually an IGC-internal buffer overflow. The SYCL_CACHE_PERSISTENT=1 workaround (kernel caching) avoids repeated JIT and thus avoids repeated corruption, but the first run after cache invalidation still triggers it.
Ahead-of-time compilation (-fsycl-targets=intel_gpu_dg2) eliminates the JIT path and sidesteps the bug entirely.
Summary
Intel Graphics Compiler (
libigc.so.2) corrupts glibc heap metadata during SYCL kernel JIT compilation targeting DG2 (Arc A770). The corruption is silent under glibc's default allocator but detectable viaLD_PRELOAD=libtcmalloc.so.4, which crashes withmunmap_chunk(): invalid pointerordouble free or corruption (out)inside IGC's internalfree()calls duringIgcOclTranslationCtx::Translate.Reproducer
Build & Run
GDB Backtrace (under tcmalloc)
All frames #9–#20 are inside
libigc.so.2. The corruption happens during theTranslate(kernel compilation) phase.Additional observation:
tcmalloc_debugTCMALLOC_PAGE_FENCE=1Running with the debug tcmalloc library reveals a related issue at startup:
This fires during SYCL/L0 runtime initialization (before any user kernel dispatch), suggesting a
malloc/deletetype mismatch inside the runtime libraries.Environment
Impact
Under glibc's default allocator, the corruption is dormant — it only manifests as
double free or corruption (!prev)during process exit (destructor chain). This makes it appear as an application bug when it's actually an IGC-internal buffer overflow. TheSYCL_CACHE_PERSISTENT=1workaround (kernel caching) avoids repeated JIT and thus avoids repeated corruption, but the first run after cache invalidation still triggers it.Ahead-of-time compilation (
-fsycl-targets=intel_gpu_dg2) eliminates the JIT path and sidesteps the bug entirely.