Skip to content

Optimize disk cache source loading#2176

Open
sepcnt wants to merge 5 commits into
tile-ai:mainfrom
sepcnt:cache-lazy-source-load
Open

Optimize disk cache source loading#2176
sepcnt wants to merge 5 commits into
tile-ai:mainfrom
sepcnt:cache-lazy-source-load

Conversation

@sepcnt
Copy link
Copy Markdown
Contributor

@sepcnt sepcnt commented May 11, 2026

Summary

#2177

  • Avoid eagerly reading cached host/device source files when loading kernels from disk cache.
  • Pass cached source paths through to JIT adapters and lazily materialize source text only when source inspection is requested.
  • Add a frontend cache index for @tilelang.jit lazy kernels so fresh processes can rebuild cached JITKernels from serialized PrimFuncs without re-elaborating Python DSL/TIR templates first.
  • Move disk cache loading outside the global cache lock so independent disk hits can proceed in parallel.
  • Add cache regression coverage, including source-read perf coverage and a frontend-cache regression test that fails if a cache hit falls back to TIR elaboration.

Motivation

For complex kernels, cached .cu and host wrapper sources can be large. Loading them eagerly on every disk cache hit adds noticeable startup overhead even when only the compiled library and params are needed to run the kernel.

Fresh-process cache hits also paid another large first-invocation cost before reaching the disk cache: each lazy JIT wrapper rebuilt its Python DSL/TIR template to compute the canonical kernel key. The frontend cache index records a lightweight frontend key to the canonical kernel cache key and stores the PrimFunc beside the existing compiled artifacts, so the fast path can load the cached kernel directly.

Measurements

  • 6-kernel disk cache load experiment: about 288 ms total before lazy source loading, about 6.9 ms total after deferring source reads.
  • 7-kernel fresh-process cold-start benchmark with disk cache already warmed: frontend fast path disabled first_forward_ms ~= 2963; frontend fast path enabled first_forward_ms ~= 216 (~13.7x, ~92.7% reduction). The same run showed no _build_tir_template calls on the fast path, and _load_kernel_from_disk stayed around 8 ms total.

Tests

  • git diff --check
  • python -m compileall tilelang
  • python -m pytest testing/python/cache/test_tilelang_kernel_cache_atomic_save.py -q
  • uvx pre-commit run --files tilelang/cache/__init__.py tilelang/cache/kernel_cache.py tilelang/jit/__init__.py testing/python/cache/test_tilelang_kernel_cache_atomic_save.py

Summary by CodeRabbit

  • Performance Improvements

    • Reduces lock contention and avoids reading large cached source files during cache hits.
  • Reliability / Bug Fixes

    • Treats incomplete on-disk cache entries as misses, repairs missing metadata when possible, and provides safer fallbacks when runtime artifacts are absent.
  • New Features

    • Frontend-level cache for lazy mode; adapters lazily load host/device sources from disk and support serialized function round-trips.
  • Tests

    • Added tests for disk-fronted cache hits, lazy source loading, large-source read guards, missing-source handling, and adapter fallbacks.

Review Change Stack

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Defers reading kernel source files during disk-cache hits: KernelCache releases its lock for disk loads, passes file-backed CachedTextSource(path=...) into builders/adapters, adds frontend-key mapping and optional prim_func persistence, provides adapter lazy-read helpers, updates adapters/JIT/autotuner signatures, and adds tests for correctness and perf.

Changes

Deferred Kernel Source Loading for Cache Hits

Layer / File(s) Summary
Frontend dispatch & keys
tilelang/cache/__init__.py
Add JSON/sha256 imports, _resolve_cache_dispatch(), _normalize_for_json(), _make_frontend_cache_key(), and top-level load_frontend_cached() / store_frontend_cache() helpers.
JITImpl frontend integration
tilelang/jit/__init__.py
Add _frontend_cache_key_data() and attempt load_frontend_cached() in lazy mode before compiling; store frontend mapping after compile when kernel exposes _tilelang_cache_key.
KernelCache: prim_func, frontend root, tagging
tilelang/cache/kernel_cache.py
Add prim_func.pkl support, frontend/ subdir and path helpers, tag kernels with _tilelang_cache_key/_tilelang_cache_path, and expose KernelCache-level frontend load/store.
KernelCache: release lock & defer disk reads
tilelang/cache/kernel_cache.py
cached() checks memory under lock, releases lock for _load_kernel_from_disk(), which verifies completeness but does not read source files and calls _build_kernel() with CachedTextSource(path=...) for host/device sources.
_build_kernel signature
tilelang/cache/kernel_cache.py
_build_kernel() accepts CachedTextSource objects for host_kernel_source/device_kernel_source; removed missing-source-as-cache-miss checks and rely on adapters/JIT to lazily load.
Base adapter helper
tilelang/jit/adapter/base.py
Add CachedTextSource dataclass and helpers _set_cached_text_source() / _load_cached_text_source() to persist/read cached text from paths, caching reads and returning None on missing path/OSError.
Adapter signatures & lazy getters
tilelang/jit/adapter/*
CuTeDSL, Cython, NVRTC, and TVM-FFI adapters accept CachedTextSource inputs, persist source-path metadata, and implement get_host_source()/get_kernel_source() to lazily load cached text or fallback to runtime inspection.
JITKernel entry & properties
tilelang/jit/kernel.py
JITKernel.from_database() and _create_adapter_from_database() accept CachedTextSource for host/device sources; kernel_source/host_source properties use adapter fallbacks when artifact is missing.
Autotuner integration
tilelang/autotuner/param.py
When reconstructing cached kernels, wrap loaded host/device source strings into CachedTextSource(text=...) before passing to JITKernel.from_database().
Tests & helpers
testing/python/cache/test_tilelang_kernel_cache_atomic_save.py
Add helper to write complete cache entries and tests covering disk-hit deferral of source loading, rejection of incomplete entries, perf guard against large source reads, frontend prim_func load/round-trip, JIT frontend-skip behavior, and adapter lazy-load/TVM fallback tests.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant KernelCache
  participant Loader as _load_kernel_from_disk
  participant Builder as _build_kernel
  participant JIT as JITKernel
  participant Adapter
  participant FS as Filesystem

  Caller->>KernelCache: cached(key)
  KernelCache->>KernelCache: check memory cache (locked)
  KernelCache->>KernelCache: release lock
  KernelCache->>Loader: _load_kernel_from_disk(key)
  Loader->>Builder: _build_kernel(..., host_source=CachedTextSource(path=...), device_source=CachedTextSource(path=...))
  Builder->>JIT: JITKernel.from_database(..., host_kernel_source_path=..., device_kernel_source_path=..., params=...)
  JIT->>Adapter: adapter.from_database(..., host_kernel_source=CachedTextSource, device_kernel_source=CachedTextSource)
  Caller->>Adapter: get_kernel_source()/get_host_source()
  Adapter->>FS: open file at stored path and read text
  Adapter-->>Caller: return loaded source text
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • LeiWang1999

"🐰 I keep the paths, not the full code pile,
Silent cache waits in the file.
Only when asked do I unbind,
I open, read, then gift the find.
Hop, fetch, return — a tidy file."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and concisely describes the main optimization: deferring eager disk cache source file reads via lazy loading, which directly aligns with the primary objective and the measured performance improvements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tilelang/cache/kernel_cache.py`:
- Around line 537-540: The cache loader currently treats a directory with only
binary and params as a hit because _build_kernel() no longer rejects missing
source text; fix by ensuring we only accept cache hits when source files are
present or by only supplying source-path arguments when the corresponding file
exists: in the cache-loading path around _build_kernel() (references:
_build_kernel(), host_kernel_source, device_kernel_source, host_kernel_path,
device_kernel_path) check the filesystem for the expected .cu/.cpp source files
and either (a) refuse the cache hit if any required source text is missing, or
(b) pass host_kernel_path/device_kernel_path only when that file exists and
leave host_kernel_source/device_kernel_source None otherwise so
show_source()/export_sources() degrade predictably; apply the same change to the
similar block covering lines mentioned (the other cache-loading block referenced
in the comment).

In `@tilelang/jit/adapter/cutedsl/adapter.py`:
- Around line 121-122: The adapter now stores adapter._host_kernel_source_path
but never lazy-loads it, so host-source inspection for cached CuTeDSL kernels is
unavailable; add a get_host_source() method that mirrors
get_device_kernel_source() to lazily read and cache
adapter._host_kernel_source_path into adapter._host_kernel_source (and return
it), replace any places that assume host/device parity to call
get_host_source(), and also add an assertion in the adapter initialization (or
in the cache-path logic) that either enforces host and device sources are equal
or fails loudly; ensure you update the analogous spots mentioned (the other
block around the adapter code at the later occurrence) and add a regression test
verifying host/device equality or that get_host_source() returns the expected
content.

In `@tilelang/jit/adapter/nvrtc/adapter.py`:
- Around line 117-121: The adapter currently sets host_kernel_source_path but
does not expose it via a real accessor, breaking JITKernel.get_host_source()
which expects NVRTCKernelAdapter.get_host_source(); add a get_host_source(self)
method on NVRTCKernelAdapter that returns
self._load_cached_text_source("host_func", "_host_kernel_source_path") so the
host text is returned from cache/disk, and apply the same pattern where a host
accessor is missing (the equivalent change referenced around the other block at
the later duplicate lines).

In `@tilelang/jit/adapter/tvm_ffi.py`:
- Around line 274-288: The from_database() constructor currently never sets
adapter.rt_mod so get_host_source()/get_device_source() can crash when cached
file loading returns None; set adapter.rt_mod = rt_mod inside the constructor
(where other adapter fields are assigned) so the inspect_source*() fallback can
run, and make the same change in the analogous block around the 306-319 region;
alternatively, if you prefer guarding, update
get_host_source()/get_device_source() to check self.rt_mod is not None before
calling inspect_source*(), but the simpler fix is to assign adapter.rt_mod =
rt_mod in both constructor paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 257bc1fa-7a8a-48b7-98ca-94d453073b00

📥 Commits

Reviewing files that changed from the base of the PR and between beef5cf and d9f9657.

📒 Files selected for processing (8)
  • testing/python/cache/test_tilelang_kernel_cache_atomic_save.py
  • tilelang/cache/kernel_cache.py
  • tilelang/jit/adapter/base.py
  • tilelang/jit/adapter/cutedsl/adapter.py
  • tilelang/jit/adapter/cython/adapter.py
  • tilelang/jit/adapter/nvrtc/adapter.py
  • tilelang/jit/adapter/tvm_ffi.py
  • tilelang/jit/kernel.py

Comment thread tilelang/cache/kernel_cache.py Outdated
Comment thread tilelang/jit/adapter/cutedsl/adapter.py Outdated
Comment thread tilelang/jit/adapter/nvrtc/adapter.py Outdated
Comment thread tilelang/jit/adapter/tvm_ffi.py Outdated
@sepcnt sepcnt force-pushed the cache-lazy-source-load branch from d9f9657 to 675a21e Compare May 11, 2026 05:27
Copy link
Copy Markdown
Contributor Author

sepcnt commented May 11, 2026

Addressed the CodeRabbit review comments in 675a21e:

  • require complete cache entries before treating disk cache as a hit, preserving cache-miss behavior for entries missing source files
  • add lazy host-source accessors for CuTeDSL and NVRTC adapters
  • guard TVM FFI source fallbacks when no runtime module is available from a database-loaded executable
  • add regression coverage for the above cases

Validation:

  • git diff --check
  • python -m compileall testing/python/cache/test_tilelang_kernel_cache_atomic_save.py tilelang/cache/kernel_cache.py tilelang/jit/adapter/base.py tilelang/jit/adapter/cutedsl/adapter.py tilelang/jit/adapter/nvrtc/adapter.py tilelang/jit/adapter/tvm_ffi.py
  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python -m pytest testing/python/cache/test_tilelang_kernel_cache_atomic_save.py -q -> 9 passed, 1 skipped
  • uvx pre-commit run --files testing/python/cache/test_tilelang_kernel_cache_atomic_save.py tilelang/cache/kernel_cache.py tilelang/jit/adapter/base.py tilelang/jit/adapter/cutedsl/adapter.py tilelang/jit/adapter/nvrtc/adapter.py tilelang/jit/adapter/tvm_ffi.py

@sepcnt sepcnt force-pushed the cache-lazy-source-load branch from 675a21e to 30c28ed Compare May 11, 2026 06:29
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
tilelang/cache/kernel_cache.py (4)

371-381: ⚡ Quick win

Mutate _memory_cache under the lock for consistency.

After the compile-on-miss branch, self._memory_cache[key] = kernel on Line 380 runs outside the with self._lock: block, while every other write/read to _memory_cache in this file holds self._lock. Even though CPython's GIL makes a single dict[key] = value effectively atomic, leaving this write unprotected:

  • breaks the invariant established elsewhere (e.g. lines 331-338, 351-355, 626-660) and is a footgun if _lock is ever replaced by a non-CPython-safe primitive or the dict is replaced with a custom mapping;
  • can lose the dedup race with another thread that just loaded the same kernel from disk under the lock (you overwrite a winner instead of returning it).

Consider folding the assignment back under the lock with a setdefault-style check, mirroring lines 351-355:

🔒 Proposed fix
         with self._lock:
             if env.is_cache_enabled():
                 cache_path = self._get_cache_path(key)
                 self._save_kernel_to_disk(key, kernel, func, verbose)
                 # Set cache path on adapter so it can save cubin after first execution
                 self._set_adapter_cache_path(kernel, cache_path)

-        # Store in memory cache after compilation
-        self._tag_kernel_cache_entry(kernel, key, self._get_cache_path(key))
-        self._memory_cache[key] = kernel
-        return kernel
+        # Store in memory cache after compilation
+        self._tag_kernel_cache_entry(kernel, key, self._get_cache_path(key))
+        with self._lock:
+            existing = self._memory_cache.get(key)
+            if existing is not None:
+                return existing
+            self._memory_cache[key] = kernel
+        return kernel
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/cache/kernel_cache.py` around lines 371 - 381, The write to
self._memory_cache happens outside the with self._lock block and can race with
other threads; move the assignment of self._memory_cache[key] = kernel back
under the lock in the function that calls
_get_cache_path/_tag_kernel_cache_entry (the same block that calls
_save_kernel_to_disk and _set_adapter_cache_path), and perform a
setdefault-style check (e.g., if key not in self._memory_cache:
self._memory_cache[key] = kernel) so you don’t overwrite a kernel another thread
loaded under the lock—if an existing kernel is found, discard the newly compiled
one and return the existing entry to preserve the deduplication invariant.

409-415: 💤 Low value

Silent try/except/pass swallows tagging failures.

Setting two plain attributes on a JITKernel should not raise under normal circumstances; if it does (e.g. a future refactor adds __slots__ or replaces JITKernel with a Mock in tests), this silently breaks the frontend cache because JITImpl.__call__ reads _tilelang_cache_key to decide whether to store_frontend_cache(...). A failure here means the kernel will never be tied back to its disk entry and the frontend cache will keep missing on subsequent runs without any signal.

Prefer logging at debug level rather than pass, and narrow the exception to AttributeError/TypeError:

🛠️ Proposed fix
     `@staticmethod`
     def _tag_kernel_cache_entry(kernel: JITKernel, key: str, cache_path: str) -> None:
-        try:
-            kernel._tilelang_cache_key = key
-            kernel._tilelang_cache_path = cache_path
-        except Exception:
-            pass
+        try:
+            kernel._tilelang_cache_key = key
+            kernel._tilelang_cache_path = cache_path
+        except (AttributeError, TypeError):
+            logging.getLogger(__name__).debug(
+                "Could not tag kernel cache entry for key %s", key, exc_info=True
+            )

Also matches the Ruff S110/BLE001 hints on this block.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/cache/kernel_cache.py` around lines 409 - 415, The silent try/except
in _tag_kernel_cache_entry hides failures when setting
kernel._tilelang_cache_key/_tilelang_cache_path; change it to only catch
AttributeError and TypeError, and log the failure at debug level (include the
key, cache_path and exception info) using a module logger (e.g.
logging.getLogger(__name__)). Do not swallow other exceptions—let them
propagate—so that real errors are visible; keep the assignments in the same
function (_tag_kernel_cache_entry) and ensure
JITImpl.__call__/store_frontend_cache can rely on the attributes when present.

663-674: ⚖️ Poor tradeoff

Cross-filesystem risk for _safe_write_file applies only when TILELANG_TMP_DIR is explicitly set to different filesystem.

_safe_write_file writes via temp file in TILELANG_TMP_DIR then os.replace onto target path. By default, TILELANG_TMP_DIR is ~/.tilelang/cache/tmp (nested inside TILELANG_CACHE_DIR), so they're on the same filesystem. However, if a user explicitly sets TILELANG_TMP_DIR elsewhere (e.g., TMPDIR=/tmp on a tmpfs while cache is on a separate volume), os.replace will raise OSError(EXDEV). The exception is silently swallowed when verbose=False (the default) at lines 672–674.

This is a pre-existing concern affecting params.pkl, prim_func.pkl, kernel source, and shared object writes throughout the codebase. Staging temp files within the target directory root would guarantee atomicity stays on one filesystem, but this requires changes across multiple code paths and only affects non-default configurations. Reasonable as an optional improvement.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/cache/kernel_cache.py` around lines 663 - 674, store_frontend_cache
currently swallows failures from KernelCache._safe_write_file when verbose is
False, which hides cross-filesystem EXDEV errors from os.replace if
TILELANG_TMP_DIR is on a different mount; modify the write logic in
store_frontend_cache (and similarly in other callers of
KernelCache._safe_write_file) to catch OSError with errno.EXDEV and retry by
creating a temp file in the same directory as the target (use
os.path.dirname(frontend_path) or equivalent), write the payload there and then
os.replace to achieve atomic move on the same filesystem; for other exceptions,
ensure at least a single self.logger.error/exception is emitted (not only when
verbose is True) so failures are visible.

488-499: 💤 Low value

Add end-to-end test for PrimFunc pickle round-trip with actual TIR objects.

Error handling for missing/corrupted prim_func.pkl is correct—the except OSError and except Exception handlers at lines 637–644 in load_frontend_cached properly treat pickle failures as cache misses (returning None). However, round-trip testing is incomplete: test_kernel_cache_frontend_hit_loads_serialized_prim_func exercises pickling only with a mock dict, not an actual tvm.tir.PrimFunc. The integration test test_jit_frontend_cache_hit_skips_tir_elaboration uses a real @T.prim_func decorator but mocks load_frontend_cached, so it doesn't verify that real PrimFunc objects pickle/unpickle correctly. Add an end-to-end test that constructs a genuine PrimFunc, saves it via _save_kernel_to_disk, and loads it back via load_frontend_cached to confirm it remains usable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/cache/kernel_cache.py` around lines 488 - 499, Add an end-to-end
test that verifies real tvm.tir.PrimFunc objects survive the full pickle
round-trip: construct a genuine PrimFunc (e.g., using the T.prim_func decorator
or creating tvm.tir.PrimFunc directly), call KernelCache._save_kernel_to_disk
(so the code path that writes prim_func.pkl runs), then call
KernelCache.load_frontend_cached to load the cache and assert the returned entry
contains a usable PrimFunc (check type is tvm.tir.PrimFunc and basic properties
or that it can be used in the same TIR contexts). Place this alongside the
existing tests (e.g., next to
test_kernel_cache_frontend_hit_loads_serialized_prim_func) so it exercises the
real serialization/deserialization paths without mocking load_frontend_cached.
tilelang/cache/__init__.py (2)

128-183: 💤 Low value

Frontend cache APIs forward consistently — minor: avoid recomputing the key.

load_frontend_cached and store_frontend_cache both call _resolve_cache_dispatch(target, execution_backend, verbose) and _make_frontend_cache_key(...) with the same arguments, which is correct. The only minor concern is that callers in tilelang/jit/__init__.py:__call__ invoke load_frontend_cached and then store_frontend_cache on a miss with the same arguments — both calls redo target normalization and the sha256 key derivation. Not a correctness issue, just wasted work on the cold path. If profiling shows it matters, expose the resolved key from load_frontend_cached so the caller can pass it to store_frontend_cache. Otherwise no action needed.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/cache/__init__.py` around lines 128 - 183, load_frontend_cached and
store_frontend_cache recompute the normalized target and frontend cache key
causing duplicate work on a cache miss; change the API so load_frontend_cached
also returns the computed frontend_key (from _make_frontend_cache_key) along
with the JITKernel (e.g., return (kernel, frontend_key) or similar) and update
callers (notably tilelang/jit/__init__.py:__call__) to pass that returned
frontend_key into store_frontend_cache (which should accept an optional
precomputed frontend_key) so _resolve_cache_dispatch and
_make_frontend_cache_key are only executed once on the cold path.

32-39: 💤 Low value

_normalize_for_json fallback to repr() can produce non-deterministic cache keys.

For any value that isn't a dict, list, tuple, primitive, or None, the function falls back to repr(value). For most objects this is deterministic, but Python's default object.__repr__ includes the memory address (<X object at 0x7f...>), which would make the frontend cache key effectively unique per process and prevent any cache hits across runs.

This is unlikely to bite for the common pass_configs / compile_flags shapes (mostly primitives and small dicts), but it's a sharp edge. Consider either:

  • raising / warning on unsupported types so a non-deterministic key is caught loudly during development, or
  • restricting the fallback to types with a known-stable __repr__ (e.g. str(value) for Target / tvm.runtime.Object).
🛡️ Defensive variant
-    if isinstance(value, (str, int, float, bool)) or value is None:
-        return value
-    return repr(value)
+    if isinstance(value, (str, int, float, bool)) or value is None:
+        return value
+    # Best-effort fallback. Reject objects whose default repr embeds an address,
+    # which would break cache-key stability across processes.
+    rendered = repr(value)
+    if " object at 0x" in rendered:
+        raise TypeError(
+            f"Cannot derive a stable cache key from object of type {type(value).__name__}"
+        )
+    return rendered
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/cache/__init__.py` around lines 32 - 39, The current
_normalize_for_json uses repr(value) as a fallback which can produce
non-deterministic strings (object memory addresses) and break cache keys; change
_normalize_for_json to avoid repr() for arbitrary objects: for known stable
types (e.g., tvm.runtime.Object, Target, or any types you know have stable
str()/repr()) explicitly convert with str(value) or a type-specific serializer,
and for all other unsupported types raise a TypeError (or warnings.warn in
non-strict mode) with a clear message including the offending type and value so
the issue is caught during development; update the fallback branch in
_normalize_for_json to perform this type check and raise/serialize accordingly.
tilelang/jit/__init__.py (3)

480-480: 💤 Low value

Document the not kernel_args gate.

The frontend cache is only consulted in lazy mode with empty kernel_args. The reasoning (lazy-mode call boundaries with no specialized tensor args are the only case where the frontend key uniquely identifies a compiled kernel) isn't obvious to future readers. A one-line comment would save someone the same investigation:

📝 Suggested comment
-            if self.mode == "lazy" and not kernel_args:
+            # Frontend cache assumes the key fully determines the compiled kernel.
+            # In lazy mode with no kernel_args, parse_args has folded all
+            # compile-time inputs into `key`, so it's safe to consult disk.
+            if self.mode == "lazy" and not kernel_args:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/jit/__init__.py` at line 480, Add a one-line explanatory comment
above the condition that checks "if self.mode == 'lazy' and not kernel_args"
explaining that the frontend cache is only safe to consult in lazy mode when
there are no kernel_args because only in that case the frontend cache key (based
on call site and other non-specialized info) uniquely identifies the compiled
kernel; reference the symbols self.mode and kernel_args in the comment so
readers know why the gate exists.

478-512: 💤 Low value

Frontend cache wiring looks correct; one robustness nit.

The lazy-mode-only gate, the in-process _kernel_cache short-circuit, and the load → compile → store fallback chain all behave correctly. Two things worth confirming:

  1. Cold-path store depends on _tilelang_cache_key tag. store_frontend_cache is only called when kernel._tilelang_cache_key is truthy (Line 497-498). Since KernelCache._tag_kernel_cache_entry swallows all exceptions (tilelang/cache/kernel_cache.py:411-415), a tagging failure produces a kernel that never gets registered in the frontend cache — every subsequent process will recompile. If you adopt the suggestion to log tagging failures, that path becomes diagnosable.

  2. Disabled cache. When env.is_cache_enabled() is False, load_frontend_cached returns None (good) and store_frontend_cache is a no-op (good), but self.compile(...) still goes through tilelang.cache.cached(), which on a disabled cache builds a JITKernel directly with no _tilelang_cache_key attribute. The getattr(..., None) guard handles this cleanly, so no action required — just flagging that the frontend cache correctly degrades to "no-op" when the underlying disk cache is off.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/jit/__init__.py` around lines 478 - 512, The
KernelCache._tag_kernel_cache_entry currently swallows exceptions which causes
kernels to lack the _tilelang_cache_key and prevents store_frontend_cache from
ever saving them; update KernelCache._tag_kernel_cache_entry to catch exceptions
but log full error details (exception message and traceback) including
identifying info about the kernel (e.g., kernel type/name/args) using the
project logger so tagging failures are diagnosable, and keep the existing
behavior (don’t rethrow unless needed); this will make missing frontend cache
entries visible when load_frontend_cached/store_frontend_cache and the
getattr(kernel, "_tilelang_cache_key", None) guard in the JIT path interact with
compile().

434-446: 💤 Low value

repr(key) in frontend key data can become non-deterministic with custom object arguments.

When a @jit kernel accepts custom objects (dataclasses, config objects, custom types) as compile-time parameters in lazy mode with no tensor arguments, repr(key) in _frontend_cache_key_data will include each object's memory address (from its default __repr__). This makes the frontend cache key process-unique, causing the cache to silently never hit.

The gate at line 480 (if self.mode == "lazy" and not kernel_args:) already constrains this to lazy-mode kernels with compile-time parameters only, limiting exposure. In practice, most @jit calls pass scalars or built-in types with stable repr(). However, the risk exists for any custom object argument.

Recommended: Document this constraint explicitly next to the frontend cache gate so future callers understand that compile-time parameters must have deterministic repr() representations to participate in cross-process caching.

Note: _normalize_for_json cannot prevent this since it also falls back to repr() for unsupported types (line 8 of tilelang/cache/__init__.py).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/jit/__init__.py` around lines 434 - 446, The frontend cache key uses
repr(key) in _frontend_cache_key_data which can include object memory addresses
and thus be non-deterministic for custom objects; add an explicit inline comment
and/or docstring next to the lazy-mode gate (the if self.mode == "lazy" and not
kernel_args: check) stating that cross-process frontend caching only works when
compile-time parameters have deterministic repr()/string representations (e.g.,
builtins, dataclasses with stable __repr__, or provide a stable key), and advise
callers to ensure custom config objects implement a stable __repr__ or to avoid
relying on cross-process cache for such kernels; reference the
_frontend_cache_key_data function and the lazy-mode gate in the comment so
future maintainers see the constraint.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tilelang/cache/__init__.py`:
- Around line 77-97: The cache key uses str(target) and str(target_host) which
can vary across TVM versions; instead serialize the Target(s) with a stable
export/JSON representation (use TVM's TargetExport/JSON API or
Target.export()/to_json()-style method) and replace the "target" and
"target_host" entries in _make_frontend_cache_key with those stable serialized
values; ensure you still handle None and the case where target_host may be a
string vs a Target object by detecting the type and exporting Targets
consistently before passing the resulting JSON/string into the existing
_normalize_for_json flow.

---

Nitpick comments:
In `@tilelang/cache/__init__.py`:
- Around line 128-183: load_frontend_cached and store_frontend_cache recompute
the normalized target and frontend cache key causing duplicate work on a cache
miss; change the API so load_frontend_cached also returns the computed
frontend_key (from _make_frontend_cache_key) along with the JITKernel (e.g.,
return (kernel, frontend_key) or similar) and update callers (notably
tilelang/jit/__init__.py:__call__) to pass that returned frontend_key into
store_frontend_cache (which should accept an optional precomputed frontend_key)
so _resolve_cache_dispatch and _make_frontend_cache_key are only executed once
on the cold path.
- Around line 32-39: The current _normalize_for_json uses repr(value) as a
fallback which can produce non-deterministic strings (object memory addresses)
and break cache keys; change _normalize_for_json to avoid repr() for arbitrary
objects: for known stable types (e.g., tvm.runtime.Object, Target, or any types
you know have stable str()/repr()) explicitly convert with str(value) or a
type-specific serializer, and for all other unsupported types raise a TypeError
(or warnings.warn in non-strict mode) with a clear message including the
offending type and value so the issue is caught during development; update the
fallback branch in _normalize_for_json to perform this type check and
raise/serialize accordingly.

In `@tilelang/cache/kernel_cache.py`:
- Around line 371-381: The write to self._memory_cache happens outside the with
self._lock block and can race with other threads; move the assignment of
self._memory_cache[key] = kernel back under the lock in the function that calls
_get_cache_path/_tag_kernel_cache_entry (the same block that calls
_save_kernel_to_disk and _set_adapter_cache_path), and perform a
setdefault-style check (e.g., if key not in self._memory_cache:
self._memory_cache[key] = kernel) so you don’t overwrite a kernel another thread
loaded under the lock—if an existing kernel is found, discard the newly compiled
one and return the existing entry to preserve the deduplication invariant.
- Around line 409-415: The silent try/except in _tag_kernel_cache_entry hides
failures when setting kernel._tilelang_cache_key/_tilelang_cache_path; change it
to only catch AttributeError and TypeError, and log the failure at debug level
(include the key, cache_path and exception info) using a module logger (e.g.
logging.getLogger(__name__)). Do not swallow other exceptions—let them
propagate—so that real errors are visible; keep the assignments in the same
function (_tag_kernel_cache_entry) and ensure
JITImpl.__call__/store_frontend_cache can rely on the attributes when present.
- Around line 663-674: store_frontend_cache currently swallows failures from
KernelCache._safe_write_file when verbose is False, which hides cross-filesystem
EXDEV errors from os.replace if TILELANG_TMP_DIR is on a different mount; modify
the write logic in store_frontend_cache (and similarly in other callers of
KernelCache._safe_write_file) to catch OSError with errno.EXDEV and retry by
creating a temp file in the same directory as the target (use
os.path.dirname(frontend_path) or equivalent), write the payload there and then
os.replace to achieve atomic move on the same filesystem; for other exceptions,
ensure at least a single self.logger.error/exception is emitted (not only when
verbose is True) so failures are visible.
- Around line 488-499: Add an end-to-end test that verifies real
tvm.tir.PrimFunc objects survive the full pickle round-trip: construct a genuine
PrimFunc (e.g., using the T.prim_func decorator or creating tvm.tir.PrimFunc
directly), call KernelCache._save_kernel_to_disk (so the code path that writes
prim_func.pkl runs), then call KernelCache.load_frontend_cached to load the
cache and assert the returned entry contains a usable PrimFunc (check type is
tvm.tir.PrimFunc and basic properties or that it can be used in the same TIR
contexts). Place this alongside the existing tests (e.g., next to
test_kernel_cache_frontend_hit_loads_serialized_prim_func) so it exercises the
real serialization/deserialization paths without mocking load_frontend_cached.

In `@tilelang/jit/__init__.py`:
- Line 480: Add a one-line explanatory comment above the condition that checks
"if self.mode == 'lazy' and not kernel_args" explaining that the frontend cache
is only safe to consult in lazy mode when there are no kernel_args because only
in that case the frontend cache key (based on call site and other
non-specialized info) uniquely identifies the compiled kernel; reference the
symbols self.mode and kernel_args in the comment so readers know why the gate
exists.
- Around line 478-512: The KernelCache._tag_kernel_cache_entry currently
swallows exceptions which causes kernels to lack the _tilelang_cache_key and
prevents store_frontend_cache from ever saving them; update
KernelCache._tag_kernel_cache_entry to catch exceptions but log full error
details (exception message and traceback) including identifying info about the
kernel (e.g., kernel type/name/args) using the project logger so tagging
failures are diagnosable, and keep the existing behavior (don’t rethrow unless
needed); this will make missing frontend cache entries visible when
load_frontend_cached/store_frontend_cache and the getattr(kernel,
"_tilelang_cache_key", None) guard in the JIT path interact with compile().
- Around line 434-446: The frontend cache key uses repr(key) in
_frontend_cache_key_data which can include object memory addresses and thus be
non-deterministic for custom objects; add an explicit inline comment and/or
docstring next to the lazy-mode gate (the if self.mode == "lazy" and not
kernel_args: check) stating that cross-process frontend caching only works when
compile-time parameters have deterministic repr()/string representations (e.g.,
builtins, dataclasses with stable __repr__, or provide a stable key), and advise
callers to ensure custom config objects implement a stable __repr__ or to avoid
relying on cross-process cache for such kernels; reference the
_frontend_cache_key_data function and the lazy-mode gate in the comment so
future maintainers see the constraint.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a1e1cfa1-8dc0-4939-b6eb-d019398f1a3c

📥 Commits

Reviewing files that changed from the base of the PR and between 675a21e and 30c28ed.

📒 Files selected for processing (10)
  • testing/python/cache/test_tilelang_kernel_cache_atomic_save.py
  • tilelang/cache/__init__.py
  • tilelang/cache/kernel_cache.py
  • tilelang/jit/__init__.py
  • tilelang/jit/adapter/base.py
  • tilelang/jit/adapter/cutedsl/adapter.py
  • tilelang/jit/adapter/cython/adapter.py
  • tilelang/jit/adapter/nvrtc/adapter.py
  • tilelang/jit/adapter/tvm_ffi.py
  • tilelang/jit/kernel.py
🚧 Files skipped from review as they are similar to previous changes (7)
  • tilelang/jit/adapter/cutedsl/adapter.py
  • tilelang/jit/adapter/base.py
  • tilelang/jit/adapter/cython/adapter.py
  • tilelang/jit/adapter/tvm_ffi.py
  • tilelang/jit/adapter/nvrtc/adapter.py
  • testing/python/cache/test_tilelang_kernel_cache_atomic_save.py
  • tilelang/jit/kernel.py

Comment thread tilelang/cache/__init__.py
sepcnt and others added 4 commits May 13, 2026 16:38
…ling

- Introduced a new test to validate the round-trip functionality of real primitive functions in the kernel cache.
- Enhanced the memory cache logic to prevent overwriting existing entries.
- Improved error handling when tagging kernel cache entries with logging for better debugging.
Guard memory-cache writes after compile, make cache tagging failures diagnosable, and document the lazy frontend-cache gate.

Wrap cached source text/path in CachedTextSource and add a real PrimFunc frontend-cache round-trip regression test.
…-lazy-source-load

# Conflicts:
#	tilelang/jit/adapter/base.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tilelang/jit/adapter/cutedsl/adapter.py (1)

152-152: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Store plain source text in kernel_global_source.

Line 152 stores a CachedTextSource object, but downstream callers treat kernel_global_source as str | None (e.g., source export paths). This can surface as type errors when writing source text.

Suggested fix
-        adapter.kernel_global_source = device_kernel_source
+        adapter.kernel_global_source = device_kernel_source.text
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/jit/adapter/cutedsl/adapter.py` at line 152, The code assigns a
CachedTextSource object to adapter.kernel_global_source but downstream code
expects a plain string (str | None); change the assignment at
adapter.kernel_global_source = device_kernel_source to store the underlying
source text instead (e.g., device_kernel_source.text or
str(device_kernel_source) / device_kernel_source.get_text()) so
kernel_global_source is a plain str or None and avoids type errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tilelang/cache/kernel_cache.py`:
- Around line 620-632: The code assumes frontend_entry is a dict after json.load
and calls frontend_entry.get("kernel_key"), which will raise AttributeError for
non-object JSON (string/list); update the logic in the load path that uses
_get_frontend_cache_path and json.load to guard payloads by checking
isinstance(frontend_entry, dict) (or mapping) before accessing .get and return
None on non-dict payloads so malformed cache files become cache misses instead
of hard failures; keep the existing exception handling and logger.exception call
intact.
- Around line 671-683: store_frontend_cache adds files under the frontend/ tree
but _clear_disk_cache doesn’t remove frontend entries, leaving stale mappings
after clear_cache; update the cache-clearing logic (the method named
_clear_disk_cache used by clear_cache) to also remove the frontend directory
alongside kernels/ and .staging (use the same directory resolution as
_get_frontend_cache_path or KernelCache._create_dirs and perform a safe
recursive delete, e.g. check existence then remove the frontend folder or its
contents to avoid leaving orphaned frontend metadata).

In `@tilelang/jit/adapter/tvm_ffi.py`:
- Around line 319-322: The code assumes rt_mod.imports[0] exists and will raise
IndexError when imports is empty; update the device-source fallback in the
method that reads rt_mod to check that rt_mod.imports is truthy and has at least
one element before calling inspect_source() on imports[0]; if imports is empty,
return None. Refer to rt_mod, its imports attribute, and the call to
inspect_source() when making the guard.

---

Outside diff comments:
In `@tilelang/jit/adapter/cutedsl/adapter.py`:
- Line 152: The code assigns a CachedTextSource object to
adapter.kernel_global_source but downstream code expects a plain string (str |
None); change the assignment at adapter.kernel_global_source =
device_kernel_source to store the underlying source text instead (e.g.,
device_kernel_source.text or str(device_kernel_source) /
device_kernel_source.get_text()) so kernel_global_source is a plain str or None
and avoids type errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6678800c-98f8-437c-b406-fd4ce6cf63be

📥 Commits

Reviewing files that changed from the base of the PR and between d279b94 and f9a54ea.

📒 Files selected for processing (10)
  • testing/python/cache/test_tilelang_kernel_cache_atomic_save.py
  • tilelang/autotuner/param.py
  • tilelang/cache/kernel_cache.py
  • tilelang/jit/adapter/__init__.py
  • tilelang/jit/adapter/base.py
  • tilelang/jit/adapter/cutedsl/adapter.py
  • tilelang/jit/adapter/cython/adapter.py
  • tilelang/jit/adapter/nvrtc/adapter.py
  • tilelang/jit/adapter/tvm_ffi.py
  • tilelang/jit/kernel.py
✅ Files skipped from review due to trivial changes (1)
  • tilelang/jit/adapter/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tilelang/jit/adapter/base.py
  • testing/python/cache/test_tilelang_kernel_cache_atomic_save.py

Comment on lines +620 to +632
frontend_path = self._get_frontend_cache_path(frontend_key)
try:
with open(frontend_path, encoding="utf-8") as file:
frontend_entry = json.load(file)
except OSError:
return None
except Exception:
self.logger.exception("Error loading frontend cache entry")
return None

key = frontend_entry.get("kernel_key")
if not isinstance(key, str) or not key:
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard non-dict frontend index payloads.

json.load() can return any JSON type. If this file contains a valid JSON string/list instead of an object, frontend_entry.get("kernel_key") raises AttributeError and turns a corrupt cache index into a hard failure instead of a cache miss.

Suggested fix
         try:
             with open(frontend_path, encoding="utf-8") as file:
                 frontend_entry = json.load(file)
         except OSError:
             return None
         except Exception:
             self.logger.exception("Error loading frontend cache entry")
             return None

+        if not isinstance(frontend_entry, dict):
+            return None
+
         key = frontend_entry.get("kernel_key")
         if not isinstance(key, str) or not key:
             return None
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/cache/kernel_cache.py` around lines 620 - 632, The code assumes
frontend_entry is a dict after json.load and calls
frontend_entry.get("kernel_key"), which will raise AttributeError for non-object
JSON (string/list); update the logic in the load path that uses
_get_frontend_cache_path and json.load to guard payloads by checking
isinstance(frontend_entry, dict) (or mapping) before accessing .get and return
None on non-dict payloads so malformed cache files become cache misses instead
of hard failures; keep the existing exception handling and logger.exception call
intact.

Comment on lines +671 to 683
def store_frontend_cache(self, frontend_key: str, kernel_key: str, *, verbose: bool = False) -> None:
if not env.is_cache_enabled():
return

KernelCache._create_dirs()
frontend_path = self._get_frontend_cache_path(frontend_key)
payload = {"kernel_key": kernel_key}
try:
KernelCache._safe_write_file(frontend_path, "w", lambda file: json.dump(payload, file, sort_keys=True))
except Exception:
if verbose:
self.logger.exception("Error saving frontend cache entry")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clear the frontend index with the rest of the cache.

This now persists extra state under frontend/, but _clear_disk_cache() still only removes kernels/ and .staging. After clear_cache(), stale frontend mappings remain on disk, which breaks the method contract and leaves orphaned cache metadata behind.

Suggested fix
     def _clear_disk_cache(self):
         try:
             shutil.rmtree(self._get_cache_root(), ignore_errors=True)
+            shutil.rmtree(self._get_frontend_cache_root(), ignore_errors=True)
             shutil.rmtree(self._get_staging_root(), ignore_errors=True)

             KernelCache._create_dirs()
         except Exception:
             self.logger.exception("Error clearing disk cache")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/cache/kernel_cache.py` around lines 671 - 683, store_frontend_cache
adds files under the frontend/ tree but _clear_disk_cache doesn’t remove
frontend entries, leaving stale mappings after clear_cache; update the
cache-clearing logic (the method named _clear_disk_cache used by clear_cache) to
also remove the frontend directory alongside kernels/ and .staging (use the same
directory resolution as _get_frontend_cache_path or KernelCache._create_dirs and
perform a safe recursive delete, e.g. check existence then remove the frontend
folder or its contents to avoid leaving orphaned frontend metadata).

Comment on lines +319 to +322
rt_mod = getattr(self, "rt_mod", None)
if rt_mod is None:
return None
return rt_mod.imports[0].inspect_source()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard empty runtime imports in device-source fallback.

Line 322 assumes rt_mod.imports[0] exists. If imports are empty, source inspection raises IndexError instead of returning None.

Suggested fix
-        return rt_mod.imports[0].inspect_source()
+        if not getattr(rt_mod, "imports", None):
+            return None
+        return rt_mod.imports[0].inspect_source()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
rt_mod = getattr(self, "rt_mod", None)
if rt_mod is None:
return None
return rt_mod.imports[0].inspect_source()
rt_mod = getattr(self, "rt_mod", None)
if rt_mod is None:
return None
if not getattr(rt_mod, "imports", None):
return None
return rt_mod.imports[0].inspect_source()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tilelang/jit/adapter/tvm_ffi.py` around lines 319 - 322, The code assumes
rt_mod.imports[0] exists and will raise IndexError when imports is empty; update
the device-source fallback in the method that reads rt_mod to check that
rt_mod.imports is truthy and has at least one element before calling
inspect_source() on imports[0]; if imports is empty, return None. Refer to
rt_mod, its imports attribute, and the call to inspect_source() when making the
guard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants