Skip to content

Commit d5f8c19

Browse files
committed
feat(core.utils): add persistent program caches (sqlite + filestream)
Convert cuda.core.utils to a package and add persistent, on-disk caches for compiled ObjectCode produced by Program.compile. Public API (cuda.core.utils): * ProgramCacheResource -- abstract bytes|str -> ObjectCode mapping with context manager and pickle-safety warning. Path-backed ObjectCode is rejected at write time (would store only the path). * SQLiteProgramCache -- single-file sqlite3 backend (WAL mode, autocommit) with LRU eviction against an optional size cap. A threading.RLock serialises connection use so one cache object is safe across threads. wal_checkpoint(TRUNCATE) + VACUUM run after evictions so the size cap bounds real on-disk usage, not just logical payload. Schema-version mismatch on open wipes entries. * FileStreamProgramCache -- directory of atomically-written entries (tmp + os.replace) safe across concurrent processes, with best-effort size enforcement by mtime. Windows-only PermissionError from os.replace is swallowed as a cache miss; other platforms re-raise. Schema-version mismatch on open wipes entries. * make_program_cache_key -- stable 32-byte blake2b key over code, code_type, ProgramOptions (including options.name), target_type, name expressions (normalised str/bytes), cuda core/driver/NVRTC versions, linker backend+version for PTX inputs, NVVM-specific fields (extra_sources, use_libdevice), and an optional extra_digest that callers MUST supply when options pull in external file content (include_path, pre_include, pch, use_pch, pch_dir). sqlite3 is imported lazily so the package is usable on interpreters built without libsqlite3. Tests: single-process CRUD, LRU/size-cap (logical and on-disk), corruption, schema-mismatch, threaded access (SQLite), multiprocess stress (FileStream), Windows vs POSIX PermissionError behaviour, and an end-to-end test that compiles a real CUDA C++ kernel, stores the ObjectCode, reopens the cache, and calls get_kernel on the deserialised copy. Public API is documented in cuda_core/docs/source/api.rst.
1 parent 56b53ca commit d5f8c19

File tree

6 files changed

+2402
-8
lines changed

6 files changed

+2402
-8
lines changed

cuda_core/cuda/core/utils.py

Lines changed: 0 additions & 8 deletions
This file was deleted.
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
from cuda.core._memoryview import (
6+
StridedMemoryView,
7+
args_viewable_as_strided_memory,
8+
)
9+
10+
__all__ = [
11+
"FileStreamProgramCache",
12+
"ProgramCacheResource",
13+
"SQLiteProgramCache",
14+
"StridedMemoryView",
15+
"args_viewable_as_strided_memory",
16+
"make_program_cache_key",
17+
]
18+
19+
# Lazily expose the program-cache APIs so ``from cuda.core.utils import
20+
# StridedMemoryView`` stays lightweight -- the cache backends pull in driver,
21+
# NVRTC, and module-load machinery that memoryview-only consumers do not need.
22+
_LAZY_CACHE_ATTRS = frozenset(
23+
{
24+
"FileStreamProgramCache",
25+
"ProgramCacheResource",
26+
"SQLiteProgramCache",
27+
"make_program_cache_key",
28+
}
29+
)
30+
31+
32+
def __getattr__(name):
33+
if name in _LAZY_CACHE_ATTRS:
34+
from cuda.core.utils import _program_cache
35+
36+
value = getattr(_program_cache, name)
37+
globals()[name] = value # cache for subsequent accesses
38+
return value
39+
raise AttributeError(f"module 'cuda.core.utils' has no attribute {name!r}")
40+
41+
42+
def __dir__():
43+
return sorted(__all__)

0 commit comments

Comments
 (0)