Feature Request: Per‑Decorator, Per‑Core Maximum Entry Size
Context
cachier supports several cache cores (Pickle, MongoDB, Memory, SQLAlchemy, Redis) todayciteturn4view0 and lists multi‑core caching on its public roadmap.
While an upcoming feature (size_limit, discussed in a separate issue) will cap the total cache size, there is currently no way to prevent a single, giant return value from being stored in a given core.
Why This Matters
- Memory pressure – Accidentally caching a 5 GB model in the in‑process Memory core can crash the interpreter.
- I/O & bandwidth – Storing huge blobs in a networked core (MongoDB / Redis) is slow and expensive.
- Tiered caching – Once multi‑core caching lands, users will want small objects to stay in fast tiers (RAM, local disk) while large objects skip directly to slower but capacious tiers.
Desired Behaviour
- A decorator can declare maximum entry sizes (
entry_size_limit) so that individual calls exceeding the limit are not cached by the used core (internally this argument is passed to the core).
- Limits apply before the core writes; rejected entries return without caching.
Proposed API
from cachier import cachier, MemoryCore, PickleCore, RedisCore
@cachier(entry_size_limit="200MB")
def load_dataset(name: str):
...
entry_size_limit accepts integers (bytes) or human‑readable strings ("256KB", "2GB").
- If the decorator refuse an entry, the function still returns the value but nothing is cached; on individual decorated function calls where
cachier__verbose was set to True, an informative message will be printed.
Implementation Sketch
- Size Estimation
- MemoryCore: use
sys.getsizeof + heuristic deep‑size walker.
- PickleCore / SQLCore: compute byte length of the serialized blob before writing.
- MongoDB / Redis: rely on driver to abort early if payload > limit.
- Core API Change
- Add optional
entry_size_limit to every BaseCore.
- On
set(key, value) the core returns True (stored) or False (skipped).
- Decorator Logic
- Iterate through configured cores, calling
core.set().
- Stop at first successful store; otherwise return uncached result.
- Backwards Compatibility
- Default
entry_size_limit=None == unlimited (today’s behaviour).
- Existing decorators work unchanged.
Interaction with Multi‑Core Caching
| Scenario |
Benefit of Entry‑Size Limits |
| Hot path (≤10 MB) |
Cached in RAM → fastest retrieval. |
| Medium objects (10 MB–200 MB) |
Skip RAM, land on Pickle core (local SSD). |
| Very large objects (>200 MB) |
Skip local tiers, optionally fall back to cloud storage core (S3, future). |
Thus, size gates guarantee that upper tiers stay lean, avoiding the need for expensive eviction cycles.
Open Questions
- Should violation raise a custom exception or silently skip with a log?
- Is a single walk with
cloudpickle.dumps() sufficient for size estimation across all cores?
- How should users discover the actual serialized size (debug utility)?
Alternatives Considered
- Global decorator‑level
max_return_size – too coarse; different cores have very different constraints.
- Rely on
size_limit total cache cap – entries may still evict everything else from the cache in one shot.
Note: Future multi-core behavior & API
This is NOT the desired behavior and API for the current feature, BUT the implementation of the current feature MUST support extension into this planned multi-core behavior.
- A decorator can declare per‑core maximum entry sizes (
entry_size_limit) so that individual calls exceeding the limit are not cached by that core.
- Limits apply before the core writes; rejected entries fall through to the next core (once multi‑core is implemented) or simply return without caching.
-
- If all cores refuse an entry, the function still returns the value but nothing is cached; on individual decorated function calls where
cachier__verbose was set to True, an informative message will be printed.
from cachier import cachier, MemoryCore, PickleCore, RedisCore
@cachier(
cores=[
MemoryCore(entry_size_limit="10MB"), # skip bigger objects
PickleCore(entry_size_limit="200MB"), # disk OK, but cap file size
RedisCore() # no limit → anything goes
],
)
def load_dataset(name: str):
...
Request
I propose adding entry_size_limit (per core) in the next minor release, paving the way for predictable multi‑tier caching behaviour.
Feature Request: Per‑Decorator, Per‑Core Maximum Entry Size
Context
cachiersupports several cache cores (Pickle, MongoDB, Memory, SQLAlchemy, Redis) todayciteturn4view0 and lists multi‑core caching on its public roadmap.While an upcoming feature (
size_limit, discussed in a separate issue) will cap the total cache size, there is currently no way to prevent a single, giant return value from being stored in a given core.Why This Matters
Desired Behaviour
entry_size_limit) so that individual calls exceeding the limit are not cached by the used core (internally this argument is passed to the core).Proposed API
entry_size_limitaccepts integers (bytes) or human‑readable strings ("256KB","2GB").cachier__verbosewas set toTrue, an informative message will be printed.Implementation Sketch
sys.getsizeof+ heuristic deep‑size walker.entry_size_limitto everyBaseCore.set(key, value)the core returnsTrue(stored) orFalse(skipped).core.set().entry_size_limit=None== unlimited (today’s behaviour).Interaction with Multi‑Core Caching
Thus, size gates guarantee that upper tiers stay lean, avoiding the need for expensive eviction cycles.
Open Questions
cloudpickle.dumps()sufficient for size estimation across all cores?Alternatives Considered
max_return_size– too coarse; different cores have very different constraints.size_limittotal cache cap – entries may still evict everything else from the cache in one shot.Note: Future multi-core behavior & API
This is NOT the desired behavior and API for the current feature, BUT the implementation of the current feature MUST support extension into this planned multi-core behavior.
entry_size_limit) so that individual calls exceeding the limit are not cached by that core.cachier__verbosewas set toTrue, an informative message will be printed.Request
I propose adding
entry_size_limit(per core) in the next minor release, paving the way for predictable multi‑tier caching behaviour.