Image Gen Caching Methods Documentation #2052
JohnLoveJoy
started this conversation in
Ideas
Replies: 1 comment 4 replies
-
|
The docs are a bit outdated (spectrum now works for DiT too), but see: The distinction is mainly between:
|
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
1. What caching methods are
Caching methods in image generation frameworks (Stable Diffusion, SD XL, etc.) are techniques to store intermediate computations (activations, feature maps, etc) during the diffusion process so they can be reused in subsequent timesteps. This reduces total generation time, especially for high-resolution or multi-step images.
In short: caching accelerates image generation by avoiding repeated calculations.
2. Kobold recently Implemented the following caching methods:
- easycache
Description: Basic caching of reusable tensors. Supports threshold, warmup, start/end percentages.
Acceleration: Reuses intermediate activations to avoid recomputation.
Compatible models: Older SD pipelines and standard UNet-based models. Not optimized for DIT.
- ucache
Description: Unified cache manager that combines TEA-like, KV/attention, and intermediate tensor caching.
Acceleration: Reduces VRAM usage and keeps multiple caches in sync.
Compatible models: Both old SD XL models and DIT pipelines. Supports reuse threshold tuning.
- dbcache
Description: Stores activations in a persistent database (potentially disk-backed).
Acceleration: Avoids recomputation when GPU memory is limited.
Compatible models: Older SD XL models. Not specifically for DIT.
- taylorseer
Description: Advanced caching using residual/error heuristics to decide which activations can be reused.
Acceleration: Reduces unnecessary recomputation based on predicted error.
Compatible models: Older SD XL pipelines. DIT compatibility is experimental.
- cache-dit
Description: Caching designed specifically for DIT pipelines. Optimizes UNet activations and attention tensors during denoising.
Acceleration: Reduces recomputation in DIT multi-step diffusion.
Compatible models: Only DIT-based pipelines. Not for older SD XL models.
- spectrum
Description: Adaptive frequency-based caching. Classifies activations as hot/warm/cold. Supports warmup, λ, window size, flex window, and stop parameters.
Acceleration: Stores hot activations in GPU RAM and warm activations in system RAM. Reduces recomputation efficiently.
Compatible models: Newer SD XL pipelines and DIT variants.
3. Key cache parameters
reuse_threshold: minimum reuse score for caching activations
start / end: percent of timesteps to enable caching
warmup: number of initial steps before caching fully engages
Fn / Bn: compute block sizes for partial caching
spectrum-specific: w, m, lam, window, flex, stop (control adaptive hot/warm/cold caching)
Old SD XL / UNet: easycache, ucache, dbcache, taylorseer
DIT pipelines: cache-dit, spectrum, ucache
5. How to use it in practice.
Depending on the model you are using, enter the model/parameters in the "default parameters" field in JSON format, as shown in the example below.
Caching methods work particularly well in SD XL, Anima, and Z-Image Base, as these models require many steps. So far, my preferred method in terms of speed and quality is Spectrum, which achieves nearly 2× faster performance in Anima.
Beta Was this translation helpful? Give feedback.
All reactions