Skip to content

Add LazyFrame plan-based caching as alternative to argument-based caching #17

@lmmx

Description

@lmmx

Add support for caching LazyFrame results based on the serialised computation plan hash rather than function arguments. This would enable more intelligent caching for LazyFrames where the same computation plan could be reached through different code paths or argument combinations.

One of the things I implemented for interactive analysis on LazyFrames that had a large computation time was to use the hash of the serialized plan to name the file while saving. This way, I didn’t have to worry about specifically naming each variation, and during a session if I reverted some change I’d made to the LazyFrame, I could still get the cached result back quickly. This has the limitation of being stable only for a specific Polars version, but was good enough for my use case.

— via avimallu in Polars discord (https://discord.com/channels/908022250106667068/957930511999832064/1405947899795476480)

I hadn't thought of that but it'd clear the cache when you change the plan, e.g. you add a filter to the definition, smart

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions