Invalidate persistent file cache when Finch source changes#358
Conversation
Agent-Logs-Url: https://github.com/finch-tensor/finch-tensor-lite/sessions/2345968a-d46d-4bdf-8342-58ff0eaf3afe Co-authored-by: willow-ahrens <5770255+willow-ahrens@users.noreply.github.com>
Agent-Logs-Url: https://github.com/finch-tensor/finch-tensor-lite/sessions/2345968a-d46d-4bdf-8342-58ff0eaf3afe Co-authored-by: willow-ahrens <5770255+willow-ahrens@users.noreply.github.com>
| _finch_source_root = Path(__file__).resolve().parent.parent | ||
|
|
||
|
|
||
| @functools.cache |
There was a problem hiding this comment.
set this on load, don't use functool cache
There was a problem hiding this comment.
Updated in 197d5a9: removed functools caching and now compute Finch source mtime once at module load (_session_finch_code_mtime_ns).
| return latest_mtime | ||
|
|
||
|
|
||
| def _clear_cache_root(cache_root: Path) -> None: |
There was a problem hiding this comment.
Make a user-accessible version of this and add a nice docstring, since users may want to use it. make it so the user can just say "clear cache" and it clears the current cache root for the current finch version
There was a problem hiding this comment.
Implemented in 197d5a9: added user-facing clear_cache() with a docstring, which clears the current Finch version cache root.
| path.unlink() | ||
|
|
||
|
|
||
| def _ensure_cache_fresh(cache_root: Path) -> None: |
There was a problem hiding this comment.
There will only ever be one cache root for a given session. Please simplify accordingly.
There was a problem hiding this comment.
Done in 197d5a9: simplified freshness tracking to a single session-level _cache_checked flag, assuming one cache root per session.
There was a problem hiding this comment.
I'd prefer to put the if _cache_checked in the cache decorator
There was a problem hiding this comment.
Implemented in 9879952: moved the _cache_checked guard into the file_cache decorator and removed that guard from _ensure_cache_fresh. No UI changes (backend-only), so there is no screenshot.
There was a problem hiding this comment.
Implemented in 9879952: _cache_checked is now checked in the file_cache decorator before calling _ensure_cache_fresh. No UI changes (backend-only), so there is no screenshot.
|
|
||
| def test_ensure_cache_fresh_clears_cache_when_code_changes(tmp_path, monkeypatch): |
There was a problem hiding this comment.
Instead, do an end-to-end test that invokes multiple python sessions and modifies a finch file.
There was a problem hiding this comment.
Addressed in 197d5a9: replaced helper-level tests with an end-to-end test that runs separate Python sessions and modifies a Finch source file mtime to verify invalidation.
Agent-Logs-Url: https://github.com/finch-tensor/finch-tensor-lite/sessions/23dea4a8-d9c1-4f24-aa53-456a48a281ad Co-authored-by: willow-ahrens <5770255+willow-ahrens@users.noreply.github.com>
| path.unlink() | ||
|
|
||
|
|
||
| def _ensure_cache_fresh(cache_root: Path) -> None: |
There was a problem hiding this comment.
I'd prefer to put the if _cache_checked in the cache decorator
| @@ -0,0 +1,57 @@ | |||
| # AI modified: 2025-01-01T00:00:00Z parent=4f5a2e5021678965ce8d830bb9edecac1dd3fea9 | |||
There was a problem hiding this comment.
I need an end-to-end test that runs the c_codegen, changes something about c codegen, and then runs it again in a new session and verifies the cache was invalidated and the new output is used.
There was a problem hiding this comment.
Implemented in 9879952: replaced the cache test with a c_codegen end-to-end multi-session test that runs codegen, modifies c_codegen.py behavior, runs again in a new session, and verifies the new output is used (7 then 9). No UI changes (backend-only), so there is no screenshot.
Agent-Logs-Url: https://github.com/finch-tensor/finch-tensor-lite/sessions/9922c86e-7a2f-4fa5-87d3-23689718ec19 Co-authored-by: willow-ahrens <5770255+willow-ahrens@users.noreply.github.com>
| # util/cache.py lives in src/finchlite/util/, so parent.parent is src/finchlite. | ||
| _finch_source_root = Path(__file__).resolve().parent.parent |
There was a problem hiding this comment.
Well, that might change in the future (placement of this file), which will surprise the person working on it. I believe that:
import finchlite
_finch_source_root = finchlite.__path__[0]Is more bulletproof and we don't hardcode this awful .parent.parent thing.
There was a problem hiding this comment.
Implemented in 37d451f: switched Finch source root detection to Path(finchlite.__path__[0]) so it no longer depends on Path(__file__).resolve().parent.parent. No UI changes (backend-only), so screenshot is not applicable.
| print("RESULT", int(lib.unique_value())) | ||
| """ | ||
|
|
||
| try: |
There was a problem hiding this comment.
We shouldn't have a whole test in a try block - let's use try for places where we actually expect it to happen and have a recovery in except there (or propagate if that's the case) - the finally contents can be a function that we can reuse.
There was a problem hiding this comment.
Implemented in 37d451f: narrowed try/finally to just the part that mutates c_codegen.py and extracted cleanup into _restore_file_state(...) for reuse. No UI changes (backend-only), so screenshot is not applicable.
|
|
||
|
|
||
| def test_c_codegen_cache_invalidation_end_to_end_across_sessions(tmp_path): | ||
| repo_root = Path(__file__).resolve().parents[1] |
There was a problem hiding this comment.
Why .parents[1] instead of .parent? We can access the repo root same way as I described before.
There was a problem hiding this comment.
Implemented in 37d451f: replaced Path(__file__).resolve().parents[1] with Path(finchlite.__path__[0]).parent.parent and added an inline note explaining the path derivation. No UI changes (backend-only), so screenshot is not applicable.
|
@mtsokol thanks for your help with this! I'm not sure if it's easier to use AI or to just write it myself. Do you think we should actually try to merge this code? |
I think searching for outdated cache files won't get much prettier than this - still, let's make it more thoughtful then it comes to edge cases, so we won't get surprised in a few months. I can iron it out after Monday next week - if you don't need it earlier. Have you considered setting up Claude Code for a reviewer assistant instead of Copilot? Drawing on my experience Claude was way more "intelligent" with finch lite code: https://code.claude.com/docs/en/github-actions but that requires Claude Pro subscription. |
willow-ahrens
left a comment
There was a problem hiding this comment.
Address mateusz's comments
Agent-Logs-Url: https://github.com/finch-tensor/finch-tensor-lite/sessions/50f6f452-e603-4120-a642-04c3a19fddfb Co-authored-by: willow-ahrens <5770255+willow-ahrens@users.noreply.github.com>
Addressed in 37d451f: applied Mateusz’s requested updates in cache root discovery and the cache e2e test structure/path handling. No UI changes (backend-only), so screenshot is not applicable. |
finchlite.__path__[0]try/finallyparents[1]