You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/code_walkthrough.md
+29-1Lines changed: 29 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,34 @@ The entire app is ~20 lines. On every rerun Streamlit calls this top to bottom:
14
14
15
15
---
16
16
17
+
## Caching strategy
18
+
19
+
**`app/utils.py`**
20
+
21
+
Streamlit reruns the entire script from top to bottom on every user interaction — every checkbox tick, every slider move, every row click. Without caching, that would mean reading `models.json` from disk and rebuilding Python dicts on every interaction, and fetching arXiv over the network every time a profile card opens.
22
+
23
+
`utils.py` is the single place where all database and network calls are wrapped with `@st.cache_data`. No component module is allowed to call `load_models()` or `fetch_recent_papers()` directly.
24
+
25
+
**How `@st.cache_data` works.** Streamlit serialises the function's arguments into a cache key. On the first call with a given key it runs the function and stores the return value. On subsequent calls with the same key it returns the stored value immediately, skipping the function body. The cache lives in memory for the lifetime of the server process.
26
+
27
+
**Five cached wrappers and their TTLs:**
28
+
29
+
| Wrapper | TTL | Why |
30
+
|---|---|---|
31
+
|`cached_load_models()`| none (session lifetime) |`models.json` never changes while the app is running |
32
+
|`cached_get_families()`| none | derived from `models.json`; same rationale |
33
+
|`cached_get_organizations()`| none | same |
34
+
|`cached_get_languages()`| none | same |
35
+
|`cached_fetch_recent_papers(model_name, max_results)`| 3600 s (1 hour) | arXiv results change over time; TTL balances freshness against rate limits |
36
+
37
+
**Cache key details.**`cached_load_models()` takes no arguments so there is exactly one cache entry — the full model list — shared across every component that calls it. `cached_fetch_recent_papers` is keyed on `(model_name, max_results)`, so each model gets its own cache entry. Opening OLMo 2 7B and then Llama 3.1 8B costs two network calls; opening OLMo 2 7B a second time within an hour costs zero.
38
+
39
+
**`get_filtered_models()` is not cached.** Filtering is pure Python list comprehension over the already-cached model list — it completes in under a millisecond and its output depends on the current widget state, which changes on every rerun. Caching it would require a complex hashable key covering all active filters and would save no meaningful time.
40
+
41
+
**To extend:** if you add a new database call (e.g. fetching model cards from HuggingFace), add a new `@st.cache_data(ttl=…)` wrapper in `utils.py` and call only that wrapper from components. Never call the underlying function directly from a component.
@@ -24,7 +52,7 @@ Opens `data/models.json` (41 records), iterates the list, and calls `compute_ope
24
52
25
53
**To extend:** add a new scoring criterion by adding another `bool(model["some_field"])` term to the sum in `compute_openness_score()`, and add that field to every record in `models.json`.
26
54
27
-
The app never calls `load_models()`directly — it goes through `app/utils.py → cached_load_models()`, which wraps it in `@st.cache_data` so the JSON is only read once per session.
55
+
In the app, `load_models()`is always called through `cached_load_models()` — see the caching strategy section above.
0 commit comments