Skip to content

Optimise data loading to avoid reading the JSON file on every request #11

@komalharshita

Description

@komalharshita

Description:
utils/data_loader.py currently reads projects.json from disk on every
call to load_all_projects(). For a small dataset this is fine, but as the
project grows, repeated disk reads will slow down every recommendation
request, detail page load, and test run.

Expected Outcome:
Implement a simple in-memory cache: after the first read, store the result in
a module-level variable. Subsequent calls return the cached value instead of
re-reading the file.

_projects_cache = None
 
def load_all_projects():
    global _projects_cache
    if _projects_cache is None:
        with open(DATA_FILE, "r", encoding="utf-8") as f:
            _projects_cache = json.load(f)
    return _projects_cache

The cache must be invalidatable — add a clear_cache() function used in
tests to reset state between test runs. Update tests/test_basic.py to call
clear_cache() in a setup step. All 27 existing tests must still pass.

Document the trade-off in a code comment: the cache means changes to
projects.json will not be reflected until the app restarts. This is
acceptable for development use but should be noted.

Files to look at:

  • utils/data_loader.py
  • tests/test_basic.py

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions