Skip to content

Feat caching#189

Merged
ryansurf merged 5 commits into
mainfrom
feat-caching
Mar 26, 2026
Merged

Feat caching#189
ryansurf merged 5 commits into
mainfrom
feat-caching

Conversation

@ryansurf
Copy link
Copy Markdown
Owner

@ryansurf ryansurf commented Mar 26, 2026

General:

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

Code:

  1. Does your submission pass tests?
  2. Have you run the linter/formatter on your code locally before submission?
  3. Have you updated the documentation/README to reflect your changes, as applicable?
  4. Have you added an explanation of what your changes do?
  5. Have you written new tests for your changes, as applicable?

Summary by Sourcery

Introduce in-memory caching and type hints across API, CLI, and server layers while simplifying server–CLI integration.

New Features:

  • Add short-lived in-memory caches for UV, ocean, wind/temperature, rain, and forecast API calls to reduce repeated external requests.

Enhancements:

  • Decorate coordinate resolution and weather/ocean retrieval functions with caching and stricter type annotations for improved performance and clarity.
  • Refactor the FastAPI root handler to invoke the SurfReport class directly instead of spawning a subprocess, capturing output in-process.
  • Tighten MongoDB connection behavior with a short server selection timeout and copy report documents before insertion to avoid side effects.
  • Adjust argument parsing helper to split already-joined argument strings and update tests accordingly.

Build:

  • Add cachetools and debugpy as new dependencies in the project configuration.

Tests:

  • Update API, helper, and root endpoint tests to reflect caching, argument handling changes, and the new server–CLI integration while removing obsolete server tests.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Mar 26, 2026

Reviewer's Guide

Implements in-memory TTL caching for API calls, tightens type hints, refactors the server to call the CLI class directly instead of spawning a subprocess, updates helper argument parsing, and adjusts DB connection and tests accordingly.

Sequence diagram for FastAPI request using SurfReport.run instead of subprocess

sequenceDiagram
    actor User
    participant Browser
    participant FastAPIApp as FastAPI_app
    participant Surf as SurfReport
    participant Helper as helper
    participant Api as api
    participant DB as SurfReportDatabaseOps
    participant OpenMeteo as OpenMeteo_API
    participant GeoAPI as Geocoding_API

    User->>Browser: Request surf report URL
    Browser->>FastAPIApp: HTTP GET /
    FastAPIApp->>FastAPIApp: Parse query_params to parsed_parameters
    FastAPIApp->>FastAPIApp: Join to passed_args (comma separated)
    FastAPIApp->>Surf: run(lat=None, long=None, args=passed_args)

    Surf->>Helper: separate_args(args)
    Helper-->>Surf: arguments dict

    Surf->>Api: separate_args_and_get_location(arguments.location_args)
    Api->>Api: get_coordinates(tuple(args)) (lru_cache)
    Api->>GeoAPI: HTTP request (if cache miss)
    GeoAPI-->>Api: Coordinates
    Api-->>Surf: coordinates, location_data

    Surf->>Api: gather_data(lat, long, arguments)
    Api->>Api: get_uv / ocean_information / forecast ... (cached)
    Api->>OpenMeteo: HTTP requests on cache miss
    OpenMeteo-->>Api: Weather/ocean data
    Api-->>Surf: ocean_data_dict

    Surf->>DB: insert_report(ocean_data_dict.copy())
    DB-->>Surf: inserted_id

    Surf-->>FastAPIApp: rendered text output
    FastAPIApp-->>Browser: HTTP 200 text response
    Browser-->>User: Display surf report
Loading

Class diagram for updated SurfReport and database interactions

classDiagram
    class SurfReport {
        +str gpt_prompt
        +tuple gpt_info
        +SurfReportDatabaseOps db_handler
        +__init__() None
        +run(lat float, long float, args list~str~) None
        +_init_db() SurfReportDatabaseOps
        +_save_report(ocean_data_dict dict) None
    }

    class SurfReportDatabaseOps {
        +DatabaseConnection db
        +Collection collection
        +insert_report(report_document dict) Any
    }

    class DatabaseConnection {
        +str db_uri
        +MongoClient client
        +Database db
        +__init__() None
        +connect(db_name str) None
    }

    class GPTSettings {
        +str GPT_PROMPT
        +str API_KEY
        +str GPT_MODEL
    }

    class DatabaseSettings {
        +str DB_URI
    }

    SurfReport --> SurfReportDatabaseOps : uses
    SurfReport --> GPTSettings : reads
    SurfReport --> DatabaseSettings : init_db
    SurfReportDatabaseOps --> DatabaseConnection : composes
    DatabaseConnection --> MongoClient : uses
Loading

Flow diagram for new TTL caching of Open-Meteo API calls

flowchart TD
    A[Caller function in api.gather_data] --> B[get_uv / get_uv_history / ocean_information / ocean_information_history / current_wind_temp / get_rain / forecast / get_hourly_forecast]
    B --> C{Corresponding TTLCache
    hit?}
    C -- Yes --> D[Return cached value]
    C -- No --> E[Build request parameters]
    E --> F[Call Open-Meteo API]
    F --> G[Parse API response]
    G --> H[Store result in TTLCache
    with 600s TTL]
    H --> I[Return fresh value]

    subgraph Caches
        C1[_uv_cache]
        C2[uv_history_cache]
        C3[_ocean_cache]
        C4[_ocean_history_cache]
        C5[_wind_temp_cache]
        C6[_rain_cache]
        C7[_forecast_cache]
        C8[_hourly_forecast_cache]
    end
Loading

File-Level Changes

Change Details Files
Add TTL-based in-memory caching around Open-Meteo API functions and coordinate resolution.
  • Introduce multiple cachetools.TTLCache instances and a shared Lock for different API result types with a 10-minute TTL.
  • Wrap get_coordinates with functools.lru_cache and change its signature to accept a tuple so it can be cached safely.
  • Decorate API functions (UV, UV history, ocean current/history, wind/temp, rain, forecast, hourly forecast) with cachetools.cached using the appropriate cache and shared lock.
  • Tighten function type hints in the API module and adjust callers/tests to pass arguments in the new types (e.g., tuple for get_coordinates).
  • Ensure uv_history_cache is explicitly cleared in the UV history error-path test to avoid cache interference.
src/api.py
tests/test_api.py
Refactor FastAPI server to invoke the SurfReport class directly instead of using a subprocess, capturing printed output via an in-process call.
  • Replace subprocess.run-based CLI invocation with construction of a SurfReport instance and direct run() call.
  • Build the argument string from query parameters and pass it into SurfReport.run while capturing stdout via io.StringIO and contextlib.redirect_stdout.
  • Remove HTTPException-based subprocess error handling and associated imports now that errors are handled in-process.
  • Update the root endpoint test to assert against realistic surf output content rather than a mocked subprocess result.
src/server.py
tests/test_root.py
Improve CLI and helper interfaces and robustness, including argument and DB handling.
  • Add precise type hints to SurfReport.init, _init_db, run, and _save_report.
  • Change SurfReport._init_db to return a typed database ops instance or None based on DB configuration.
  • Adjust SurfReport.run signature to accept typed lat/long and args list, aligning with server/CLI usage.
  • Modify helper.separate_args to split on the incoming args string rather than indexing into a list, aligning with the new server argument passing style.
  • Remove an obsolete separate_args unit test that relied on the previous list-based behavior.
src/cli.py
src/helper.py
tests/test_helper.py
Harden database connection and write behavior to be more resilient and side-effect free.
  • Configure MongoClient with a short serverSelectionTimeoutMS to fail fast when the DB is unreachable.
  • Ensure insert_report writes a shallow copy of the report document to avoid side effects on the original data structure.
src/db/connection.py
src/db/operations.py
Update project dependencies to support new functionality and debugging.
  • Add debugpy and cachetools to the project dependencies in pyproject.toml.
  • Regenerate the lockfile to reflect the new dependencies.
pyproject.toml
poetry.lock

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@ryansurf ryansurf merged commit 4be615c into main Mar 26, 2026
11 checks passed
@ryansurf ryansurf deleted the feat-caching branch March 26, 2026 23:38
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 4 issues, and left some high level feedback:

  • The change to helper.separate_args now expects a string (args.split(',')), but SurfReport.run is still typed as taking args: list[str] | None and other callers may still pass lists; consider standardizing on a single type (e.g., str | None) in both the implementation and type hints to avoid runtime type mismatches.
  • You are reusing _ocean_lock for all TTLCache usages (_uv_cache, _wind_temp_cache, _rain_cache, etc.), which effectively serializes all cached calls; if you need locking at all, consider using separate locks per cache or dropping the lock where functions are read-only to reduce unnecessary contention.
  • In server.py, the endpoint captures CLI output by redirect_stdout around a long-running surf.run call on the event loop; consider refactoring surf.run (or a helper) to return structured data or a string instead of printing so the FastAPI handler can call it directly without stdout redirection and with better async compatibility.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The change to `helper.separate_args` now expects a string (`args.split(',')`), but `SurfReport.run` is still typed as taking `args: list[str] | None` and other callers may still pass lists; consider standardizing on a single type (e.g., `str | None`) in both the implementation and type hints to avoid runtime type mismatches.
- You are reusing `_ocean_lock` for all `TTLCache` usages (`_uv_cache`, `_wind_temp_cache`, `_rain_cache`, etc.), which effectively serializes all cached calls; if you need locking at all, consider using separate locks per cache or dropping the lock where functions are read-only to reduce unnecessary contention.
- In `server.py`, the endpoint captures CLI output by `redirect_stdout` around a long-running `surf.run` call on the event loop; consider refactoring `surf.run` (or a helper) to return structured data or a string instead of printing so the FastAPI handler can call it directly without stdout redirection and with better async compatibility.

## Individual Comments

### Comment 1
<location path="src/server.py" line_range="60-65" />
<code_context>
-        except subprocess.CalledProcessError as e:
-            logger.error("Subprocess error: %s", e.stderr)
-            raise HTTPException(status_code=500, detail="Internal CLI Error")
+        passed_args = ",".join(parsed_parameters)
+
+        # get stdout from print function (probably not ideal but whatever)
+        f = io.StringIO()
+        with redirect_stdout(f):
+            surf.run(args=passed_args)
+        return f.getvalue()

</code_context>
<issue_to_address>
**issue (bug_risk):** Using redirect_stdout to capture CLI output is not concurrency-safe in a web server context.

This approach temporarily overrides process-wide stdout, so concurrent requests can interfere with each other and mix outputs, and unrelated logs may be captured. Instead, consider refactoring `SurfReport.run` (or a lower-level function) to return its output as a string so the route can return that directly without redirecting stdout.
</issue_to_address>

### Comment 2
<location path="src/cli.py" line_range="42-46" />
<code_context>
             return None

-    def run(self, lat=None, long=None, args=None):
+    def run(
+        self,
+        lat: float | None = None,
+        long: float | None = None,
+        args: list[str] | None = None,
+    ):
         """
</code_context>
<issue_to_address>
**issue:** The `run` method type hint for `args` conflicts with how it is used from the server and in `separate_args`.

`server.default_route` passes `args` as a comma-separated `str` to `surf.run`, and `helper.separate_args` now also expects a `str` and calls `args.split(',')`. However, `run` is typed as `args: list[str] | None`, which doesn’t match this usage and suggests a list-based API that would break `separate_args`. Please update the `run` signature to reflect the actual type (e.g. `str | None`) or introduce separate parameters for the raw query string vs. parsed list so the types stay consistent.
</issue_to_address>

### Comment 3
<location path="src/helper.py" line_range="130" />
<code_context>
     Args are separated by commas in input. Separate them and return list.
     """
     if len(args) > 1:
-        return args[1].split(",")
+        return args.split(",")
     return []

</code_context>
<issue_to_address>
**suggestion (bug_risk):** Changing `separate_args` from list-based to string-based input is a behavioral change that may surprise existing callers.

This now assumes `args` is a comma-separated string (`args.split(',')`) instead of a list like `sys.argv` (`args[1].split(',')`). While that matches the new `server.default_route` usage, any existing callers that still pass a list will break. Consider either renaming this helper and keeping a list-based version, or supporting both list and string inputs (e.g., by checking the type) to avoid runtime regressions.
</issue_to_address>

### Comment 4
<location path="tests/test_root.py" line_range="9-17" />
<code_context>


-def test_root_returns_200_with_mock(monkeypatch):
+def test_root_returns_200():
     env = ServerSettings()
     app = create_app(env)

-    class DummyCompleted:
-        stdout = "ok"
-        stderr = ""
-
-    def fake_run(*args, **kwargs):
-        return DummyCompleted()
-
-    monkeypatch.setattr(subprocess, "run", fake_run)
-
     client = TestClient(app)
     resp = client.get("/")

     assert resp.status_code == HTTPStatus.OK
-    assert "ok" in resp.text
+    assert "wave" in resp.text.lower()
</code_context>
<issue_to_address>
**issue (testing):** `test_root_returns_200` no longer isolates the FastAPI route from external dependencies, which may make it flaky and slow

By removing the subprocess mock, this test now exercises real `SurfReport.run` behavior (network/DB/GPT/etc.), making it slow, environment‑dependent, and potentially flaky.

Consider mocking the `SurfReport` used in `create_app` (or patching `cli.SurfReport.run`) so the test stays fast and deterministic, and so you can:
- Assert `run` is called with the expected args from the query parameters.
- Add a separate test for failure cases (e.g., when `run` raises) to verify the route returns the expected 5xx.

This keeps the route test isolated while still covering the new direct method call path.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/server.py
Comment on lines +60 to +65
passed_args = ",".join(parsed_parameters)

# get stdout from print function (probably not ideal but whatever)
f = io.StringIO()
with redirect_stdout(f):
surf.run(args=passed_args)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Using redirect_stdout to capture CLI output is not concurrency-safe in a web server context.

This approach temporarily overrides process-wide stdout, so concurrent requests can interfere with each other and mix outputs, and unrelated logs may be captured. Instead, consider refactoring SurfReport.run (or a lower-level function) to return its output as a string so the route can return that directly without redirecting stdout.

Comment thread src/cli.py
Comment on lines +42 to +46
def run(
self,
lat: float | None = None,
long: float | None = None,
args: list[str] | None = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: The run method type hint for args conflicts with how it is used from the server and in separate_args.

server.default_route passes args as a comma-separated str to surf.run, and helper.separate_args now also expects a str and calls args.split(','). However, run is typed as args: list[str] | None, which doesn’t match this usage and suggests a list-based API that would break separate_args. Please update the run signature to reflect the actual type (e.g. str | None) or introduce separate parameters for the raw query string vs. parsed list so the types stay consistent.

Comment thread src/helper.py
Args are separated by commas in input. Separate them and return list.
"""
if len(args) > 1:
return args[1].split(",")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Changing separate_args from list-based to string-based input is a behavioral change that may surprise existing callers.

This now assumes args is a comma-separated string (args.split(',')) instead of a list like sys.argv (args[1].split(',')). While that matches the new server.default_route usage, any existing callers that still pass a list will break. Consider either renaming this helper and keeping a list-based version, or supporting both list and string inputs (e.g., by checking the type) to avoid runtime regressions.

Comment thread tests/test_root.py
Comment on lines +9 to +17
def test_root_returns_200():
env = ServerSettings()
app = create_app(env)

class DummyCompleted:
stdout = "ok"
stderr = ""

def fake_run(*args, **kwargs):
return DummyCompleted()

monkeypatch.setattr(subprocess, "run", fake_run)

client = TestClient(app)
resp = client.get("/")

assert resp.status_code == HTTPStatus.OK
assert "ok" in resp.text
assert "wave" in resp.text.lower()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (testing): test_root_returns_200 no longer isolates the FastAPI route from external dependencies, which may make it flaky and slow

By removing the subprocess mock, this test now exercises real SurfReport.run behavior (network/DB/GPT/etc.), making it slow, environment‑dependent, and potentially flaky.

Consider mocking the SurfReport used in create_app (or patching cli.SurfReport.run) so the test stays fast and deterministic, and so you can:

  • Assert run is called with the expected args from the query parameters.
  • Add a separate test for failure cases (e.g., when run raises) to verify the route returns the expected 5xx.

This keeps the route test isolated while still covering the new direct method call path.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 98.11321% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/helper.py 0.00% 1 Missing ⚠️
Files with missing lines Coverage Δ
src/api.py 100.00% <100.00%> (ø)
src/cli.py 95.55% <100.00%> (ø)
src/db/connection.py 100.00% <100.00%> (ø)
src/db/operations.py 100.00% <100.00%> (ø)
src/server.py 100.00% <100.00%> (ø)
src/helper.py 99.07% <0.00%> (-0.93%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant