Feat caching#189
Conversation
Reviewer's GuideImplements in-memory TTL caching for API calls, tightens type hints, refactors the server to call the CLI class directly instead of spawning a subprocess, updates helper argument parsing, and adjusts DB connection and tests accordingly. Sequence diagram for FastAPI request using SurfReport.run instead of subprocesssequenceDiagram
actor User
participant Browser
participant FastAPIApp as FastAPI_app
participant Surf as SurfReport
participant Helper as helper
participant Api as api
participant DB as SurfReportDatabaseOps
participant OpenMeteo as OpenMeteo_API
participant GeoAPI as Geocoding_API
User->>Browser: Request surf report URL
Browser->>FastAPIApp: HTTP GET /
FastAPIApp->>FastAPIApp: Parse query_params to parsed_parameters
FastAPIApp->>FastAPIApp: Join to passed_args (comma separated)
FastAPIApp->>Surf: run(lat=None, long=None, args=passed_args)
Surf->>Helper: separate_args(args)
Helper-->>Surf: arguments dict
Surf->>Api: separate_args_and_get_location(arguments.location_args)
Api->>Api: get_coordinates(tuple(args)) (lru_cache)
Api->>GeoAPI: HTTP request (if cache miss)
GeoAPI-->>Api: Coordinates
Api-->>Surf: coordinates, location_data
Surf->>Api: gather_data(lat, long, arguments)
Api->>Api: get_uv / ocean_information / forecast ... (cached)
Api->>OpenMeteo: HTTP requests on cache miss
OpenMeteo-->>Api: Weather/ocean data
Api-->>Surf: ocean_data_dict
Surf->>DB: insert_report(ocean_data_dict.copy())
DB-->>Surf: inserted_id
Surf-->>FastAPIApp: rendered text output
FastAPIApp-->>Browser: HTTP 200 text response
Browser-->>User: Display surf report
Class diagram for updated SurfReport and database interactionsclassDiagram
class SurfReport {
+str gpt_prompt
+tuple gpt_info
+SurfReportDatabaseOps db_handler
+__init__() None
+run(lat float, long float, args list~str~) None
+_init_db() SurfReportDatabaseOps
+_save_report(ocean_data_dict dict) None
}
class SurfReportDatabaseOps {
+DatabaseConnection db
+Collection collection
+insert_report(report_document dict) Any
}
class DatabaseConnection {
+str db_uri
+MongoClient client
+Database db
+__init__() None
+connect(db_name str) None
}
class GPTSettings {
+str GPT_PROMPT
+str API_KEY
+str GPT_MODEL
}
class DatabaseSettings {
+str DB_URI
}
SurfReport --> SurfReportDatabaseOps : uses
SurfReport --> GPTSettings : reads
SurfReport --> DatabaseSettings : init_db
SurfReportDatabaseOps --> DatabaseConnection : composes
DatabaseConnection --> MongoClient : uses
Flow diagram for new TTL caching of Open-Meteo API callsflowchart TD
A[Caller function in api.gather_data] --> B[get_uv / get_uv_history / ocean_information / ocean_information_history / current_wind_temp / get_rain / forecast / get_hourly_forecast]
B --> C{Corresponding TTLCache
hit?}
C -- Yes --> D[Return cached value]
C -- No --> E[Build request parameters]
E --> F[Call Open-Meteo API]
F --> G[Parse API response]
G --> H[Store result in TTLCache
with 600s TTL]
H --> I[Return fresh value]
subgraph Caches
C1[_uv_cache]
C2[uv_history_cache]
C3[_ocean_cache]
C4[_ocean_history_cache]
C5[_wind_temp_cache]
C6[_rain_cache]
C7[_forecast_cache]
C8[_hourly_forecast_cache]
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- The change to
helper.separate_argsnow expects a string (args.split(',')), butSurfReport.runis still typed as takingargs: list[str] | Noneand other callers may still pass lists; consider standardizing on a single type (e.g.,str | None) in both the implementation and type hints to avoid runtime type mismatches. - You are reusing
_ocean_lockfor allTTLCacheusages (_uv_cache,_wind_temp_cache,_rain_cache, etc.), which effectively serializes all cached calls; if you need locking at all, consider using separate locks per cache or dropping the lock where functions are read-only to reduce unnecessary contention. - In
server.py, the endpoint captures CLI output byredirect_stdoutaround a long-runningsurf.runcall on the event loop; consider refactoringsurf.run(or a helper) to return structured data or a string instead of printing so the FastAPI handler can call it directly without stdout redirection and with better async compatibility.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The change to `helper.separate_args` now expects a string (`args.split(',')`), but `SurfReport.run` is still typed as taking `args: list[str] | None` and other callers may still pass lists; consider standardizing on a single type (e.g., `str | None`) in both the implementation and type hints to avoid runtime type mismatches.
- You are reusing `_ocean_lock` for all `TTLCache` usages (`_uv_cache`, `_wind_temp_cache`, `_rain_cache`, etc.), which effectively serializes all cached calls; if you need locking at all, consider using separate locks per cache or dropping the lock where functions are read-only to reduce unnecessary contention.
- In `server.py`, the endpoint captures CLI output by `redirect_stdout` around a long-running `surf.run` call on the event loop; consider refactoring `surf.run` (or a helper) to return structured data or a string instead of printing so the FastAPI handler can call it directly without stdout redirection and with better async compatibility.
## Individual Comments
### Comment 1
<location path="src/server.py" line_range="60-65" />
<code_context>
- except subprocess.CalledProcessError as e:
- logger.error("Subprocess error: %s", e.stderr)
- raise HTTPException(status_code=500, detail="Internal CLI Error")
+ passed_args = ",".join(parsed_parameters)
+
+ # get stdout from print function (probably not ideal but whatever)
+ f = io.StringIO()
+ with redirect_stdout(f):
+ surf.run(args=passed_args)
+ return f.getvalue()
</code_context>
<issue_to_address>
**issue (bug_risk):** Using redirect_stdout to capture CLI output is not concurrency-safe in a web server context.
This approach temporarily overrides process-wide stdout, so concurrent requests can interfere with each other and mix outputs, and unrelated logs may be captured. Instead, consider refactoring `SurfReport.run` (or a lower-level function) to return its output as a string so the route can return that directly without redirecting stdout.
</issue_to_address>
### Comment 2
<location path="src/cli.py" line_range="42-46" />
<code_context>
return None
- def run(self, lat=None, long=None, args=None):
+ def run(
+ self,
+ lat: float | None = None,
+ long: float | None = None,
+ args: list[str] | None = None,
+ ):
"""
</code_context>
<issue_to_address>
**issue:** The `run` method type hint for `args` conflicts with how it is used from the server and in `separate_args`.
`server.default_route` passes `args` as a comma-separated `str` to `surf.run`, and `helper.separate_args` now also expects a `str` and calls `args.split(',')`. However, `run` is typed as `args: list[str] | None`, which doesn’t match this usage and suggests a list-based API that would break `separate_args`. Please update the `run` signature to reflect the actual type (e.g. `str | None`) or introduce separate parameters for the raw query string vs. parsed list so the types stay consistent.
</issue_to_address>
### Comment 3
<location path="src/helper.py" line_range="130" />
<code_context>
Args are separated by commas in input. Separate them and return list.
"""
if len(args) > 1:
- return args[1].split(",")
+ return args.split(",")
return []
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Changing `separate_args` from list-based to string-based input is a behavioral change that may surprise existing callers.
This now assumes `args` is a comma-separated string (`args.split(',')`) instead of a list like `sys.argv` (`args[1].split(',')`). While that matches the new `server.default_route` usage, any existing callers that still pass a list will break. Consider either renaming this helper and keeping a list-based version, or supporting both list and string inputs (e.g., by checking the type) to avoid runtime regressions.
</issue_to_address>
### Comment 4
<location path="tests/test_root.py" line_range="9-17" />
<code_context>
-def test_root_returns_200_with_mock(monkeypatch):
+def test_root_returns_200():
env = ServerSettings()
app = create_app(env)
- class DummyCompleted:
- stdout = "ok"
- stderr = ""
-
- def fake_run(*args, **kwargs):
- return DummyCompleted()
-
- monkeypatch.setattr(subprocess, "run", fake_run)
-
client = TestClient(app)
resp = client.get("/")
assert resp.status_code == HTTPStatus.OK
- assert "ok" in resp.text
+ assert "wave" in resp.text.lower()
</code_context>
<issue_to_address>
**issue (testing):** `test_root_returns_200` no longer isolates the FastAPI route from external dependencies, which may make it flaky and slow
By removing the subprocess mock, this test now exercises real `SurfReport.run` behavior (network/DB/GPT/etc.), making it slow, environment‑dependent, and potentially flaky.
Consider mocking the `SurfReport` used in `create_app` (or patching `cli.SurfReport.run`) so the test stays fast and deterministic, and so you can:
- Assert `run` is called with the expected args from the query parameters.
- Add a separate test for failure cases (e.g., when `run` raises) to verify the route returns the expected 5xx.
This keeps the route test isolated while still covering the new direct method call path.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| passed_args = ",".join(parsed_parameters) | ||
|
|
||
| # get stdout from print function (probably not ideal but whatever) | ||
| f = io.StringIO() | ||
| with redirect_stdout(f): | ||
| surf.run(args=passed_args) |
There was a problem hiding this comment.
issue (bug_risk): Using redirect_stdout to capture CLI output is not concurrency-safe in a web server context.
This approach temporarily overrides process-wide stdout, so concurrent requests can interfere with each other and mix outputs, and unrelated logs may be captured. Instead, consider refactoring SurfReport.run (or a lower-level function) to return its output as a string so the route can return that directly without redirecting stdout.
| def run( | ||
| self, | ||
| lat: float | None = None, | ||
| long: float | None = None, | ||
| args: list[str] | None = None, |
There was a problem hiding this comment.
issue: The run method type hint for args conflicts with how it is used from the server and in separate_args.
server.default_route passes args as a comma-separated str to surf.run, and helper.separate_args now also expects a str and calls args.split(','). However, run is typed as args: list[str] | None, which doesn’t match this usage and suggests a list-based API that would break separate_args. Please update the run signature to reflect the actual type (e.g. str | None) or introduce separate parameters for the raw query string vs. parsed list so the types stay consistent.
| Args are separated by commas in input. Separate them and return list. | ||
| """ | ||
| if len(args) > 1: | ||
| return args[1].split(",") |
There was a problem hiding this comment.
suggestion (bug_risk): Changing separate_args from list-based to string-based input is a behavioral change that may surprise existing callers.
This now assumes args is a comma-separated string (args.split(',')) instead of a list like sys.argv (args[1].split(',')). While that matches the new server.default_route usage, any existing callers that still pass a list will break. Consider either renaming this helper and keeping a list-based version, or supporting both list and string inputs (e.g., by checking the type) to avoid runtime regressions.
| def test_root_returns_200(): | ||
| env = ServerSettings() | ||
| app = create_app(env) | ||
|
|
||
| class DummyCompleted: | ||
| stdout = "ok" | ||
| stderr = "" | ||
|
|
||
| def fake_run(*args, **kwargs): | ||
| return DummyCompleted() | ||
|
|
||
| monkeypatch.setattr(subprocess, "run", fake_run) | ||
|
|
||
| client = TestClient(app) | ||
| resp = client.get("/") | ||
|
|
||
| assert resp.status_code == HTTPStatus.OK | ||
| assert "ok" in resp.text | ||
| assert "wave" in resp.text.lower() |
There was a problem hiding this comment.
issue (testing): test_root_returns_200 no longer isolates the FastAPI route from external dependencies, which may make it flaky and slow
By removing the subprocess mock, this test now exercises real SurfReport.run behavior (network/DB/GPT/etc.), making it slow, environment‑dependent, and potentially flaky.
Consider mocking the SurfReport used in create_app (or patching cli.SurfReport.run) so the test stays fast and deterministic, and so you can:
- Assert
runis called with the expected args from the query parameters. - Add a separate test for failure cases (e.g., when
runraises) to verify the route returns the expected 5xx.
This keeps the route test isolated while still covering the new direct method call path.
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
General:
Code:
Summary by Sourcery
Introduce in-memory caching and type hints across API, CLI, and server layers while simplifying server–CLI integration.
New Features:
Enhancements:
Build:
Tests: