Feat cache update#197
Conversation
Reviewer's GuideRefactors Open-Meteo client usage into a singleton module and introduces concurrent data fetching in the API layer, while updating caches and tests accordingly. Sequence diagram for concurrent gather_data executionsequenceDiagram
actor User
participant APILayer as api_gather_data
participant Executor as ThreadPoolExecutor
participant Ocean as ocean_information
participant UV as get_uv
participant Hourly as get_hourly_forecast
participant WindTemp as current_wind_temp
participant Rain as get_rain
participant Forecast as forecast
participant OceanHist as ocean_information_history
participant UVHist as get_uv_history
participant Client as openmeteo_client
participant OpenMeteoAPI
User->>APILayer: gather_data(lat, long, arguments)
APILayer->>APILayer: parse lat, long, decimal, unit
APILayer->>Executor: create with max_workers 8
par submit_ocean
Executor->>Ocean: submit(lat, long, decimal, unit)
Ocean->>Client: weather_api(marine_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: marine response
Client-->>Ocean: responses
and submit_uv
Executor->>UV: submit(lat, long, decimal, unit)
UV->>Client: weather_api(air_quality_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: uv response
Client-->>UV: responses
and submit_hourly
Executor->>Hourly: submit(lat, long)
Hourly->>Client: weather_api(forecast_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: hourly response
Client-->>Hourly: responses
and submit_wind_temp
Executor->>WindTemp: submit(lat, long, decimal)
WindTemp->>Client: weather_api(forecast_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: wind temp response
Client-->>WindTemp: responses
and submit_rain
Executor->>Rain: submit(lat, long)
Rain->>Client: weather_api(forecast_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: rain response
Client-->>Rain: responses
and submit_forecast
Executor->>Forecast: submit(lat, long, decimal, 7)
Forecast->>Client: weather_api(marine_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: marine forecast
Client-->>Forecast: responses
Forecast->>Client: weather_api(general_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: general forecast
Client-->>Forecast: responses
and submit_ocean_hist
Executor->>OceanHist: submit(lat, long, decimal, unit)
OceanHist->>Client: weather_api(marine_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: marine history response
Client-->>OceanHist: responses
and submit_uv_hist
Executor->>UVHist: submit(lat, long, decimal, unit)
UVHist->>Client: weather_api(air_quality_url, params)
Client->>OpenMeteoAPI: HTTP request
OpenMeteoAPI-->>Client: uv history response
Client-->>UVHist: responses
end
Executor-->>APILayer: futures with results
APILayer->>APILayer: build results dict
APILayer-->>User: aggregated surf data dict
Class diagram for open_meteo singleton client and api module functionsclassDiagram
class OpenMeteoModule {
+openmeteo_client openmeteo_requests_Client
+_create_client() openmeteo_requests_Client
}
class ApiModule {
+get_coordinates(args tuple) list_or_str
+get_uv(lat float, long float, decimal int, unit str) float_or_str
+get_uv_history(lat float, long float, decimal int, unit str) dict_or_str
+ocean_information(lat float, long float, decimal int, unit str) tuple
+ocean_information_history(lat float, long float, decimal int, unit str) tuple
+current_wind_temp(lat float, long float, temp_unit str) tuple
+get_rain(lat float, long float) tuple
+forecast(lat float, long float, decimal int, days int) dict
+get_hourly_forecast(lat float, long float, days int) dict
+gather_data(lat float_or_str, long float_or_str, arguments dict) dict
}
class TTLCacheOcean {
+maxsize int
+ttl int
}
class TTLCacheUV {
+maxsize int
+ttl int
}
class TTLCacheUVHistory {
+maxsize int
+ttl int
}
class TTLCacheOceanHistory {
+maxsize int
+ttl int
}
class TTLCacheWindTemp {
+maxsize int
+ttl int
}
class TTLCacheRain {
+maxsize int
+ttl int
}
class TTLCacheForecast {
+maxsize int
+ttl int
}
class TTLCacheHourlyForecast {
+maxsize int
+ttl int
}
ApiModule --> OpenMeteoModule : uses openmeteo_client
ApiModule --> TTLCacheOcean : uses
ApiModule --> TTLCacheUV : uses
ApiModule --> TTLCacheUVHistory : uses
ApiModule --> TTLCacheOceanHistory : uses
ApiModule --> TTLCacheWindTemp : uses
ApiModule --> TTLCacheRain : uses
ApiModule --> TTLCacheForecast : uses
ApiModule --> TTLCacheHourlyForecast : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Codecov Report✅ All modified and coverable lines are covered by tests.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
gather_data, consider derivingmax_workersfrom the number of submitted tasks or a module-level constant (and possibly reusing a shared executor) rather than hardcoding8, to make concurrency behavior clearer and easier to tune. - The eager, module-level construction of
openmeteo_clientinopen_meteo.pycouples import time to network client setup; you might want to use a lazily initialized singleton or a small accessor function (e.g.get_openmeteo_client()) to defer creation and make future configuration/injection easier.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `gather_data`, consider deriving `max_workers` from the number of submitted tasks or a module-level constant (and possibly reusing a shared executor) rather than hardcoding `8`, to make concurrency behavior clearer and easier to tune.
- The eager, module-level construction of `openmeteo_client` in `open_meteo.py` couples import time to network client setup; you might want to use a lazily initialized singleton or a small accessor function (e.g. `get_openmeteo_client()`) to defer creation and make future configuration/injection easier.
## Individual Comments
### Comment 1
<location path="src/api.py" line_range="510-519" />
<code_context>
+ dec, unit = arguments["decimal"], arguments["unit"]
+
+ with ThreadPoolExecutor(max_workers=8) as executor:
+ futures = {
+ "ocean": executor.submit(ocean_information, lat, long, dec, unit),
+ "uv": executor.submit(get_uv, lat, long, dec, unit),
+ "hourly": executor.submit(get_hourly_forecast, lat, long),
+ "wind_temp": executor.submit(current_wind_temp, lat, long, dec),
+ "rain": executor.submit(get_rain, lat, long),
+ "forecast": executor.submit(forecast, lat, long, dec, 7),
+ "ocean_hist": executor.submit(
+ ocean_information_history, lat, long, dec, unit
+ ),
+ "uv_hist": executor.submit(get_uv_history, lat, long, dec, unit),
+ }
+ results = {k: f.result() for k, f in futures.items()}
</code_context>
<issue_to_address>
**issue (bug_risk):** Using shared global client and caches from multiple threads may introduce thread-safety issues.
These functions share a single `openmeteo_client` (`requests.Session`) and several `TTLCache` instances, neither of which are inherently thread-safe. Calling them in parallel against shared instances can cause race conditions. If you keep this parallelism, either instantiate separate clients/caches per task/thread or protect shared mutable state with appropriate locking.
</issue_to_address>
### Comment 2
<location path="src/open_meteo.py" line_range="6-9" />
<code_context>
+from retry_requests import retry
+
+
+def _create_client() -> openmeteo_requests.Client:
+ """Creates a retry-enabled Open-Meteo API client."""
+ retry_session = retry(requests.Session(), retries=5, backoff_factor=0.2)
+ return openmeteo_requests.Client(session=retry_session)
+
+
</code_context>
<issue_to_address>
**question (performance):** The new client drops the previous HTTP-level caching, which may significantly increase API traffic and latency.
Previously this client used `requests_cache.CachedSession` with a 1-hour TTL, which avoided repeated calls for identical parameters. The new version uses a plain `requests.Session` with retries only. If this behavior change isn’t intentional, consider adding caching back at this layer or documenting that we now expect increased network usage, higher latency, and more exposure to rate limits.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| futures = { | ||
| "ocean": executor.submit(ocean_information, lat, long, dec, unit), | ||
| "uv": executor.submit(get_uv, lat, long, dec, unit), | ||
| "hourly": executor.submit(get_hourly_forecast, lat, long), | ||
| "wind_temp": executor.submit(current_wind_temp, lat, long, dec), | ||
| "rain": executor.submit(get_rain, lat, long), | ||
| "forecast": executor.submit(forecast, lat, long, dec, 7), | ||
| "ocean_hist": executor.submit( | ||
| ocean_information_history, lat, long, dec, unit | ||
| ), |
There was a problem hiding this comment.
issue (bug_risk): Using shared global client and caches from multiple threads may introduce thread-safety issues.
These functions share a single openmeteo_client (requests.Session) and several TTLCache instances, neither of which are inherently thread-safe. Calling them in parallel against shared instances can cause race conditions. If you keep this parallelism, either instantiate separate clients/caches per task/thread or protect shared mutable state with appropriate locking.
| def _create_client() -> openmeteo_requests.Client: | ||
| """Creates a retry-enabled Open-Meteo API client.""" | ||
| retry_session = retry(requests.Session(), retries=5, backoff_factor=0.2) | ||
| return openmeteo_requests.Client(session=retry_session) |
There was a problem hiding this comment.
question (performance): The new client drops the previous HTTP-level caching, which may significantly increase API traffic and latency.
Previously this client used requests_cache.CachedSession with a 1-hour TTL, which avoided repeated calls for identical parameters. The new version uses a plain requests.Session with retries only. If this behavior change isn’t intentional, consider adding caching back at this layer or documenting that we now expect increased network usage, higher latency, and more exposure to rate limits.
General:
Code:
Changes
There are two main changes here:
src/open_meteo.py:openmeteo_clientis imported from other files. The module is only executed once, soopenmeteo_client = _create_client()runs exactly oncesrc/api.py:openmeteo_clientimportgather_data()function makes several API calls, which are synchronous - this was speed up my introducing concurrency viaThreadPoolExecutor, which heavily speeds up the APIs response timeImpact
Response time before the changes (7.9 seconds):
Response time after the changes (2.5 seconds):
Summary by Sourcery
Introduce a shared Open-Meteo client and parallelize data aggregation to improve API response performance.
Enhancements:
Tests: