Skip to content

Commit b8c7815

Browse files
shaypal5claudepre-commit-ci[bot]
authored
Add S3 backend core (#338)
* Add S3 backend core (closes #41) Implements a new cachier backend backed by AWS S3 (or any S3-compatible service such as MinIO or localstack). Key changes: - src/cachier/cores/s3.py: new _S3Core implementing _BaseCore; stores one pickled CacheEntry per key under <prefix>/<func_str>/<key>.pkl; supports direct boto3 client, callable factory, or auto-created client via region / endpoint_url / Config options; syncs async callers via thread delegation from _BaseCore defaults (boto3 is sync-only). - src/cachier/_types.py: add S3Client type alias; extend Backend literal with "s3". - src/cachier/core.py: wire s3_bucket, s3_prefix, s3_client, s3_client_factory, s3_region, s3_endpoint_url, s3_config decorator parameters; import and instantiate _S3Core. - pyproject.toml: add "s3" pytest marker; add [project.optional-dependencies] with per-backend extras (mongo, redis, sql, s3, all). - tests/s3_tests/: 18 tests covering basic caching, skip/overwrite, stale_after, next_time, allow_none, entry_size_limit, clear_cache, clear_being_calculated, delete_stale_entries, client factory, thread safety, and error handling. Uses moto[s3] for offline testing with no real AWS account needed. - tests/requirements_s3.txt: boto3 + moto[s3] test deps. - examples/s3_example.py: runnable demo for basic caching, stale_after, client factory, and cache management. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codex changes * add s3 tests to the CI * fix(async): make wrapper clear methods await-safe for coroutine-decorated functions clear_cache/clear_being_calculated on async-decorated wrappers previously returned None, so await func.clear_cache() raised TypeError. Return an immediate awaitable for coroutine wrappers while preserving existing sync usage, and add a regression test covering both sync and awaited calls. * fix(s3): prevent warning escalation from crashing threaded cache paths Use a _safe_warn helper for S3 core warning paths so recoverable S3 errors still surface as warnings, but do not raise uncaught thread exceptions under pytest -W error (fixes test_s3_core_threadsafe CI failure). * feat(test-local): add S3 core support and document local S3 testing in README * test(s3): cover async aget_entry_by_key path in S3 core * fix: satisfy ruff SIM105 in s3 warning helper --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent d3b000a commit b8c7815

File tree

14 files changed

+1627
-21
lines changed

14 files changed

+1627
-21
lines changed

.github/workflows/ci-test.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ jobs:
2525
matrix:
2626
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
2727
os: ["ubuntu-latest", "macOS-latest", "windows-latest"]
28-
backend: ["local", "mongodb", "postgres", "redis"]
28+
backend: ["local", "mongodb", "postgres", "redis", "s3"]
2929
exclude:
3030
# ToDo: take if back when the connection become stable
3131
# or resolve using `InMemoryMongoClient`
@@ -65,7 +65,7 @@ jobs:
6565

6666
- name: Unit tests (local)
6767
if: matrix.backend == 'local'
68-
run: pytest -m "not mongo and not sql and not redis" --cov=cachier --cov-report=term --cov-report=xml:cov.xml
68+
run: pytest -m "not mongo and not sql and not redis and not s3" --cov=cachier --cov-report=term --cov-report=xml:cov.xml
6969

7070
- name: Setup docker (missing on MacOS)
7171
if: runner.os == 'macOS' && matrix.backend == 'mongodb'
@@ -135,6 +135,10 @@ jobs:
135135
if: matrix.backend == 'redis'
136136
run: pytest -m redis --cov=cachier --cov-report=term --cov-report=xml:cov.xml
137137

138+
- name: Unit tests (S3)
139+
if: matrix.backend == 's3'
140+
run: pytest -m s3 --cov=cachier --cov-report=term --cov-report=xml:cov.xml
141+
138142
- name: Upload coverage to Codecov (non PRs)
139143
continue-on-error: true
140144
uses: codecov/codecov-action@v5

README.rst

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ Current features
5959
* Cross-machine caching using MongoDB.
6060
* SQL-based caching using SQLAlchemy-supported databases.
6161
* Redis-based caching for high-performance scenarios.
62+
* S3-based caching for cross-machine object storage backends.
6263

6364
* Thread-safety.
6465
* **Per-call max age:** Specify a maximum age for cached values per call.
@@ -71,7 +72,6 @@ Cachier is **NOT**:
7172
Future features
7273
---------------
7374

74-
* S3 core.
7575
* Multi-core caching.
7676
* `Cache replacement policies <https://en.wikipedia.org/wiki/Cache_replacement_policies>`_
7777

@@ -580,6 +580,12 @@ Cachier supports Redis-based caching for high-performance scenarios. Redis provi
580580
- ``processing``: Boolean, is value being calculated
581581
- ``completed``: Boolean, is value calculation completed
582582

583+
**S3 Sync/Async Support:**
584+
585+
- Sync functions use direct boto3 calls.
586+
- Async functions are supported via thread-offloaded sync boto3 calls
587+
(delegated mode), not a native async client.
588+
583589
**Limitations & Notes:**
584590

585591
- Requires SQLAlchemy (install with ``pip install SQLAlchemy``)
@@ -631,6 +637,11 @@ async drivers and require the client or engine type to match the decorated funct
631637
- ``redis_client`` must be a sync client or sync callable for sync functions and
632638
an async callable returning a ``redis.asyncio.Redis`` client for async
633639
functions. Passing a sync callable to an async function raises ``TypeError``.
640+
* - **S3**
641+
- Yes
642+
- Yes (delegated)
643+
- Async support is delegated via thread-offloaded sync boto3 calls
644+
(``asyncio.to_thread``). No async S3 client is required.
634645

635646

636647
Contributing
@@ -655,13 +666,14 @@ Install in development mode with test dependencies for local cores (memory and p
655666
cd cachier
656667
pip install -e . -r tests/requirements.txt
657668
658-
Each additional core (MongoDB, Redis, SQL) requires additional dependencies. To install all dependencies for all cores, run:
669+
Each additional core (MongoDB, Redis, SQL, S3) requires additional dependencies. To install all dependencies for all cores, run:
659670

660671
.. code-block:: bash
661672
662673
pip install -r tests/requirements_mongodb.txt
663674
pip install -r tests/requirements_redis.txt
664675
pip install -r tests/requirements_postgres.txt
676+
pip install -r tests/requirements_s3.txt
665677
666678
Running the tests
667679
-----------------
@@ -724,7 +736,7 @@ This script automatically handles Docker container lifecycle, environment variab
724736
.. code-block:: bash
725737
726738
make test-mongo-local # Run MongoDB tests with Docker
727-
make test-all-local # Run all backends with Docker
739+
make test-all-local # Run all backends locally (Docker used for mongo/redis/sql)
728740
make test-mongo-inmemory # Run with in-memory MongoDB (default)
729741
730742
**Option 3: Manual setup**
@@ -750,18 +762,21 @@ Contributors are encouraged to test against a real MongoDB instance before submi
750762
Testing all backends locally
751763
-----------------------------
752764

753-
To test all cachier backends (MongoDB, Redis, SQL, Memory, Pickle) locally with Docker:
765+
To test all cachier backends (MongoDB, Redis, SQL, S3, Memory, Pickle) locally:
754766

755767
.. code-block:: bash
756768
757769
# Test all backends at once
758770
./scripts/test-local.sh all
759771
760-
# Test only external backends (MongoDB, Redis, SQL)
772+
# Test only external backends that require Docker (MongoDB, Redis, SQL)
761773
./scripts/test-local.sh external
762774
775+
# Test S3 backend only (uses moto, no Docker needed)
776+
./scripts/test-local.sh s3
777+
763778
# Test specific combinations
764-
./scripts/test-local.sh mongo redis
779+
./scripts/test-local.sh mongo redis s3
765780
766781
# Keep containers running for debugging
767782
./scripts/test-local.sh all -k
@@ -772,7 +787,7 @@ To test all cachier backends (MongoDB, Redis, SQL, Memory, Pickle) locally with
772787
# Test multiple files across all backends
773788
./scripts/test-local.sh all -f tests/test_main.py -f tests/test_redis_core_coverage.py
774789
775-
The unified test script automatically manages Docker containers, installs required dependencies, and runs the appropriate test suites. The ``-f`` / ``--files`` option allows you to run specific test files instead of the entire test suite. See ``scripts/README-local-testing.md`` for detailed documentation.
790+
The unified test script automatically manages Docker containers for MongoDB/Redis/SQL, installs required dependencies (including ``tests/requirements_s3.txt`` for S3), and runs the appropriate test suites. The ``-f`` / ``--files`` option allows you to run specific test files instead of the entire test suite. See ``scripts/README-local-testing.md`` for detailed documentation.
776791

777792

778793
Running pre-commit hooks locally

examples/s3_example.py

Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
"""Cachier S3 backend example.
2+
3+
Demonstrates persistent function caching backed by AWS S3 (or any S3-compatible
4+
service). Requires boto3 to be installed::
5+
6+
pip install cachier[s3]
7+
8+
A real S3 bucket (or a local S3-compatible service such as MinIO / localstack)
9+
is needed to run this example. Adjust the configuration variables below to
10+
match your environment.
11+
12+
"""
13+
14+
import time
15+
from datetime import timedelta
16+
17+
try:
18+
import boto3
19+
20+
from cachier import cachier
21+
except ImportError as exc:
22+
print(f"Missing required package: {exc}")
23+
print("Install with: pip install cachier[s3]")
24+
raise SystemExit(1) from exc
25+
26+
# ---------------------------------------------------------------------------
27+
# Configuration - adjust these to your environment
28+
# ---------------------------------------------------------------------------
29+
BUCKET_NAME = "my-cachier-bucket"
30+
REGION = "us-east-1"
31+
32+
# Optional: point to a local S3-compatible service
33+
# ENDPOINT_URL = "http://localhost:9000" # MinIO default
34+
ENDPOINT_URL = None
35+
36+
37+
# ---------------------------------------------------------------------------
38+
# Helper: verify S3 connectivity
39+
# ---------------------------------------------------------------------------
40+
41+
42+
def _check_bucket(client, bucket: str) -> bool:
43+
"""Return True if the bucket is accessible."""
44+
try:
45+
client.head_bucket(Bucket=bucket)
46+
return True
47+
except Exception as exc:
48+
print(f"Cannot access bucket '{bucket}': {exc}")
49+
return False
50+
51+
52+
# ---------------------------------------------------------------------------
53+
# Demos
54+
# ---------------------------------------------------------------------------
55+
56+
57+
def demo_basic_caching():
58+
"""Show basic S3 caching: the first call computes, the second reads cache."""
59+
print("\n=== Basic S3 caching ===")
60+
61+
@cachier(
62+
backend="s3",
63+
s3_bucket=BUCKET_NAME,
64+
s3_region=REGION,
65+
s3_endpoint_url=ENDPOINT_URL,
66+
)
67+
def expensive(n: int) -> int:
68+
"""Simulate an expensive computation."""
69+
print(f" computing expensive({n})...")
70+
time.sleep(1)
71+
return n * n
72+
73+
expensive.clear_cache()
74+
75+
start = time.time()
76+
r1 = expensive(5)
77+
t1 = time.time() - start
78+
print(f"First call: {r1} ({t1:.2f}s)")
79+
80+
start = time.time()
81+
r2 = expensive(5)
82+
t2 = time.time() - start
83+
print(f"Second call: {r2} ({t2:.2f}s) - from cache")
84+
85+
assert r1 == r2
86+
assert t2 < t1
87+
print("Basic caching works correctly.")
88+
89+
90+
def demo_stale_after():
91+
"""Show stale_after: results expire and are recomputed after the timeout."""
92+
print("\n=== Stale-after demo ===")
93+
94+
@cachier(
95+
backend="s3",
96+
s3_bucket=BUCKET_NAME,
97+
s3_region=REGION,
98+
s3_endpoint_url=ENDPOINT_URL,
99+
stale_after=timedelta(seconds=3),
100+
)
101+
def timed(n: int) -> float:
102+
print(f" computing timed({n})...")
103+
return time.time()
104+
105+
timed.clear_cache()
106+
r1 = timed(1)
107+
r2 = timed(1)
108+
assert r1 == r2, "Second call should hit cache"
109+
110+
print("Sleeping 4 seconds so the entry becomes stale...")
111+
time.sleep(4)
112+
113+
r3 = timed(1)
114+
assert r3 > r1, "Should have recomputed after stale period"
115+
print("Stale-after works correctly.")
116+
117+
118+
def demo_client_factory():
119+
"""Show using a callable factory instead of a pre-built client."""
120+
print("\n=== Client factory demo ===")
121+
122+
def make_client():
123+
"""Lazily create a boto3 S3 client."""
124+
kwargs = {"region_name": REGION}
125+
if ENDPOINT_URL:
126+
kwargs["endpoint_url"] = ENDPOINT_URL
127+
return boto3.client("s3", **kwargs)
128+
129+
@cachier(
130+
backend="s3",
131+
s3_bucket=BUCKET_NAME,
132+
s3_client_factory=make_client,
133+
)
134+
def compute(n: int) -> int:
135+
return n + 100
136+
137+
compute.clear_cache()
138+
assert compute(7) == compute(7)
139+
print("Client factory works correctly.")
140+
141+
142+
def demo_cache_management():
143+
"""Show clear_cache and overwrite_cache."""
144+
print("\n=== Cache management demo ===")
145+
call_count = [0]
146+
147+
@cachier(
148+
backend="s3",
149+
s3_bucket=BUCKET_NAME,
150+
s3_region=REGION,
151+
s3_endpoint_url=ENDPOINT_URL,
152+
)
153+
def managed(n: int) -> int:
154+
call_count[0] += 1
155+
return n * 3
156+
157+
managed.clear_cache()
158+
managed(10)
159+
managed(10)
160+
assert call_count[0] == 1, "Should have been called once (cached on second call)"
161+
162+
managed.clear_cache()
163+
managed(10)
164+
assert call_count[0] == 2, "Should have recomputed after cache clear"
165+
166+
managed(10, cachier__overwrite_cache=True)
167+
assert call_count[0] == 3, "Should have recomputed due to overwrite_cache"
168+
print("Cache management works correctly.")
169+
170+
171+
# ---------------------------------------------------------------------------
172+
# Entry point
173+
# ---------------------------------------------------------------------------
174+
175+
176+
def main():
177+
"""Run all S3 backend demos."""
178+
print("Cachier S3 Backend Demo")
179+
print("=" * 50)
180+
181+
client = boto3.client(
182+
"s3",
183+
region_name=REGION,
184+
**({"endpoint_url": ENDPOINT_URL} if ENDPOINT_URL else {}),
185+
)
186+
187+
if not _check_bucket(client, BUCKET_NAME):
188+
print(f"\nCreate the bucket first: aws s3 mb s3://{BUCKET_NAME} --region {REGION}")
189+
raise SystemExit(1)
190+
191+
try:
192+
demo_basic_caching()
193+
demo_stale_after()
194+
demo_client_factory()
195+
demo_cache_management()
196+
197+
print("\n" + "=" * 50)
198+
print("All S3 demos completed successfully.")
199+
print("\nKey benefits of the S3 backend:")
200+
print("- Persistent cache survives process restarts")
201+
print("- Shared across machines without a running service")
202+
print("- Works with any S3-compatible object storage")
203+
finally:
204+
client.close()
205+
206+
207+
if __name__ == "__main__":
208+
main()

pyproject.toml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,25 @@ dependencies = [
4949
"pympler>=1",
5050
"watchdog>=2.3.1",
5151
]
52+
53+
optional-dependencies.all = [
54+
"boto3>=1.26",
55+
"pymongo>=4",
56+
"redis>=4",
57+
"sqlalchemy>=2",
58+
]
59+
optional-dependencies.mongo = [
60+
"pymongo>=4",
61+
]
62+
optional-dependencies.redis = [
63+
"redis>=4",
64+
]
65+
optional-dependencies.s3 = [
66+
"boto3>=1.26",
67+
]
68+
optional-dependencies.sql = [
69+
"sqlalchemy>=2",
70+
]
5271
urls.Source = "https://github.com/python-cachier/cachier"
5372
# --- setuptools ---
5473

@@ -176,6 +195,7 @@ markers = [
176195
"pickle: test the pickle core",
177196
"redis: test the Redis core",
178197
"sql: test the SQL core",
198+
"s3: test the S3 core",
179199
"maxage: test the max_age functionality",
180200
"asyncio: marks tests as async",
181201
]

0 commit comments

Comments
 (0)