Tighten boto3 S3 timeouts (connect/read)

## Follow-up to #6

After landing the `max_pool_connections=50` fix in [#7](https://github.com/bluedynamics/zodb-pgjsonb-thumborblobloader/pull/7) / released as [v0.4.2](https://github.com/bluedynamics/zodb-pgjsonb-thumborblobloader/releases/tag/v0.4.2), the pool-saturation class of failure should be gone. If a residual error rate remains under load, the next likely culprit is the boto3 default **request timeouts**: `connect_timeout=60s` and `read_timeout=60s`.

## Why this matters

A hanging S3 connection (bad network, throttling, endpoint hiccup) currently keeps a Thumbor worker blocked for up to 60 s. Under a brief S3-side blip this cascades into:

- `asyncio.to_thread` worker threads all stuck on S3 I/O.
- Incoming Thumbor requests queue behind them.
- Varnish/CDN keepalive begins retrying — amplifying the stuck load.
- Overall frontend slowdown proportional to `connect_timeout + read_timeout` per hung request.

A tighter budget bounds the blast radius: fail fast, free the worker, let the client retry (or fall through to the PG path if both are available).

## Proposal

Extend the `botocore.Config` already introduced in #7 with timeout settings, both env-var overridable:

```python
from botocore.config import Config

max_pool = int(os.environ.get("PGTHUMBOR_S3_MAX_POOL_CONNECTIONS", "50"))
connect_timeout = float(os.environ.get("PGTHUMBOR_S3_CONNECT_TIMEOUT", "5"))
read_timeout = float(os.environ.get("PGTHUMBOR_S3_READ_TIMEOUT", "30"))

kwargs["config"] = Config(
    max_pool_connections=max_pool,
    connect_timeout=connect_timeout,
    read_timeout=read_timeout,
    retries={"max_attempts": 2, "mode": "adaptive"},
)
```

Suggested defaults:

| Setting | Default | Rationale |
|---|---|---|
| `connect_timeout` | `5 s` | Fast fail on unreachable endpoints; legitimate TLS handshakes are well under this |
| `read_timeout` | `30 s` | Covers typical blob sizes (KB–low-MB) on sane connections |
| `retries.max_attempts` | `2` | One retry for truly transient errors; keep latency bounded |
| `retries.mode` | `adaptive` | Honours throttling signals from S3 |

All three env-var overridable for ops tuning.

## Interaction with existing 4xx/5xx microcache (#5)

If a timeout causes the loader to return `None` → Thumbor emits a 400/5xx → the [`PGTHUMBOR_CACHE_CONTROL_ERROR`](https://github.com/bluedynamics/zodb-pgjsonb-thumborblobloader/blob/main/src/zodb_pgjsonb_thumborblobloader/auth_handler.py) microcache (10 s) absorbs the repeat-requests, so even during an S3 incident the backend doesn't get hammered. The two fixes compound nicely.

## Test plan

- Unit: patch `boto3.client`, assert `kwargs["config"].connect_timeout == 5` / `read_timeout == 30` with env unset.
- Unit: env override for each, asserted individually.
- Integration (optional): use `moto` with a deliberate slow response and assert the loader returns `None` within the configured budget rather than hanging.

## Not yet prioritised because

The pool fix alone already addressed the observed symptoms on aaf-6 prod in the scope of #6. Open this as a standing task for the next time an S3-side hiccup causes visible latency spikes — at which point these defaults become the mitigation.

Setting	Default	Rationale
`connect_timeout`	`5 s`	Fast fail on unreachable endpoints; legitimate TLS handshakes are well under this
`read_timeout`	`30 s`	Covers typical blob sizes (KB–low-MB) on sane connections
`retries.max_attempts`	`2`	One retry for truly transient errors; keep latency bounded
`retries.mode`	`adaptive`	Honours throttling signals from S3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tighten boto3 S3 timeouts (connect/read) #9

Follow-up to #6

Why this matters

Proposal

Interaction with existing 4xx/5xx microcache (#5)

Test plan

Not yet prioritised because

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tighten boto3 S3 timeouts (connect/read) #9

Description

Follow-up to #6

Why this matters

Proposal

Interaction with existing 4xx/5xx microcache (#5)

Test plan

Not yet prioritised because

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions