Skip to content

fix: set explicit max_pool_connections on boto3 S3 client (closes #6)#7

Merged
jensens merged 1 commit into
mainfrom
fix/s3-max-pool-connections
Apr 20, 2026
Merged

fix: set explicit max_pool_connections on boto3 S3 client (closes #6)#7
jensens merged 1 commit into
mainfrom
fix/s3-max-pool-connections

Conversation

@jensens
Copy link
Copy Markdown
Member

@jensens jensens commented Apr 20, 2026

Summary

  • Pass a botocore.Config with max_pool_connections=50 (env-var overridable) when constructing the boto3 S3 client.
  • Closes #6 — pool-full warnings and connection churn on aaf-6 prod under normal Thumbor load.

Why this change

boto3's default max_pool_connections is 10. Under typical Thumbor load (30 thumbnails per listing page × active visitors), the urllib3 pool saturates immediately and overflow requests get connection-discard-then-fresh-handshake churn. On aaf-6 prod this produced the observed urllib3.connectionpool: Connection pool is full, discarding connection warnings and correlates with intermittent Thumbor 400s from the downstream handshake failures.

50 is a reasonable default: covers asyncio.to_thread's default executor (min(32, cpu+4) — i.e. 8–32 threads depending on host) plus headroom for reopen churn. Env-var override (PGTHUMBOR_S3_MAX_POOL_CONNECTIONS) keeps it tunable per deployment.

Behaviour matrix

Env var Effect
unset (default) max_pool_connections=50
PGTHUMBOR_S3_MAX_POOL_CONNECTIONS=128 max_pool_connections=128

Scope limitations (out of scope, candidates for a follow-up)

  • Timeouts: Config does not override connect_timeout/read_timeout; boto3 defaults of 60s/60s remain. A hanging S3 connection still blocks a worker for up to 60s. If the pool fix alone doesn't eliminate the residual 400s, tightening to ~5s/30s via the same Config is the next lever.
  • Cache key: _get_s3_client caches by (bucket, region, endpoint). The env var is read once per client construction (first call wins); not worth including in the cache key since it won't vary per process.

Test plan

  • New test test_default_max_pool_connections: mocks boto3.client, asserts kwargs["config"].max_pool_connections == 50 with the env var unset.
  • New test test_env_override_max_pool_connections: env var =128 → asserts max_pool_connections == 128.
  • Existing S3/moto/integration tests unchanged and passing (80/80).
  • Production verification on aaf-6: urllib3.connectionpool: Connection pool is full warnings should disappear; Thumbor 400 rate should drop.

One small factual note

Issue #6 mentions asyncio.to_thread's default executor has "40 threads". Python 3.8+ actually defaults to min(32, os.cpu_count()+4) (so 8–32 in practice). Doesn't affect the recommendation — 50 still covers it with room — but the comment in the source now uses the accurate figure.

🤖 Generated with Claude Code

boto3's default max_pool_connections=10 is too low for concurrent
Thumbor image loads (30 thumbnails per listing page times active
visitors saturates immediately). The resulting urllib3 pool-full
warnings cause connection churn — every overflow request does a
fresh TCP+TLS handshake — which on aaf-6 prod correlates with
intermittent Thumbor 400s.

Pass a botocore.Config with max_pool_connections=50 by default,
overridable via PGTHUMBOR_S3_MAX_POOL_CONNECTIONS. 50 covers
asyncio.to_thread's default executor (min(32, cpu+4)) plus headroom.

Closes #6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jensens jensens merged commit a4dcf9e into main Apr 20, 2026
4 checks passed
@jensens jensens deleted the fix/s3-max-pool-connections branch April 20, 2026 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

boto3 S3 client created without max_pool_connections — pool exhausts under load

1 participant