|
| 1 | +# EasyCLA: Author and Co-author Caching + Large-PR Support |
| 2 | + |
| 3 | +- **Two-level caching** for author and co-author identity & identity plus per-project signature decisions. |
| 4 | +- **GraphQL-based commit ingestion** that comfortably handles PRs with **250+ commits (and beyond)**. |
| 5 | + |
| 6 | +--- |
| 7 | + |
| 8 | +## Why it matters |
| 9 | +- Faster PR checks and `/easycla` re-runs. |
| 10 | +- Lower DB/API load via memoized decisions. |
| 11 | +- Stable, deterministic output and accurate status posting on the PR **head SHA**. |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +## Caching |
| 16 | +- **General cache key**: `(author_id, lower(login), lower(email)) → (user | None)` |
| 17 | +- **Per-project cache key**: `(project_id, author_id, lower(login), lower(email)) → (user | None, authorized, affiliated)` |
| 18 | +- **TTL policy**: positives **~24h**; negative/uncertain states use **Quick TTL = 5m**. |
| 19 | +- **Flow**: per-project cache → general cache → cold DB path. Results are stored back with the appropriate TTL. |
| 20 | +- Thread-safe with periodic expired entries cleanup (once per hour). |
| 21 | + |
| 22 | +--- |
| 23 | + |
| 24 | +## Large PR (250+) support |
| 25 | +- Switch to **GitHub GraphQL** for commits (`pageSize=100`) with cursor paging. |
| 26 | +- Parallel processing via thread pool; co-authors parsed from **commit messages** (`Co-authored-by:`). |
| 27 | +- Final actor lists are **de-duplicated** and **sorted** (login, name, email, sha) for stable comments. |
| 28 | +- PR comments are **edited only when normalized body changes** (prevents churn & size bloat). |
| 29 | +- Commit statuses are always posted to the **true PR head SHA**. |
| 30 | + |
| 31 | +--- |
| 32 | + |
| 33 | +## Operational notes |
| 34 | +- Expect noticeable **latency reduction** on large PRs and repeated checks. |
| 35 | +- Fallbacks remain safe; unknown users land in an “Unknown” bucket with guidance. |
| 36 | +- No behavior change to the core signing rules—only faster execution. |
| 37 | + |
| 38 | +--- |
| 39 | + |
| 40 | +## Quick constants |
| 41 | +- `QUICK_CACHE_TTL = 300` seconds (negative/uncertain states). |
| 42 | +- Default positive cache TTL ≈ **24 hours**. |
| 43 | +- GraphQL: `pageSize=100`, parallel workers tuned for throughput. |
0 commit comments