Skip to content

Commit 8b2260c

Browse files
ekremneyclaude
andauthored
feat(seo-client): fan-out queries across markets using site region (#1522)
## Summary The SEO provider requires a `database` (country) parameter on every API call — unlike the previous provider (Ahrefs), there is no "global" mode. Until now, all queries were hardcoded to `database=us`, which meant non-US domains (e.g. `www.deceuninck.es`) returned little to no data. This PR makes the SEO client locale-aware by accepting the site's `region` (from `Site.getRegion()`, ISO 3166-1 alpha-2) and fanning out queries across multiple markets to build a global traffic picture. ## Why fan-out instead of single-region queries? The site's region tells us _where the site primarily operates_, but organic traffic is not confined to a single country. A `.es` domain may rank in `es`, `fr`, `it`, and `br` simultaneously. Querying only the site's region would undercount traffic just as badly as querying only `us`. **Smoke test data for `adobe.com` (region=US):** | Method | Single DB (us) | Fan-out (12 DBs) | |--------|---------------|-------------------| | `getTopPages` top traffic | 72,331,413 | 72,331,413 | | `getMetrics` org_traffic | ~48M | **153,060,534** | | `getMetrics` org_keywords | ~11.5M | **26,101,114** | | `getPaidPages` top_keyword_country | always `US` | `UK`, `US`, `IN` (actual source) | For `www.deceuninck.es` (region=ES), the previous `database=us` returned **zero results**. With fan-out, ES database returns 10 pages with traffic data, and the other markets gracefully return nothing. ## What changed ### New: `fanOut(items, fn, operation)` resilience primitive A single batched fan-out method (batch size 10) that all multi-market methods now use — including `getBrokenBacklinks`, which previously had its own inline batching loop. Each call to `fn(item)` already has per-request retry with exponential backoff via `sendRawRequest`; `fanOut` adds: - Batched `Promise.allSettled` to respect rate limits - Consistent `log.warn` for items that fail after all retries - Fulfilled results collected with their key for downstream merge ### New: `getDatabases(region)` helper Builds the query set: `BIG_MARKETS` + site region if not already present. `BIG_MARKETS = ['us', 'uk', 'de', 'fr', 'es', 'it', 'br', 'ca', 'au', 'in', 'jp', 'nl']` — 12 major SEO provider databases by search volume. If the site's region is already in `BIG_MARKETS` (e.g. `ES`), no duplication. If not (e.g. `CZ`), it's appended as a 13th database. ### Refactored: positional params → options bags All updated methods now use a clean options bag instead of positional parameters. Callers no longer need to pass `undefined` placeholders to reach later parameters: | Method | Signature | |--------|-----------| | `getTopPages` | `(url, { limit, region })` | | `getPaidPages` | `(url, { date, limit, region })` | | `getMetrics` | `(url, { date, region })` | | `getOrganicTraffic` | `(url, { startDate, endDate, region })` | | `getBrokenBacklinks` | No signature change (refactored to use `fanOut`) | | `getOrganicKeywords` | No change needed (already uses options bag) | ### Merge strategies | Method | Merge strategy | |--------|----------------| | `getTopPages` | Sum `sum_traffic` per URL across DBs, first keyword wins | | `getPaidPages` | Sum traffic per URL, `top_keyword_country` reflects actual DB | | `getMetrics` | Sum all numeric fields across DBs | | `getOrganicTraffic` | Group by date, sum all fields across DBs | ### Fixed: `lastMonthISO()` default date `getMetrics` and `getPaidPages` previously defaulted to `todayISO()`, but the SEO provider publishes monthly snapshots with a delay — the current month has no data yet. Changed default to `lastMonthISO()` (1st of previous month) so callers without an explicit date get the most recent available data. ## How callers use the region ```js const site = await dataAccess.getSiteById(siteId); const region = site.getRegion(); // ISO 3166-1 alpha-2, e.g. 'ES', 'CZ', or null const topPages = await seoClient.getTopPages(url, { limit: 200, region }); const metrics = await seoClient.getMetrics(url, { region }); const traffic = await seoClient.getOrganicTraffic(url, { startDate, endDate, region }); const paid = await seoClient.getPaidPages(url, { limit: 200, region }); ``` All methods are backward-compatible — the options bag is optional and defaults to querying only big markets. ## Test plan - [x] Unit tests: 140 passing, `client.js` at 100% lines/statements/functions, 97.5% branches - [x] Smoke tested against live API with `adobe.com` (US) and `www.deceuninck.es` (ES) - [ ] Verify `getMetrics`/`getPaidPages` return data without explicit date (lastMonthISO fix) - [ ] Verify non-US domain returns data (was zero before) - [ ] Verify `top_keyword_country` in `getPaidPages` reflects actual market, not hardcoded US 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f5ef97f commit 8b2260c

6 files changed

Lines changed: 734 additions & 250 deletions

File tree

0 commit comments

Comments
 (0)