feat(llmo): route URL Inspector citation data through Semrush BP API (LLMO-5180) by claudiaboldis · Pull Request #2588 · adobe/spacecat-api-service

claudiaboldis · 2026-06-10T13:40:31Z

Summary

Routes URL Inspector citation data (citations, promptsCited) through Semrush BP reporting API via the Serenity proxy, while agentic and referral traffic stay in Mysticat
Falls back to existing Mysticat RPC behaviour for orgs without a Semrush workspace (non-Semrush customers unaffected)
Confirms the URL-contains filter (col: "source", op: "contains") that was the open blocker for this ticket

Changes

src/support/serenity/rest-transport.js — new queryBrandPresenceResults() transport method
src/support/serenity/handlers/brand-presence.js (new) — fans out BP citation queries across all project slices, merges results
src/controllers/llmo/llmo-url-inspector.js — stats, owned-urls, and url-prompts handlers now source citation data from Semrush when semrushWorkspaceId is present
test/controllers/llmo/llmo-url-inspector.test.js — 11 new tests; all 97 pass

Env var required

SEMRUSH_BP_ELEMENT_ID must be set in Vault (dx_mysticat/<env>/api-service) before deploying — identifies the BP reporting element in the Semrush workspace. Obtain from Semrush team.

Test plan

All 97 existing + new unit tests pass (npm test)
Null-workspace path verified — non-Semrush customers see identical behaviour
Manual test against dev with a Semrush-onboarded org: stats, owned-urls, and URL drilldown show Semrush-sourced citation counts
Agentic + referral columns on owned-urls table unchanged

Jira: LLMO-5180

🤖 Generated with Claude Code

codecov · 2026-06-10T13:56:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

github-actions · 2026-06-11T03:27:49Z

This PR will trigger a minor release when merged.

MysticatBot

Hey @claudiaboldis,

Verdict: Request changes - solid integration with clean fallback semantics, but the owned-urls fan-out has an error isolation gap and the domain filter parameter looks semantically wrong.
Changes: Routes URL Inspector citation data (stats, owned-urls, url-prompts) through the Semrush BP reporting API via Serenity proxy, with graceful fallback to Mysticat for non-onboarded orgs (6 files).

Must fix before merge

[Important] Unbounded fan-out with no per-item error isolation in owned-urls handler - src/controllers/llmo/llmo-url-inspector.js:370 (details inline)
[Important] Full URL passed as domain to a CBF_domain eq filter - potential semantic mismatch - src/controllers/llmo/llmo-url-inspector.js:226 (details inline)
[Important] Broad catch swallows configuration errors (503 from missing env var) alongside transient failures - src/controllers/llmo/llmo-url-inspector.js:235 (details inline)
[Important] Missing integration tests for modified endpoints - per CLAUDE.md rule: "New or modified endpoints must include integration tests in test/it/". The three URL Inspector endpoints now have a conditional Semrush data-sourcing path that is not exercised by any integration test.

Non-blocking (2): minor issues and suggestions

suggestion: Extract a shared helper for the repeated Semrush integration block (workspace check, token extraction, transport creation, query, catch/fallback) to reduce the three-location maintenance surface - src/controllers/llmo/llmo-url-inspector.js
nit: The new queryBrandPresenceResults method skips API_PREFIX and uses a hardcoded /apis/v4-raw/external-api/v1/ path unlike all other transport methods. Add a named constant so the divergence is visible when proxy routing changes - src/support/serenity/rest-transport.js:304

_{Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 3m 5s | Cost: $4.20 | Commit: 4e7c47ada72fa4013f842cfefd36ee40e5d28f6a
If this code review was useful, please react with 👍. Otherwise, react with 👎.}

MysticatBot · 2026-06-11T08:32:56Z

+        try {
+          const imsToken = ctx?.pathInfo?.headers?.authorization?.replace(/^Bearer\s+/i, '');
+          const transport = createSerenityTransport({ env: ctx.env, imsToken });
+          const urlsWithSemrush = await Promise.all(


issue (blocking): Unbounded fan-out with no per-item error isolation. The Promise.all fires one queryBpCitationsByUrl call per URL row, and each of those internally fans out across all BrandSemrushProject slices. For N URLs with M market slices, this is N*M concurrent HTTP calls per request. More critically, if ANY single URL's call fails, the entire Promise.all rejects and the outer catch discards ALL Semrush data for every URL - including the ones that succeeded.

How to fix: Add per-URL error handling inside the .map() callback so one failure preserves results for other URLs:

urlsFromMysticat.map(async (urlRow) => { try { const bpResult = await queryBpCitationsByUrl(...); return bpResult ? { ...urlRow, citations: bpResult.citations, promptsCited: bpResult.promptsCited } : urlRow; } catch { return urlRow; } })

Also consider adding a concurrency limiter (e.g. p-limit) for large URL sets.

Fixed: added per-URL try/catch inside the .map() callback so one failed URL preserves Mysticat data for that row while others proceed. Re-throws status===503 errors upward. Also moved imsToken/transport construction outside the Promise.all (shared across rows).

Fixed: added per-URL try/catch inside the .map() callback so one failed URL preserves Mysticat data for that row while others proceed. Re-throws status===503 errors upward. Also moved imsToken/transport construction outside Promise.all (shared across rows).

MysticatBot · 2026-06-11T08:32:56Z

+            filterByBrandId,
+            semrushWorkspaceId,
+            ctx.env,
+            {


issue (blocking): Full URL passed as domain to a CBF_domain eq filter. In all three handlers, the domain field receives a full URL (e.g. https://example.com/page) or a site ID - not a bare domain. This value flows into buildRenderData as { op: "eq", val: domain, col: "CBF_domain" }. If the upstream Semrush API's CBF_domain column stores actual domains, this equality filter will never match and citations will silently return zero (no error thrown, just empty results).

The PR description confirms the source contains filter was validated, but does not mention verifying the CBF_domain eq filter.

How to fix: Verify with the Semrush API what CBF_domain expects. If it is a bare domain, extract it: new URL(urlRow.url).hostname. If it expects a full URL, rename the parameter to avoid confusion.

Fixed: now extracting hostname via new URL(url).hostname for the CBF_domain eq filter in all three handlers. Falls back to the raw value if the URL parse fails (e.g. for non-URL siteIds). urlFragment still passes the full URL for the source contains filter.

MysticatBot · 2026-06-11T08:32:56Z

+          );
+          if (bpResult) {
+            stats.totalCitations = bpResult.citations;
+            stats.totalPromptsCited = bpResult.promptsCited;


issue (blocking): Catch block swallows configuration errors alongside transient failures. The ErrorWithStatusCode(503) thrown by resolveElementId when SEMRUSH_BP_ELEMENT_ID is missing signals a permanent deployment misconfiguration - not a transient upstream failure. Catching it here means every request from a Semrush-onboarded org silently returns stale Mysticat data with only a log line distinguishing it from a network glitch.

How to fix: Either check for config errors and re-throw:

} catch (e) { if (e instanceof ErrorWithStatusCode && e.status === 503) throw e; ctx.log.error(...); }

Or call resolveElementId(ctx.env) outside the try/catch so it fails fast before attempting the query. The same pattern applies to the other two handlers.

Fixed: added 'if (e?.status === 503) throw e;' before the log+fallback in all three handlers, so missing SEMRUSH_BP_ELEMENT_ID surfaces as a 503 rather than silently returning Mysticat data.

…(LLMO-5180) - Add queryBrandPresenceResults to SerenityTransport for the BP reporting endpoint - Add src/support/serenity/handlers/brand-presence.js with queryBpCitationsByUrl that fans out across all BrandSemrushProject slices and merges citation/prompt results - Modify stats, owned-urls, and url-prompts handlers: when the org has a Semrush workspace and a brand is scoped, fetch citation data from Semrush; fall back to Mysticat RPCs when workspace is null or Semrush throws - Agentic + referral traffic columns remain sourced from Mysticat throughout - 11 new esmock-based integration tests covering Semrush path, null-workspace fallback, and error fallback for all three modified handlers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…sh integration - Extract hostname from full URL for CBF_domain eq filter (domain != full URL) - Add per-URL error isolation in owned-urls fan-out so one failure preserves other rows - Re-throw ErrorWithStatusCode(503) config errors instead of falling back silently - Add tests: 503 re-throw, per-URL isolation, hostname extraction verification Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ved enc/request helpers The file was refactored to use typed openapi-fetch clients, removing the hand-rolled enc() and request() helpers. The v4-raw endpoint is outside the typed client's path space so use createTimeoutFetch + native fetch directly instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claudiaboldis temporarily deployed to dev-branches June 10, 2026 13:52 — with GitHub Actions Inactive

claudiaboldis temporarily deployed to dev-branches June 11, 2026 03:41 — with GitHub Actions Inactive

claudiaboldis temporarily deployed to dev-branches June 11, 2026 05:10 — with GitHub Actions Inactive

claudiaboldis requested a review from rainer-friederich June 11, 2026 06:54

claudiaboldis temporarily deployed to dev-branches June 11, 2026 07:08 — with GitHub Actions Inactive

rainer-friederich requested a review from MysticatBot June 11, 2026 08:21

MysticatBot requested changes Jun 11, 2026

View reviewed changes

MysticatBot added the ai-reviewed Reviewed by AI label Jun 11, 2026

claudiaboldis temporarily deployed to dev-branches June 11, 2026 09:34 — with GitHub Actions Inactive

claudiaboldis force-pushed the feat/llmo-5180-url-inspector-semrush-bp branch from 40eb5f2 to c688399 Compare June 18, 2026 08:44

claudiaboldis temporarily deployed to dev-branches June 18, 2026 09:13 — with GitHub Actions Inactive

cboldis and others added 6 commits July 2, 2026 19:37

test(llmo): add brand-presence handler unit tests for LLMO-5180 coverage

d4299ef

test(llmo): cover null bpResult branches in URL Inspector for LLMO-5180

f3a9c00

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(llmo): cover URL parse fallback branch in owned-urls handler

1b3495a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(llmo): repair rest-transport test conflict resolution (LLMO-5180)

6bb2e28

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claudiaboldis force-pushed the feat/llmo-5180-url-inspector-semrush-bp branch from c770e8c to 6bb2e28 Compare July 2, 2026 16:38

claudiaboldis temporarily deployed to dev-branches July 2, 2026 16:57 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(llmo): route URL Inspector citation data through Semrush BP API (LLMO-5180)#2588

feat(llmo): route URL Inspector citation data through Semrush BP API (LLMO-5180)#2588
claudiaboldis wants to merge 7 commits into
mainfrom
feat/llmo-5180-url-inspector-semrush-bp

claudiaboldis commented Jun 10, 2026

Uh oh!

codecov Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

MysticatBot left a comment •

edited

Loading

Uh oh!

MysticatBot Jun 11, 2026

Uh oh!

claudiaboldis Jun 11, 2026

Uh oh!

claudiaboldis Jun 11, 2026

Uh oh!

MysticatBot Jun 11, 2026

Uh oh!

claudiaboldis Jun 11, 2026

Uh oh!

MysticatBot Jun 11, 2026

Uh oh!

claudiaboldis Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

claudiaboldis commented Jun 10, 2026

Summary

Changes

Env var required

Test plan

Uh oh!

codecov Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

MysticatBot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Must fix before merge

Uh oh!

MysticatBot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

claudiaboldis Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

claudiaboldis Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

MysticatBot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

claudiaboldis Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

MysticatBot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

claudiaboldis Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Jun 10, 2026 •

edited

Loading

MysticatBot left a comment •

edited

Loading