docs: ADR-002 preflight API REST redesign#2377
Conversation
Captures the decision to replace /preflight/jobs and /preflight/beta/jobs with site-scoped endpoints under /sites/:siteId/preflights. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
This PR will trigger no release when merged. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AsyncJob has no siteId index; GET /sites/:siteId/preflights requires either extending AsyncJob or a new Preflight entity in shared-data-access. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eId index AsyncJob TTL (~7 days) makes a purpose-built Preflight entity unnecessary. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Existing job creation paths must not be affected; new endpoints populate siteId at creation time and query by it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Capture IMS caller identity server-side at job creation for audit purposes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sponse Per RFC 7231 §6.3.3 the created resource URL belongs in the Location response header, not a custom body field. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add error response table to POST endpoint (400/403/404/500) - Disabled-site case returns 403 immediately; no job record created - Previous CANCELLED-job behavior on disabled preflight is explicitly removed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both /preflight/beta/jobs and /preflight/jobs are deprecated and run in parallel with the new endpoints until consumers migrate and a deletion milestone is agreed upon. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| Body: | ||
| ```json | ||
| { | ||
| "preflightId": "3fa85f64-5717-4562-b3fc-2c963f66afa6", |
There was a problem hiding this comment.
should this not just be id ? Is this not the id of the aysnc job?
There was a problem hiding this comment.
The rename is intentional — jobId leaks the AsyncJob backing model to consumers, which is an implementation detail they shouldn't need to know about. preflightId is the domain name for this resource. Internally it maps 1:1 to the AsyncJob ID.
There was a problem hiding this comment.
Sure, preflightId is fine, I figured that id would be enough. The response is a preflight response and here's the id.
| { | ||
| "preflightId": "3fa85f64-5717-4562-b3fc-2c963f66afa6", | ||
| "status": "IN_PROGRESS", | ||
| "createdAt": "2026-05-11T10:00:00.000Z", |
There was a problem hiding this comment.
Should we include last updatedAt as well.
There was a problem hiding this comment.
Good call — updatedAt is already on the GET by ID response. Worth adding to the list items too so clients can detect state changes without fetching each one individually.
| **Request body** (`application/json`): | ||
| ```json | ||
| { | ||
| "url": "https://main--site--org.hlx.page/some-path", |
There was a problem hiding this comment.
Today you can pass a list of urls.
There was a problem hiding this comment.
Yes, and that's an intentional change — one URL per request, one preflightId per URL. Each preflight is an independently pollable resource. If the MFE needs bulk in future, clients fire parallel requests. This avoids partial-failure complexity and keeps each job clean. See the key changes section for the full rationale.
There was a problem hiding this comment.
Client's could select hundreds of pages. This is too much load to manage. Use case, customer is ready to run a promotion and they have many different areas of the site they want to run preflight on. Allowing them to send 1 job with many urls makes sense. It could take a while, but running in bulk this way would be expected to be slower.
There was a problem hiding this comment.
Same response as above — see the thread on line 44 for the full proposal.
| ```json | ||
| { | ||
| "url": "https://main--site--org.hlx.page/some-path", | ||
| "step": "identify", |
There was a problem hiding this comment.
The concept of step should be going away as we always will perform identify/suggest as part of the mysticat agent step.
There was a problem hiding this comment.
Good point and worth nailing down now. If the Mysticat agent always runs both identify and suggest as a single flow, step is redundant and we should drop it from the new endpoint entirely rather than carry it forward and deprecate it later. Can you confirm that's the direction? If so we'll remove it from the design.
There was a problem hiding this comment.
This is the direction that we want to go.
| { | ||
| "url": "https://main--site--org.hlx.page/some-path", | ||
| "step": "identify", | ||
| "mystiqueUrl": "optional-ephemeral-host.stage.cloud.adobe.io" |
There was a problem hiding this comment.
I don't see the value of sending this back to the client.
There was a problem hiding this comment.
To clarify — mystiqueUrl is a request-only field, never returned in any response. It lets developers route a preflight to an ephemeral Mysticat instance for testing without hitting prod, and is blocked in non-dev environments. If you're questioning whether it belongs in the API at all that's a fair debate — it could move to an env-level config instead.
There was a problem hiding this comment.
Let's remove it from the API response.
| |-------|------|----------|-------------| | ||
| | `url` | string (URI) | Yes | The single page URL to analyze | | ||
| | `step` | enum: `identify` \| `suggest` | Yes | Audit step to perform | | ||
| | `mystiqueUrl` | string | No | Dev-only override for the Mysticat service URL | |
There was a problem hiding this comment.
url i don't think needs to be exposed here.
There was a problem hiding this comment.
The url field is required in the request body because Mysticat needs it to scrape the specific page via DRS headless browser — siteId alone isn't enough since a site has many pages and preflight is per-page. As for the response, it is useful in GET by ID so the consumer knows what was analyzed. Could reasonably be omitted from the 202 POST response body given the Location header already points to the resource.
There was a problem hiding this comment.
I do not think this should be exposed as part of the public api.
There was a problem hiding this comment.
Apologies for the misread — we interpreted this as being about the url field rather than mystiqueUrl. Agreed, mystiqueUrl is removed from the public documentation. It stays in the implementation as an undocumented dev override for testing against ephemeral Mysticat instances, just not part of the public API contract.
| | `step` | enum: `identify` \| `suggest` | Yes | Audit step to perform | | ||
| | `mystiqueUrl` | string | No | Dev-only override for the Mysticat service URL | | ||
|
|
||
| `promiseToken` is passed via cookie for authenticated CMS pages (CS/CS_CW/AMS sites); it is not part of the request body. |
There was a problem hiding this comment.
There is a fix that should be set via the header not cookie.
There was a problem hiding this comment.
Agreed a header would be cleaner. The cookie approach is inherited from the /auth/promise browser flow where it's set automatically. Switching to a header would require the MFE to read and forward it explicitly — worth a separate cleanup ticket but out of scope for this redesign.
| `promiseToken` is passed via cookie for authenticated CMS pages (CS/CS_CW/AMS sites); it is not part of the request body. | ||
|
|
||
| `createdBy` is derived server-side from the caller's IMS profile (`authInfo.getProfile().email`) and is never supplied by the client. | ||
|
|
There was a problem hiding this comment.
createdAt - would be good to report too
There was a problem hiding this comment.
createdAt is already in the 202 response body — you may have been looking at an earlier version of the doc before we expanded the response shapes.
| | `500 Internal Server Error` | Mysticat call failed or unexpected error | | ||
|
|
||
| No job record is created for `400`, `403`, or `404` responses. The current `/preflight/beta/jobs` | ||
| behavior of creating a job and immediately setting it to `CANCELLED` when preflight is disabled |
There was a problem hiding this comment.
Cancelled is not the correct status here it feels like cancelled is a user action. Maybe something like ABORTED.
There was a problem hiding this comment.
Good point on the semantics, but we've resolved this differently — the new design returns 403 Forbidden immediately when preflight is disabled, without creating a job record at all. No phantom CANCELLED or ABORTED record is created. See the error responses table in the POST section.
There was a problem hiding this comment.
Nice! I like it. However as part of the 403, what does the response body look like to allow the UI to provide a meaningful message to the user as if they do not have access to the site, or that the audit was disabled.
There was a problem hiding this comment.
Good call. The error body includes an errorCode for machine-readable handling and a message as a human-readable hint:
{
"errorCode": "PREFLIGHT_NOT_ENABLED",
"message": "Preflight is not enabled for this site"
}All codes are PREFLIGHT_-prefixed for consistency: PREFLIGHT_ACCESS_DENIED, PREFLIGHT_NOT_ENABLED, PREFLIGHT_INVALID_REQUEST, PREFLIGHT_SITE_NOT_FOUND, PREFLIGHT_INTERNAL_ERROR. Consumers translate errorCode to their own messaging — message is informational only. ADR updated.
Open to other suggestions on the error code style if there's a preferred convention.
There was a problem hiding this comment.
I'm happy that at least there's an error code! Using a numeric value just requires another lookup, I don't mind the string.
| { | ||
| "url": "https://main--site--org.hlx.page/some-path", | ||
| "preflights": [ | ||
| { | ||
| "preflightId": "3fa85f64-5717-4562-b3fc-2c963f66afa6", | ||
| "status": "COMPLETED", | ||
| "step": "identify", | ||
| "createdAt": "2026-05-11T10:00:00.000Z", | ||
| "createdBy": "user@example.com" | ||
| }, | ||
| { | ||
| "preflightId": "7c9b1e32-1234-4abc-b3fc-9f8a7c6d5e4f", | ||
| "status": "COMPLETED", | ||
| "step": "suggest", | ||
| "createdAt": "2026-05-11T10:05:00.000Z", | ||
| "createdBy": "user@example.com" | ||
| } | ||
| ] | ||
| }, |
There was a problem hiding this comment.
This structure is not what I would expect, but rather a list of jobs with some minor details:
[
{
"preflightId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "COMPLETED",
"urlCount": 12, <----- maybe show the number of urls they ran with
"opportunities": 5, <------ the number of opportunities that were created
"createdAt": "2026-05-11T10:00:00.000Z",
"createdBy": "user@example.com"
}
]
Listing the preflight jobs should be lightweight and fast. Details of the jobs can be fetched with a follow up with the id of the preflight job.
There was a problem hiding this comment.
The flat list is simpler and we can drop the URL grouping — agreed. The opportunities count is an interesting addition; does Mysticat write that back to the job result today? If so it's worth surfacing in the list.
One question on urlCount: that field implies a job can span multiple URLs, but the new design is one URL per preflight. With that model urlCount would always be 1. Was the intent to keep multi-URL jobs, or is urlCount a carry-over from the old design?
There was a problem hiding this comment.
I think we must consider the ability for multiple urls. If we get to the point of providing a different experience where they can select many pages and initiate a preflight request, we need a way to make this experience simply for the client, being a headless agent or client ui.
There was a problem hiding this comment.
Addressed in the thread on line 44 — the proposal is to keep this endpoint single-URL for now since it was never actually used with multiple URLs in practice, and add a dedicated POST /sites/:siteId/preflights/bulk endpoint when that use case is confirmed. Adding it later is cheap and keeps the design clean in the meantime.
bhellema
left a comment
There was a problem hiding this comment.
We really need to consider what the bulk use case looks like.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Confirmed by Ben: identify/suggest are no longer separate steps. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dev-only override, not part of the public contract. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
docs/decisions/002-preflight-api-rest-redesign.mdcapturing the decision to replace/preflight/beta/jobswith site-scoped REST endpoints under/sites/:siteId/preflights, and to deprecate both legacy endpoint pairsWhat's being proposed (SITES-44290)
POST /sites/:siteId/preflights
Request body:
{ "url": "https://main--site--org.hlx.page/some-path" }urlstays in the body — Mysticat requires it to identify the specific page to analyze.promiseTokenpassed via cookie for authenticated CMS pages.createdByis derived server-side from the caller's IMS profile and never supplied by the client.stepis removed — Mysticat's agent always runs the full identify/suggest flow.Response
202 Accepted:Headers:
Body:
{ "preflightId": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "IN_PROGRESS", "createdAt": "2026-05-11T10:00:00.000Z", "createdBy": "user@example.com" }Error responses:
errorCode400 Bad RequestPREFLIGHT_INVALID_REQUESTurlis missing or invalid403 ForbiddenPREFLIGHT_ACCESS_DENIED403 ForbiddenPREFLIGHT_NOT_ENABLED404 Not FoundPREFLIGHT_SITE_NOT_FOUNDsiteIddoes not exist500 Internal Server ErrorPREFLIGHT_INTERNAL_ERRORError response body:
{ "errorCode": "PREFLIGHT_NOT_ENABLED", "message": "Preflight is not enabled for this site" }No job record is created for
400,403, or404responses.GET /sites/:siteId/preflights
Response
200 OK— grouped by URL (a site can have preflights across multiple pages):[ { "url": "https://main--site--org.hlx.page/some-path", "preflights": [ { "preflightId": "3fa85f64-...", "status": "COMPLETED", "createdAt": "2026-05-11T10:00:00.000Z", "updatedAt": "2026-05-11T10:00:05.000Z", "createdBy": "user@example.com" }, { "preflightId": "7c9b1e32-...", "status": "COMPLETED", "createdAt": "2026-05-11T10:05:00.000Z", "updatedAt": "2026-05-11T10:05:04.000Z", "createdBy": "user@example.com" } ] } ]GET /sites/:siteId/preflights/:preflightId
Response
200 OK— full detail:{ "preflightId": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "COMPLETED", "url": "https://main--site--org.hlx.page/some-path", "createdAt": "2026-05-11T10:00:00.000Z", "createdBy": "user@example.com", "updatedAt": "2026-05-11T10:00:05.000Z", "startedAt": "2026-05-11T10:00:01.000Z", "endedAt": "2026-05-11T10:00:05.000Z", "result": {}, "error": null }Data model
AsyncJobwill be extended inspacecat-shared-data-accesswithsiteIdas a top-level indexed attribute and anallBySiteIdAndJobType(siteId, jobType)collection method. A purpose-builtPreflightentity is not needed —AsyncJobrecords are TTL-based (~7 days) and inherently transient, so a separate model would just add cleanup complexity for no gain.createdBy(IMS email) is stored in job metadata at creation time and surfaces in all three endpoint responses for audit purposes.Current endpoint fate
Both legacy endpoint pairs are deprecated and remain functional in parallel with the new endpoints until consumers have migrated and a deletion milestone is agreed upon.
Test plan
🤖 Generated with Claude Code