fix(source-amazon-seller-partner): make GET_SALES_AND_TRAFFIC_REPORT_BY_DATE step configurable instead of per-day#77747
Conversation
…BY_DATE step configurable instead of per-day The BY_DATE variant of the Sales and Traffic report extracts the `salesAndTrafficByDate` array, which Amazon already returns broken down per date even for multi-day report windows. Hard-coding `step: P1D` on this stream caused one SP-API report request per calendar day, which is unnecessary and wasteful for large backfills. Switch to the same `period_in_days`-driven step pattern used by other streams in the same manifest (capped at 30 days). Co-Authored-By: bot_apk <apk@cognition.ai>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Co-Authored-By: bot_apk <apk@cognition.ai>
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksPR Slash CommandsAirbyte Maintainers (that's you!) can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful Resources
|
|
Deploy preview for airbyte-docs ready!
Deployed with vercel-action |
|
|
↪️ Triggering Reason: Draft PR linked to P3 oncall issue with no prior |
|
✅ Fix Validation: Sufficient Evidence — Pre-release ReadyThe Outcome
Evidence Summary
Detailed Evidence LogRegression-test workflow: airbyte-ops-mcp/actions/runs/25374556298 (control auto-detected as Per-step outcomes from the workflow's "Determine Final Status" output: Local control-vs-target unit test against the new test added in this PR Target (PR manifest, Control (master's manifest, The control failure logs show the connector kicked off 29 per-day async report jobs ( Pre-flight diff inspection (against The single behavioral line is in - step: "P1D"
+ step: "P{{ min( config.get('period_in_days', 365), 30 ) }}D"Pre-flight Checks
Connector & PR Details
Recommended Next Steps
|
|
|
↪️ Triggering Reason: Draft PR with no prior |
|
✅ Fix Validation: Sufficient Evidence — Pre-release ReadyThe Outcome
Evidence Summary
Recommended Next Steps
Connector & PR Details
Evidence Plan (Hypothesis)
Proving criteria — all met:
Disproving criteria — none observed. Detailed Evidence Log
|
What
Resolves https://github.com/airbytehq/oncall/issues/12180:
The
GET_SALES_AND_TRAFFIC_REPORT_BY_DATEstream insource-amazon-seller-partnercurrently issues one Amazon SP-API report request per calendar day because its slicestepis hard-coded toP1D. For large backfills this is wasteful (e.g. ~365 separate report requests for a one-year window), and customers have asked us to issue a single (or far smaller number of) request(s) covering a wider window.The BY_DATE variant of the Sales & Traffic report extracts the
salesAndTrafficByDatearray, which Amazon's official SP-API report schema already returns broken down per date even for multi-day report windows ("Aggregated data is available at different date range aggregation levels: DAY, WEEK, MONTH. … Requests can span multiple date range periods."). The per-day slicing is therefore unnecessary for this stream.The sibling
GET_SALES_AND_TRAFFIC_REPORTstream (extractingsalesAndTrafficByAsin) keepsP1Dbecause Amazon aggregatessalesAndTrafficByAsinover the whole gap whendateGranularity: DAYand the range is multi-day — see selling-partner-api-models#426. TheGET_SALES_AND_TRAFFIC_REPORT_BY_MONTHstream usesP1Mcorrectly.How
Single-line manifest change to the BY_DATE stream's incremental_sync, swapping the hard-coded
step: "P1D"for the existingperiod_in_days-driven pattern (capped conservatively at 30 days) used by many other streams in the same manifest:The
period_in_daysconfig field already exists in the spec atmanifest.yaml:127(default 90, minimum 1) — no new config field is added. The 30-day cap is conservative; Amazon does not publish an explicit max date range for this report and a customer hitting the cap simply experiences more (still much smaller than 365) slices, not failures.Connector version: 5.7.4 → 5.8.0 (minor bump — adds user-visible behavior tied to the existing
period_in_daysconfig).Declarative-First Evaluation
This is a declarative / manifest-only connector (
language:manifest-only,cdk:low-code). The fix is purely a YAML change to the existingDatetimeBasedCursor'sstepexpression — it uses the same Jinjamin(config.get('period_in_days', …), …)pattern already used by ~10 other streams in the same manifest. No custom Python component is introduced or modified.Test Coverage
Extended
TestSalesAndTrafficReportRequestBodyinunit_tests/integration/test_report_based_streams.pywithtest_by_date_stream_sends_multi_day_window_in_single_request, which asserts that with the default 29-day config range (2023-01-01 → 2023-01-30) the BY_DATE stream issues a single create-report request withdataStartTime/dataEndTimecovering the full window — not 29 per-day requests.I confirmed locally that the new test:
dataStartTime/dataEndTimedoes not match the multi-day mocked body).All 4 existing tests in
TestSalesAndTrafficReportRequestBodycontinue to pass.Breaking Change Evaluation
Not a breaking change:
datefield continues to come straight from Amazon'ssalesAndTrafficByDate[].datepayload, so widening the slice does not distort per-day record values. OnlyqueryEndDate(a slice-level marker that doubles as the cursor) becomes coarser — same behavior already seen on otherperiod_in_days-driven streams.DatetimeBasedCursorcursor field (queryEndDate) and format.Review guide
airbyte-integrations/connectors/source-amazon-seller-partner/manifest.yaml— the actual one-line behavior change (plus a clarifying comment).airbyte-integrations/connectors/source-amazon-seller-partner/unit_tests/integration/test_report_based_streams.py— new test verifying single multi-day request.airbyte-integrations/connectors/source-amazon-seller-partner/metadata.yamlanddocs/integrations/sources/amazon-seller-partner.md— version bump + changelog entry.User Impact
Customers running the BY_DATE stream over large date ranges will see a substantial reduction in SP-API report requests (e.g. ~365 → ~12 requests for a one-year backfill, with the default 30-day cap). Sync time should drop accordingly. No change to record-level data:
datecontinues to come straight from Amazon's response, and existing customers can still opt for smaller windows by loweringperiod_in_days.Coordination
cc Teo (@tgonzalezc5) (Aldo Gonzalez), original author of the BY_DATE stream in #72258 — would appreciate a sanity check, since this changes the slicing behavior the original PR introduced.
Can this PR be safely reverted and rolled back?
Link to Devin session: https://app.devin.ai/sessions/535847d8b01c4b1ebcd3ca8524eb064d