feat(source-stripe)!: fix missing records in invoice/subscription line items (6.0.0)#76095
Conversation
… with RecordExpander
|
Note 📝 PR Converted to Draft More info...Thank you for creating this PR. As a policy to protect our engineers' time, Airbyte requires all PRs to be created first in draft status. Your PR has been automatically converted to draft status in respect for this policy. As soon as your PR is ready for formal review, you can proceed to convert the PR to "ready for review" status by clicking the "Ready for review" button at the bottom of the PR page. To skip draft status in future PRs, please include |
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksPR Slash CommandsAirbyte Maintainers (that's you!) can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful Resources
|
Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
Code Review SummaryPushed the changelog entry for 5.15.24 and updated the PR description with full context. Here are a few items to consider: 1.
|
|
Deploy preview for airbyte-docs ready! ✅ Preview Built with commit f8567b5. |
…uard invoice_id extraction Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
Point 4 — Cursor fallback chain and falsy values (detailed explanation)The {{ (record['original_record'].get('updated')
or record['original_record'].get('created')
or record['original_record'].get('data', {}).get('object', {}).get('updated')
or record['original_record'].get('data', {}).get('object', {}).get('created')
or now_utc().timestamp()) | int }}In Python/Jinja2, Concrete scenario: If In practice for Stripe: This is a very low-risk edge case because:
So while the pattern isn't technically correct for arbitrary inputs, it's safe for Stripe's data model. A "correct" fix would use something like: {{ (record['original_record'].get('updated') if record['original_record'].get('updated') is not none else record['original_record'].get('created') if record['original_record'].get('created') is not none else now_utc().timestamp()) | int }}But that's significantly less readable for minimal practical benefit. I'd recommend keeping the current |
…ption Items Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
|
|
/ai-canary-prerelease
|
🐤 Canary Prerelease Testing — StartingStarting canary prerelease testing for Session: https://app.devin.ai/sessions/c23b96b68e6f47ce9fa4082676cec4b8 |
|
🔄 Canary Prerelease — Phase 3 CompletePrerelease published: Breaking change evaluation: ✅ NOT BREAKING, REVERSIBLE
Connection selection (8 connections):
Status: ⏳ Awaiting approval to pin connections (Slack HITL request sent) |
|
|
🔴 Canary Prerelease — BLOCKED on Publish FailureStatus: The prerelease publish workflow for Error: Failed workflow runs: What is ready:
Will retry the publish and proceed with canary testing once the image is available. |
|
Canary Prerelease — Update: Persistent Publish FailureThe prerelease publish for This is a CI infrastructure issue affecting the multi-architecture Docker build, not related to the PR code. Failed workflow runs: Per debugging guidelines, I'm stopping retries and escalating for human assistance to resolve the CI infrastructure issue. Canary status: Everything else is ready — approval received, 8 connections selected. Will proceed with pinning as soon as the prerelease image is published successfully. |
|
|
Canary Prerelease — BLOCKEDVerdict: BLOCKED — Unable to publish prerelease image due to persistent CI infrastructure failure. Full DetailsPrerelease version: Issue: The pre-release publish workflow has failed 6 consecutive times over approximately 2 hours. All failures occur in the same step: "Build and publish Python and Manifest-Only connectors images" with error Failed workflow runs:
Docker Hub verification: Neither Root cause: ARM64 QEMU emulation failure in the CI runner environment. This is a CI infrastructure issue, not related to this PR's code. What is ready (will proceed immediately once prerelease is published):
Next steps: A reminder is set to retry the publish. Once the CI infrastructure issue is resolved and the prerelease image is available, canary testing will proceed immediately with the pre-approved pinning plan. |
Root Cause: Pre-release publish failureThe pre-release publish is failing because the Show/Hide Detailed ReportEvidenceThe pinned digest For comparison, the previous base image digests (7.14.0 and 7.15.0) correctly pin to manifest list digests ( What happens at build timeWhen Docker buildx builds the arm64 variant of source-stripe, it pulls the base image by the pinned The amd64 build succeeds because it happens to get the correct architecture by coincidence (the pinned digest is the amd64 image). Digest comparison
Correct digest for 7.17.0The manifest list (index) digest for FixUpdate - baseImage: docker.io/airbyte/source-declarative-manifest:7.17.0@sha256:6a1903ff4fc317b66314f302aafb1e5a343e4072b6dfa0230caad7db9ffc2489
+ baseImage: docker.io/airbyte/source-declarative-manifest:7.17.0@sha256:85621b892f41ad06b1c00ca1e5bbfbc506e6961a8d6293e5e60c31050ace3b33How this likely happenedThe wrong digest was introduced in the first commit of this PR ( |
|
|
/publish-connectors-prerelease
|
|
🔬 Canary Prerelease — Phase 4 Complete: Pins DeployedApproval: Received from Alfredo Garcia (@agarctfi) via Slack HITL 7 connections pinned to
Note: Skipped 1 candidate (Customer-A-US-1) — it has a pre-existing workspace-level pin from a different source that I cannot safely override. Next: Entering monitoring phase. Will track sync executions over the next 2-4 hours and post periodic updates. |
Canary Prerelease Report —
|
| Connection | Tier | Region | Post-Pin Syncs | Status | Records Range |
|---|---|---|---|---|---|
| Customer-A-EU-1 | TIER_0 | EU | 3 | All SUCCEEDED | 113–201 |
| Customer-A-US-1 | TIER_0 | US | 3 | All SUCCEEDED | 7 |
| Customer-B-US-1 | TIER_2 | US | 3 | All SUCCEEDED | 78–100 |
| Customer-B-EU-1 | TIER_2 | EU | 1 | SUCCEEDED | 1 |
| Customer-C-US-1 | TIER_2 | US | 1 | SUCCEEDED | 298 |
| Customer-D-US-1 | TIER_2 | US | 3 | All SUCCEEDED | 481–490 |
| Customer-D-EU-1 | TIER_2 | EU | 3 | All SUCCEEDED | 50–102 |
Totals:
- 17 syncs across 7 connections
- 0 failures — 100% success rate
- No new error patterns detected
- Record counts and byte volumes consistent with pre-pin baselines
Observations
- All connections continued syncing normally after the version pin was applied
- Record counts are consistent with historical patterns (no unexpected spikes or drops)
- Diverse destination types exercised: S3, Snowflake, Redshift, BigQuery, ClickHouse, Postgres
Recommendation
Canary results are clean. This PR is safe to proceed to merge and GA release. Pins will remain in place until the GA version is published, at which point they will be removed.
Rollback Plan
If issues emerge before GA release, all 7 connections can be reverted to their original versions (all were on the default/unpinned version before this canary).
…line_items and subscription_items - Bump dockerImageTag from 5.15.24 to 6.0.0 (major version) - Add breakingChanges entry for 6.0.0 with upgradeDeadline 2026-04-22 - Scope impact to invoice_line_items and subscription_items streams - Update changelog with release date 2026-04-08 - Add migration guide for 6.0.0 with full refresh options Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…date Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
Kat Wilson (katmarkham)
left a comment
There was a problem hiding this comment.
reviewed IRL with Alfredo
Patrick Nilan (pnilan)
left a comment
There was a problem hiding this comment.
Just one nit.
Thanks!
Canary Prerelease Cleanup — CompletePR has been merged. All 7 canary connections have been unpinned from
Canary prerelease lifecycle complete. No further action needed. |
What
OC: https://github.com/airbytehq/oncall/issues/8683
OC: https://github.com/airbytehq/oncall/issues/10756
The
invoice_line_itemsandsubscription_itemsincremental streams were missing records because the oldDpathFlattenFieldsapproach flattened the entire event object into a single record, discarding nested arrays (data.object.lines.datafor invoices,data.object.items.datafor subscriptions). An event with N line/subscription items would emit only 1 record instead of N.This is released as a breaking change (6.0.0) to notify affected users that previously synced data may be incomplete, and to provide guidance on full refresh options.
How
The PR upgrades the CDK from 7.14.0 → 7.17.0, following the changes in airbytehq/airbyte-python-cdk#859, which introduces the
RecordExpandercomponent.Key changes in
manifest.yaml(applied to bothinvoice_line_itemsandsubscription_itemsincremental streams):events_objects_retrieverref with an inlineSimpleRetrieverthat includes aRecordExpanderon the extractor. The expander walks the nested array and emits one record per item, attaching the parent event asoriginal_record(viaremain_original_record: true).invoice_line_itemsexpands from:data.object.lines.datasubscription_itemsexpands from:data.object.items.datais_deletedAddFields → cursor AddFields →DpathFlattenFields) with:AddFieldsstep that setsis_deleted: trueon.deletedevents (now reads event type fromoriginal_record)AddFieldsstep that extracts the cursor field and (for invoices) theinvoice_idfromrecord['original_record'], using safe.get()accessRemoveFieldsstep that stripsoriginal_recordfrom the final outputBreaking change (6.0.0):
metadata.yaml: bumpsdockerImageTagto6.0.0, addsbreakingChangesentry withupgradeDeadline: 2026-04-22, scoped toinvoice_line_itemsandsubscription_itemsdocs/integrations/sources/stripe-migrations.md: adds migration guide for 6.0.0 explaining full refresh options (Retain vs Clear) and the 30-day Events API retention caveatdocs/integrations/sources/stripe.md: changelog entry updated to 6.0.0 with release date 2026-04-08Review guide
airbyte-integrations/connectors/source-stripe/manifest.yaml— RecordExpander + transformation changes for bothinvoice_line_items(≈line 1466) andsubscription_items(≈line 1634) incremental streamsairbyte-integrations/connectors/source-stripe/metadata.yaml— CDK bump, version bump to 6.0.0, andbreakingChangesentrydocs/integrations/sources/stripe-migrations.md— Migration guide for 6.0.0docs/integrations/sources/stripe.md— Changelog entry for 6.0.0Human review checklist
data.object.items.data. Invoice events usedata.object.lines.data. If the path is wrong, RecordExpander will silently produce no expanded records.is_deletedon expanded records — Each expanded line item getsis_deleted: truewhen the parent event is a.deletedtype. Confirm this is the desired behavior (marking every child item as deleted when the parent entity is deleted).orchain — Theinvoice_updated/subscription_updatedvalue uses a longorchain that would skip a hypothetical timestamp of0. This is safe for Stripe (timestamps are always positive) but worth noting as a known tradeoff for readability.scopedImpactstream names — Verify thatinvoice_line_itemsandsubscription_itemsare the correct internal stream identifiers the platform uses for scoped breaking change notifications.breakingChanges.6.0.0.messageinmetadata.yamlfor user-facing clarity (this message is emailed to affected users).#upgrading-to-600— confirm the Docusaurus-generated anchor matches.User Impact
Users syncing the
invoice_line_itemsorsubscription_itemsstreams in incremental mode will now receive all individual items from each event, rather than a single flattened record per event. This fixes missing records reported in the linked oncall issues.Users of these streams will receive a breaking change notification (email + in-app) with an upgrade deadline of 2026-04-22. They can choose to:
Can this PR be safely reverted and rolled back?
Link to Devin session: https://app.devin.ai/sessions/a28807b26e9744bd97ac3aebbc532f7f
Requested by: Alfredo Garcia (@agarctfi)