Skip to content

feat: add scopes object array to declarative OAuth spec#934

Merged
Aldo Gonzalez (aldogonzalez8) merged 10 commits intomainfrom
aldo/scopes-array-schema
Mar 12, 2026
Merged

feat: add scopes object array to declarative OAuth spec#934
Aldo Gonzalez (aldogonzalez8) merged 10 commits intomainfrom
aldo/scopes-array-schema

Conversation

@aldogonzalez8
Copy link
Copy Markdown
Contributor

@aldogonzalez8 Aldo Gonzalez (aldogonzalez8) commented Mar 5, 2026

Summary

Add structured OAuth scope fields to oauth_connector_input_specification in the declarative component schema, replacing the freeform scope string with an extensible object array.

New fields

Field Type Description
scopes array[{scope: string}] List of scope objects. When present, takes precedence over the scope string.
optional_scopes array[{scope: string}] Scopes that may or may not be granted by the provider.
scopes_join_strategy enum(space, comma, plus) How to join scope values into a single string. Default: space (per RFC 6749).

Schema design

Each scope is an object ({ scope: "read" }) rather than a plain string. The object shape uses additionalProperties: true so future fields (e.g., mode, entities) can be added without a schema change.

Example

# Before (string — still supported)
scope: "read chat"

# After (object array)
scopes:
  - scope: read
  - scope: chat
scopes_join_strategy: space  # optional, default

What this PR changes

  • declarative_component_schema.yaml — Adds scopes, optional_scopes, scopes_join_strategy under OAuthConfigSpecification.oauth_connector_input_specification
  • declarative_component_schema.py — Adds OAuthScope model, ScopesJoinStrategy enum, and new fields on OauthConnectorInputSpecification

How it works

These fields are not consumed by the CDK runtime. They define the manifest contract that the platform reads during the OAuth consent flow. The platform handler (DeclarativeOAuthSpecHandler.kt) extracts scope values from the objects, joins them using the strategy, and populates URL template variables. See airbyte-platform-internal#18705.

Related PRs

Test plan

  • All 3,869 CDK tests pass (no regressions)
  • Connectors using scope string continue to work unchanged
  • Platform handler correctly reads scope objects from manifests

🤖 Generated with Claude Code

…declarative OAuth spec

Adds structured scopes support to oauth_connector_input_specification:
- scopes: array of strings, takes precedence over scope string
- optional_scopes: array of optional scopes
- scopes_join_strategy: enum (space/comma/plus), defaults to space per RFC 6749

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 5, 2026

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@aldo/scopes-array-schema#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch aldo/scopes-array-schema

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 5, 2026

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds OAuth scope handling to the declarative connector schema and models: introduces a ScopesJoinStrategy enum and three new fields on OauthConnectorInputSpecificationscopes, optional_scopes, and scopes_join_strategy—with scopes taking precedence over the existing scope string when present. The new fields are intended for the platform OAuth handler.

Changes

Cohort / File(s) Summary
OAuth scope schema (YAML)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Adds scopes, optional_scopes, and scopes_join_strategy under OAuthConfigSpecification.oauth_connector_input_specification.scope with descriptions, examples, types, required fields, and default join strategy (space). Notes that these fields are read by the platform OAuth handler.
Models / Python
airbyte_cdk/sources/declarative/models/declarative_component_schema.py
Introduces ScopesJoinStrategy enum (space, comma, plus), new OAuthScope model (allows extra fields, required scope: str), and extends OauthConnectorInputSpecification with scopes: Optional[List[OAuthScope]], optional_scopes: Optional[List[OAuthScope]], and scopes_join_strategy: Optional[ScopesJoinStrategy] (default space). Documentation indicates scopes overrides the single scope string and that platform handles these fields.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a scopes object array to the declarative OAuth specification.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch aldo/scopes-array-schema

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

3079-3088: Could we make optional_scopes merge semantics explicit, wdyt?

Would you clarify ordering/interaction with scopes (and legacy scope) to avoid connector-side ambiguity?

Proposed doc clarification
           optional_scopes:
             title: Optional Scopes
             type: array
             items:
               type: string
             description: |-
               The DeclarativeOAuth Specific list of optional scopes to request from the OAuth provider.
               These scopes may or may not be granted depending on the provider and user consent.
+              If both `scopes` and `optional_scopes` are set, define whether `optional_scopes` are
+              appended before or after `scopes` prior to `scopes_join_strategy` application.
+              Also define behavior when only legacy `scope` is provided.
             examples:
               - ["admin:read"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte_cdk/sources/declarative/declarative_component_schema.yaml` around
lines 3079 - 3088, Update the declarative_component_schema.yaml description for
optional_scopes to explicitly state the merge semantics and interaction with
scopes and legacy scope: clarify whether optional_scopes are appended to scopes
(and to legacy scope if present), whether duplicates are removed or preserved,
the order priority (e.g., scopes take precedence for required scopes and
optional_scopes only requested if not present), and any normalization rules
(case, whitespace, delimiter handling); reference the properties
optional_scopes, scopes and scope in the text so connector authors know exactly
how the final requested scope list is composed and ordered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte_cdk/sources/declarative/declarative_component_schema.yaml`:
- Around line 3089-3099: The schema field scopes_join_strategy is unused at
runtime; update the factory that reads oauth_connector_input_specification to
parse scopes_join_strategy and pass it into the DeclarativeOauth2Authenticator
(or extend DeclarativeOauth2Authenticator to accept a scopes_join_strategy
parameter), then modify DeclarativeOauth2Authenticator's token-request
preparation logic to join the scopes array using the provided strategy (support
"space", "comma", "plus") before sending the OAuth request; alternatively, if
you prefer removal, delete scopes_join_strategy from the schema and any
references in oauth_connector_input_specification to avoid dead config.

---

Nitpick comments:
In `@airbyte_cdk/sources/declarative/declarative_component_schema.yaml`:
- Around line 3079-3088: Update the declarative_component_schema.yaml
description for optional_scopes to explicitly state the merge semantics and
interaction with scopes and legacy scope: clarify whether optional_scopes are
appended to scopes (and to legacy scope if present), whether duplicates are
removed or preserved, the order priority (e.g., scopes take precedence for
required scopes and optional_scopes only requested if not present), and any
normalization rules (case, whitespace, delimiter handling); reference the
properties optional_scopes, scopes and scope in the text so connector authors
know exactly how the final requested scope list is composed and ordered.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ff8f9dfe-ff06-4069-b8b9-8386e64ac039

📥 Commits

Reviewing files that changed from the base of the PR and between 7f41401 and f15dbd7.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 5, 2026

PyTest Results (Fast)

3 918 tests  +4   3 906 ✅ +4   6m 17s ⏱️ -22s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit a737192. ± Comparison against base commit 6876663.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 5, 2026

PyTest Results (Full)

3 921 tests  +4   3 909 ✅ +4   11m 18s ⏱️ -3s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit a737192. ± Comparison against base commit 6876663.

♻️ This comment has been updated with latest results.

Change scopes and optional_scopes from List[str] to List[OAuthScope]
where each scope is an object with a required `scope` property.
Uses additionalProperties: true for future extensibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aldogonzalez8 Aldo Gonzalez (aldogonzalez8) changed the title feat: add scopes array to declarative OAuth spec feat: add scopes object array to declarative OAuth spec Mar 5, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b0aca088-b25e-4c09-ace0-d055c2f5ed8b

📥 Commits

Reviewing files that changed from the base of the PR and between f15dbd7 and a577eed.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml

…es fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ChristoGrab
Copy link
Copy Markdown
Collaborator

CodeRabbit (@coderabbitai) pause

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 5, 2026

✅ Actions performed

Reviews paused.

@ChristoGrab
Copy link
Copy Markdown
Collaborator

Aldo Gonzalez (@aldogonzalez8) the bunny will leave you alone now

@aldogonzalez8
Copy link
Copy Markdown
Contributor Author

Aldo Gonzalez (Aldo Gonzalez (@aldogonzalez8)) the bunny will leave you alone now

Thank you Christo Grabowski (@ChristoGrab)

@devin-ai-integration
Copy link
Copy Markdown
Contributor

/prerelease

…preserve scopes fields

Override OauthConnectorInputSpecification and cascade through
OAuthConfigSpecification -> AdvancedAuth -> ConnectorSpecification -> AirbyteMessage
to ensure serpyco_rs preserves scopes, optional_scopes, and scopes_join_strategy
fields during deserialization.

Also convert ScopesJoinStrategy enum to string value in spec.py before
passing to ConnectorSpecificationSerializer.load().

Fixes: airbytehq/airbyte-internal-issues#15963
Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review submitted

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Validation of raised concerns against PR #934 + platform PR #18705

I reviewed the full diff of this CDK PR and the companion platform PR (airbyte-platform-internal#18705) to validate the four concerns. All four reference symbols that do not exist in the actual codebase. Details below.

Show/Hide Detailed Report

1. "Unsafe as ObjectNode cast in applyRequestedScopes — could throw ClassCastException"

Finding: No such method exists.

There is no applyRequestedScopes method anywhere in either PR or in the existing codebase (searched both origin/aldo/scopes-array-platform and origin/master). The scope extraction in DeclarativeOAuthSpecHandler.createDefaultTemplateMap() uses safe Jackson JsonNode.path() calls:

val scopesNode = userConfig.path(SCOPES_KEY)
if (scopesNode.isArray && scopesNode.size() > 0) {
    val scopeNames = scopesNode.map { it.path("scope").asText() }.filter { it.isNotEmpty() }
    // ...
}

No ObjectNode cast is performed on scope data. The existing as ObjectNode casts in MoreOAuthParameters.kt are pre-existing code unrelated to this PR's changes.

Verdict: Not a valid concern — the referenced method does not exist.


2. "requestedOptionalScopes silently ignored when requestedScopes is null — recommend a warning log"

Finding: Neither requestedOptionalScopes nor requestedScopes exist as symbols.

In the actual code, the closest logic is in OAuthHandler.getSourceOAuthScopes():

val scopesList = extractScopesList(specNode, "scopes")
if (scopesList.isEmpty()) {
    throw ConfigNotFoundException(...)  // throws before reaching optional_scopes
}
val optionalScopesList = extractScopesList(specNode, "optional_scopes")

This does short-circuit if scopes is empty (throwing before processing optional_scopes), but this is intentional per the endpoint contract: the endpoint returns 404 for connectors not yet migrated to the structured scopes format. Optional scopes without required scopes is not a supported state.

In the consent URL flow (DeclarativeOAuthSpecHandler.createDefaultTemplateMap()), optional_scopes is processed independently — there is no gating on scopes being present. So the two flows handle this differently by design.

Verdict: Not a valid concern as stated — the referenced symbols don't exist. The actual short-circuit behavior in the scopes endpoint is intentional.


3. "Format mismatch in StoredOAuthStateList<String> vs [{scope: "x"}] objects, needs a clarifying comment"

Finding: StoredOAuthState has no scopes field at all.

StoredOAuthState stores oAuthInputConfiguration as a raw JsonNode:

@JsonProperty("oauthInputConfiguration")
var oAuthInputConfiguration: JsonNode? = null

There is no List<String> for scopes. The scopes (whether {scope: "x"} objects or legacy scope string) are embedded inside the opaque oAuthInputConfiguration JsonNode. Jackson preserves whatever format they were in through serialization/deserialization. No type mismatch is possible.

Verdict: Not a valid concern — StoredOAuthState doesn't have a typed scopes field; it uses opaque JsonNode storage.


4. "Missing KDoc on requestedScopes parameter in CloudOAuthServiceImpl — flat strings vs scope objects confusion"

Finding: No requestedScopes parameter exists in CloudOAuthServiceImpl.

CloudOAuthServiceImpl has three methods:

  • getSourceConsentUrl(workspaceId, definitionId, redirectUrl, oauthInputConfiguration)
  • completeSourceOAuthReturnSecret(workspaceId, definitionId, redirectUrl, queryParameters, oauthInputConfiguration)
  • setWorkspaceOverrideOAuthParams(workspaceId, actorType, definitionId, oauthCredentialsConfiguration)

None have a requestedScopes parameter. The oauthInputConfiguration: JsonNode parameter carries scopes as part of opaque JSON — the scope format is handled downstream by DeclarativeOAuthSpecHandler, not by this service.

Verdict: Not a valid concern — the referenced parameter does not exist.


Summary

# Concern Verdict
1 Unsafe ObjectNode cast in applyRequestedScopes Invalid — method doesn't exist; scope extraction uses safe JsonNode.path()
2 requestedOptionalScopes silently ignored Invalid — symbols don't exist; actual short-circuit is intentional
3 Format mismatch in StoredOAuthState Invalid — no typed scopes field; uses opaque JsonNode
4 Missing KDoc on requestedScopes in CloudOAuthServiceImpl Invalid — parameter doesn't exist

All four concerns appear to reference an earlier draft or hallucinated code rather than the actual implementation. The PR's scope extraction logic uses safe Jackson APIs throughout, and the data flow preserves scope format correctly via JsonNode passthrough.


Devin session

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Updated validation — corrected concerns

Apologies for the earlier mismatch. Here's the validation against the actual 4 concerns, based on review of both the CDK diff and the companion platform PR (airbyte-platform-internal#18705).

Show/Hide Detailed Report

1. Fragile protocol override in airbyte_protocol.py — needs a sync-check test

Finding: Valid concern.

The PR redeclares 4 protocol dataclasses (OauthConnectorInputSpecification, OAuthConfigSpecification, AdvancedAuth, ConnectorSpecification) to shadow the upstream airbyte_protocol_dataclasses models via from airbyte_protocol_dataclasses.models import *. This is necessary because serpyco_rs silently drops unknown fields (scopes, optional_scopes, scopes_join_strategy) during deserialization.

The fragility: if the upstream airbyte-protocol package adds or reorders fields in any of these 4 classes, the local override will silently diverge. The comment in the file acknowledges this follows the AirbyteStateBlob pattern, but that pattern has the same fragility risk.

Suggestion: Add a sync-check test that introspects the upstream protocol dataclass fields and asserts they're a subset of the local override. Something like:

def test_protocol_override_fields_in_sync():
    """Ensure our OauthConnectorInputSpecification override stays compatible with the upstream protocol."""
    from airbyte_protocol_dataclasses.models import OauthConnectorInputSpecification as UpstreamSpec
    from airbyte_cdk.models.airbyte_protocol import OauthConnectorInputSpecification as OverrideSpec
    
    upstream_fields = {f.name for f in dataclasses.fields(UpstreamSpec)}
    override_fields = {f.name for f in dataclasses.fields(OverrideSpec)}
    
    missing = upstream_fields - override_fields
    assert not missing, (
        f"Upstream protocol added fields {missing} to OauthConnectorInputSpecification "
        f"that are missing from the airbyte_protocol.py override. Update the override to match."
    )

This would catch drift on CI rather than silently losing fields at runtime.


2. Type mismatch — Optional[str] in protocol vs Optional[ScopesJoinStrategy] in schema

Finding: Valid concern, but handled at serialization boundary.

The Pydantic model in declarative_component_schema.py declares:

scopes_join_strategy: Optional[ScopesJoinStrategy] = Field(ScopesJoinStrategy.space, ...)

The protocol override in airbyte_protocol.py declares:

scopes_join_strategy: Optional[str] = None

These types don't match. The Pydantic model uses the ScopesJoinStrategy enum, while the protocol dataclass expects a plain string. The bridge is in spec.py, which converts the enum to its string value before serialization:

oauth_input.scopes_join_strategy = oauth_input.scopes_join_strategy.value

So the mismatch is intentional and handled — but it could be clearer.

Suggestion: Add an inline comment on the protocol override field explaining the type contract:

# Stored as str (not ScopesJoinStrategy enum) because spec.py converts the enum
# to its .value before serialization. The protocol layer only sees plain strings.
scopes_join_strategy: Optional[str] = None

3. scopes_join_strategy.value may fail if value is already a string — needs defensive handling

Finding: Valid concern.

In spec.py:

if (
    oauth_input
    and hasattr(oauth_input, "scopes_join_strategy")
    and oauth_input.scopes_join_strategy is not None
):
    oauth_input.scopes_join_strategy = oauth_input.scopes_join_strategy.value

If scopes_join_strategy is already a string (e.g., from raw dict input or a re-serialization path), calling .value on a str will raise AttributeError. The test already passes scopes_join_strategy="space" (a raw string), which only works because the test constructs the spec object directly rather than going through the code path that calls .value.

Suggestion: Add defensive handling, same pattern as you'd use for any enum-or-string field:

if (
    oauth_input
    and hasattr(oauth_input, "scopes_join_strategy")
    and oauth_input.scopes_join_strategy is not None
):
    strategy = oauth_input.scopes_join_strategy
    oauth_input.scopes_join_strategy = (
        strategy.value if hasattr(strategy, "value") else strategy
    )

Also add a test case that exercises the .value conversion path through generate_spec() with a ScopesJoinStrategy enum value, not just a raw string.


4. Duplicate title "Scopes" on both scope (string) and scopes (array) fields

Finding: Valid concern.

In declarative_component_schema.py, both fields share title="Scopes":

scope: Optional[str] = Field(
    ...,
    title="Scopes",  # line 863
)
scopes: Optional[List[OAuthScope]] = Field(
    ...,
    title="Scopes",  # line 873
)

And in the YAML schema:

scope:
  title: Scopes
  ...
scopes:
  title: Scopes
  ...

Since this schema drives the Connector Builder UI form generation, duplicate titles on sibling fields in the same object can cause confusion in the UI — users would see two fields both labeled "Scopes" with no visual distinction.

Suggestion: Differentiate the titles:

  • scopetitle: "Scope" (singular, for the legacy flat string)
  • scopestitle: "Scopes (Array)" or keep "Scopes" and rename the legacy to "Scope (Legacy)"

This change should be made in both the YAML schema and the generated Pydantic model.


Summary

# Concern Valid? Severity Suggestion
1 Fragile protocol override Yes Medium Add sync-check test asserting upstream field subset
2 Type mismatch str vs ScopesJoinStrategy Yes (handled) Low Add clarifying comment on protocol field
3 .value on already-string Yes High Defensive hasattr(strategy, "value") guard
4 Duplicate "Scopes" title Yes Medium Differentiate titles for UI clarity

Devin session

… scopes_join_strategy

Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
@aldogonzalez8
Copy link
Copy Markdown
Contributor Author

Aldo Gonzalez (aldogonzalez8) commented Mar 11, 2026

/prerelease

Prerelease Job Info

This job triggers the publish workflow with default arguments to create a prerelease.

Prerelease job started... Check job output.

✅ Prerelease workflow triggered successfully.

View the publish workflow run: https://github.com/airbytehq/airbyte-python-cdk/actions/runs/22978062353

@aldogonzalez8 Aldo Gonzalez (aldogonzalez8) merged commit fd21b86 into main Mar 12, 2026
27 of 30 checks passed
@aldogonzalez8 Aldo Gonzalez (aldogonzalez8) deleted the aldo/scopes-array-schema branch March 12, 2026 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants