Skip to content

feat(microsoft): add delegated oauth_token authentication#706

Merged
ds-filipknefel merged 3 commits intomainfrom
feat/microsoft-oauth-token
May 7, 2026
Merged

feat(microsoft): add delegated oauth_token authentication#706
ds-filipknefel merged 3 commits intomainfrom
feat/microsoft-oauth-token

Conversation

@ds-filipknefel
Copy link
Copy Markdown
Contributor

@ds-filipknefel ds-filipknefel commented May 7, 2026

Summary

Mirror the Google Drive oauth_token pattern across the three Microsoft Graph connectors (SharePoint, OneDrive, Outlook).

  • Add Optional oauth_token to OnedriveAccessConfig and OutlookAccessConfig (SharepointAccessConfig inherits from OnedriveAccessConfig).
  • client_cred / client_credential and client_id become Optional so a delegated user access token can be supplied without app credentials.
  • model_post_init validators enforce at-least-one auth method and mutual exclusion of oauth_token with client_cred / password.
  • get_token / _acquire_token short-circuit on oauth_token: hand the bearer token through directly, no MSAL round-trip. Existing app-only and password-grant paths are unchanged.
  • Schema-validation unit tests for each AccessConfig variant.
  • Bump to 1.5.2; CHANGELOG entry added.

Tokens typically expire after ~1 hour; refresh handling is out of scope for this connector — same model as the Google Drive oauth_token field.

Test plan

  • CI passes (unit tests for the new schema validators)
  • Existing integration tests still pass (no behaviour change for app-only or password-grant flows)

Note

Medium Risk
Touches Microsoft Graph authentication/config validation for OneDrive/SharePoint/Outlook; misconfiguration could break existing credential-based flows, though changes are gated by new validation and the prior MSAL paths remain.

Overview
Adds delegated OAuth support to the Microsoft Graph connectors by introducing an optional oauth_token on OneDrive/SharePoint/Outlook access configs and allowing client_id/client_cred to be omitted when a user access token is supplied.

Implements stricter Pydantic validation (mutual exclusion and “at least one auth method”) plus a cross-field rule requiring client_id when oauth_token is not set, and logs a one-time warning during precheck when token-based auth is used.

Includes new unit tests covering these auth/validation combinations, and bumps the package to 1.5.2 with a CHANGELOG entry.

Reviewed by Cursor Bugbot for commit 7fb41d4. Bugbot is set up for automated code reviews on this repo. Configure here.

Mirror the Google Drive oauth_token pattern across the three Microsoft
Graph connectors (SharePoint, OneDrive, Outlook).

- Add Optional `oauth_token` to OnedriveAccessConfig and OutlookAccessConfig
  (SharepointAccessConfig inherits from OnedriveAccessConfig).
- Make `client_cred` / `client_credential` and `client_id` Optional so a
  delegated user access token can be supplied without app credentials.
- model_post_init validators enforce at-least-one auth method and
  mutual exclusion of oauth_token with client_cred / password.
- get_token / _acquire_token short-circuit on oauth_token: hand the
  bearer token through directly, no MSAL round-trip. Existing app-only
  and password-grant paths are unchanged.
- Schema-validation unit tests for each AccessConfig variant.
- Bump to 1.5.2; CHANGELOG entry added.

Tokens typically expire after ~1 hour; refresh handling is out of scope
for this connector.
@potter-potter
Copy link
Copy Markdown
Contributor

@claude please review

@claude
Copy link
Copy Markdown

claude Bot commented May 7, 2026

Claude finished @potter-potter's task —— View job


Code Review: Microsoft OAuth Token Authentication

I've completed my review of this PR that adds delegated OAuth token authentication to Microsoft Graph connectors. Overall, this is a well-implemented feature that follows good patterns.

Todo List:

  • Read repository guidelines (CLAUDE.md)
  • Analyze changed files and implementation
  • Review authentication validation logic
  • Check security considerations
  • Evaluate test coverage
  • Provide comprehensive review feedback

Strengths

Authentication Architecture

  • Excellent validation logic: The model_post_init validators properly enforce mutual exclusion between auth methods
  • Consistent pattern: Successfully mirrors Google Drive's oauth_token approach across all three connectors
  • Clear security boundaries: OAuth tokens bypass MSAL entirely, reducing complexity

Implementation Quality

  • Short-circuit optimization: get_token() and _acquire_token() methods properly return tokens directly when oauth_token is present
  • Comprehensive test coverage: All three connectors have unit tests covering valid/invalid auth combinations
  • Clean inheritance: SharePoint inherits from OneDrive's AccessConfig naturally

Code Safety

  • Proper field validation: Makes client_id/client_cred optional only when oauth_token is provided
  • Clear error messages: Validation errors are descriptive and actionable
  • Backwards compatibility: Existing auth flows remain unchanged

🔍 Minor Issues & Suggestions

Field Naming Inconsistency (outlook.py:41-42)

client_credential: Optional[str] = Field(
    default=None, description="Azure AD App client secret", alias="client_cred"
)

Issue: Outlook uses client_credential with client_cred alias, while OneDrive uses client_cred directly.
Suggestion: Consider standardizing on one approach across all connectors for consistency.

Missing Password Validation (outlook.py:53-62)

The validation logic in OutlookAccessConfig.model_post_init() doesn't include password checks like OneDrive does:

# Current - missing password validation
if not has_client_cred and not has_oauth_token:
    raise ValueError("either client_cred or oauth_token must be set")

# Should be (like OneDrive):
if not has_client_cred and not has_oauth_token:
    raise ValueError("either client_cred or oauth_token must be set")

if has_oauth_token and (has_client_cred or has_password):  # Add password check
    raise ValueError("cannot use both oauth_token and client_cred/password authentication")

Token Expiration Warning (onedrive.py:117, outlook.py:91)

The warning messages are helpful but could be more actionable:

# Current
logger.warning("Using OAuth token authentication. Tokens expire after ~1 hour.")

# Suggestion
logger.warning("Using OAuth token authentication. Tokens typically expire after ~1 hour. "
               "Refresh handling is not supported - obtain a fresh token if auth fails.")

📋 Technical Review

Security Considerations

  • OAuth tokens are properly treated as secrets (not logged)
  • Validation prevents mixed auth modes that could cause confusion
  • No credential leakage in error messages

Error Handling

  • Maintains existing error handling for MSAL flows
  • Clear separation between delegated and app-only authentication paths
  • Proper exception types used throughout

Test Coverage

  • Comprehensive unit tests for all AccessConfig variants
  • Tests cover both valid configurations and error cases
  • Edge cases like missing auth and mixed auth modes are tested

🎯 Recommendations

  1. Fix Outlook password validation - Add password checks to match OneDrive/SharePoint
  2. Consider field naming consistency - Standardize client_cred vs client_credential across connectors
  3. Enhance warning messages - Make token expiration warnings more actionable

📝 Summary

This is a solid implementation that successfully adds delegated OAuth authentication across Microsoft Graph connectors. The code follows established patterns, has good test coverage, and maintains backwards compatibility. The minor issues noted above are easily addressable and don't impact the core functionality.

Verdict: ✅ Approved with minor suggestions for improvement.


Comment thread unstructured_ingest/processes/connectors/onedrive.py
Comment thread unstructured_ingest/processes/connectors/onedrive.py Outdated
Comment thread unstructured_ingest/processes/connectors/onedrive.py
Copy link
Copy Markdown
Contributor

@potter-potter potter-potter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Three corrections to the delegated oauth_token feature, all flagged on the
initial review.

- Cross-field validator on OneDrive/Outlook ConnectionConfig: require
  client_id when oauth_token is absent. Was previously caught only by a
  cryptic AADSTS / MSAL error at runtime.
- model_post_init checks switched from "is not None" to truthiness so
  empty-string env vars no longer desync the validator from the runtime
  auth-mode branching in get_token / _acquire_token.
- OAuth-mode advisory moved out of the per-Graph-request callback into a
  helper called from each Indexer/Uploader/Downloader precheck. Cuts log
  volume from one warning per request to one per pipeline step. SharePoint
  inherits from OneDrive so the validator and helper propagate; precheck
  override calls the helper explicitly.

Six new unit tests pin the new behaviors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 7fb41d4. Configure here.

if access_config.oauth_token:
# Delegated user authentication: hand the access token through directly.
# Tokens typically expire after ~1 hour; refresh is not handled here.
return {"access_token": access_config.oauth_token, "token_type": "Bearer"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oauth_token path unnecessarily requires msal dependency

Low Severity

The oauth_token short-circuit in get_token and _acquire_token is placed after the from msal import ... (and from requests import ...) statements. Both the @requires_dependencies decorator and these local imports execute before the if access_config.oauth_token: check, meaning the oauth_token path — which never uses MSAL — still requires msal (and requests) to be installed and importable. This contradicts the stated intent of "no MSAL round-trip" for the delegated token path.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 7fb41d4. Configure here.

@ds-filipknefel ds-filipknefel merged commit 3ab29e6 into main May 7, 2026
39 of 40 checks passed
@ds-filipknefel ds-filipknefel deleted the feat/microsoft-oauth-token branch May 7, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants