Skip to content

feat: upgrade langchain to 0.3.0 and update related dependencies#811

Closed
David Gold (dbgold17) wants to merge 3 commits intomainfrom
devin/1761184352-upgrade-langchain-0.3.0
Closed

feat: upgrade langchain to 0.3.0 and update related dependencies#811
David Gold (dbgold17) wants to merge 3 commits intomainfrom
devin/1761184352-upgrade-langchain-0.3.0

Conversation

@dbgold17
Copy link
Copy Markdown
Contributor

@dbgold17 David Gold (dbgold17) commented Oct 23, 2025

Upgrade langchain to 0.3.0 and update related dependencies

Summary

Upgraded the Python CDK to use langchain 0.3.0 as requested, with minimal necessary dependency updates. The main changes include:

  1. Dependency updates:

    • langchain: 0.1.16 → 0.3.0
    • langchain_core: 0.1.42 → >=0.3.0,<0.4.0
    • Added new packages: langchain-cohere, langchain-community, langchain-openai (required by langchain 0.3.0's new architecture)
    • openai: 0.27.9 → >=1.40.0,<2.0.0 (required by langchain-openai)
  2. Code changes in embedder.py:

    • Updated imports to use new package structure (embeddings moved from langchain.embeddings.* to separate packages)
    • Fixed AzureOpenAIEmbedder to use AzureOpenAIEmbeddings class instead of OpenAIEmbeddings with Azure parameters (breaking change in langchain-openai 0.2.x)
    • Updated Azure OpenAI parameter names: openai_api_keyapi_key, openai_api_baseazure_endpoint, deploymentazure_deployment, etc.
    • Added SecretStr wrapper for API keys to satisfy mypy type checking requirements

Review & Testing Checklist for Human

  • Critical: Test with actual Azure OpenAI credentials - The Azure OpenAI implementation changed significantly (different class + parameter names). Verify embeddings work correctly with real credentials.
  • Test with actual Cohere credentials - Verify the SecretStr wrapper doesn't break Cohere embeddings at runtime.
  • Test a connector using vector-db-based extra - Ensure downstream connectors still work with these dependency changes.
  • Verify OpenAI embeddings - The OpenAI version jumped from 0.27.9 to 1.40.0+. Test with real credentials to catch any breaking changes.
  • Check transitive dependency impacts - Connectors may rely on undeclared transitive dependencies that changed in the lock file.

Test Plan

  1. Run a connector that uses Azure OpenAI embeddings with actual credentials
  2. Run a connector that uses Cohere embeddings with actual credentials
  3. Run a connector that uses standard OpenAI embeddings with actual credentials
  4. Verify all embedding modes produce expected output

Notes

  • All unit tests pass locally (3799 passed, 3 skipped)
  • MyPy type checking passes
  • Linting passes
  • The SecretStr wrapper is required by the new langchain libraries' type signatures
  • The lock file shows extensive changes due to dependency resolution, but these should not affect functionality

Link to Devin run: https://app.devin.ai/sessions/b82c85600676421da2957652757b3b66
Requested by: David Gold (David Gold (@dbgold17))

Summary by CodeRabbit

  • Chores

    • Updated LangChain to v0.3.0, langchain_core to >=0.3.0,<0.4.0, and OpenAI to >=1.40.0. Added optional integrations: langchain-cohere, langchain-community, langchain-openai; expanded vector-db-based extras.
  • Refactor

    • Updated embedding provider integrations and public import surfaces.
    • Azure OpenAI embedder constructor keyword names changed (e.g., api_key, api_version, azure_endpoint, azure_deployment); adjust configs accordingly.

Co-Authored-By: David Gold <32782137+dbgold17@users.noreply.github.com>
Copilot AI review requested due to automatic review settings October 23, 2025 01:53
@devin-ai-integration
Copy link
Copy Markdown
Contributor

Original prompt from David Gold
upgrade the python CDK to use langchain 0.3.0 and minimally update other dependencies.

@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions github-actions bot added the enhancement New feature or request label Oct 23, 2025
@github-actions
Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1761184352-upgrade-langchain-0.3.0#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1761184352-upgrade-langchain-0.3.0

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR upgrades the langchain dependency from version 0.1.16 to 0.3.0, along with necessary related package updates to accommodate langchain's new modular structure where embeddings functionality has been split into separate packages.

Key changes:

  • Updated langchain and related dependencies to their 0.3.x versions
  • Migrated embedding imports to new langchain package structure (langchain-cohere, langchain-community, langchain-openai)
  • Updated Azure OpenAI embedder implementation to use the new AzureOpenAIEmbeddings class with updated parameter names

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
pyproject.toml Updated langchain from 0.1.16 to 0.3.0, added new split packages (langchain-cohere, langchain-community, langchain-openai), upgraded openai from 0.27.9 to >=1.40.0, and updated vector-db-based extras
airbyte_cdk/destinations/vector_db_based/embedder.py Updated imports to use new langchain package structure and refactored AzureOpenAIEmbedder to use AzureOpenAIEmbeddings class with updated parameter names

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread pyproject.toml
fastavro = { version = ">=1.11.0,<2.0.0", optional = true }
langchain = { version = "0.1.16", optional = true }
langchain_core = { version = "0.1.42", optional = true }
langchain = { version = "0.3.0", optional = true }
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider using a version range (e.g., '>=0.3.0,<0.4.0') instead of pinning to an exact version. This is consistent with how langchain_core and the new langchain packages are specified, and allows for patch updates without requiring dependency updates.

Suggested change
langchain = { version = "0.3.0", optional = true }
langchain = { version = ">=0.3.0,<0.4.0", optional = true }

Copilot uses AI. Check for mistakes.
Comment thread pyproject.toml
langchain_core = { version = ">=0.3.0,<0.4.0", optional = true }
langchain-cohere = { version = ">=0.3.0,<0.5.0", optional = true }
langchain-community = { version = ">=0.3.0,<0.4.0", optional = true }
langchain-openai = { version = ">=0.2.0,<0.3.0", optional = true }
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version constraint for langchain-openai (>=0.2.0,<0.3.0) appears inconsistent with the other langchain 0.3.x packages. Typically, langchain-openai 0.3.x would be released alongside langchain 0.3.0. Consider verifying if langchain-openai 0.3.x exists and updating to '>=0.3.0,<0.4.0' for consistency.

Suggested change
langchain-openai = { version = ">=0.2.0,<0.3.0", optional = true }
langchain-openai = { version = ">=0.3.0,<0.4.0", optional = true }

Copilot uses AI. Check for mistakes.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Oct 23, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Updated LangChain-related dependencies and refactored embedding imports/initialization: embedder classes now use new langchain_* packages and AzureOpenAIEmbedder constructs AzureOpenAIEmbeddings with renamed Azure-specific constructor parameters; dependency versions for langchain/openai were bumped and new optional langchain packages were added.

Changes

Cohort / File(s) Summary
Embedder implementation
airbyte_cdk/destinations/vector_db_based/embedder.py
Replaced langchain.embeddings.* imports with langchain_cohere, langchain_community, langchain_openai. AzureOpenAIEmbedder now constructs AzureOpenAIEmbeddings and uses renamed kwargs (openai_keyapi_key wrapped in SecretStr, openai_api_versionapi_version, openai_api_baseazure_endpoint, deploymentazure_deployment). Cohere key wrapped with SecretStr. Public class signatures retained but internal client types/param names changed.
Project dependencies
pyproject.toml
Bumped langchain0.3.0, langchain_core>=0.3.0,<0.4.0, openai>=1.40.0,<2.0.0; added optional deps langchain-cohere, langchain-community, langchain-openai; expanded vector-db-based extras to include new langchain packages; added types-pytz.

Sequence Diagram(s)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • dbgold17 — wdyt?

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat: upgrade langchain to 0.3.0 and update related dependencies" directly summarizes the main objective of the changeset. The primary change is the langchain upgrade from 0.1.16 to 0.3.0, which is explicitly stated in the title, along with an acknowledgment of the necessary related dependency updates. The title is concise, specific, and avoids vague terminology—a teammate reviewing the commit history would immediately understand that this PR addresses a major dependency upgrade. The code changes in embedder.py are logical consequences of this main upgrade rather than independent changes, making them appropriately captured by the scope of the title.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1761184352-upgrade-langchain-0.3.0

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte_cdk/destinations/vector_db_based/embedder.py (1)

109-113: Update OpenAIEmbedder to use api_key for consistency.

Both OpenAIEmbeddings and AzureOpenAIEmbeddings in langchain-openai 0.2.x support the api_key parameter. Currently, OpenAIEmbedder (line 110) uses the legacy openai_api_key while AzureOpenAIEmbedder (line 122) uses the modern api_key. Would you consider standardizing both to use api_key for consistency?

🧹 Nitpick comments (1)
pyproject.toml (1)

67-67: Consider using a version range for langchain instead of exact pin?

Line 67 pins langchain to exactly "0.3.0", while line 68 uses a range for langchain_core (">=0.3.0,<0.4.0"). This inconsistency might prevent receiving patch updates for langchain (e.g., 0.3.1, 0.3.2) that could include bug fixes or security patches.

Would you consider changing this to ">=0.3.0,<0.4.0" for consistency and to allow patch updates? Wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20ae208 and 3b4c403.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • airbyte_cdk/destinations/vector_db_based/embedder.py (2 hunks)
  • pyproject.toml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: MyPy Check
🔇 Additional comments (3)
airbyte_cdk/destinations/vector_db_based/embedder.py (1)

10-12: LGTM! Import paths updated for langchain 0.3.0.

The import restructuring correctly reflects the new split-package architecture in langchain 0.3.0.

pyproject.toml (2)

73-73: LGTM! OpenAI major version upgrade aligned with langchain-openai requirements.

The upgrade from openai 0.27.9 to >=1.40.0,<2.0.0 is a significant change (major version bump), but this aligns with the requirements of langchain-openai 0.2.x. The helpful comment clarifies the dependency relationship.


127-127: LGTM! Extras correctly updated with new split packages.

The vector-db-based extras now include all the necessary langchain split packages (langchain_core, langchain-cohere, langchain-community, langchain-openai) alongside the existing dependencies.

Comment on lines +120 to 127
AzureOpenAIEmbeddings( # type: ignore [call-arg]
api_key=config.openai_key,
chunk_size=16,
max_retries=15,
openai_api_type="azure",
openai_api_version="2023-05-15",
openai_api_base=config.api_base,
deployment=config.deployment,
api_version="2023-05-15",
azure_endpoint=config.api_base,
azure_deployment=config.deployment,
disallowed_special=(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify Azure OpenAI parameter mappings with the new SDK.

The parameter renames look aligned with Azure-specific naming conventions (e.g., azure_endpoint, azure_deployment), but since this is a major dependency upgrade, it's worth double-checking these match the langchain-openai 0.2.x API exactly.

As you noted in the PR description, testing with real Azure OpenAI credentials would help confirm the mappings are correct.


🏁 Script executed:

#!/bin/bash
# Description: Check for actual usage patterns of AzureOpenAIEmbeddings in the codebase to ensure parameter names are used correctly

# Search for any other references to AzureOpenAIEmbeddings or AzureOpenAIEmbedder
rg -nP -C5 'AzureOpenAI(Embeddings|Embedder)' --type=py

Length of output: 6311


Let me verify the exact parameter names for langchain-openai 0.2.x:

Let me check the actual implementation more closely and whether the model parameter is being provided:

Add missing model parameter to AzureOpenAIEmbeddings initialization.

The model parameter is required in langchain-openai 0.2.x AzureOpenAIEmbeddings—every official example shows it as the first argument. Your code passes api_key, chunk_size, max_retries, api_version, azure_endpoint, azure_deployment, and disallowed_special, but omits model.

The Azure OpenAI Service embeddings operation supports text-embedding-ada-002 (Version 2), text-embedding-3-large, and text-embedding-3-small models. Wdyt about adding something like model="text-embedding-3-large" (or whichever model your Azure deployment uses) to the initialization?

🤖 Prompt for AI Agents
In airbyte_cdk/destinations/vector_db_based/embedder.py around lines 120 to 127,
the AzureOpenAIEmbeddings initialization is missing the required model parameter
for langchain-openai 0.2.x; add a model argument (e.g.,
model="text-embedding-3-large" or the specific embedding model your Azure
deployment uses) to the constructor call so the embeddings client is created
with the correct model; ensure the chosen model matches the Azure deployment
configuration (deployment/model) and update any config or docs if you introduce
a new default.

openai_api_version="2023-05-15",
openai_api_base=config.api_base,
deployment=config.deployment,
api_version="2023-05-15",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Should the Azure OpenAI API version be configurable or more recent?

The api_version is hardcoded to "2023-05-15", which is over 2 years old. Azure OpenAI has released newer API versions since then. This could limit access to newer features or cause compatibility issues.

Consider either:

  1. Making the API version configurable via AzureOpenAIEmbeddingConfigModel
  2. Updating to a more recent stable API version (2024-x)

Wdyt about making this configurable for flexibility?


🌐 Web query:

What are the latest Azure OpenAI API versions and which version is recommended for langchain-openai 0.2.x?

💡 Result:

Short answer

  • Latest Azure OpenAI API surfaces (as of Oct 23, 2025): the new unified v1 OpenAI-style API (opt‑in since mid‑2025) plus dateed preview versions (notably 2025-03-01-preview for Responses/features) and the most recent date-based GA/preview releases (examples: 2024-06-01 / 2024-10-21 / 2025-03-01-preview). [1][2][3]

Recommended version for langchain-openai 0.2.x

  • Use the API version LangChain v0.2 docs show: set OPENAI_API_VERSION = 2023-12-01-preview (LangChain v0.2 guidance). If you can move to the newer unified v1/OpenAI-style endpoints and your LangChain version supports them, prefer the v1 spec for access to the latest features; otherwise pin to 2023-12-01-preview for stability with langchain-openai 0.2.x. [4][1]

Sources

  • Microsoft: Responses API / Azure OpenAI docs (v1 + preview notes). [1][2]
  • Microsoft: Azure OpenAI API lifecycle (unified v1 rollout / api-version guidance). [1]
  • LangChain v0.2 Azure OpenAI docs (recommended OPENAI_API_VERSION = 2023-12-01-preview). [4]

References (links shown as numbered sources above).


Update the Azure OpenAI API version to match LangChain compatibility requirements.

The api_version is hardcoded to "2023-05-15", but LangChain v0.2 docs recommend setting OPENAI_API_VERSION = 2023-12-01-preview for langchain-openai 0.2.x. This 7-month gap may cause compatibility issues or limit access to features.

Would updating to 2023-12-01-preview work for your use case? Additionally, consider making this configurable via AzureOpenAIEmbeddingConfigModel for flexibility across different deployment scenarios.

🤖 Prompt for AI Agents
In airbyte_cdk/destinations/vector_db_based/embedder.py around line 124, the
api_version is hardcoded to "2023-05-15"; update it to "2023-12-01-preview" to
match LangChain v0.2 expectations and avoid compatibility issues, and make this
value configurable by adding a new field (e.g., openai_api_version) to
AzureOpenAIEmbeddingConfigModel with default "2023-12-01-preview"; then replace
the hardcoded string with the config value when constructing the Azure OpenAI
client so existing behavior remains unchanged if the config is not set.

Comment thread pyproject.toml
Comment on lines +69 to +71
langchain-cohere = { version = ">=0.3.0,<0.5.0", optional = true }
langchain-community = { version = ">=0.3.0,<0.4.0", optional = true }
langchain-openai = { version = ">=0.2.0,<0.3.0", optional = true }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify version range compatibility across langchain packages.

The new langchain packages have different version constraints:

  • langchain-cohere: 0.3.x-0.4.x
  • langchain-community: 0.3.x only
  • langchain-openai: 0.2.x only

While this might be intentional due to different release schedules, it's worth verifying these ranges are compatible with each other and with langchain 0.3.0, especially since langchain-openai is on a different major version.


🌐 Web query:

Are langchain 0.3.0, langchain-openai 0.2.x, langchain-community 0.3.x, and langchain-cohere 0.3.x-0.4.x compatible with each other?

💡 Result:

Short answer: No — mix of 0.3.0 (langchain) with langchain-openai 0.2.x is not compatible. All LangChain packages that depend on langchain-core should stay in the same minor series (use >=0.3,<0.4 for the 0.3 line). In other words, use langchain-openai, langchain-community, langchain-cohere all in the 0.3.x family (or otherwise upgrade the whole stack together). [1][2][3]

Sources:

  • LangChain v0.3 docs — recommended constraint >=0.3,<0.4 for core and companion packages. [1]
  • LangChain forum guidance to keep packages within the same minor series (v0.3). [2]
  • langchain-cohere release listings / package notes showing 0.3/0.4 series as integrations tied to langchain-core. [3]

Fix langchain-openai version constraint — currently incompatible with langchain 0.3.0

All LangChain packages that depend on langchain-core should stay in the same minor series. The current constraint langchain-openai = ">=0.2.0,<0.3.0" pins to the 0.2.x series, which is incompatible with langchain 0.3.0 and the other 0.3.x packages. Use >=0.3,<0.4 for all packages.

Additionally, langchain-cohere = ">=0.3.0,<0.5.0" allows 0.4.x versions—should this be <0.4.0 for consistency, wdyt?

🤖 Prompt for AI Agents
In pyproject.toml around lines 69 to 71, the langchain-openai version constraint
is pinned to the 0.2.x series which is incompatible with langchain 0.3.x and the
other 0.3.x packages; update langchain-openai to use ">=0.3.0,<0.4.0" so it
stays in the same minor series as langchain-core, and also narrow
langchain-cohere to ">=0.3.0,<0.4.0" for consistency with the 0.3.x family
(langchain-community is already using the correct <0.4.0 range).

@github-actions
Copy link
Copy Markdown

github-actions bot commented Oct 23, 2025

PyTest Results (Fast)

3 802 tests  ±0   3 790 ✅ ±0   6m 26s ⏱️ -17s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 11d5bf0. ± Comparison against base commit 20ae208.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Oct 23, 2025

PyTest Results (Full)

3 805 tests   3 793 ✅  11m 12s ⏱️
    1 suites     12 💤
    1 files        0 ❌

Results for commit 11d5bf0.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaronsteers
Copy link
Copy Markdown
Member

Devin, check the linter failures. Some methods expect SecretString() wrapper on top of the str inputs.

Co-Authored-By: David Gold <32782137+dbgold17@users.noreply.github.com>
@aaronsteers
Copy link
Copy Markdown
Member

Devin:

Run poetry run mypy --config-file mypy.ini airbyte_cdk
/home/runner/.cache/pypoetry/virtualenvs/airbyte-cdk-LAWzOSa7-py3.10/lib/python3.12/site-packages/langchain_community/document_loaders/recursive_url_loader.py:54: SyntaxWarning: invalid escape sequence '\s'
  """Recursively load all child links from a root URL.
airbyte_cdk/sources/declarative/interpolation/macros.py:14: error: Library stubs not installed for "pytz"  [import-untyped]

@devin-ai-integration
Copy link
Copy Markdown
Contributor

The pytz error is pre-existing - it was already failing in the previous CI run before my changes. My changes fixed the SecretStr errors that you originally flagged:

  • ✅ Fixed: Argument "api_key" to "AzureOpenAIEmbeddings" has incompatible type "str"; expected "SecretStr | None"
  • ✅ Fixed: Argument "cohere_api_key" to "CohereEmbeddings" has incompatible type "str"; expected "SecretStr | None"

The pytz stub issue is unrelated to the langchain upgrade. Should I fix it in this PR or leave it for a separate fix?

@aaronsteers
Copy link
Copy Markdown
Member

Devin, it's possible that the non-zero exist status (aka failing lint check) is from the prior message and not from the pytz issue.

/home/runner/.cache/pypoetry/virtualenvs/airbyte-cdk-LAWzOSa7-py3.10/lib/python3.12/site-packages/langchain_community/document_loaders/recursive_url_loader.py:54: SyntaxWarning: invalid escape sequence '\s'

@devin-ai-integration
Copy link
Copy Markdown
Contributor

You're right! The pytz error is new - it's because the langchain upgrade brought in dependencies that use pytz, and we don't have the type stubs installed. The linter passes on main.

I'll add types-pytz to the dev dependencies to fix this.

Co-Authored-By: David Gold <32782137+dbgold17@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
pyproject.toml (1)

71-71: Fix langchain-openai version incompatibility with langchain 0.3.0

The langchain-openai package has 0.3.x versions available, but the constraint >=0.2.0,<0.3.0 pins to 0.2.x, which is incompatible with langchain 0.3.0 and the other 0.3.x packages in this file (langchain_core, langchain-community). All LangChain packages in the 0.3 ecosystem should stay in the same minor series—wdyt about updating to >=0.3.0,<0.4.0 for consistency?

-langchain-openai = { version = ">=0.2.0,<0.3.0", optional = true }
+langchain-openai = { version = ">=0.3.0,<0.4.0", optional = true }
🧹 Nitpick comments (1)
pyproject.toml (1)

67-67: Consider using version range for langchain (consistency and flexibility)

Line 67 pins langchain to an exact version (0.3.0), while langchain_core and other LangChain packages use ranges (e.g., >=0.3.0,<0.4.0). This approach is more flexible and allows patch-level updates without requiring manual version bumps. Wdyt about aligning with the pattern used for other packages?

-langchain = { version = "0.3.0", optional = true }
+langchain = { version = ">=0.3.0,<0.4.0", optional = true }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6757a1f and 11d5bf0.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • pyproject.toml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: MyPy Check
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: preview_docs
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (3)
pyproject.toml (3)

69-69: Verify langchain-cohere version range upper bound

Line 69 allows langchain-cohere versions up to 0.5.0 (>=0.3.0,<0.5.0), whereas the other 0.3.x packages in this file (langchain_core, langchain-community) are constrained to <0.4.0. Should langchain-cohere also be narrowed to <0.4.0 for consistency with the 0.3.x family, or is the broader range <0.5.0 intentional?


122-122: Good addition to resolve mypy stubs issue

Adding types-pytz = "*" to dev dependencies addresses the mypy error mentioned in the PR comments about missing stubs for pytz. This is a solid fix for typing support.


128-128: Extras correctly updated to include new langchain packages

The vector-db-based extras have been properly updated to include all the new langchain packages (langchain_core, langchain-cohere, langchain-community, langchain-openai) alongside the existing dependencies. This ensures users can access the full embedding/vectorization functionality with the new package structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants