feat: add bocha web search component#13322
Conversation
WalkthroughThis PR adds a new Bocha web search component to the Langflow component system. It includes the component implementation with API integration, module registration for proper imports, a catalog entry with embedded schema, and user documentation with sidebar navigation integration. ChangesBocha Web Search Component Integration
🎯 3 (Moderate) | ⏱️ ~20 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 3 warnings)
✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The remaining failing check is
This branch exists in my fork ( |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/docs/Components/bundles-bocha.mdx`:
- Line 29: Update the docs table row for the count parameter (the "count" table
cell) to mention the hard cap of 50 in addition to the default of 10—e.g.,
change the description to "Input parameter. The maximum number of search results
to return. Default: `10`. Maximum: `50`." so users know the results are capped
at 50.
In `@src/lfx/src/lfx/components/bocha/bocha_web_search.py`:
- Line 66: The payload currently uses "count": min(int(self.count), 50) which
only caps the upper bound; change it to clamp to a valid lower bound as well
(e.g., "count": max(1, min(int(self.count), 50))) so zero or negative values
cannot be sent; update the construction in bocha_web_search.py (the code that
sets the "count" field) to use this two-sided clamp and ensure int(self.count)
is still applied before clamping.
- Around line 74-75: The code calls response.json() and then accesses webPages
but doesn't handle JSON parsing errors; wrap the response.json() call in a
try/except that catches json.decoder.JSONDecodeError, ValueError (and optionally
TypeError) around the call in the function in bocha_web_search.py, log or build
a clear message and return the structured Data(text=msg, data={"error": msg})
instead of letting the exception escape; keep the rest of the httpx exception
handling intact and continue to use the same web_pages extraction
(result.get("data", {}).get("webPages", {}).get("value", [])) when parsing
succeeds.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e4f80bee-7da3-43e4-bcdb-1b18132cbeb8
📒 Files selected for processing (6)
docs/docs/Components/bundles-bocha.mdxdocs/sidebars.jssrc/lfx/src/lfx/_assets/component_index.jsonsrc/lfx/src/lfx/components/__init__.pysrc/lfx/src/lfx/components/bocha/__init__.pysrc/lfx/src/lfx/components/bocha/bocha_web_search.py
| |------|------|-------------| | ||
| | api_key | SecretString | Input parameter. The API key for authenticating with Bocha. | | ||
| | query | String | Input parameter. The search query to send to Bocha. | | ||
| | count | Integer | Input parameter. The maximum number of search results to return. Default: `10`. | |
There was a problem hiding this comment.
Document the maximum count limit.
The parameter table shows the default count is 10, but according to the review context, the result count is capped at 50. Users should be informed of this maximum limit to set appropriate expectations.
📝 Proposed fix to document the maximum limit
-| count | Integer | Input parameter. The maximum number of search results to return. Default: `10`. |
+| count | Integer | Input parameter. The maximum number of search results to return. Default: `10`. Maximum: `50`. |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | count | Integer | Input parameter. The maximum number of search results to return. Default: `10`. | | |
| | count | Integer | Input parameter. The maximum number of search results to return. Default: `10`. Maximum: `50`. | |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/docs/Components/bundles-bocha.mdx` at line 29, Update the docs table row
for the count parameter (the "count" table cell) to mention the hard cap of 50
in addition to the default of 10—e.g., change the description to "Input
parameter. The maximum number of search results to return. Default: `10`.
Maximum: `50`." so users know the results are capped at 50.
| "query": self.query, | ||
| "summary": self.summary, | ||
| "freshness": self.freshness or "noLimit", | ||
| "count": min(int(self.count), 50), |
There was a problem hiding this comment.
Clamp count to a valid lower bound before issuing the request.
Line 66 only enforces an upper cap. Zero/negative values can still be sent and cause avoidable API failures.
Suggested fix
- payload = {
+ count = max(1, min(int(self.count), 50))
+ payload = {
"query": self.query,
"summary": self.summary,
"freshness": self.freshness or "noLimit",
- "count": min(int(self.count), 50),
+ "count": count,
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "count": min(int(self.count), 50), | |
| count = max(1, min(int(self.count), 50)) | |
| payload = { | |
| "query": self.query, | |
| "summary": self.summary, | |
| "freshness": self.freshness or "noLimit", | |
| "count": count, | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lfx/src/lfx/components/bocha/bocha_web_search.py` at line 66, The payload
currently uses "count": min(int(self.count), 50) which only caps the upper
bound; change it to clamp to a valid lower bound as well (e.g., "count": max(1,
min(int(self.count), 50))) so zero or negative values cannot be sent; update the
construction in bocha_web_search.py (the code that sets the "count" field) to
use this two-sided clamp and ensure int(self.count) is still applied before
clamping.
| result = response.json() | ||
| web_pages = result.get("data", {}).get("webPages", {}).get("value", []) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does httpx.Response.json()raiseValueError/JSONDecodeError when the response body is not valid JSON?
💡 Result:
Yes, the httpx.Response.json method raises a json.decoder.JSONDecodeError if the response body is not valid JSON [1][2]. Because httpx uses the standard library's json module to parse the response body, any failure to decode the content as JSON will propagate the exception raised by json.loads [1][2]. This includes scenarios where the response body is empty, malformed, or contains content that does not conform to JSON syntax [1][3][2]. Additionally, depending on the response content and encoding, it is also possible for the method to raise a UnicodeDecodeError if the response body contains bytes that cannot be decoded into a string using the detected or default character encoding [1][4].
Citations:
- 1: https://stackoverflow.com/questions/75773019/what-exceptions-can-be-raised-by-python-httpxs-json-method
- 2: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) encode/httpx#2695
- 3: https://www.reddit.com/r/learnpython/comments/524ufd/randomly_receiving_jsondecoderjsondecodeerror/
- 4: Cleanup
response.json()method. encode/httpx#2911
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/lfx/src/lfx/components/bocha/bocha_web_search.py"
# Show the relevant sections around the reported line ranges
nl -ba "$FILE" | sed -n '50,130p'Repository: langflow-ai/langflow
Length of output: 107
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/lfx/src/lfx/components/bocha/bocha_web_search.py"
# Print with line numbers (cat -n exists) and focus around the referenced areas
cat -n "$FILE" | sed -n '60,120p'Repository: langflow-ai/langflow
Length of output: 2389
Handle invalid JSON responses so bocha_web_search fails gracefully
response.json() exceptions (e.g., json.decoder.JSONDecodeError / ValueError) aren’t caught by the current httpx exception handlers, so malformed/empty JSON will escape the component instead of returning the structured Data(text=msg, data={"error": msg}).
Suggested fix
except httpx.TimeoutException:
msg = "Bocha request timed out."
+ except ValueError as exc:
+ msg = f"Bocha response parse error: {exc}"
except httpx.HTTPStatusError as exc:
msg = f"Bocha HTTP error: {exc.response.status_code} - {exc.response.text}"
except httpx.RequestError as exc:
msg = f"Bocha request failed: {exc}"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lfx/src/lfx/components/bocha/bocha_web_search.py` around lines 74 - 75,
The code calls response.json() and then accesses webPages but doesn't handle
JSON parsing errors; wrap the response.json() call in a try/except that catches
json.decoder.JSONDecodeError, ValueError (and optionally TypeError) around the
call in the function in bocha_web_search.py, log or build a clear message and
return the structured Data(text=msg, data={"error": msg}) instead of letting the
exception escape; keep the rest of the httpx exception handling intact and
continue to use the same web_pages extraction (result.get("data",
{}).get("webPages", {}).get("value", [])) when parsing succeeds.
just like other search engine company, I also introduce the bocha Web_search API into the project.
Summary
Added Bocha Web Search integration as a new Langflow component.
Changes
Testing
Summary by CodeRabbit
New Features
Documentation