fix: certain APIs return SSE-style string responses by parsing them as JSON and reconstructing a ChatCompletion object#7280
Conversation
兼容某些 API 强制返回 SSE 格式的 Bug
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- Consider preserving the original exception when auto-fix fails by using
raise ... from erather than creating a freshException, so that debugging retains the full stack trace and error context. - Using
ChatCompletion.construct(**completion_dict)bypasses Pydantic validation; if possible, prefer the standard initializer (e.g.,ChatCompletion.model_validate(...)or equivalent) to catch malformed responses early. - The SSE compatibility logic assumes a single
data:payload; if upstream APIs send multi-line or batched SSE messages, you may need to split and join the relevantdata:lines before JSON parsing to avoid partial or invalid JSON.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider preserving the original exception when auto-fix fails by using `raise ... from e` rather than creating a fresh `Exception`, so that debugging retains the full stack trace and error context.
- Using `ChatCompletion.construct(**completion_dict)` bypasses Pydantic validation; if possible, prefer the standard initializer (e.g., `ChatCompletion.model_validate(...)` or equivalent) to catch malformed responses early.
- The SSE compatibility logic assumes a single `data:` payload; if upstream APIs send multi-line or batched SSE messages, you may need to split and join the relevant `data:` lines before JSON parsing to avoid partial or invalid JSON.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="467-476" />
<code_context>
-
+
+ # --- 新增:兼容某些 API 强制返回 SSE 格式的 Bug ---
+ if isinstance(completion, str):
+ logger.warning(f"检测到 API 返回了字符串而非对象,尝试自动修复: {completion[:100]}...")
+ try:
+ # 如果是 data:{...} 格式,去掉 "data:" 并解析 JSON
+ json_str = completion.strip()
+ if json_str.startswith("data:"):
+ json_str = json_str[5:].strip()
+
+ # 尝试解析 JSON
+ completion_dict = json.loads(json_str)
+
+ # 重新构造 ChatCompletion 对象
</code_context>
<issue_to_address>
**issue (bug_risk):** SSE-like `data:` responses can contain multiple lines and trailing markers which this logic currently ignores.
Some gateways send multi-line SSE chunks (e.g. `data:{...}\n\ndata:[DONE]`) or extra newlines. Since this code only strips a single leading `data:`, `json.loads` will fail when there are multiple `data:` lines or a `[DONE]` sentinel. Consider splitting on newlines, discarding `[DONE]`/empty lines, and parsing only the last valid `data:` JSON line.
</issue_to_address>
### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="475-481" />
<code_context>
+ json_str = json_str[5:].strip()
+
+ # 尝试解析 JSON
+ completion_dict = json.loads(json_str)
+
+ # 重新构造 ChatCompletion 对象
+ completion = ChatCompletion.construct(**completion_dict)
+ logger.info("成功将字符串响应转换为 ChatCompletion 对象。")
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Using `ChatCompletion.construct` bypasses validation and may admit malformed data.
Because `construct` skips validation and type coercion, malformed or partially invalid JSON can become a `ChatCompletion` instance that violates its invariants and fails later in harder-to-debug ways. Prefer a validated constructor (e.g. `ChatCompletion(**completion_dict)` or a proper `from_*` helper) so bad responses fail fast with validation errors.
```suggestion
# 尝试解析 JSON
completion_dict = json.loads(json_str)
# 重新构造 ChatCompletion 对象(使用带验证的构造函数,而非 construct)
completion = ChatCompletion(**completion_dict)
logger.info("成功将字符串响应转换为 ChatCompletion 对象。")
```
</issue_to_address>
### Comment 3
<location path="astrbot/core/provider/sources/openai_source.py" line_range="482-485" />
<code_context>
+ completion = ChatCompletion.construct(**completion_dict)
+ logger.info("成功将字符串响应转换为 ChatCompletion 对象。")
+
+ except Exception as e:
+ logger.error(f"自动修复失败: {e}")
+ # 如果修复失败,继续抛出原始错误
+ raise Exception(f"API 返回格式错误且无法修复:{type(completion)}: {completion}。")
+ # ---------------------------------------------------
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Re-raising a new generic `Exception` here loses the original traceback and error type.
The comment promises to rethrow the original error, but the code creates a new `Exception`, discarding the original stack trace and specific error from `json.loads` / `ChatCompletion`. To keep debugging context, either use a bare `raise` to rethrow `e`, or raise a more specific/custom exception with `raise ... from e` so the root cause is preserved via exception chaining.
</issue_to_address>
### Comment 4
<location path="astrbot/core/provider/sources/openai_source.py" line_range="468" />
<code_context>
+
+ # --- 新增:兼容某些 API 强制返回 SSE 格式的 Bug ---
+ if isinstance(completion, str):
+ logger.warning(f"检测到 API 返回了字符串而非对象,尝试自动修复: {completion[:100]}...")
+ try:
+ # 如果是 data:{...} 格式,去掉 "data:" 并解析 JSON
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Logging the raw response snippet may expose sensitive content and could be toned down or guarded.
This substring can still include user prompts or other private data, which may be sensitive depending on where logs are stored or shipped. Please consider masking/redacting the payload, logging only metadata (e.g., length or content type), or gating this detailed snippet behind a debug-only flag.
Suggested implementation:
```python
if isinstance(completion, str):
# 避免在日志中暴露原始响应内容,仅记录元信息
logger.warning("检测到 API 返回了字符串而非对象,尝试自动修复。为保护隐私,已省略响应内容。")
# 在 debug 级别可输出更详细的调试信息(例如长度),不包含具体文本
try:
response_length = len(completion)
except Exception:
response_length = None
if logger.isEnabledFor(logging.DEBUG):
logger.debug("原始字符串响应元信息: length=%s", response_length)
try:
```
If not already present at the top of `astrbot/core/provider/sources/openai_source.py`, add `import logging` and ensure `logger` is configured appropriately for your project’s logging setup.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if isinstance(completion, str): | ||
| logger.warning(f"检测到 API 返回了字符串而非对象,尝试自动修复: {completion[:100]}...") | ||
| try: | ||
| # 如果是 data:{...} 格式,去掉 "data:" 并解析 JSON | ||
| json_str = completion.strip() | ||
| if json_str.startswith("data:"): | ||
| json_str = json_str[5:].strip() | ||
|
|
||
| # 尝试解析 JSON | ||
| completion_dict = json.loads(json_str) |
There was a problem hiding this comment.
issue (bug_risk): SSE-like data: responses can contain multiple lines and trailing markers which this logic currently ignores.
Some gateways send multi-line SSE chunks (e.g. data:{...}\n\ndata:[DONE]) or extra newlines. Since this code only strips a single leading data:, json.loads will fail when there are multiple data: lines or a [DONE] sentinel. Consider splitting on newlines, discarding [DONE]/empty lines, and parsing only the last valid data: JSON line.
| # 尝试解析 JSON | ||
| completion_dict = json.loads(json_str) | ||
|
|
||
| # 重新构造 ChatCompletion 对象 | ||
| completion = ChatCompletion.construct(**completion_dict) | ||
| logger.info("成功将字符串响应转换为 ChatCompletion 对象。") | ||
|
|
There was a problem hiding this comment.
suggestion (bug_risk): Using ChatCompletion.construct bypasses validation and may admit malformed data.
Because construct skips validation and type coercion, malformed or partially invalid JSON can become a ChatCompletion instance that violates its invariants and fails later in harder-to-debug ways. Prefer a validated constructor (e.g. ChatCompletion(**completion_dict) or a proper from_* helper) so bad responses fail fast with validation errors.
| # 尝试解析 JSON | |
| completion_dict = json.loads(json_str) | |
| # 重新构造 ChatCompletion 对象 | |
| completion = ChatCompletion.construct(**completion_dict) | |
| logger.info("成功将字符串响应转换为 ChatCompletion 对象。") | |
| # 尝试解析 JSON | |
| completion_dict = json.loads(json_str) | |
| # 重新构造 ChatCompletion 对象(使用带验证的构造函数,而非 construct) | |
| completion = ChatCompletion(**completion_dict) | |
| logger.info("成功将字符串响应转换为 ChatCompletion 对象。") | |
| except Exception as e: | ||
| logger.error(f"自动修复失败: {e}") | ||
| # 如果修复失败,继续抛出原始错误 | ||
| raise Exception(f"API 返回格式错误且无法修复:{type(completion)}: {completion}。") |
There was a problem hiding this comment.
issue (bug_risk): Re-raising a new generic Exception here loses the original traceback and error type.
The comment promises to rethrow the original error, but the code creates a new Exception, discarding the original stack trace and specific error from json.loads / ChatCompletion. To keep debugging context, either use a bare raise to rethrow e, or raise a more specific/custom exception with raise ... from e so the root cause is preserved via exception chaining.
|
|
||
| # --- 新增:兼容某些 API 强制返回 SSE 格式的 Bug --- | ||
| if isinstance(completion, str): | ||
| logger.warning(f"检测到 API 返回了字符串而非对象,尝试自动修复: {completion[:100]}...") |
There was a problem hiding this comment.
🚨 suggestion (security): Logging the raw response snippet may expose sensitive content and could be toned down or guarded.
This substring can still include user prompts or other private data, which may be sensitive depending on where logs are stored or shipped. Please consider masking/redacting the payload, logging only metadata (e.g., length or content type), or gating this detailed snippet behind a debug-only flag.
Suggested implementation:
if isinstance(completion, str):
# 避免在日志中暴露原始响应内容,仅记录元信息
logger.warning("检测到 API 返回了字符串而非对象,尝试自动修复。为保护隐私,已省略响应内容。")
# 在 debug 级别可输出更详细的调试信息(例如长度),不包含具体文本
try:
response_length = len(completion)
except Exception:
response_length = None
if logger.isEnabledFor(logging.DEBUG):
logger.debug("原始字符串响应元信息: length=%s", response_length)
try:If not already present at the top of astrbot/core/provider/sources/openai_source.py, add import logging and ensure logger is configured appropriately for your project’s logging setup.
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to handle cases where the OpenAI API returns SSE-formatted strings instead of objects. The reviewer identified that using ChatCompletion.construct is insufficient because it does not recursively parse nested dictionaries, which would lead to AttributeErrors later. Additionally, the reviewer recommended enhancing the parsing logic to correctly handle multi-line SSE responses and exclude the [DONE] marker to prevent JSON decoding failures.
| if isinstance(completion, str): | ||
| logger.warning(f"检测到 API 返回了字符串而非对象,尝试自动修复: {completion[:100]}...") | ||
| try: | ||
| # 如果是 data:{...} 格式,去掉 "data:" 并解析 JSON | ||
| json_str = completion.strip() | ||
| if json_str.startswith("data:"): | ||
| json_str = json_str[5:].strip() | ||
|
|
||
| # 尝试解析 JSON | ||
| completion_dict = json.loads(json_str) | ||
|
|
||
| # 重新构造 ChatCompletion 对象 | ||
| completion = ChatCompletion.construct(**completion_dict) | ||
| logger.info("成功将字符串响应转换为 ChatCompletion 对象。") | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"自动修复失败: {e}") | ||
| # 如果修复失败,继续抛出原始错误 | ||
| raise Exception(f"API 返回格式错误且无法修复:{type(completion)}: {completion}。") |
There was a problem hiding this comment.
这里存在两个主要问题:
- 非递归构造问题:
ChatCompletion.construct(或 Pydantic v2 中的model_construct)不是递归的。这意味着completion_dict中的嵌套字典(如choices列表中的项)不会被自动转换为 Pydantic 模型对象,而是保留为dict。这会导致后续代码(如第 731 行)在访问choice.message.content时抛出AttributeError。建议使用ChatCompletion.model_validate()。 - 多行 SSE 处理问题:如果 API 返回的是包含多行的 SSE 响应(例如末尾带有
data: [DONE]),当前的json.loads会因为包含非 JSON 字符而解析失败。建议通过splitlines()遍历并提取第一个有效的 JSON 数据块。
此外,建议优化异常处理逻辑以提高鲁棒性。
if isinstance(completion, str):
logger.warning(f"检测到 API 返回了字符串而非对象,尝试自动修复: {completion[:100]}...")
try:
# 兼容多行 SSE 格式,提取第一个包含有效 JSON 的 data 行
json_str = None
for line in completion.splitlines():
line = line.strip()
if line.startswith("data:"):
content = line[5:].strip()
if content and content != "[DONE]":
json_str = content
break
if not json_str:
json_str = completion.strip()
completion_dict = json.loads(json_str)
# 使用 model_validate 以确保嵌套对象(如 choices, message)被正确解析为 Pydantic 模型
# construct 方法不是递归的,会导致后续访问属性时抛出 AttributeError
completion = ChatCompletion.model_validate(completion_dict)
logger.info("成功将字符串响应转换为 ChatCompletion 对象。")
except Exception as e:
logger.error(f"自动修复失败: {e}")
raise Exception(f"API 返回格式错误且无法修复:{type(completion)}: {completion}。")|
|
||
| # --- 新增:兼容某些 API 强制返回 SSE 格式的 Bug --- | ||
| if isinstance(completion, str): | ||
| logger.warning(f"检测到 API 返回了字符串而非对象,尝试自动修复: {completion[:100]}...") | ||
| try: | ||
| # 如果是 data:{...} 格式,去掉 "data:" 并解析 JSON | ||
| json_str = completion.strip() | ||
| if json_str.startswith("data:"): | ||
| json_str = json_str[5:].strip() | ||
|
|
||
| # 尝试解析 JSON | ||
| completion_dict = json.loads(json_str) | ||
|
|
||
| # 重新构造 ChatCompletion 对象 | ||
| completion = ChatCompletion.construct(**completion_dict) | ||
| logger.info("成功将字符串响应转换为 ChatCompletion 对象。") | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"自动修复失败: {e}") | ||
| # 如果修复失败,继续抛出原始错误 | ||
| raise Exception(f"API 返回格式错误且无法修复:{type(completion)}: {completion}。") | ||
| # --------------------------------------------------- |
There was a problem hiding this comment.
| # --- 新增:兼容某些 API 强制返回 SSE 格式的 Bug --- | |
| if isinstance(completion, str): | |
| logger.warning(f"检测到 API 返回了字符串而非对象,尝试自动修复: {completion[:100]}...") | |
| try: | |
| # 如果是 data:{...} 格式,去掉 "data:" 并解析 JSON | |
| json_str = completion.strip() | |
| if json_str.startswith("data:"): | |
| json_str = json_str[5:].strip() | |
| # 尝试解析 JSON | |
| completion_dict = json.loads(json_str) | |
| # 重新构造 ChatCompletion 对象 | |
| completion = ChatCompletion.construct(**completion_dict) | |
| logger.info("成功将字符串响应转换为 ChatCompletion 对象。") | |
| except Exception as e: | |
| logger.error(f"自动修复失败: {e}") | |
| # 如果修复失败,继续抛出原始错误 | |
| raise Exception(f"API 返回格式错误且无法修复:{type(completion)}: {completion}。") | |
| # --------------------------------------------------- | |
| if isinstance(completion, str): | |
| try: | |
| # see #7280 | |
| json_str = completion.strip().removeprefix("data:").strip() | |
| completion_dict = json.loads(json_str) | |
| completion = ChatCompletion.construct(**completion_dict) | |
| except Exception as e: | |
| raise Exception( | |
| f"The API returned a string response that cannot be parsed as a ChatCompletion. Response: {completion[:200]}... Error: {e}" | |
| ) |
兼容某些 API 强制返回 SSE 格式的出参,将字符串响应转换为 ChatCompletion 对象
兼容某些 API 强制返回 SSE 格式的出参,将字符串响应转换为 ChatCompletion 对象
Summary by Sourcery
Bug Fixes: