Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Only write entries that are worth mentioning to users.

## Unreleased

- Core: Fix crash on streaming mid-flight network disconnection — when the OpenAI SDK raises a base `APIError` (instead of `APIConnectionError`) during long-running streams, the error is now correctly classified as retryable, enabling automatic retry and connection recovery instead of an unrecoverable crash
- Shell: Exclude empty current session from `/sessions` picker — completely empty sessions (no conversation history and no custom title) are no longer shown in the session list; sessions with a custom title are still displayed
- Shell: Fix slash command completion Enter key behavior — accepting a completion now submits in a single Enter press; auto-submit is limited to slash command completions only; file mention completions (`@`) accept without submitting so the user can continue editing; re-completion after accepting is suppressed to prevent stale completion state
- Shell: Add directory scope toggle to `/sessions` picker — press `Ctrl+A` to switch between showing sessions for the current working directory only or across all known directories; uses a new full-screen session picker UI with header scope indicator and footer hint bar
Expand Down
1 change: 1 addition & 0 deletions docs/en/release-notes/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ This page documents the changes in each Kimi Code CLI release.

## Unreleased

- Core: Fix crash on streaming mid-flight network disconnection — when the OpenAI SDK raises a base `APIError` (instead of `APIConnectionError`) during long-running streams, the error is now correctly classified as retryable, enabling automatic retry and connection recovery instead of an unrecoverable crash
- Shell: Exclude empty current session from `/sessions` picker — completely empty sessions (no conversation history and no custom title) are no longer shown in the session list; sessions with a custom title are still displayed
- Shell: Fix slash command completion Enter key behavior — accepting a completion now submits in a single Enter press; auto-submit is limited to slash command completions only; file mention completions (`@`) accept without submitting so the user can continue editing; re-completion after accepting is suppressed to prevent stale completion state
- Shell: Add directory scope toggle to `/sessions` picker — press `Ctrl+A` to switch between showing sessions for the current working directory only or across all known directories; uses a new full-screen session picker UI with header scope indicator and footer hint bar
Expand Down
1 change: 1 addition & 0 deletions docs/zh/release-notes/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

## 未发布

- Core:修复长时间 streaming 过程中网络断连导致崩溃的问题——当 OpenAI SDK 在流式传输中途抛出基类 `APIError`(而非 `APIConnectionError`)时,现在能正确识别为可重试错误,自动触发重试和连接恢复,而不再直接崩溃退出
- Shell:从 `/sessions` 选择器中排除空的当前会话——完全为空的会话(既无对话记录也无自定义标题)不再显示在会话列表中;有自定义标题的会话仍然正常显示
- Shell:修复斜杠命令补全 Enter 键行为——接受补全后现在通过一次 Enter 即可提交命令;自动提交仅限于斜杠命令补全,文件引用(`@`)补全接受后不提交以便继续编辑;接受补全时抑制重新补全,防止过时的补全状态
- Shell:为 `/sessions` 会话选择器新增目录范围切换功能——按 `Ctrl+A` 可在"仅当前工作目录"和"所有已知目录"之间切换会话列表;采用全屏会话选择器 UI,顶部显示当前范围,底部显示快捷键提示
Expand Down
2 changes: 2 additions & 0 deletions packages/kosong/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

## Unreleased

- OpenAI: Fix crash on streaming mid-flight disconnection — classify base `openai.APIError` (body=None) as retryable via heuristic message matching, so that `_run_with_connection_recovery` and tenacity retry logic correctly trigger instead of crashing

## 0.48.0 (2026-04-02)

- Google GenAI: Add `default_headers` parameter to `GoogleGenAI` constructor — custom headers are merged into `HttpOptions` so they are included in all API requests
Expand Down
25 changes: 25 additions & 0 deletions packages/kosong/src/kosong/chat_provider/openai_common.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import asyncio
import inspect
import re
from collections.abc import Awaitable, Mapping
from typing import Any, cast

Expand Down Expand Up @@ -86,10 +87,34 @@ def convert_error(error: OpenAIError | httpx.HTTPError) -> ChatProviderError:
return APITimeoutError(error.message)
case openai.APIConnectionError():
return APIConnectionError(error.message)
case openai.APIError() if type(error) is openai.APIError and error.body is None:
# Base APIError with no body indicates a transport-layer failure
# (e.g. "Network connection lost." during streaming). SSE error
# events from the server carry a body dict and should fall through
# to the default case instead.
return _classify_base_api_error(error.message)
case _:
return ChatProviderError(f"Error: {error}")


_NETWORK_RE = re.compile(r"network|connection|connect|disconnect", re.IGNORECASE)
_TIMEOUT_RE = re.compile(r"timed?\s*out|timeout|deadline", re.IGNORECASE)


def _classify_base_api_error(message: str) -> ChatProviderError:
"""Heuristically map an ``openai.APIError`` message to a retryable error type.

Timeout patterns are checked first because a message like
"connection timed out" should be classified as a timeout, not a
connection error.
"""
if _TIMEOUT_RE.search(message):
return APITimeoutError(message)
if _NETWORK_RE.search(message):
return APIConnectionError(message)
return ChatProviderError(f"Error: {message}")


def thinking_effort_to_reasoning_effort(effort: ThinkingEffort) -> ReasoningEffort:
match effort:
case "off":
Expand Down
148 changes: 147 additions & 1 deletion packages/kosong/tests/test_openai_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,16 @@
from typing import Any

import httpx
import openai
import pytest

from kosong.chat_provider import APIConnectionError, openai_common
from kosong.chat_provider import (
APIConnectionError,
APITimeoutError,
ChatProviderError,
openai_common,
)
from kosong.chat_provider.openai_common import convert_error
from kosong.contrib.chat_provider.openai_legacy import OpenAILegacy


Expand Down Expand Up @@ -45,3 +52,142 @@ async def test_retry_recovery_does_not_close_shared_http_client() -> None:
assert provider.client._client is http_client # type: ignore[reportPrivateUsage]
assert http_client.is_closed is False
await http_client.aclose()


# ---------------------------------------------------------------------------
# convert_error: openai.APIError (base class) handling
# ---------------------------------------------------------------------------

_DUMMY_REQUEST = httpx.Request("POST", "https://api.test")


class TestConvertErrorBaseAPIError:
"""openai.APIError (the base class, NOT APIConnectionError) must be
correctly mapped when the error message indicates a network issue.

This guards against the bug where streaming mid-flight disconnections
raise ``openai.APIError("Network connection lost.")`` instead of
``openai.APIConnectionError``, and the converter falls through to
the generic ``ChatProviderError`` — bypassing all retry/recovery logic.
"""

@pytest.mark.parametrize(
("message", "expected_type"),
[
("Network connection lost.", APIConnectionError),
("Connection error.", APIConnectionError),
("network error", APIConnectionError),
("disconnected from server", APIConnectionError),
("connection reset by peer", APIConnectionError),
("connection closed unexpectedly", APIConnectionError),
("Request timed out.", APITimeoutError),
("timed out", APITimeoutError),
# Timeout must take priority over network when both patterns match.
("connection timed out", APITimeoutError),
("Something completely unrelated", ChatProviderError),
("Internal server error", ChatProviderError),
# Bare "reset"/"closed" must NOT match — they are too broad
# and could appear in non-network server messages.
("Your session has been reset", ChatProviderError),
("Stream closed by server due to policy violation", ChatProviderError),
],
ids=[
"network_connection_lost",
"connection_error",
"network_error",
"disconnected",
"connection_reset_by_peer",
"connection_closed_unexpectedly",
"request_timed_out",
"timed_out",
"connection_timed_out_timeout_priority",
"unrelated_error",
"internal_server_error",
"bare_reset_no_match",
"bare_closed_no_match",
],
)
def test_base_api_error_mapping(
self, message: str, expected_type: type[ChatProviderError]
) -> None:
err = openai.APIError(message=message, request=_DUMMY_REQUEST, body=None)
result = convert_error(err)
assert type(result) is expected_type, (
f"Expected {expected_type.__name__} for message={message!r}, "
f"got {type(result).__name__}"
)

def test_subclass_errors_still_match_first(self) -> None:
"""Existing specific error types must still be matched before
the new base APIError branch."""
# APIConnectionError should still match its own case
conn_err = openai.APIConnectionError(request=_DUMMY_REQUEST)
result = convert_error(conn_err)
assert type(result) is APIConnectionError

# APITimeoutError should still match its own case
timeout_err = openai.APITimeoutError(request=_DUMMY_REQUEST)
result = convert_error(timeout_err)
assert type(result) is APITimeoutError

def test_api_error_with_body_skips_heuristic(self) -> None:
"""SSE error events carry a body dict — they must NOT be
heuristically reclassified, even if the message contains
network keywords."""
err = openai.APIError(
message="Connection limit exceeded",
request=_DUMMY_REQUEST,
body={"error": {"message": "Connection limit exceeded", "type": "server_error"}},
)
result = convert_error(err)
assert type(result) is ChatProviderError

def test_api_response_validation_error_falls_through(self) -> None:
"""APIResponseValidationError has a body and must not be
heuristically reclassified even if message contains keywords."""
resp = httpx.Response(200, request=_DUMMY_REQUEST)
err = openai.APIResponseValidationError(
response=resp,
body=None,
message="connection field missing in response",
)
# APIResponseValidationError sets body from the response parsing,
# but even with body=None the guard only applies to exact APIError;
# however APIResponseValidationError IS an APIError subclass.
# The key point: it should become ChatProviderError, not APIConnectionError.
result = convert_error(err)
assert type(result) is ChatProviderError


# ---------------------------------------------------------------------------
# Streaming error propagation (integration)
# ---------------------------------------------------------------------------


class TestOpenAIStreamingErrorPropagation:
"""When openai.APIError is raised during OpenAI stream consumption,
_convert_stream_response must convert it to the correct kosong error type.

This is the exact scenario from the bug: streaming for ~33 minutes,
then the SSE connection drops and the SDK raises
openai.APIError("Network connection lost.") — which must become
APIConnectionError so that retry/recovery logic triggers.
"""

async def test_base_api_error_becomes_connection_error(self) -> None:
"""openai.APIError("Network connection lost.") during streaming
must surface as kosong APIConnectionError."""
from kosong.contrib.chat_provider.openai_legacy import OpenAILegacyStreamedMessage

async def _failing_stream() -> Any:
raise openai.APIError(
message="Network connection lost.",
request=_DUMMY_REQUEST,
body=None,
)
yield # make this an async generator # noqa: RUF027

msg = OpenAILegacyStreamedMessage(_failing_stream(), reasoning_key=None) # type: ignore[arg-type]
with pytest.raises(APIConnectionError, match="Network connection lost"):
async for _ in msg:
pass
Loading