Skip to content

Commit 4d10adb

Browse files
feat(core): optional ThinkingChunk StreamChunk variant (default-off, upstream-compatible; supersedes #39) (#169)
* feat(core): optional ThinkingChunk StreamChunk variant (default-off, upstream-compatible; supersedes #39) Adds a fourth, opt-in StreamChunk variant for streaming agent reasoning/"thinking" to chat platforms. The whole design is default-off: with no opt-in, the normalized stream and the posted message are byte-for-byte identical to upstream chat@4.31. - Additive type: ThinkingChunk(type="thinking", content=str) added to the StreamChunk union. The existing three variants are unaffected. - Opt-in emit: from_full_stream(stream, emit_thinking=False) and a thread-level emit_thinking config flag (threaded into the internal _from_full_stream) surface AI-SDK reasoning/reasoning-delta (and pydantic-ai part_kind=="thinking") parts as ThinkingChunk only when enabled. Default False drops reasoning exactly as upstream does. - Graceful consume: Thread._handle_stream never accumulates a ThinkingChunk into the posted-message text, and every adapter's stream handler skips it. Slack/Teams expose an optional render_thinking hook (via shared.adapter_utils.maybe_render_thinking); the text-accumulate adapters ignore it structurally — no crash, posted message unchanged. - No state pollution: ThinkingChunk is streaming-only. Message has no thinking field, to_json() is unchanged, and a round-tripped Message is byte-identical, so cross-SDK Redis/Postgres state stays compatible. - Docs: UPSTREAM_SYNC.md Known Non-Parity row added. Rationale: upstream's chat-platform SDK drops reasoning (leaves it to the AI-SDK web UI); chinchill streams thinking to Slack/Teams out-of-band today because there is no path. This gives the SDK a first-class, opt-in one without changing any default behavior. Gauntlet: ruff check + format, audit_test_quality (0 hard failures), verify_test_fidelity --strict (732/732, 0 missing), pyrefly src (0 errors), pytest (5121 passed, 4 pre-existing skips). * refactor(core): keep StreamChunk upstream-exact, ThinkingChunk as opt-in StreamInput Revert the public ``StreamChunk`` union to upstream's three variants (``MarkdownTextChunk | TaskUpdateChunk | PlanUpdateChunk``) so a consumer doing an exhaustive ``match`` over it sees zero change on upgrade. The Python-only ``ThinkingChunk`` is no longer a member of that union. Introduce a public ``StreamInput = str | StreamChunk | ThinkingChunk`` alias for what a stream may yield, and widen only the stream-input/output boundaries to accept it: the ``Adapter.stream()`` protocol signature, ``from_full_stream``/``_from_full_stream`` returns, ``Thread._wrapped_stream``, and each receiving adapter's ``stream()`` signature. A producer can yield ``ThinkingChunk`` (opt-in) and the adapters type-check; code referencing ``StreamChunk`` itself is unaffected. All other thinking behavior is preserved byte-for-byte: ``emit_thinking`` defaults to False (default stream identical to upstream), the ``_handle_stream`` graceful skip, the adapter skip / ``render_thinking`` branches, ``maybe_render_thinking``, and the no-``Message.thinking``-field / round-trip-identical guarantee. The UPSTREAM_SYNC divergence row is updated to state that ``StreamChunk`` is NOT widened and ``ThinkingChunk`` is a separate opt-in input type. Tests now assert ``get_args(StreamChunk)`` has exactly the three upstream variants and that ``ThinkingChunk`` is excluded from ``StreamChunk`` but accepted by ``StreamInput``.
1 parent 8a819f4 commit 4d10adb

21 files changed

Lines changed: 968 additions & 37 deletions

docs/UPSTREAM_SYNC.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -684,6 +684,7 @@ stay explicit instead of being rediscovered in code review.
684684
| Teams `cards_input` empty-options default (vercel/chat#8c71411, chat@4.31) | `input_request_to_teams_adaptive_card` reads `options = request.get("options") or []` | Upstream `cards-primitives/input.ts` reads `const options = request.options ?? []` | Benign truthiness divergence. The only values that differ between `or []` and `?? []` are the falsy-but-present ones (`[]`, `None`); for an options list both produce the same empty list, so the rendered card is byte-identical. Documented (not "fixed" to `is not None`) because there is no observable behavior difference — a present empty `options` and an absent `options` both yield "no choices". |
685685
| Teams `graph` path-segment encoding (vercel/chat#8c71411, chat@4.31) | Path segments (`team_id` / `channel_id` / `chat_id` / `message_id`) are interpolated through `quote(segment, safe='')` | Upstream `graph/{channels,messages}.ts` use `encodeURIComponent(...)` | Benign over-encoding divergence. `quote(safe='')` percent-encodes a strictly larger set than `encodeURIComponent` (notably `!`, `'`, `(`, `)`, `*` — which `encodeURIComponent` leaves literal). Graph IDs (team/channel/chat/message GUIDs and thread tokens) never contain those characters, so the encoded path is identical in practice; where they did differ, the stricter `quote` is the safer choice (no URL injection via an unescaped sub-delimiter). Documented rather than narrowed to match `encodeURIComponent` exactly. |
686686
| Teams `graph` defensive shape coercion (vercel/chat#8c71411, chat@4.31) | `to_graph_message` / channel + message readers coerce unexpected Graph payload shapes with `isinstance` guards (`x if isinstance(x, Mapping) else {}`, `value if isinstance(value, list) else []`) before reading fields | Upstream `graph/messages.ts` reads `message.from?.user` / `message.body?.content ?? ""` etc. with optional chaining — a non-object where an object is expected throws at the property access | Benign defensive divergence. Upstream's optional chaining tolerates `null`/`undefined` but throws on a wrong-typed non-null (e.g. a string where an object is expected); our `isinstance` coercion fails closed to an empty mapping/list instead of raising. For well-formed Graph responses the behavior is identical; the divergence only manifests on malformed payloads, where returning an empty-shape result is more resilient than throwing. Mirrors the repo's general "more resilient than throw" stance (cf. the `renderPostable on unknown input` row). |
687+
| `ThinkingChunk` opt-in stream-input type (Python-only, default-off; supersedes PR #39) | A **separate, opt-in** dataclass — `ThinkingChunk(type="thinking", content=str)` — surfaces AI-SDK `reasoning`/`reasoning-delta` (and pydantic-ai `part_kind == "thinking"`) parts. **`StreamChunk` is NOT widened**: the canonical union stays `StreamChunk = MarkdownTextChunk \| TaskUpdateChunk \| PlanUpdateChunk` — byte-identical to upstream's three variants, so a consumer doing an exhaustive `match` over `StreamChunk` sees zero change on upgrade. `ThinkingChunk` is accepted only at the **stream-input/output boundaries** via the public alias `StreamInput = str \| StreamChunk \| ThinkingChunk` (the `Adapter.stream()` protocol signature, `from_full_stream`/`_from_full_stream` returns, `Thread._wrapped_stream`, and each receiving adapter's `stream()` signature). A producer can yield `ThinkingChunk` (opt-in) and the adapters that receive the stream type-check; code that only references `StreamChunk` never touches it. **OPT-IN, default-off**: emitted only when a caller passes `from_full_stream(stream, emit_thinking=True)` or sets the thread-level `emit_thinking=True` config; the internal `_from_full_stream` threads the same flag. With the default (`emit_thinking=False`) the normalized stream is **byte-for-byte identical** to upstream — reasoning parts are dropped and **no** `ThinkingChunk` is produced. Consumption is graceful: `Thread._handle_stream` never accumulates a `ThinkingChunk` into the posted-message text, and every adapter's stream handler skips it (Slack/Teams expose an optional `render_thinking` hook via `chat_sdk.shared.adapter_utils.maybe_render_thinking`; the text-accumulate adapters ignore it structurally). **Streaming-only — never persisted**: `Message` has no `thinking` field, `to_json()` is unchanged, and a round-tripped `Message` is byte-identical, so cross-SDK state (Redis/Postgres shared with the TS SDK) stays compatible. | Upstream `from-full-stream.ts` forwards only `text-delta` + `finish-step`; AI-SDK `reasoning`/`reasoning-delta` parts fall through and are discarded. `StreamChunk = MarkdownTextChunk \| TaskUpdateChunk \| PlanUpdateChunk` — no reasoning variant and no stream-input alias. Upstream leaves reasoning display to the AI-SDK web UI. | chinchill actively streams agent thinking to Slack/Teams but has to intercept the model stream out-of-band today because the chat-platform SDK has no path for it. This gives the SDK a first-class, opt-in one without changing any default behavior — and crucially **without widening the public `StreamChunk` union**, so consumers referencing it are unaffected. The whole design constraint is that default-off == upstream and `StreamChunk` == upstream: separate opt-in input type, opt-in emit, graceful/skip consume, zero state pollution. Regression coverage: `tests/test_thinking_chunk.py`, `tests/test_from_full_stream.py::TestThinkingOptIn`, `tests/test_types.py::TestThinkingChunk`, plus per-adapter no-crash tests in `tests/test_slack_api.py`, `tests/test_teams_native_streaming.py`, `tests/test_twilio_adapter.py`, `tests/test_messenger_api.py`. |
687688
### Platform-specific gaps
688689

689690
| Area | Python | TS | Rationale |

src/chat_sdk/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,8 +210,10 @@
210210
SlashCommandEvent,
211211
StateAdapter,
212212
StreamChunk,
213+
StreamInput,
213214
StreamOptions,
214215
TaskUpdateChunk,
216+
ThinkingChunk,
215217
Thread,
216218
ThreadInfo,
217219
ThreadSummary,
@@ -448,8 +450,10 @@
448450
"SlashCommandEvent",
449451
"StateAdapter",
450452
"StreamChunk",
453+
"StreamInput",
451454
"StreamOptions",
452455
"TaskUpdateChunk",
456+
"ThinkingChunk",
453457
"Thread",
454458
"ThreadInfo",
455459
"ThreadSummary",

src/chat_sdk/adapters/github/adapter.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@
5555
MessageMetadata,
5656
PostableMarkdown,
5757
RawMessage,
58-
StreamChunk,
58+
StreamInput,
5959
StreamOptions,
6060
Thread,
6161
ThreadInfo,
@@ -645,7 +645,7 @@ async def edit_message(self, thread_id: str, message_id: str, message: AdapterPo
645645
async def stream(
646646
self,
647647
thread_id: str,
648-
text_stream: AsyncIterable[str | StreamChunk],
648+
text_stream: AsyncIterable[StreamInput],
649649
options: StreamOptions | None = None,
650650
) -> RawMessage:
651651
"""Stream by accumulating text and posting once."""

src/chat_sdk/adapters/google_chat/adapter.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@
7777
RawMessage,
7878
ReactionEvent,
7979
StateAdapter,
80-
StreamChunk,
80+
StreamInput,
8181
StreamOptions,
8282
ThreadInfo,
8383
ThreadSummary,
@@ -1729,7 +1729,7 @@ async def delete_message(self, thread_id: str, message_id: str) -> None:
17291729
async def stream(
17301730
self,
17311731
thread_id: str,
1732-
text_stream: AsyncIterable[str | StreamChunk],
1732+
text_stream: AsyncIterable[StreamInput],
17331733
options: StreamOptions | None = None,
17341734
) -> RawMessage:
17351735
"""Stream by accumulating all chunks and posting as a single message."""

src/chat_sdk/adapters/messenger/adapter.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@
7070
PostableMarkdown,
7171
RawMessage,
7272
ReactionEvent,
73-
StreamChunk,
73+
StreamInput,
7474
StreamOptions,
7575
ThreadInfo,
7676
UserInfo,
@@ -585,7 +585,7 @@ async def edit_message(
585585
async def stream(
586586
self,
587587
thread_id: str,
588-
text_stream: AsyncIterable[str | StreamChunk],
588+
text_stream: AsyncIterable[StreamInput],
589589
options: StreamOptions | None = None,
590590
) -> RawMessage:
591591
"""Buffer all stream chunks and send as a single message.

src/chat_sdk/adapters/slack/adapter.py

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,12 @@
6464
from chat_sdk.emoji import emoji_to_slack, resolve_emoji_from_slack
6565
from chat_sdk.logger import ConsoleLogger, Logger
6666
from chat_sdk.modals import ModalElement, OptionsLoadGroup, SelectOptionElement
67-
from chat_sdk.shared.adapter_utils import extract_card, extract_files
67+
from chat_sdk.shared.adapter_utils import (
68+
extract_card,
69+
extract_files,
70+
is_thinking_chunk,
71+
maybe_render_thinking,
72+
)
6873
from chat_sdk.shared.errors import AdapterRateLimitError, AuthenticationError, ValidationError
6974
from chat_sdk.types import (
7075
ActionEvent,
@@ -100,7 +105,9 @@
100105
ScheduledMessage,
101106
SlashCommandEvent,
102107
StreamChunk,
108+
StreamInput,
103109
StreamOptions,
110+
ThinkingChunk,
104111
ThreadInfo,
105112
ThreadSummary,
106113
UserInfo,
@@ -3873,7 +3880,7 @@ async def start_typing(self, thread_id: str, status: str | None = None) -> None:
38733880
async def stream(
38743881
self,
38753882
thread_id: str,
3876-
text_stream: AsyncIterable[str | StreamChunk],
3883+
text_stream: AsyncIterable[StreamInput],
38773884
options: StreamOptions | None = None,
38783885
) -> RawMessage:
38793886
"""Stream a message using Slack's native streaming API.
@@ -3945,7 +3952,11 @@ async def flush_markdown_delta(delta: str) -> None:
39453952
return
39463953
await streamer.append(markdown_text=delta, token=token)
39473954

3948-
async def send_structured_chunk(chunk: StreamChunk | dict[str, Any]) -> None:
3955+
# Accepts the residual stream-input union: ``is_thinking_chunk`` filters
3956+
# ``ThinkingChunk`` out before this runs (it is never reached with one),
3957+
# but it stays in the static type since that runtime guard does not
3958+
# narrow it. The generic ``_read``-based body handles any chunk shape.
3959+
async def send_structured_chunk(chunk: StreamChunk | ThinkingChunk | dict[str, Any]) -> None:
39493960
nonlocal last_appended, structured_chunks_supported
39503961
if not structured_chunks_supported:
39513962
return
@@ -4006,6 +4017,15 @@ async def push_text_and_flush(text: str) -> None:
40064017
await push_text_and_flush(text_value)
40074018
elif hasattr(chunk, "type") and chunk.type == "markdown_text": # type: ignore[union-attr]
40084019
await push_text_and_flush(chunk.text) # type: ignore[union-attr]
4020+
elif is_thinking_chunk(chunk):
4021+
# Python-only divergence: ``ThinkingChunk`` is streaming-only
4022+
# agent reasoning, NOT message content. By default it is
4023+
# skipped (no effect on the posted message). An adapter/consumer
4024+
# that wants to display thinking sets ``self.render_thinking``;
4025+
# only then is it invoked. Skipping (not routing to
4026+
# ``send_structured_chunk``) keeps the default posted message
4027+
# byte-identical and avoids disabling structured-chunk support.
4028+
await maybe_render_thinking(getattr(self, "render_thinking", None), chunk)
40094029
else:
40104030
await send_structured_chunk(chunk)
40114031

src/chat_sdk/adapters/teams/adapter.py

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,12 @@
3434
from chat_sdk.emoji import convert_emoji_placeholders
3535
from chat_sdk.errors import ChatNotImplementedError
3636
from chat_sdk.logger import ConsoleLogger, Logger
37-
from chat_sdk.shared.adapter_utils import extract_card, extract_files
37+
from chat_sdk.shared.adapter_utils import (
38+
extract_card,
39+
extract_files,
40+
is_thinking_chunk,
41+
maybe_render_thinking,
42+
)
3843
from chat_sdk.shared.buffer_utils import buffer_to_data_uri, to_buffer
3944
from chat_sdk.shared.errors import (
4045
AdapterPermissionError,
@@ -1724,6 +1729,12 @@ async def stream(
17241729
text = chunk
17251730
elif isinstance(chunk, dict) and chunk.get("type") == "markdown_text":
17261731
text = chunk.get("text", "")
1732+
elif is_thinking_chunk(chunk):
1733+
# Python-only divergence: streaming-only reasoning, not message
1734+
# content. Default-skip keeps the buffered post byte-identical;
1735+
# an opt-in ``render_thinking`` hook may display it.
1736+
await maybe_render_thinking(getattr(self, "render_thinking", None), chunk)
1737+
continue
17271738
if not text:
17281739
continue
17291740
accumulated += text
@@ -1796,6 +1807,13 @@ async def _on_chunk(activity: Any) -> None:
17961807
text = chunk
17971808
elif isinstance(chunk, dict) and chunk.get("type") == "markdown_text":
17981809
text = chunk.get("text", "")
1810+
elif is_thinking_chunk(chunk):
1811+
# Python-only divergence: streaming-only reasoning, not
1812+
# message content. Default-skip keeps emitted text
1813+
# byte-identical; an opt-in ``render_thinking`` hook may
1814+
# display it.
1815+
await maybe_render_thinking(getattr(self, "render_thinking", None), chunk)
1816+
continue
17991817
elif getattr(chunk, "type", None) == "markdown_text":
18001818
# Dataclass ``MarkdownTextChunk`` form — mirror
18011819
# ``Thread.stream``'s ``_wrapped_stream`` extraction so the

src/chat_sdk/adapters/telegram/adapter.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@
8989
RawMessage,
9090
ReactionEvent,
9191
SlashCommandEvent,
92-
StreamChunk,
92+
StreamInput,
9393
StreamOptions,
9494
ThreadInfo,
9595
UserInfo,
@@ -1729,7 +1729,7 @@ async def start_typing(self, thread_id: str, status: str | None = None) -> None:
17291729
async def stream(
17301730
self,
17311731
thread_id: str,
1732-
text_stream: AsyncIterable[str | StreamChunk],
1732+
text_stream: AsyncIterable[StreamInput],
17331733
options: StreamOptions | None = None,
17341734
) -> RawMessage | None:
17351735
"""Stream a message to a Telegram private chat via draft updates.

src/chat_sdk/adapters/twilio/adapter.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@
8585
MessageMetadata,
8686
PostableMarkdown,
8787
RawMessage,
88-
StreamChunk,
88+
StreamInput,
8989
StreamOptions,
9090
ThreadInfo,
9191
UserInfo,
@@ -327,7 +327,7 @@ async def start_typing(self, thread_id: str, status: str | None = None) -> None:
327327
async def stream(
328328
self,
329329
thread_id: str,
330-
text_stream: AsyncIterable[str | StreamChunk],
330+
text_stream: AsyncIterable[StreamInput],
331331
options: StreamOptions | None = None,
332332
) -> RawMessage:
333333
"""Buffer all stream chunks and send as a single message.

src/chat_sdk/adapters/whatsapp/adapter.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@
5757
PostableMarkdown,
5858
RawMessage,
5959
ReactionEvent,
60-
StreamChunk,
60+
StreamInput,
6161
StreamOptions,
6262
ThreadInfo,
6363
UserInfo,
@@ -887,7 +887,7 @@ async def edit_message(
887887
async def stream(
888888
self,
889889
thread_id: str,
890-
text_stream: AsyncIterable[str | StreamChunk],
890+
text_stream: AsyncIterable[StreamInput],
891891
options: StreamOptions | None = None,
892892
) -> RawMessage:
893893
"""Stream a message by buffering all chunks and sending as a single message."""

0 commit comments

Comments
 (0)