Skip to content

Commit 5eceb22

Browse files
yokoflyclaude
andcommitted
docs: clarify Python string values are bytes
Verified the new examples against timeplus/timeplusd:3.2.2 and :3.2.7. Sink-side string columns arrive as Python `bytes`, not `str`, since v3.2.2 (proton-enterprise commit 43c46c5, #11788). The original `py_metric_sink` and `post_event` examples needed `.decode()` to run; add the calls and a one-paragraph note linking to the Python UDF data type mapping. Also bump the product-version note to 3.2.2+ so it matches the floor where every documented setting and global is present. Holding merge pending the contract question in proton-enterprise#12067. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent d475bb0 commit 5eceb22

3 files changed

Lines changed: 9 additions & 6 deletions

File tree

docs/shared/python-external-stream-write.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,16 @@
22

33
The write function is invoked once per chunk, not once per row. Its arguments are **column-oriented**: one Python list per output column, in declared order, all of equal length. Iterate with `zip` to recover row tuples.
44

5+
Column values follow the same Python type mapping as [Python UDF](/py-udf#data-type-mapping). One detail worth highlighting for sinks: a `string` (or `fixed_string`) column arrives as Python `bytes`, not `str`. Decode with `.decode()` (UTF-8) before passing values into APIs that require text.
6+
57
### Sink basics
68

79
```sql
810
CREATE EXTERNAL STREAM py_metric_sink (host string, value float32)
911
AS $$
1012
def py_metric_sink(host, value):
1113
for h, v in zip(host, value):
12-
print(f"{h}={v}")
14+
print(f"{h.decode()}={v}")
1315
$$
1416
SETTINGS type = 'python';
1517
```
@@ -20,7 +22,7 @@ Insert a few rows:
2022
INSERT INTO py_metric_sink (host, value) VALUES ('a', 1.0), ('b', 2.0);
2123
```
2224

23-
Behind the scenes Timeplus calls `py_metric_sink(['a', 'b'], [1.0, 2.0])` — one call carrying both rows. A larger INSERT or a downstream query that delivers many chunks results in one call per chunk.
25+
Behind the scenes Timeplus calls `py_metric_sink([b'a', b'b'], [1.0, 2.0])` — one call carrying both rows, with the `string` column delivered as `bytes`. A larger INSERT or a downstream query that delivers many chunks results in one call per chunk.
2426

2527
If `write_function_name` is omitted Timeplus uses `read_function_name` (which itself defaults to the stream name), so the Python function above only needs to be named once.
2628

@@ -33,7 +35,7 @@ CREATE EXTERNAL STREAM py_alert_sink (host string, value float32)
3335
AS $$
3436
def py_alert_sink(host, value):
3537
for h, v in zip(host, value):
36-
notify(h, v) # your notifier
38+
notify(h.decode(), v) # your notifier
3739
$$
3840
SETTINGS type = 'python';
3941

@@ -61,9 +63,10 @@ def close_client():
6163

6264
def post_event(event_id, body):
6365
for eid, b in zip(event_id, body):
66+
payload = {"id": eid.decode(), "body": b.decode()}
6467
req = urllib.request.Request(
6568
builtins._tp_webhook,
66-
data=json.dumps({"id": eid, "body": b}).encode(),
69+
data=json.dumps(payload).encode(),
6770
headers={"Content-Type": "application/json"},
6871
method="POST",
6972
)

docs/shared/python-external-stream.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
## Overview
22

3-
Python External Stream lets you read from and write to arbitrary sources by embedding a Python body directly in the DDL. It is available in **Timeplus Enterprise 3.2+**.
3+
Python External Stream lets you read from and write to arbitrary sources by embedding a Python body directly in the DDL. It is available in **Timeplus Enterprise 3.2.2+**.
44

55
Unlike the Kafka, Pulsar, and NATS JetStream external streams — which speak a specific wire protocol — a Python External Stream is a generic escape hatch: you bring the protocol, the client library, and the logic. Timeplus calls your functions inside the embedded CPython runtime. When reading, return values become row batches; when writing, the sink function receives column batches. The same DDL object can serve as both a source (via `read_function_name`) and a sink (via `write_function_name`).
66

docs/sql-create-external-stream.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ SETTINGS
7373
mode = 'auto' -- 'auto' (default), 'streaming', or 'batch'
7474
```
7575

76-
Available in **Timeplus Enterprise 3.2+**.
76+
Available in **Timeplus Enterprise 3.2.2+**.
7777

7878
Please check the [Python External Stream Source](/python-external-stream-source) for read-side settings, generator/batch sources, and lifecycle hooks, and the [Python External Stream Sink](/python-external-stream-sink) for write-side semantics, materialized-view sinks, and custom-protocol examples.
7979

0 commit comments

Comments
 (0)