You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: plugins/docent/skills/docent/readings-reference.md
+70-2Lines changed: 70 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,11 +43,12 @@ Access attributes to get `ColumnRef` objects (e.g., `rows.transcript`).
43
43
44
44
When you use a ColumnRef in a prompt template, you should make its type explicit with `.as_type()`. The type can be:
45
45
* transcript
46
+
* transcript_slice
46
47
* agent_run
47
48
* reading_result
48
49
* text
49
50
50
-
For `text`, the literal text from that column will be embedded in the prompt. For other types, the column will be interpreted as the UUID of an object in the database, and that object will be formatted as a string and embedded in the prompt.
51
+
For `text`, the literal text from that column will be embedded in the prompt. For most other types, the column will be interpreted as the UUID of an object in the database, and that object will be formatted as a string and embedded in the prompt. The exception is `transcript_slice`, whose column value is a JSON object produced by the DQL `transcript_slice(transcript_id, start_idx, end_idx)` function (see the **Transcript slices** section below).
51
52
52
53
When you specify a type, you are also specifying whether the prompt slot is scalar or list-valued:
53
54
*`.as_type("transcript")` means scalar and defaults to `is_list=False`
@@ -89,7 +90,7 @@ reading = client.read(
89
90
)
90
91
```
91
92
92
-
Other ref types for scripted readings: `AgentRunRef(id, collection_id)`, `ReadingResultRef(id, collection_id)`.
93
+
Other ref types for scripted readings: `AgentRunRef(id, collection_id)`, `TranscriptSliceRef(transcript_id, start_idx, end_idx, agent_run_id, collection_id)`, `ReadingResultRef(id, collection_id)`.
93
94
94
95
Parameters:
95
96
-`prompt_template` or `prompts_list` (mutually exclusive)
@@ -104,6 +105,73 @@ Parameters:
104
105
-`"results"`: always create a new reading record, but reuse individual results to avoid redundant LLM calls
105
106
-`"none"`: no caching — force full re-evaluation
106
107
108
+
### Transcript slices
109
+
110
+
A `transcript_slice` parameter renders a contiguous message range on a specific transcript instead of the whole transcript. The range is inclusive on both ends (`start_idx`, `end_idx`), and rendered block labels preserve the original transcript message indices so the LLM can still cite by absolute position. Negative indices are valid and interpreted like in Python, e.g. to get the last 5 transcript blocks you could set start_idx=-5 end_idx=-1.
111
+
112
+
Transcript slices are a specialized feature, and should only be used if the user's request strongly implies that they're the right tool (e.g. "look at the last 5 messages of each transcript").
113
+
114
+
**Template reading.** Produce slice references directly in DQL with the `transcript_slice(transcript_id, start_idx, end_idx)` function, then annotate the column with `.as_type("transcript_slice")`:
115
+
116
+
```python
117
+
slices = client.query(
118
+
collection_id,
119
+
"""
120
+
WITH windows AS (
121
+
SELECT
122
+
t.id AS transcript_id,
123
+
GREATEST(0, CAST(t.metadata_json->>'first_error_idx' AS INTEGER) - 3) AS start_idx,
124
+
CAST(t.metadata_json->>'first_error_idx' AS INTEGER) + 3 AS end_idx
125
+
FROM transcripts t
126
+
WHERE t.metadata_json ? 'first_error_idx'
127
+
)
128
+
SELECT transcript_slice(transcript_id, start_idx, end_idx) AS window
129
+
FROM windows
130
+
""",
131
+
name="Error context windows",
132
+
)
133
+
134
+
reading = client.read(
135
+
prompt_template=[
136
+
"Explain what went wrong in this excerpt: ",
137
+
slices.window.as_type("transcript_slice"),
138
+
],
139
+
model="openai/gpt-5.4-mini",
140
+
name="Explain error contexts",
141
+
)
142
+
```
143
+
144
+
Notes on the DQL function:
145
+
*`transcript_slice()` must be called with exactly three arguments and emits a JSON object value. It is allowed anywhere a scalar expression is valid (including inside `CASE`, `DISTINCT`, `ORDER BY`, or `ARRAY_AGG(...)` for list-valued slots).
146
+
* Access control and collection scoping come from the underlying transcript; indices outside the transcript simply render fewer messages rather than erroring.
147
+
*`start_idx` and `end_idx` may be equal to render a single message.
148
+
149
+
**Scripted reading.** Construct a `TranscriptSliceRef` per prompt. Use this when the slice indices come from Python logic rather than SQL (e.g., derived from earlier reading results):
150
+
151
+
```python
152
+
from docent import TranscriptSliceRef
153
+
154
+
reading = client.read(
155
+
prompts_list=[
156
+
[
157
+
"Summarize this excerpt: ",
158
+
TranscriptSliceRef(
159
+
transcript_id="<transcript-uuid>",
160
+
start_idx=10,
161
+
end_idx=25,
162
+
agent_run_id="<run-uuid>",
163
+
collection_id=collection_id,
164
+
),
165
+
],
166
+
],
167
+
model="openai/gpt-5.4-mini",
168
+
name="Slice summaries",
169
+
)
170
+
```
171
+
172
+
**Context config for slices.**`TranscriptSliceContextConfig` exposes the same filters as `TranscriptContextConfig` (`transcript_metadata`, `message_metadata`); defaults are listed under **Context configs** above. Attach it the same way as other context configs — via `context_configs={param_name: TranscriptSliceContextConfig(...)}` for template readings, or `TranscriptSliceRef(..., context_config=TranscriptSliceContextConfig(...))` for scripted readings. As with other context configs, changing it changes the reading's content hash and therefore its cache identity.
173
+
174
+
107
175
### Context configs
108
176
Use context configs to control which metadata and transcript subtrees are rendered when a reading prompt includes an `agent_run`, `transcript`, or `transcript_slice` parameter. Context configs do not change which rows DQL selects; they only change how selected context items are formatted for the LLM. They are part of the reading config/cache identity, so changing them creates a different reading.
0 commit comments