Skip to content

Commit e264756

Browse files
committed
fix(webapp): truncate span events' attributes so AI SDK telemetry doesn't blow ClickHouse JSON parse
Vercel AI SDK telemetry emits one span event per conversation turn (`gen_ai.system.message`, `gen_ai.user.message`, etc.) carrying message content as event attributes. `spanEventsToEventEvents` ran them through `convertKeyValueItemsToMap` but never through `truncateAttributes`, so that content reached ClickHouse uncapped and pushed the row past the JSON parse tolerance, dropping the whole batch. - Apply the same per-attribute truncation + AI content overrides + total cap to each span event's properties that we already apply to the main attributes map. - Add new ingestion limits envs: SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT (8KB default per string), SERVER_OTEL_AI_CONTENT_ATTRIBUTE_VALUE_LENGTH_LIMIT (1KB for ai.*/gen_ai.* content keys), SERVER_OTEL_SPAN_TOTAL_ATTRIBUTES_LENGTH_LIMIT (32KB backstop that drops AI content keys in priority order). Cost/token metadata is preserved so LLM enrichment continues to work. - Extract the truncation helpers into otlpAttributeLimits with unit tests. Keeps one batch one ClickHouse insert: no batch splitting, no parts amplification on the failure path.
1 parent ac02c0f commit e264756

6 files changed

Lines changed: 491 additions & 93 deletions

File tree

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
---
2+
area: webapp
3+
type: fix
4+
---
5+
6+
Tighten OTel span attribute truncation for Vercel AI SDK content keys
7+
(`ai.prompt*`, `ai.response.text/object/toolCalls/reasoning*`,
8+
`gen_ai.prompt`, `gen_ai.completion`, `gen_ai.request.messages`,
9+
`gen_ai.response.text`) to a 1KB per-attribute cap, plus a 32KB per-span
10+
backstop that drops these content keys in priority order if the assembled
11+
attributes JSON still exceeds it. Cost/token metadata (`ai.usage.*`,
12+
`ai.model.*`, `gen_ai.usage.*`, `gen_ai.response.model`, etc.) keeps the
13+
default 8KB cap so LLM enrichment continues to work.
14+
15+
Extends the same truncation to span events' attributes. AI SDK telemetry
16+
emits one span event per conversation turn (`gen_ai.system.message`,
17+
`gen_ai.user.message`, etc.) carrying message content as event attributes,
18+
which previously flowed into ClickHouse uncapped because
19+
`spanEventsToEventEvents` did not run them through `truncateAttributes`.
20+
That field was the actual source of the oversized rows breaking the
21+
ClickHouse JSON parser and dropping whole batches.

apps/webapp/app/env.server.ts

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -498,6 +498,17 @@ const EnvironmentSchema = z
498498
TRIGGER_OTEL_ATTRIBUTE_PER_LINK_COUNT_LIMIT: z.string().default("10"),
499499
TRIGGER_OTEL_ATTRIBUTE_PER_EVENT_COUNT_LIMIT: z.string().default("10"),
500500

501+
// Server-side OTel ingestion limits applied in otlpExporter.server.ts.
502+
// Default per-attribute cap (8KB) is enough for nearly all keys, but
503+
// Vercel AI SDK content keys (ai.prompt*, ai.response.text/object/etc.,
504+
// gen_ai.prompt, gen_ai.completion) carry tens of KB and have a tighter
505+
// dedicated cap. The total cap is a backstop applied to the assembled
506+
// attributes JSON; if exceeded, AI content keys are dropped in priority
507+
// order. Both prevent oversized JSON from breaking ClickHouse inserts.
508+
SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT: z.coerce.number().int().default(8192),
509+
SERVER_OTEL_AI_CONTENT_ATTRIBUTE_VALUE_LENGTH_LIMIT: z.coerce.number().int().default(1024),
510+
SERVER_OTEL_SPAN_TOTAL_ATTRIBUTES_LENGTH_LIMIT: z.coerce.number().int().default(32768),
511+
501512
CHECKPOINT_THRESHOLD_IN_MS: z.coerce.number().int().default(30000),
502513

503514
// Internal OTEL environment variables

apps/webapp/app/v3/dynamicFlushScheduler.server.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -196,11 +196,11 @@ export class DynamicFlushScheduler<T> {
196196
// Schedule all batches for concurrent processing
197197
const flushPromises = batchesToFlush.map((batch) =>
198198
this.limiter(async () => {
199-
const itemCount = batch.length;
200-
201199
const self = this;
202200

203201
async function tryFlush(flushId: string, batchToFlush: T[], attempt: number = 1) {
202+
const itemCount = batchToFlush.length;
203+
204204
try {
205205
const startTime = Date.now();
206206
await self.callback(flushId, batchToFlush);
Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
/**
2+
* Pure helpers for OTel attribute truncation and per-span size capping.
3+
* Lives in a separate module from `otlpExporter.server.ts` so tests can
4+
* import the helpers without dragging in the env-parsing side effect of
5+
* the server module's transitive dependencies.
6+
*/
7+
8+
export type AttributeValue = string | number | boolean | undefined;
9+
export type AttributeMap = Record<string, AttributeValue>;
10+
11+
/**
12+
* Per-key cap overrides for `truncateAttributes`. A key matches an override
13+
* when `key === prefix` or `key.startsWith(prefix + ".")` — i.e. the prefix
14+
* covers the attribute itself and any dotted children. First matching entry
15+
* wins; later entries are ignored.
16+
*/
17+
export type AttributeKeyOverride = { prefix: string; limit: number };
18+
19+
export type SpanAttributeLimits = {
20+
/** Per-attribute cap applied to every string-valued attribute. */
21+
defaultValueLengthLimit: number;
22+
/**
23+
* Per-attribute cap applied only to known Vercel AI SDK content keys.
24+
* These attributes (`ai.prompt*`, `ai.response.text/object/toolCalls/reasoning*`,
25+
* `gen_ai.prompt`, `gen_ai.completion`, `gen_ai.request.messages`,
26+
* `gen_ai.response.text`) routinely carry tens of KB of user prompt or
27+
* model response, which is enough to push the assembled per-row JSON past
28+
* ClickHouse's parse tolerance even after the default 8KB cap.
29+
*/
30+
aiContentValueLengthLimit: number;
31+
/**
32+
* Backstop: if the serialized size of all truncated attributes still
33+
* exceeds this many bytes, the AI content keys are dropped in priority
34+
* order until the assembled JSON is under budget. Cost/token metadata is
35+
* preserved.
36+
*/
37+
totalAttributesLengthLimit: number;
38+
};
39+
40+
/**
41+
* Vercel AI SDK content keys to cap aggressively. Keep cost/token metadata
42+
* out of this list — `ai.usage.*`, `ai.model.*`, `ai.operationId`,
43+
* `ai.settings.*`, `ai.telemetry.*`, `gen_ai.usage.*`,
44+
* `gen_ai.response.model`, `gen_ai.request.model`, `gen_ai.system`, and
45+
* `gen_ai.operation.name` are needed by `enrichCreatableEvents` for cost
46+
* and LLM enrichment.
47+
*/
48+
export const AI_CONTENT_KEY_OVERRIDES = (limit: number): AttributeKeyOverride[] => [
49+
// `ai.prompt` covers `ai.prompt`, `ai.prompt.messages`, `ai.prompt.format`,
50+
// `ai.prompt.tools`, `ai.prompt.toolChoice`, `ai.prompt.system`.
51+
{ prefix: "ai.prompt", limit },
52+
{ prefix: "ai.response.text", limit },
53+
{ prefix: "ai.response.object", limit },
54+
{ prefix: "ai.response.toolCalls", limit },
55+
{ prefix: "ai.response.reasoning", limit },
56+
{ prefix: "ai.response.reasoningDetails", limit },
57+
{ prefix: "gen_ai.prompt", limit },
58+
{ prefix: "gen_ai.completion", limit },
59+
{ prefix: "gen_ai.request.messages", limit },
60+
{ prefix: "gen_ai.response.text", limit },
61+
];
62+
63+
/**
64+
* Priority list of keys to drop when the assembled attributes JSON exceeds
65+
* the total-size budget. Higher up = dropped first. Each entry is a prefix
66+
* (same semantics as `AttributeKeyOverride`).
67+
*/
68+
export const AI_CONTENT_DROP_PRIORITY: string[] = [
69+
"ai.prompt.messages",
70+
"ai.prompt",
71+
"ai.response.object",
72+
"ai.response.text",
73+
"ai.response.toolCalls",
74+
"ai.response.reasoning",
75+
"ai.response.reasoningDetails",
76+
"gen_ai.prompt",
77+
"gen_ai.completion",
78+
"gen_ai.request.messages",
79+
"gen_ai.response.text",
80+
];
81+
82+
function matchKeyOverride(
83+
key: string,
84+
overrides: AttributeKeyOverride[] | undefined
85+
): AttributeKeyOverride | undefined {
86+
if (!overrides) return undefined;
87+
for (const override of overrides) {
88+
if (key === override.prefix || key.startsWith(override.prefix + ".")) {
89+
return override;
90+
}
91+
}
92+
return undefined;
93+
}
94+
95+
export function truncateAttributes(
96+
attributes: AttributeMap | undefined,
97+
maximumLength: number = 1024,
98+
keyOverrides?: AttributeKeyOverride[]
99+
): AttributeMap | undefined {
100+
if (!attributes) return undefined;
101+
102+
const truncatedAttributes: AttributeMap = {};
103+
104+
for (const [key, value] of Object.entries(attributes)) {
105+
if (!key) continue;
106+
107+
if (typeof value === "string") {
108+
const override = matchKeyOverride(key, keyOverrides);
109+
const limit = override ? override.limit : maximumLength;
110+
truncatedAttributes[key] = truncateAndDetectUnpairedSurrogate(value, limit);
111+
} else {
112+
truncatedAttributes[key] = value;
113+
}
114+
}
115+
116+
return truncatedAttributes;
117+
}
118+
119+
/**
120+
* Backstop applied after per-attribute truncation. If `JSON.stringify(attrs)`
121+
* is still over `maxBytes`, walk `AI_CONTENT_DROP_PRIORITY` and remove any
122+
* attributes that match (by `key === prefix` or `key.startsWith(prefix + ".")`)
123+
* until the assembled size is under budget or the list is exhausted.
124+
*
125+
* Returns the original `attributes` reference unchanged when already under
126+
* budget; otherwise returns a new object with the offending keys removed.
127+
*
128+
* If the size is still over budget after exhausting the drop list, calls
129+
* `onResidualOverflow` (if provided) with the remaining size so the caller
130+
* can log it. Downstream protection lives in
131+
* `DynamicFlushScheduler.tryFlush`'s batch-split branch.
132+
*/
133+
export function capAssembledAttributesSize(
134+
attributes: AttributeMap | undefined,
135+
maxBytes: number,
136+
onResidualOverflow?: (info: { remainingBytes: number; maxBytes: number }) => void
137+
): AttributeMap {
138+
if (!attributes) return {};
139+
if (maxBytes <= 0) return attributes;
140+
141+
let serialized = JSON.stringify(attributes);
142+
if (serialized.length <= maxBytes) return attributes;
143+
144+
const result: AttributeMap = { ...attributes };
145+
146+
for (const prefix of AI_CONTENT_DROP_PRIORITY) {
147+
for (const key of Object.keys(result)) {
148+
if (key === prefix || key.startsWith(prefix + ".")) {
149+
delete result[key];
150+
}
151+
}
152+
serialized = JSON.stringify(result);
153+
if (serialized.length <= maxBytes) return result;
154+
}
155+
156+
onResidualOverflow?.({ remainingBytes: serialized.length, maxBytes });
157+
return result;
158+
}
159+
160+
function truncateAndDetectUnpairedSurrogate(str: string, maximumLength: number): string {
161+
const truncatedString = smartTruncateString(str, maximumLength);
162+
163+
if (hasUnpairedSurrogateAtEnd(truncatedString)) {
164+
return smartTruncateString(truncatedString, [...truncatedString].length - 1);
165+
}
166+
167+
return truncatedString;
168+
}
169+
170+
const ASCII_ONLY_REGEX = /^[\p{ASCII}]*$/u;
171+
172+
function smartTruncateString(str: string, maximumLength: number): string {
173+
if (!str) return "";
174+
if (str.length <= maximumLength) return str;
175+
176+
const checkLength = Math.min(str.length, maximumLength * 2 + 2);
177+
178+
if (ASCII_ONLY_REGEX.test(str.slice(0, checkLength))) {
179+
return str.slice(0, maximumLength);
180+
}
181+
182+
return [...str.slice(0, checkLength)].slice(0, maximumLength).join("");
183+
}
184+
185+
function hasUnpairedSurrogateAtEnd(str: string): boolean {
186+
if (str.length === 0) return false;
187+
188+
const lastCode = str.charCodeAt(str.length - 1);
189+
190+
// Check if last character is an unpaired high surrogate
191+
if (lastCode >= 0xd800 && lastCode <= 0xdbff) {
192+
return true; // High surrogate at end = unpaired
193+
}
194+
195+
// Check if last character is an unpaired low surrogate
196+
if (lastCode >= 0xdc00 && lastCode <= 0xdfff) {
197+
// Low surrogate is only valid if preceded by high surrogate
198+
if (str.length === 1) return true; // Single low surrogate
199+
200+
const secondLastCode = str.charCodeAt(str.length - 2);
201+
if (secondLastCode < 0xd800 || secondLastCode > 0xdbff) {
202+
return true; // Low surrogate not preceded by high surrogate
203+
}
204+
}
205+
206+
return false;
207+
}

0 commit comments

Comments
 (0)