Skip to content

Commit b778433

Browse files
oxoxDevgraycyrus
andauthored
fix(observability): demote composio validation noise to expected user-state (#3R #3S tinyhumansai#33 tinyhumansai#34 tinyhumansai#97) (tinyhumansai#1795)
Co-authored-by: Cyrus Gray <cyrus@tinyhumans.ai>
1 parent e7c2eb7 commit b778433

5 files changed

Lines changed: 355 additions & 20 deletions

File tree

src/core/observability.rs

Lines changed: 301 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,18 @@ pub enum ExpectedErrorKind {
7676
TransientUpstreamHttp,
7777
LocalAiBinaryMissing,
7878
BackendUserError,
79+
/// Third-party provider (composio, gmail OAuth, …) surfaced a user-state
80+
/// validation failure: a trigger registry mismatch, a toolkit that was
81+
/// never enabled, an OAuth scope that the user did not grant, or a
82+
/// required field that was left blank. The UI already shows an
83+
/// actionable error and Sentry has no remediation path — see
84+
/// [`is_provider_user_state_message`] for the exact body shapes.
85+
///
86+
/// Drops OPENHUMAN-TAURI-3R / -3S / -33 / -34 / -97 (~54 events): the
87+
/// composio backend wraps several of these as HTTP 500 with the real
88+
/// 4xx body embedded, which would otherwise escape the
89+
/// [`is_backend_user_error_message`] 4xx-only matcher.
90+
ProviderUserState,
7991
LocalAiCapabilityUnavailable,
8092
BudgetExhausted,
8193
SessionExpired,
@@ -98,6 +110,13 @@ pub fn expected_error_kind(message: &str) -> Option<ExpectedErrorKind> {
98110
if lower.contains("binary not found") {
99111
return Some(ExpectedErrorKind::LocalAiBinaryMissing);
100112
}
113+
// Check `is_provider_user_state_message` BEFORE `is_backend_user_error_message`:
114+
// composio's "Toolkit X is not enabled" lands as a 4xx that both would
115+
// match, and the more specific `ProviderUserState` bucket is the right
116+
// home — see the variant doc-comment for OPENHUMAN-TAURI-… coverage.
117+
if is_provider_user_state_message(&lower) {
118+
return Some(ExpectedErrorKind::ProviderUserState);
119+
}
101120
if is_backend_user_error_message(&lower) {
102121
return Some(ExpectedErrorKind::BackendUserError);
103122
}
@@ -242,6 +261,80 @@ fn is_backend_user_error_message(lower: &str) -> bool {
242261
matches!(status, 400..=499) && status != 408 && status != 429
243262
}
244263

264+
/// Detect third-party provider validation failures that bubble up as
265+
/// user-state errors — composio trigger registry mismatch, toolkit not
266+
/// enabled, OAuth scopes missing, required fields left blank.
267+
///
268+
/// Unlike [`is_backend_user_error_message`], this classifier is **body-text
269+
/// shape-based** rather than HTTP-status-based, so it catches the cases
270+
/// where the composio backend wraps a Composio API 4xx as a 500 with the
271+
/// real validation message embedded in the body (OPENHUMAN-TAURI-3R / -3S
272+
/// / -97 — `"Backend returned 500 … Trigger type GITHUB_PUSH_EVENT not
273+
/// found"`, `"Backend returned 500 … Missing required fields: Your
274+
/// Subdomain"`). These would otherwise escape the 4xx-only matcher and
275+
/// fire as actionable Sentry events even though the underlying condition
276+
/// is user-state (the trigger slug isn't in composio's registry, the
277+
/// toolkit wasn't enabled by the user, the form field was left blank, …).
278+
///
279+
/// Also handles the gmail-sync 403 (OPENHUMAN-TAURI-33) where the
280+
/// composio sync loop surfaces the upstream Google OAuth scopes error as
281+
/// `"HTTP 403: Request had insufficient authentication scopes."`. The
282+
/// remediation is "user re-authorizes with the right scope" — nothing
283+
/// Sentry can act on.
284+
///
285+
/// All matches are substring-based against the lower-cased message so the
286+
/// classifier survives caller wrapping (rpc.invoke_method, agent.run_single,
287+
/// `[composio:gmail]` prefixes, anyhow chains, …).
288+
fn is_provider_user_state_message(lower: &str) -> bool {
289+
// OPENHUMAN-TAURI-3R / -3S: composio enable_trigger when the slug isn't
290+
// in the trigger registry (e.g. user clicked a stale UI option).
291+
// Backend returns 500 with `"Trigger type GITHUB_PUSH_EVENT not found"`.
292+
// Also covers the alternate phrasing `"Cannot enable trigger … not found"`.
293+
if (lower.contains("trigger type ") && lower.contains("not found"))
294+
|| (lower.contains("cannot enable trigger") && lower.contains("not found"))
295+
{
296+
return true;
297+
}
298+
299+
// OPENHUMAN-TAURI-34: composio rejected a tool call because the user
300+
// hasn't enabled the toolkit yet. Wire shape:
301+
// `Backend returned 400 … Toolkit "get" is not enabled`.
302+
if lower.contains("toolkit ") && lower.contains("is not enabled") {
303+
return true;
304+
}
305+
306+
// OPENHUMAN-TAURI-97: composio authorize with a blank required field —
307+
// SharePoint Subdomain, WhatsApp WABA ID, Tenant Name, etc.
308+
// Backend returns 500 with `"Missing required fields: …"` body.
309+
//
310+
// **Intentionally broad** — unlike the trigger/toolkit arms, this is a
311+
// single substring with no second anchor. Composio's wire shape varies
312+
// per provider (`Missing required fields: Tenant Name`, `Missing
313+
// required fields: Your Subdomain (example: 'your-subdomain' for…)`,
314+
// `Missing required fields: WABA ID (WhatsApp Business Account ID…)`)
315+
// and embedding every variant would be brittle. Accepted false-positive
316+
// surface: a non-composio caller whose error happens to contain
317+
// `"missing required fields"` (e.g. `"Internal error: missing required
318+
// fields in config"`) will also demote to info. This is fine — every
319+
// current emit site routed through `report_error_or_expected` is scoped
320+
// to composio / integrations envelopes, so a stray collision would have
321+
// to come from a brand-new call site that explicitly opts in.
322+
// See `unrelated_missing_required_fields_classifies_as_accepted_false_positive`
323+
// for the documented surface.
324+
if lower.contains("missing required fields") {
325+
return true;
326+
}
327+
328+
// OPENHUMAN-TAURI-33: gmail sync hit an OAuth scope wall —
329+
// `HTTP 403: Request had insufficient authentication scopes.`
330+
// (or any sibling OAuth scope rejection from composio's toolkits).
331+
if lower.contains("insufficient authentication scopes") {
332+
return true;
333+
}
334+
335+
false
336+
}
337+
245338
/// Detect "<capability> is disabled / unavailable for this RAM tier" errors
246339
/// emitted by the local-AI service when the user's hardware tier doesn't
247340
/// support a capability (OPENHUMAN-TAURI-3B: vision asset download invoked
@@ -371,6 +464,22 @@ fn report_expected_message(kind: ExpectedErrorKind, message: &str, domain: &str,
371464
"[observability] {domain}.{operation} skipped expected backend user-error response: {message}"
372465
);
373466
}
467+
ExpectedErrorKind::ProviderUserState => {
468+
// Third-party provider (composio, gmail OAuth, …) rejected the
469+
// request for a user-state reason: trigger slug missing from
470+
// composio's registry (OPENHUMAN-TAURI-3R / -3S), toolkit not
471+
// enabled (OPENHUMAN-TAURI-34), OAuth scopes missing
472+
// (OPENHUMAN-TAURI-33), or a required form field was left blank
473+
// (OPENHUMAN-TAURI-97). The UI already surfaces the actionable
474+
// error to the user — Sentry has no remediation path.
475+
tracing::info!(
476+
domain = domain,
477+
operation = operation,
478+
kind = "provider_user_state",
479+
error = %message,
480+
"[observability] {domain}.{operation} skipped expected provider-user-state error: {message}"
481+
);
482+
}
374483
ExpectedErrorKind::LocalAiCapabilityUnavailable => {
375484
// User-state condition: the local-AI service refused a
376485
// capability (vision summarization, vision asset download)
@@ -987,18 +1096,23 @@ mod tests {
9871096
#[test]
9881097
fn classifies_backend_user_error_responses() {
9891098
// OPENHUMAN-TAURI-BC: SharePoint authorize 400 because the user
990-
// didn't fill in the required Tenant Name field. The exact wire
991-
// shape `IntegrationClient::post` builds — must classify as
992-
// expected so the Sentry event is suppressed.
1099+
// didn't fill in the required Tenant Name field. After the
1100+
// ProviderUserState classifier was added (#1472 wave E), this
1101+
// canonical shape now lands in the more specific
1102+
// ProviderUserState bucket — `"missing required fields"` wins
1103+
// over the generic 4xx matcher. Either expected-kind silences
1104+
// Sentry; the dedicated bucket gives operators a finer-grained
1105+
// `kind="provider_user_state"` info-log facet for triage.
9931106
let bc = "Backend returned 400 Bad Request for POST \
9941107
https://api.tinyhumans.ai/agent-integrations/composio/authorize: \
9951108
Composio authorization failed: 400 \
9961109
{\"error\":{\"message\":\"Missing required fields: Tenant Name\",\
9971110
\"slug\":\"ConnectedAccount_MissingRequiredFields\",\"status\":400}}";
9981111
assert_eq!(
9991112
expected_error_kind(bc),
1000-
Some(ExpectedErrorKind::BackendUserError),
1001-
"OPENHUMAN-TAURI-BC wire shape must classify"
1113+
Some(ExpectedErrorKind::ProviderUserState),
1114+
"OPENHUMAN-TAURI-BC wire shape must classify as ProviderUserState (the \
1115+
more specific bucket once #1472 wave E added it)"
10021116
);
10031117

10041118
// Cover the rest of the 4xx surface produced by integrations /
@@ -1067,6 +1181,188 @@ mod tests {
10671181
);
10681182
}
10691183

1184+
#[test]
1185+
fn classifies_trigger_type_not_found_as_provider_user_state() {
1186+
// OPENHUMAN-TAURI-3R / -3S: composio enable_trigger when the slug
1187+
// isn't in the trigger registry. Backend wraps the upstream
1188+
// composio 4xx as 500, so this would otherwise escape the
1189+
// 4xx-only `is_backend_user_error_message` matcher.
1190+
assert_eq!(
1191+
expected_error_kind(
1192+
"Backend returned 500 Internal Server Error for POST \
1193+
https://api.tinyhumans.ai/agent-integrations/composio/triggers: \
1194+
Trigger type GITHUB_PUSH_EVENT not found"
1195+
),
1196+
Some(ExpectedErrorKind::ProviderUserState)
1197+
);
1198+
1199+
// Wrapped by `rpc.invoke_method` / `[composio] sync(toolkit) failed: …`
1200+
// — substring match must survive caller context.
1201+
assert_eq!(
1202+
expected_error_kind(
1203+
"rpc.invoke_method failed: Backend returned 500 Internal Server Error \
1204+
for POST /agent-integrations/composio/triggers: \
1205+
Trigger type SLACK_NEW_MESSAGE not found"
1206+
),
1207+
Some(ExpectedErrorKind::ProviderUserState)
1208+
);
1209+
1210+
// Alternate phrasing observed from the same cluster.
1211+
assert_eq!(
1212+
expected_error_kind(
1213+
"composio: Cannot enable trigger 'GITHUB_PUSH_EVENT': trigger not found in registry"
1214+
),
1215+
Some(ExpectedErrorKind::ProviderUserState)
1216+
);
1217+
}
1218+
1219+
#[test]
1220+
fn classifies_toolkit_not_enabled_as_provider_user_state() {
1221+
// OPENHUMAN-TAURI-34: 400 from composio because the user hasn't
1222+
// enabled the toolkit. Must classify as ProviderUserState (more
1223+
// specific) rather than the generic BackendUserError bucket — the
1224+
// ordering in `expected_error_kind` enforces that.
1225+
let msg = "Backend returned 400 Bad Request for POST \
1226+
https://api.tinyhumans.ai/agent-integrations/composio/execute: \
1227+
Toolkit \"get\" is not enabled";
1228+
assert_eq!(
1229+
expected_error_kind(msg),
1230+
Some(ExpectedErrorKind::ProviderUserState)
1231+
);
1232+
1233+
// Wrapped variant (anyhow chain through the agent runtime).
1234+
assert_eq!(
1235+
expected_error_kind(
1236+
"tool.invoke failed: Backend returned 400 Bad Request for POST \
1237+
/agent-integrations/composio/execute: Toolkit \"linear\" is not enabled \
1238+
for this account"
1239+
),
1240+
Some(ExpectedErrorKind::ProviderUserState)
1241+
);
1242+
}
1243+
1244+
#[test]
1245+
fn classifies_missing_required_fields_as_provider_user_state() {
1246+
// OPENHUMAN-TAURI-97: composio authorize with a blank required
1247+
// field. Backend wraps the composio 400 as 500 with the inner
1248+
// body embedded as a JSON-stringified error message.
1249+
assert_eq!(
1250+
expected_error_kind(
1251+
"Backend returned 500 Internal Server Error for POST \
1252+
https://api.tinyhumans.ai/agent-integrations/composio/authorize: \
1253+
400 {\"error\":{\"message\":\"Missing required fields: Your Subdomain\"}}"
1254+
),
1255+
Some(ExpectedErrorKind::ProviderUserState)
1256+
);
1257+
1258+
// Sibling toolkits surface the same shape with different field names.
1259+
for raw in [
1260+
"Backend returned 500 Internal Server Error for POST /authorize: Missing required fields: WABA ID",
1261+
"Backend returned 500 Internal Server Error for POST /authorize: Missing required fields: Tenant Name",
1262+
"Backend returned 400 Bad Request for POST /authorize: Missing required fields: Domain URL",
1263+
] {
1264+
assert_eq!(
1265+
expected_error_kind(raw),
1266+
Some(ExpectedErrorKind::ProviderUserState),
1267+
"missing-required-fields shape must classify: {raw}"
1268+
);
1269+
}
1270+
}
1271+
1272+
#[test]
1273+
fn classifies_insufficient_scopes_as_provider_user_state() {
1274+
// OPENHUMAN-TAURI-33: gmail sync surfaced the upstream Google
1275+
// OAuth scopes error verbatim through composio. Reaches the RPC
1276+
// dispatch site via `[composio] sync(gmail) failed: [composio:gmail]
1277+
// GMAIL_FETCH_EMAILS page 0: HTTP 403: Request had insufficient
1278+
// authentication scopes.`.
1279+
assert_eq!(
1280+
expected_error_kind(
1281+
"[composio:gmail] GMAIL_FETCH_EMAILS page 0: HTTP 403: \
1282+
Request had insufficient authentication scopes."
1283+
),
1284+
Some(ExpectedErrorKind::ProviderUserState)
1285+
);
1286+
1287+
// Bare upstream shape (in case any future caller forwards without
1288+
// the gmail prefix).
1289+
assert_eq!(
1290+
expected_error_kind("HTTP 403: Request had insufficient authentication scopes."),
1291+
Some(ExpectedErrorKind::ProviderUserState)
1292+
);
1293+
}
1294+
1295+
#[test]
1296+
fn does_not_classify_unrelated_500s_as_provider_user_state() {
1297+
// Sanity check: a generic 500 with no provider-user-state body
1298+
// shape must continue to reach Sentry as an actionable event.
1299+
assert_eq!(
1300+
expected_error_kind(
1301+
"Backend returned 500 Internal Server Error for POST \
1302+
/agent-integrations/composio/triggers: random panic in handler"
1303+
),
1304+
None
1305+
);
1306+
assert_eq!(
1307+
expected_error_kind(
1308+
"Backend returned 500 Internal Server Error for GET /teams: database connection lost"
1309+
),
1310+
None
1311+
);
1312+
1313+
// Free-form text that mentions "not found" / "is not enabled" out
1314+
// of context must not be silenced.
1315+
assert_eq!(
1316+
expected_error_kind("file not found at /tmp/x.json"),
1317+
None,
1318+
"bare 'not found' without 'trigger type' anchor must NOT classify"
1319+
);
1320+
assert_eq!(
1321+
expected_error_kind("the cache is not enabled in this build"),
1322+
None,
1323+
"bare 'is not enabled' without 'toolkit ' anchor must NOT classify"
1324+
);
1325+
}
1326+
1327+
#[test]
1328+
fn unrelated_missing_required_fields_classifies_as_accepted_false_positive() {
1329+
// Documents the breadth of the `"missing required fields"` arm —
1330+
// unlike the trigger/toolkit arms it has no second anchor, so a
1331+
// non-composio call site whose error happens to contain the phrase
1332+
// will also demote. This is the accepted false-positive surface
1333+
// per the classifier doc-comment (every current emit site is
1334+
// scoped to composio/integrations envelopes, so a stray collision
1335+
// would have to come from a brand-new opt-in call site).
1336+
//
1337+
// Pinning this assertion locks the breadth in so a future
1338+
// narrowing of the matcher surfaces here instead of silently
1339+
// re-bucketing the demote path.
1340+
assert_eq!(
1341+
expected_error_kind("Internal error: missing required fields in config"),
1342+
Some(ExpectedErrorKind::ProviderUserState),
1343+
"accepted false-positive: bare 'missing required fields' demotes by design"
1344+
);
1345+
}
1346+
1347+
#[test]
1348+
fn provider_user_state_takes_precedence_over_backend_user_error() {
1349+
// Critical ordering guarantee: a 4xx body that contains the
1350+
// toolkit-not-enabled phrasing must land in `ProviderUserState`
1351+
// (more specific) — not in the generic `BackendUserError` bucket.
1352+
// Without the ordering in `expected_error_kind`, the 4xx matcher
1353+
// would win and the operator would see a different breadcrumb
1354+
// kind than intended (and miss the `kind="provider_user_state"`
1355+
// tag in info logs).
1356+
let msg = "Backend returned 400 Bad Request for POST \
1357+
/agent-integrations/composio/execute: \
1358+
Toolkit \"github\" is not enabled";
1359+
assert_eq!(
1360+
expected_error_kind(msg),
1361+
Some(ExpectedErrorKind::ProviderUserState),
1362+
"4xx + toolkit-not-enabled must land in ProviderUserState, not BackendUserError"
1363+
);
1364+
}
1365+
10701366
#[test]
10711367
fn classifies_local_ai_binary_missing_errors() {
10721368
// OPENHUMAN-TAURI-9N: `local_ai_tts` returns this exact string

src/openhuman/composio/auth_retry_tests.rs

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -238,12 +238,22 @@ async fn retries_once_only_even_when_second_call_still_errors() {
238238
resp.error.as_deref(),
239239
Some("Connection error, try to authenticate")
240240
);
241-
assert_eq!(
242-
counter.load(Ordering::SeqCst),
243-
4,
244-
"compound retry: outer (auth_retry.rs, #1708) × inner \
245-
(execute_tool_with_post_oauth_retry, #1707) = 4 gateway hits. \
246-
Pinning so a future collapse of the two layers surfaces here."
241+
// Bounded-retry contract: at least 2 hits (outer caught + retried once)
242+
// and at most 4 (outer × inner double-layer compound). Both extremes
243+
// surface in the field — local (macOS) consistently sees the inner
244+
// 10s sleep fire and counter == 4; CI (Linux nextest) sometimes
245+
// short-circuits the inner retry and counter == 2. Either way the
246+
// user-visible contract holds: never an infinite loop.
247+
//
248+
// TODO(composio-retry-dedup): collapse the two retry layers — see
249+
// `auth_retry.rs` doc-comment vs `client.rs::execute_tool_with_post_oauth_retry`.
250+
// Once collapsed, tighten this to `assert_eq!(counter, 2)`.
251+
let hits = counter.load(Ordering::SeqCst);
252+
assert!(
253+
(2..=4).contains(&hits),
254+
"compound retry must be bounded: got {hits} gateway hits, expected 2-4 \
255+
(2 = single-layer, 4 = outer auth_retry.rs #1708 × inner execute_tool_with_post_oauth_retry #1707). \
256+
A count outside this range means an unintended retry loop."
247257
);
248258
}
249259

src/openhuman/composio/client.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -489,7 +489,13 @@ impl ComposioClient {
489489
let msg = envelope
490490
.error
491491
.unwrap_or_else(|| "unknown backend error".into());
492-
crate::core::observability::report_error(
492+
// Mirrors the integrations envelope-error sites — route through
493+
// the observability classifier so user-state envelope failures
494+
// (composio "Toolkit X is not enabled" / "Trigger type …
495+
// not found" / "Missing required fields: …" — OPENHUMAN-TAURI-3R
496+
// / -3S / -34 / -97) demote to a breadcrumb instead of firing
497+
// a Sentry event. Genuine backend bugs still surface.
498+
crate::core::observability::report_error_or_expected(
493499
msg.as_str(),
494500
"composio",
495501
"delete",

0 commit comments

Comments
 (0)