Skip to content

feat(error): add typed Error::ServerError { code, name, message } variant#426

Open
gregvirgin-ls wants to merge 1 commit into
ClickHouse:mainfrom
gregvirgin-ls:typed-error-codes
Open

feat(error): add typed Error::ServerError { code, name, message } variant#426
gregvirgin-ls wants to merge 1 commit into
ClickHouse:mainfrom
gregvirgin-ls:typed-error-codes

Conversation

@gregvirgin-ls
Copy link
Copy Markdown
Contributor

Summary

Adds a typed Error::ServerError { code: u32, name: Option<String>, message: String } variant alongside the existing Error::BadResponse(String). ClickHouse server exceptions arrive in two wire shapes today (streaming body and X-ClickHouse-Exception-Code header fast-fail) and both flatten into BadResponse(String), forcing every downstream that wants code-aware control flow to substring-match the text. This change recognises the canonical shapes and produces the typed variant; anything that doesn't match still falls through to BadResponse, so every existing match arm still compiles.

Behaviour change

Recognisable ClickHouse server exceptions now surface as Error::ServerError { code, name, message } instead of Error::BadResponse(String). Both the streaming wire path (extract_exception_old, extract_exception_new) and the header fast-fail path (collect_bad_response) emit the typed variant when the body parses cleanly. Anything unrecognised — proxy errors, malformed bodies, non-ClickHouse responses, compressed bodies that didn't decompress — continues to surface as Error::BadResponse(String).

Error is #[non_exhaustive], so the new variant is additive: existing match arms with a _ => fallback (or that already matched on BadResponse) keep compiling. The new variant gives consumers an option to do better, not a required migration.

Migration example

Code that today substring-matches the body text:

match err {
    clickhouse::error::Error::BadResponse(s) if s.contains("(TOO_MANY_PARTS)") => {
        alert_on_too_many_parts();
    }
    other => return Err(other),
}

Becomes a clean code-based dispatch:

match err {
    clickhouse::error::Error::ServerError { code: 252, .. } => alert_on_too_many_parts(),
    other => return Err(other),
}

Or, for an exception class that matters by name rather than code:

match err {
    clickhouse::error::Error::ServerError { name: Some(n), .. } if n == "TIMEOUT_EXCEEDED" => {
        retry_with_longer_timeout();
    }
    other => return Err(other),
}

Existing Error::BadResponse(...) match arms continue to fire on responses that do not parse as a recognised CH exception (proxy errors, malformed bodies, compressed payloads). Consumers who only have a BadResponse arm today will silently miss the typed variant on canonical responses; adding a ServerError { .. } arm above it is the upgrade path.

Tests

Ten unit tests in src/response.rs::tests cover:

  • Streaming body with (NAME) tag and (version ...) trailer.
  • Header-only Code: NNN fast-fail.
  • (TOO_MANY_PARTS) (the motivating ClickFlow case).
  • Bodies that should fall back to BadResponse: non-ClickHouse responses, malformed code, missing-Code: strings.
  • No-name-tag form (some CH versions / proxies trim the tag).
  • Compressed (zstd / lz4) error bodies — fall through to BadResponse.
  • Code: N. DB::Exception:<no-space>message — parser intentionally tolerates the missing space; code, name, and the full message are all extracted (the variant captures the same fields it would with the space present).
  • User-data parens in the message (must NOT be misread as the (NAME) tag).
  • Header code with non-ClickHouse proxy body (must fall back to BadResponse).

Existing integration test in tests/it/fetch_bytes.rs updated to match.

Clippy / test matrix

Before pushing:

cargo fmt -- --check
cargo clippy --all-targets --no-default-features
cargo clippy --all-targets --all-features
cargo clippy --features native-tls
cargo clippy --features rustls-tls
cargo clippy --features rustls-tls-ring,rustls-tls-webpki-roots
cargo clippy --features rustls-tls-ring,rustls-tls-native-roots
cargo clippy --features rustls-tls-aws-lc,rustls-tls-webpki-roots
cargo clippy --features rustls-tls-aws-lc,rustls-tls-native-roots
cargo test --workspace
cargo test --workspace --no-default-features
cargo test --workspace --all-features
cargo rustdoc -p clickhouse --all-features -- -D warnings --cfg docsrs

What this does NOT do

  • Does not introduce a hardcoded code → name lookup table. When CH includes the (NAME) tag we capture it; when the response is bare Code: NNN we leave name as None.
  • Does not change BadResponse's representation or remove it. It remains the fallback for anything the parser doesn't recognise.
  • Does not make ServerError exhaustive. The struct fields can grow under #[non_exhaustive].

Downstream motivation

ClickFlow (a Rust SDK on top of clickhouse-rs) carries two substring-matching helpers that this change retires:

  • is_too_many_parts in crates/clickflow/src/batch.rs.
  • is_caller_caused_ch_error in crates/clickflow-mcp/src/error.rs (matches CH codes 43/47/53/386 to map to JSON-RPC -32602 invalid_params).

Every non-trivial clickhouse-rs consumer ends up with some version of these helpers. The typed variant retires the class.

…iant

ClickHouse server exceptions arrive as text in two wire paths:
- streaming body — "Code: NNN. DB::<Class>: <message> (NAME) (version ...)"
- header fast-fail — bare "Code: NNN" via X-ClickHouse-Exception-Code

Today both are flattened into Error::BadResponse(String), forcing every
downstream consumer that wants code-aware control flow (alerting on
TOO_MANY_PARTS, mapping caller-error codes 43/47/53 to a JSON-RPC
invalid_params, etc.) to substring-match the string. That parser is
duplicated everywhere and breaks subtly when CH's exception text
changes (e.g. the 25.11 tag-format bump in ClickHouse#365).

This commit:

- Adds a non-exhaustive Error::ServerError { code: u32, name:
  Option<String>, message: String } variant.
- Adds parse_server_exception() that recognises the documented forms
  and produces the typed variant.
- Wires both wire paths (extract_exception_old, extract_exception_new,
  collect_bad_response) to try the typed parse first and fall back to
  BadResponse on failure (proxy errors, malformed bodies,
  non-ClickHouse responses).
- Keeps BadResponse as a fallback so this is purely additive — every
  existing match arm still compiles.

The `name` field is Option<String>: when CH includes a (TOO_MANY_PARTS)
tag we capture it; when the response is a bare "Code: 252" we leave
None rather than ship a hardcoded code→name lookup table.

Tests cover both wire paths, header-only fast-fail, the no-tag form,
and proxy-error fallback.

Motivated by the substring-matching helpers carried by every non-trivial
clickhouse-rs consumer (see ClickFlow's `is_too_many_parts` in
crates/clickflow/src/batch.rs:387 and `is_caller_caused_ch_error` in
crates/clickflow-mcp/src/error.rs:16).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@abonander
Copy link
Copy Markdown
Contributor

Even if it's not technically a breaking API change, this is a significant enough behavior change that it may be better to wait for 0.16.0. Error handling is one of those things that you don't want to change out from under people if you can avoid it.

Comment thread src/error.rs
Comment on lines +62 to +76
/// Numeric ClickHouse exception code (e.g. `252` for `TOO_MANY_PARTS`).
///
/// The full table of codes lives in
/// [src/Common/ErrorCodes.cpp](https://github.com/ClickHouse/ClickHouse/blob/master/src/Common/ErrorCodes.cpp)
/// in the ClickHouse server tree.
code: u32,
/// Human-readable exception name (e.g. `"TOO_MANY_PARTS"`), extracted
/// from the trailing parenthesised tag in the server message.
///
/// `None` when the response is the bare `Code: NNN` form (e.g. a
/// header-only fast-fail) or when the trailing tag is absent.
name: Option<String>,
/// The exception message — the text between `DB::<Class>:` and the
/// trailing `(NAME) (version ...)` metadata, trimmed.
message: String,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think that structured errors like this should be their own structs. That allows for more flexible handling, as you can write functions which handle that error type specifically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants