Skip to content

Cross-platform primitive for cancellable background work (in-process cancel + progress) #552

@xerial

Description

@xerial

Context

wvlet/wvlet#1717 landed a cross-platform SqlConnector trait whose QueryHandle exposes state, stats, cancel, await. The Trino impl (wvlet/wvlet#1708, wvlet/wvlet#1722) fits the contract naturally — it polls a remote server over HTTP, so progress + cancel come "for free" (per-page stats, DELETE nextUri).

The DuckDB impl (wvlet/wvlet#1723) currently ships as a synchronous wrapper: submit blocks until the query completes, the handle is already Finished when the caller gets it, and cancel is a no-op. That's correct semantics for the in-process synchronous DuckDB.execute we already have, but it leaves real progress + cancel on the table.

libduckdb actually does expose the pieces:

  • duckdb_query_progress(connection) — percentage + rows processed
  • duckdb_interrupt(connection) — cancel an in-flight query
  • Pending result API (duckdb_pending_execute_task) — cooperative incremental execution that the caller drives

The blocker is that using any of these mid-query requires the query to run on a different thread than the one polling progress / driving cancel. wvlet doesn't have a cross-platform primitive for that today, and we don't want to inline JVM Thread / Node worker_threads / Native pthread plumbing per backend.

What wvlet needs from uni

A cross-platform primitive for "run a unit of work on a background worker that can be cooperatively cancelled and progress-polled from another thread." The shape doesn't have to be a brand-new type — could plug into Rx (progress as a stream), into a future Task abstraction, or some combination. The requirements are:

  • Runs on a worker: JVM Thread, Node worker_threads (already used by NodeSyncHttpChannel), Scala Native pthread. The caller doesn't write platform-specific code.
  • Cooperative cancel: a signal the worker can observe at safe points (loop iterations, between libduckdb pending-execute steps).
  • Progress visibility: caller can read latest progress without blocking. Either pull-based (poll a snapshot) or push-based (Rx[Stats]) — either fits.
  • Block-to-completion: caller can await() for the worker's terminal state.
  • No effect monad required: in line with uni's "manage side-effects outside the interface" direction (per offline discussion). The worker body is just a regular block that touches FFI.

Use cases beyond DuckDB

  • REPL ctrl-C — interrupt any long-running query (Trino, DuckDB, future Snowflake).
  • CLI progress barswvlet run on a 30 GB parquet scan should show rows processed.
  • Snowflake SqlConnector — Snowflake's REST API is async, but cancel is via POST /queries/{queryId}/cancel; same primitive applies.
  • Future Spark / DuckLake / etc. backends — same shape.

Explicitly out of scope here

A concrete API design. The point of this issue is to record the requirement so the cross-platform threading / cancellation primitive gets considered as part of the broader Task / Rx design discussion in uni, not retrofitted later.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions