HTTP API

This document describes the HTTP surface that is actually implemented in server/leaderless_log_broker.py.

The broker can run in three roles selected by --role or LLOG_BROKER_ROLE:

write: enables POST /produce
read: enables POST /consume
both: enables both endpoints and is the default

Endpoints

`GET /health`

Returns:

{
  "status": "ok",
  "broker_id": "broker-1",
  "host": "127.0.0.1",
  "port": 8080,
  "started_at_ms": 1760000000000
}

Status:

200 OK when the broker process is up
404 Not Found for any other GET path

`GET /metrics`

Returns:

JSON with content type application/json
top-level objects:
- broker
- http
- batching
- oxia
- s3

Minimum broker metadata in the snapshot:

broker.broker_id
broker.host
broker.port
broker.started_at_ms
broker.roles
broker.batch_settings.max_bytes
broker.batch_settings.max_delay_ms
broker.batch_settings.max_buffer_bytes

Batching semantics:

broker.batch_settings.max_bytes is the sealed batch target. Once a batch reaches it, the batch is handed to the flusher and later appends go into a new batch.
broker.batch_settings.max_buffer_bytes is the total uncommitted payload memory cap across the in-flight flushing batch and any queued future batches. Backpressure is returned only when that cap would be exceeded.

Current HTTP counters include:

http.inbound_requests
http.accepted_produce_requests
http.rejected_produce_requests
http.malformed_request_count
http.response_status_counts
http.backpressure_rejected_count
http.records_accepted
http.request_payload_bytes_accepted
http.distinct_topics_seen
http.distinct_topic_partitions_seen
http.consume_requests_total
http.consume_request_errors_total
http.consume_records_returned_total
http.consume_bytes_returned_total
http.consume_partition_results_total
http.consume_blob_cache_hits_total
http.consume_blob_cache_misses_total
http.consume_blob_cache_full_object_gets_total
http.consume_blob_cache_range_gets_total
http.consume_long_poll_waiters_current
http.consume_long_poll_hits_total
http.consume_long_poll_timeouts_total
http.consume_backpressure_rejections_total

Current batching, Oxia, and S3 counters include:

enqueue and flush totals
partition initialization attempts, cache hits and misses, successes, and failures
offsets assigned and committed
index-entry writes
meta/control reads and CAS stats
pending append recovery and materialization
Oxia client errors by exception type
S3 PUT count, bytes, latency totals, and failures
s3.billing request-cost and bucket-footprint estimates

Current s3.billing fields include:

pricing_model
storage_usd_per_gb_month
put_usd_per_1000
get_usd_per_1000
bucket_usage_refresh_interval_seconds
bucket_usage_last_refreshed_at_ms
bucket_usage_scan_success
bucket_usage_scan_error
bucket_object_count
bucket_stored_bytes
bucket_stored_gb
estimated_request_cost_usd
estimated_monthly_storage_cost_usd

Notes:

the request-cost estimate is derived from observed put, list, get, and range_get counts
the bucket-size fields come from cached list_objects_v2 scans under the configured LLOG_ROOT_PREFIX
pricing is a hardcoded S3 Standard us-east-1 estimate, not an authoritative AWS bill

Status:

200 OK for /metrics
404 Not Found for any other GET path

`GET /metrics/prometheus`

Returns:

Prometheus text exposition with content type text/plain; version=0.0.4; charset=utf-8

Current exported families include:

leaderless_log_inflight_requests
leaderless_log_batch_waiters_current
leaderless_log_batch_buffer_payload_bytes
leaderless_log_batch_append_payload_bytes_enqueued_total
leaderless_log_s3_put_requests_total
leaderless_log_s3_put_uploaded_bytes_total
leaderless_log_s3_put_failures_total
leaderless_log_batch_queued_payload_bytes_current
leaderless_log_batch_inflight_flush_bytes_current
leaderless_log_batch_pending_batches_current
leaderless_log_batch_sealed_batches_current
leaderless_log_batch_oldest_pending_batch_age_seconds
leaderless_log_produce_requests_total
leaderless_log_consume_long_poll_waiters_current
leaderless_log_consume_requests_total
leaderless_log_consume_records_returned_total
leaderless_log_consume_bytes_returned_total
leaderless_log_consume_partition_results_total
leaderless_log_consume_blob_cache_hits_total
leaderless_log_consume_blob_cache_misses_total
leaderless_log_consume_blob_cache_full_object_gets_total
leaderless_log_consume_blob_cache_range_gets_total
leaderless_log_consume_long_poll_hits_total
leaderless_log_consume_long_poll_timeouts_total
leaderless_log_consume_backpressure_rejections_total
leaderless_log_consume_request_duration_seconds
leaderless_log_consume_blob_cache_request_peak_bytes
leaderless_log_batch_backpressure_rejections_total
leaderless_log_batch_flushes_total
leaderless_log_batch_queue_wait_seconds
leaderless_log_batch_seal_to_flush_start_seconds
leaderless_log_batch_flush_duration_seconds
leaderless_log_oxia_operations_total
leaderless_log_oxia_operation_duration_seconds
leaderless_log_s3_operations_total
leaderless_log_s3_operation_duration_seconds
leaderless_log_s3_estimated_request_cost_usd_total
leaderless_log_s3_bucket_stored_bytes
leaderless_log_s3_bucket_stored_gb
leaderless_log_s3_bucket_objects
leaderless_log_s3_estimated_monthly_storage_cost_usd
leaderless_log_s3_bucket_usage_refresh_timestamp_seconds
leaderless_log_s3_bucket_usage_refresh_failures_total
leaderless_log_shared_wal_blob_bytes

Status:

200 OK for /metrics/prometheus
404 Not Found for any other GET path

`POST /produce`

Request body:

{
  "topic_partitions": [
    {
      "topic": "orders",
      "partition": 0,
      "records": [
        "alpha",
        {"base64": "AAE="}
      ]
    }
  ]
}

Rules:

request body must be valid UTF-8 JSON
top-level payload must be an object
topic_partitions must be a non-empty array
each item must contain:
- topic: non-empty string
- partition: non-negative integer
- records: non-empty array
each record must be either:
- a JSON string, encoded to bytes with UTF-8
- an object with exactly one base64 field

Success response:

{
  "results": [
    {
      "topic": "orders",
      "partition": 0,
      "ok": true,
      "start_offset": 1,
      "end_offset": 2,
      "count": 2,
      "index_key": "llog/orders/partitions/0/index/00000000000000000002",
      "wal_uri": "s3://leaderless-log-wal/llog/wal-shared/<uuid>"
    }
  ],
  "success_count": 1,
  "error_count": 0
}

Failure item shape:

{
  "topic": "orders",
  "partition": 0,
  "ok": false,
  "error_type": "BackPressureRejected",
  "error": "buffer full"
}

Status semantics:

200 OK when every topic-partition batch succeeds
503 Service Unavailable when every result failed and every failure is BackPressureRejected
409 Conflict for any other response that contains at least one failed batch
400 Bad Request for missing Content-Length, empty bodies, malformed JSON, or invalid request shape
404 Not Found for any other POST path

Operational notes:

partitions are initialized lazily on write and cached in broker memory after the first successful initialization
one request can contain multiple topic-partition batches
there is no cross-partition transaction
mixed success is allowed and returned per batch
the endpoint is available only when the broker role includes write

`POST /consume`

Request body:

{
  "topic_partitions": [
    {
      "topic": "orders",
      "partition": 0,
      "fetch_offset": 1,
      "partition_max_bytes": 1048576
    }
  ],
  "max_wait_ms": 250,
  "min_bytes": 1,
  "max_bytes": 52428800
}

Rules:

request body must be valid UTF-8 JSON
top-level payload must be an object
topic_partitions must be a non-empty array
each item must contain:
- topic: non-empty string
- partition: non-negative integer
- fetch_offset: integer >= 1
optional fields:
- partition_max_bytes: positive integer
- max_wait_ms: non-negative integer, clamped by the broker cap
- min_bytes: non-negative integer
- max_bytes: positive integer

Success response:

{
  "results": [
    {
      "topic": "orders",
      "partition": 0,
      "ok": true,
      "high_watermark": 3,
      "start_offset": 1,
      "end_offset": 2,
      "next_fetch_offset": 3,
      "record_count": 2,
      "records": [
        {"offset": 1, "payload": "alpha"},
        {"offset": 2, "base64": "AAE="}
      ]
    }
  ],
  "success_count": 1,
  "error_count": 0
}

Failure item shape:

{
  "topic": "orders",
  "partition": 0,
  "ok": false,
  "error_type": "BlobNotFound",
  "error": "missing"
}

Status semantics:

200 OK when the request is parsed and the consume service returns results
503 Service Unavailable when the consume service rejects the request for backpressure
400 Bad Request for missing Content-Length, empty bodies, malformed JSON, or invalid request shape
404 Not Found when the broker role does not include read, or for any other POST path

Operational notes:

one request can fetch multiple topic-partitions
per-partition failures do not fail the whole request
records are returned as UTF-8 strings when decodable, otherwise as {"base64":"..."}
long polling is controlled by max_wait_ms and min_bytes
POST /consume checks the local tail cache before it falls back to Oxia/S3
a tail-cache response can also be "known empty at cached tail" with a high_watermark but no records, so long polls can sleep locally instead of probing Oxia/S3 again immediately
only cache misses, gaps, or evictions fall back to LeaderlessLogReader
when multiple requested partitions resolve to the same shared WAL object, the broker fetches that blob once per request and demuxes it in memory
recent local writes are eligible for a tail-cache hit before the reader falls back to Oxia/S3; the cache is in-process for same-process write+read brokers, or file-backed and cross-process when LLOG_TAIL_CACHE_DIR points writer and reader processes at the same local directory
LLOG_TAIL_CACHE_MAX_BYTES=0 disables the tail cache; the default cap is 536870912
the endpoint is available only when the broker role includes read

Definition Status

The HTTP API is moderately defined:

request validation is explicit in code
status code behavior is explicit in code
unit tests cover request parsing and status selection
there is no OpenAPI spec, versioned schema, auth layer, or broader admin/read API yet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP API

Endpoints

`GET /health`

`GET /metrics`

`GET /metrics/prometheus`

`POST /produce`

`POST /consume`

Definition Status

FilesExpand file tree

HTTP_API.md

Latest commit

History

HTTP_API.md

File metadata and controls

HTTP API

Endpoints

GET /health

GET /metrics

GET /metrics/prometheus

POST /produce

POST /consume

Definition Status

`GET /health`

`GET /metrics`

`GET /metrics/prometheus`

`POST /produce`

`POST /consume`