This document describes the HTTP surface that is actually implemented in server/leaderless_log_broker.py.
The broker can run in three roles selected by --role or LLOG_BROKER_ROLE:
write: enablesPOST /produceread: enablesPOST /consumeboth: enables both endpoints and is the default
Returns:
{
"status": "ok",
"broker_id": "broker-1",
"host": "127.0.0.1",
"port": 8080,
"started_at_ms": 1760000000000
}Status:
200 OKwhen the broker process is up404 Not Foundfor any otherGETpath
Returns:
- JSON with content type
application/json - top-level objects:
brokerhttpbatchingoxias3
Minimum broker metadata in the snapshot:
broker.broker_idbroker.hostbroker.portbroker.started_at_msbroker.rolesbroker.batch_settings.max_bytesbroker.batch_settings.max_delay_msbroker.batch_settings.max_buffer_bytes
Batching semantics:
broker.batch_settings.max_bytesis the sealed batch target. Once a batch reaches it, the batch is handed to the flusher and later appends go into a new batch.broker.batch_settings.max_buffer_bytesis the total uncommitted payload memory cap across the in-flight flushing batch and any queued future batches. Backpressure is returned only when that cap would be exceeded.
Current HTTP counters include:
http.inbound_requestshttp.accepted_produce_requestshttp.rejected_produce_requestshttp.malformed_request_counthttp.response_status_countshttp.backpressure_rejected_counthttp.records_acceptedhttp.request_payload_bytes_acceptedhttp.distinct_topics_seenhttp.distinct_topic_partitions_seenhttp.consume_requests_totalhttp.consume_request_errors_totalhttp.consume_records_returned_totalhttp.consume_bytes_returned_totalhttp.consume_partition_results_totalhttp.consume_blob_cache_hits_totalhttp.consume_blob_cache_misses_totalhttp.consume_blob_cache_full_object_gets_totalhttp.consume_blob_cache_range_gets_totalhttp.consume_long_poll_waiters_currenthttp.consume_long_poll_hits_totalhttp.consume_long_poll_timeouts_totalhttp.consume_backpressure_rejections_total
Current batching, Oxia, and S3 counters include:
- enqueue and flush totals
- partition initialization attempts, cache hits and misses, successes, and failures
- offsets assigned and committed
- index-entry writes
meta/controlreads and CAS stats- pending append recovery and materialization
- Oxia client errors by exception type
- S3 PUT count, bytes, latency totals, and failures
s3.billingrequest-cost and bucket-footprint estimates
Current s3.billing fields include:
pricing_modelstorage_usd_per_gb_monthput_usd_per_1000get_usd_per_1000bucket_usage_refresh_interval_secondsbucket_usage_last_refreshed_at_msbucket_usage_scan_successbucket_usage_scan_errorbucket_object_countbucket_stored_bytesbucket_stored_gbestimated_request_cost_usdestimated_monthly_storage_cost_usd
Notes:
- the request-cost estimate is derived from observed
put,list,get, andrange_getcounts - the bucket-size fields come from cached
list_objects_v2scans under the configuredLLOG_ROOT_PREFIX - pricing is a hardcoded S3 Standard
us-east-1estimate, not an authoritative AWS bill
Status:
200 OKfor/metrics404 Not Foundfor any otherGETpath
Returns:
- Prometheus text exposition with content type
text/plain; version=0.0.4; charset=utf-8
Current exported families include:
leaderless_log_inflight_requestsleaderless_log_batch_waiters_currentleaderless_log_batch_buffer_payload_bytesleaderless_log_batch_append_payload_bytes_enqueued_totalleaderless_log_s3_put_requests_totalleaderless_log_s3_put_uploaded_bytes_totalleaderless_log_s3_put_failures_totalleaderless_log_batch_queued_payload_bytes_currentleaderless_log_batch_inflight_flush_bytes_currentleaderless_log_batch_pending_batches_currentleaderless_log_batch_sealed_batches_currentleaderless_log_batch_oldest_pending_batch_age_secondsleaderless_log_produce_requests_totalleaderless_log_consume_long_poll_waiters_currentleaderless_log_consume_requests_totalleaderless_log_consume_records_returned_totalleaderless_log_consume_bytes_returned_totalleaderless_log_consume_partition_results_totalleaderless_log_consume_blob_cache_hits_totalleaderless_log_consume_blob_cache_misses_totalleaderless_log_consume_blob_cache_full_object_gets_totalleaderless_log_consume_blob_cache_range_gets_totalleaderless_log_consume_long_poll_hits_totalleaderless_log_consume_long_poll_timeouts_totalleaderless_log_consume_backpressure_rejections_totalleaderless_log_consume_request_duration_secondsleaderless_log_consume_blob_cache_request_peak_bytesleaderless_log_batch_backpressure_rejections_totalleaderless_log_batch_flushes_totalleaderless_log_batch_queue_wait_secondsleaderless_log_batch_seal_to_flush_start_secondsleaderless_log_batch_flush_duration_secondsleaderless_log_oxia_operations_totalleaderless_log_oxia_operation_duration_secondsleaderless_log_s3_operations_totalleaderless_log_s3_operation_duration_secondsleaderless_log_s3_estimated_request_cost_usd_totalleaderless_log_s3_bucket_stored_bytesleaderless_log_s3_bucket_stored_gbleaderless_log_s3_bucket_objectsleaderless_log_s3_estimated_monthly_storage_cost_usdleaderless_log_s3_bucket_usage_refresh_timestamp_secondsleaderless_log_s3_bucket_usage_refresh_failures_totalleaderless_log_shared_wal_blob_bytes
Status:
200 OKfor/metrics/prometheus404 Not Foundfor any otherGETpath
Request body:
{
"topic_partitions": [
{
"topic": "orders",
"partition": 0,
"records": [
"alpha",
{"base64": "AAE="}
]
}
]
}Rules:
- request body must be valid UTF-8 JSON
- top-level payload must be an object
topic_partitionsmust be a non-empty array- each item must contain:
topic: non-empty stringpartition: non-negative integerrecords: non-empty array
- each record must be either:
- a JSON string, encoded to bytes with UTF-8
- an object with exactly one
base64field
Success response:
{
"results": [
{
"topic": "orders",
"partition": 0,
"ok": true,
"start_offset": 1,
"end_offset": 2,
"count": 2,
"index_key": "llog/orders/partitions/0/index/00000000000000000002",
"wal_uri": "s3://leaderless-log-wal/llog/wal-shared/<uuid>"
}
],
"success_count": 1,
"error_count": 0
}Failure item shape:
{
"topic": "orders",
"partition": 0,
"ok": false,
"error_type": "BackPressureRejected",
"error": "buffer full"
}Status semantics:
200 OKwhen every topic-partition batch succeeds503 Service Unavailablewhen every result failed and every failure isBackPressureRejected409 Conflictfor any other response that contains at least one failed batch400 Bad Requestfor missingContent-Length, empty bodies, malformed JSON, or invalid request shape404 Not Foundfor any otherPOSTpath
Operational notes:
- partitions are initialized lazily on write and cached in broker memory after the first successful initialization
- one request can contain multiple topic-partition batches
- there is no cross-partition transaction
- mixed success is allowed and returned per batch
- the endpoint is available only when the broker role includes
write
Request body:
{
"topic_partitions": [
{
"topic": "orders",
"partition": 0,
"fetch_offset": 1,
"partition_max_bytes": 1048576
}
],
"max_wait_ms": 250,
"min_bytes": 1,
"max_bytes": 52428800
}Rules:
- request body must be valid UTF-8 JSON
- top-level payload must be an object
topic_partitionsmust be a non-empty array- each item must contain:
topic: non-empty stringpartition: non-negative integerfetch_offset: integer>= 1
- optional fields:
partition_max_bytes: positive integermax_wait_ms: non-negative integer, clamped by the broker capmin_bytes: non-negative integermax_bytes: positive integer
Success response:
{
"results": [
{
"topic": "orders",
"partition": 0,
"ok": true,
"high_watermark": 3,
"start_offset": 1,
"end_offset": 2,
"next_fetch_offset": 3,
"record_count": 2,
"records": [
{"offset": 1, "payload": "alpha"},
{"offset": 2, "base64": "AAE="}
]
}
],
"success_count": 1,
"error_count": 0
}Failure item shape:
{
"topic": "orders",
"partition": 0,
"ok": false,
"error_type": "BlobNotFound",
"error": "missing"
}Status semantics:
200 OKwhen the request is parsed and the consume service returns results503 Service Unavailablewhen the consume service rejects the request for backpressure400 Bad Requestfor missingContent-Length, empty bodies, malformed JSON, or invalid request shape404 Not Foundwhen the broker role does not includeread, or for any otherPOSTpath
Operational notes:
- one request can fetch multiple topic-partitions
- per-partition failures do not fail the whole request
- records are returned as UTF-8 strings when decodable, otherwise as
{"base64":"..."} - long polling is controlled by
max_wait_msandmin_bytes POST /consumechecks the local tail cache before it falls back to Oxia/S3- a tail-cache response can also be "known empty at cached tail" with a
high_watermarkbut no records, so long polls can sleep locally instead of probing Oxia/S3 again immediately - only cache misses, gaps, or evictions fall back to
LeaderlessLogReader - when multiple requested partitions resolve to the same shared WAL object, the broker fetches that blob once per request and demuxes it in memory
- recent local writes are eligible for a tail-cache hit before the reader falls back to Oxia/S3; the cache is in-process for same-process
write+readbrokers, or file-backed and cross-process whenLLOG_TAIL_CACHE_DIRpoints writer and reader processes at the same local directory LLOG_TAIL_CACHE_MAX_BYTES=0disables the tail cache; the default cap is536870912- the endpoint is available only when the broker role includes
read
The HTTP API is moderately defined:
- request validation is explicit in code
- status code behavior is explicit in code
- unit tests cover request parsing and status selection
- there is no OpenAPI spec, versioned schema, auth layer, or broader admin/read API yet