Perf Debugging Reference

This is the operator reference for load-test debugging. Use it when a run has unexpected throughput, latency, error, routing, or compaction behavior.

First Files To Open

Every orchestrated run writes artifacts under:

loadtest/results/<utc-timestamp>-<scenario-name>/

Start with these files:

File	Use
`summary.json`	Final collector rollup, artifact presence, reconciliation warnings, latest broker/process/host totals.
`client-summary.json`	Hammer totals, latency percentiles, per-broker routing counts, and client trace stage summaries.
`client-30s.jsonl`	Interval hammer counts, rates, latency summaries, and client trace stage summaries.
`metrics-30s.jsonl`	Interval broker JSON metrics, process samples, host samples, aggregate totals, and interval deltas.
`logs/http-hammer.log`	Hammer stdout/stderr and fatal client-side failures.
`logs/<broker-id>.log`	Broker stdout/stderr and request trace write failures.
`logs/http-hammer-request-trace.jsonl`	Per-request client trace records keyed by `request_id`.
`logs/<broker-id>-request-trace.jsonl`	Per-request broker trace records keyed by `request_id`.
`logs/<broker-id>-batch-flush-trace.jsonl`	Per-flush broker batch trace records with queue-wait summaries and nested writer stage totals.

When a latency gap appears, join the two request trace files by request_id. Client trace stages are in milliseconds. Broker trace stages and Prometheus histograms are in seconds.

Run Artifact Inventory

Artifact	Producer	Notes
`resolved-config.json`	`loadtest.orchestrate`	Effective scenario config for the run.
`stack.json`	`loadtest.orchestrate`	Generated stack resources and paths.
`processes.json`	`loadtest.orchestrate`	Process metadata, PIDs, running status, and log paths.
`brokers.json`	`loadtest.orchestrate`	Broker IDs and HTTP addresses.
`ports.json`	`loadtest.orchestrate`	Reserved local TCP ports.
`environment.json`	`loadtest.orchestrate`	Captured environment subset used by the run.
`git.json`	`loadtest.orchestrate`	Repo revision metadata.
`routing-manifest.json`	`loadtest.routing`	Deterministic topic-partition owner sets.
`partition-manifest.json`	`loadtest.precreate`	Partition pre-create attempts and per-entry status.
`client-30s.jsonl`	`loadtest/http_hammer.go`	One JSON object per hammer interval.
`client-summary.json`	`loadtest/http_hammer.go`	Final hammer summary.
`remap-events.jsonl`	`loadtest/http_hammer.go`	Routing remaps emitted by owner-set routing.
`metrics-30s.jsonl`	`loadtest.metrics_collector`	One JSON object per collector interval.
`summary.json`	`loadtest.metrics_collector`	Final collector summary and reconciliation view.
`logs/http-hammer-request-trace.jsonl`	`loadtest/http_hammer.go`	Per-request client spans and errors.
`logs/<broker-id>-request-trace.jsonl`	`server.leaderless_log_broker`	Per-request broker spans and response summaries.
`logs/<broker-id>-batch-flush-trace.jsonl`	`server.leaderless_log_writer`	Per-flush batch trace with waiter queue-wait stats, group shape, and nested writer stage totals.
`logs/http-hammer.log`	`loadtest.orchestrate`	Hammer process log.
`logs/metrics-collector.log`	`loadtest.orchestrate`	Collector process log.
`logs/<broker-id>.log`	`loadtest.orchestrate`	Broker process log.
`logs/minio.log`	`loadtest.orchestrate`	MinIO process log.
`logs/oxia-standalone.log`	`loadtest.orchestrate`	Standalone Oxia process log.
`logs/oxia-coordinator.log`	`loadtest.orchestrate`	Clustered Oxia coordinator log.
`logs/oxia-dataserver-<n>.log`	`loadtest.orchestrate`	Clustered Oxia data server logs.
`logs/oxia-settle.log`	`loadtest.orchestrate`	Oxia assignment settle progress.
`logs/precreate.log`	`loadtest.orchestrate`	Partition pre-create subprocess output.
`logs/prometheus-target-sync-<reason>.log`	`loadtest.orchestrate`	Host-loopback Prometheus target sync output, when available.

Request-Path Tracing

Hammer sends X-Llog-Request-Id on every produce request. The same ID appears in both request trace files.

Client request trace fields:

Field	Meaning
`request_id`	Correlation key shared with the broker trace.
`broker_id`	Target broker selected by routing.
`started_at_ms`	Client-side request start time.
`status_code`	HTTP status, or `0` when the request failed before a response.
`latency_ms`	Total client-observed request latency.
`topic_partition_count`	Number of topic-partitions in the request group.
`record_count`	Records attempted by this request.
`payload_bytes`	Intended payload bytes.
`stages`	Client-side span timings in milliseconds.
`error_type`, `error`	Present when the request failed.

Client trace stages:

Stage	Meaning
`get_conn`	Time from `httptrace.GetConn` to `GotConn`.
`tcp_connect`	TCP connect time for new connections. Absent for reused connections.
`got_conn_to_wrote_headers`	Delay after connection acquisition before headers are written.
`request_write`	Time from headers written to full request written.
`request_write_to_first_byte`	Time from request write completion to first response byte.
`response_body_read`	Time to read the response body after headers arrived.
`round_trip`	Time inside the Go HTTP request.
`client_total`	Total measured by the hammer worker around the request call.

Broker request trace fields:

Field	Meaning
`request_id`	Correlation key shared with the client trace.
`broker_id`	Broker process that handled the request.
`method`, `path`, `status`, `result`	HTTP summary.
`started_at_ms`	Broker handler start time.
`duration_seconds`	Broker-observed request duration.
`stages`	Broker-side span timings in seconds.
`request`	Content length, topic-partition count, record count, and payload bytes.
`response`	Success and error counts when a produce response was built.
`error_type`, `error`	Present for validation or handler errors.

Broker request stages:

Stage	Meaning
`read_body`	Read the HTTP request body.
`json_decode`	Decode the JSON request body.
`parse_batches`	Validate and convert JSON into produce batches.
`initialize_partitions`	Run partition initialization for all batches.
`append_shared`	Enqueue and wait for the shared WAL append path.
`produce_total`	Total produce service time; this includes initialization and append time.
`response_encode`	Encode the JSON response.
`response_write`	Write HTTP status, headers, and response bytes.
`request_total`	Full broker handler duration.

Batch append stages:

Stage	Meaning
`enqueue_lock_wait`	Time waiting to enter the batcher condition lock.
`enqueue_lock_hold`	Time spent inside the batcher condition lock while enqueueing.
`future_wait`	Time waiting for enqueued requests to complete after leaving the lock.

Partition initialization results:

Result	Meaning
`cached`	The topic-partition was already initialized in this broker process.
`success`	The writer initialized the topic-partition during this request.
`error`	Initialization raised an exception.

Metrics Endpoints

Broker and compactor services expose:

GET /metrics
GET /metrics/prometheus

/metrics is the full JSON process-local snapshot. It contains these main sections:

Section	Contents
`broker`	Broker identity, host, port, start time, and batch settings.
`http`	Produce request counts, status counts, in-flight requests, accepted records/bytes, request duration totals, and request-stage histograms.
`batching`	Batch enqueue counts, buffer depth, flush counts/durations, partition initialization counts/durations, queue wait, append-stage durations, backpressure, and shared WAL blob totals.
`oxia`	Offset assignment, index writes, control key operations, recoveries, client errors, operation counts, and operation duration totals.
`s3`	Put counts, uploaded bytes, failures, operation counts, and operation duration totals.
`compactor`	Service identity, discovery state, worker counts, scheduler/claim state, claim counters, and discovery/attempt durations.
`compaction`	Compaction run counts, durations, lag, pending state, recovery counts, bytes, records, stage timings, candidate shape, and index key deletion counts.

/metrics/prometheus is the scrape surface for Prometheus and Grafana Explore.

Prometheus Metrics

Gauges:

Metric	Labels	Use
`leaderless_log_inflight_requests`	none	Current in-flight broker produce requests.
`leaderless_log_batch_waiters_current`	none	Current waiters buffered in the batcher.
`leaderless_log_batch_buffer_payload_bytes`	none	Current uncommitted batch payload bytes, including an in-flight sealed flush plus queued future batches.
`leaderless_log_batch_queued_payload_bytes_current`	none	Current queued payload bytes still waiting for a flush to start.
`leaderless_log_batch_inflight_flush_bytes_current`	none	Current payload bytes in the batch actively being flushed.
`leaderless_log_batch_pending_batches_current`	none	Current count of pending batches not yet handed to the underlying writer.
`leaderless_log_batch_sealed_batches_current`	none	Current count of sealed batches waiting for the flusher.
`leaderless_log_batch_oldest_pending_batch_age_seconds`	none	Age of the oldest batch still pending in the batcher.
`leaderless_log_compactor_known_targets`	none	Known compactor target count.
`leaderless_log_compactor_discovered_targets`	none	Discovered target count beyond static config.
`leaderless_log_compactor_inflight_partitions`	none	Topic-partitions currently held by compactor workers.
`leaderless_log_compactor_workers`	none	Configured compactor worker count.
`leaderless_log_compactor_consecutive_errors`	none	Current consecutive compactor attempt errors.
`leaderless_log_compactions_inflight`	none	Current `compact_once` runs.
`leaderless_log_compaction_lag_offsets`	`topic`, `partition`	Per-partition lag, only when partition metrics are enabled.
`leaderless_log_compaction_avg_lag_offsets`	none	Average compaction lag across known partitions.
`leaderless_log_compaction_max_lag_offsets`	none	Maximum compaction lag across known partitions.
`leaderless_log_compaction_last_success_timestamp_seconds`	none	Unix timestamp of last successful or recovered compaction.
`leaderless_log_compaction_pending_current`	`state`	Current pending compactions by state.
`leaderless_log_compaction_oldest_pending_age_seconds`	none	Age of the oldest pending compaction.

Counters:

Metric	Labels	Use
`leaderless_log_produce_requests_total`	`result`	Accepted vs rejected produce requests.
`leaderless_log_batch_backpressure_rejections_total`	none	Append attempts rejected because total uncommitted payload bytes would exceed the batch buffer cap.
`leaderless_log_batch_flushes_total`	`trigger`, `result`	Batch flush outcomes by trigger.
`leaderless_log_oxia_operations_total`	`operation`, `result`	Oxia operation outcomes.
`leaderless_log_s3_operations_total`	`operation`, `result`	S3 operation outcomes.
`leaderless_log_compactor_attempts_total`	`result`	Compactor attempt outcomes.
`leaderless_log_compactor_discovery_scans_total`	none	Compactor discovery scans.
`leaderless_log_compactor_discovery_additions_total`	none	Targets added by discovery.
`leaderless_log_compactor_claim_attempts_total`	`result`	Distributed claim outcomes.
`leaderless_log_compaction_runs_total`	`result`	Compaction run outcomes.
`leaderless_log_compaction_recoveries_total`	`kind`	Compaction recovery events.
`leaderless_log_compacted_records_total`	none	Records included in compacted blobs.
`leaderless_log_compaction_input_bytes_total`	none	Bytes read by compaction.
`leaderless_log_compaction_output_bytes_total`	none	Bytes written by compaction.
`leaderless_log_compaction_index_keys_deleted_total`	none	Index keys deleted by compaction cleanup.

Histograms and summaries:

Metric	Labels	Use
`leaderless_log_request_stage_duration_seconds`	`stage`, `result`	Broker-side produce request stage timing.
`leaderless_log_batch_queue_wait_seconds`	none	Time waiting in the batcher before flush starts.
`leaderless_log_batch_seal_to_flush_start_seconds`	none	Time between sealing a batch and starting its flush.
`leaderless_log_batch_append_stage_duration_seconds`	`stage`	Request-side batch append timing.
`leaderless_log_partition_initialize_duration_seconds`	`result`	Partition initialization timing by result.
`leaderless_log_partition_initialize_lock_wait_seconds`	`result`	Time waiting for the initialization underlying lock.
`leaderless_log_batch_flush_duration_seconds`	`trigger`	Time inside one batch flush.
`leaderless_log_oxia_operation_duration_seconds`	`operation`, `result`	Oxia operation latency.
`leaderless_log_s3_operation_duration_seconds`	`operation`, `result`	S3 operation latency.
`leaderless_log_compactor_scheduler_pass_duration_seconds`	none	Compactor scheduler pass duration.
`leaderless_log_compactor_discovery_duration_seconds`	none	Compactor discovery scan duration.
`leaderless_log_compactor_attempt_duration_seconds`	`result`	Compactor attempt duration.
`leaderless_log_compactor_claim_hold_duration_seconds`	none	Time a distributed compactor claim was held.
`leaderless_log_compaction_duration_seconds`	`result`	End-to-end compaction duration.
`leaderless_log_compaction_stage_duration_seconds`	`stage`, `result`	Per-stage compaction duration.
`leaderless_log_compaction_candidate_entries`	none	Index entries selected for a compaction candidate.
`leaderless_log_compaction_candidate_records`	none	Records selected for a compaction candidate.
`leaderless_log_compaction_candidate_span_offsets`	none	Offset span selected for a compaction candidate.
`leaderless_log_shared_wal_blob_bytes`	none	Summary of encoded shared WAL blob size.

Grafana

The source dashboard is:

grafana/leaderless-log-observability.json

On the host monitoring stack, the provisioned dashboard copy is:

/home/ops/services/monitoring/grafana/provisioning/dashboards/leaderless-log-observability.json

The dashboard includes a REQUEST PATH row with avg and p99 panels for the request-stage, batch append-stage, partition initialization, and partition initialization lock-wait histograms. Use Grafana Explore or Prometheus directly for ad hoc label filtering beyond those panels.

Useful PromQL while investigating the request-path gap:

histogram_quantile(
  0.99,
  sum by (instance, le, stage, result) (
    rate(leaderless_log_request_stage_duration_seconds_bucket{job=~"$job",instance=~"$instance"}[$rate_interval])
  )
)

sum by (instance, stage) (
  rate(leaderless_log_batch_append_stage_duration_seconds_sum{job=~"$job",instance=~"$instance"}[$rate_interval])
)
/
sum by (instance, stage) (
  rate(leaderless_log_batch_append_stage_duration_seconds_count{job=~"$job",instance=~"$instance"}[$rate_interval])
)

sum by (instance, result) (
  rate(leaderless_log_partition_initialize_lock_wait_seconds_sum{job=~"$job",instance=~"$instance"}[$rate_interval])
)
/
sum by (instance, result) (
  rate(leaderless_log_partition_initialize_lock_wait_seconds_count{job=~"$job",instance=~"$instance"}[$rate_interval])
)

The request trace JSONL files are not in Grafana today unless a separate log shipping path is added.

Common Debug Flow

Open summary.json and check warnings, artifact_presence, reconciliation, latest_totals, and cumulative_totals.
Open client-summary.json and compare throughput, latency, connect_error_count, timeout_count, remaps, unroutable partitions, and client_trace_stage_ms.
Open metrics-30s.jsonl and compare broker totals and interval deltas against the client intervals.
If broker HTTP latency is high, inspect leaderless_log_request_stage_duration_seconds by stage and result.
If append_shared dominates, inspect leaderless_log_batch_append_stage_duration_seconds, leaderless_log_batch_queue_wait_seconds, leaderless_log_batch_seal_to_flush_start_seconds, leaderless_log_batch_flush_duration_seconds, leaderless_log_batch_pending_batches_current, leaderless_log_batch_sealed_batches_current, leaderless_log_batch_queued_payload_bytes_current, leaderless_log_batch_inflight_flush_bytes_current, leaderless_log_oxia_operation_duration_seconds, and leaderless_log_s3_operation_duration_seconds. Then open logs/<broker-id>-batch-flush-trace.jsonl to correlate slow flushes with waiter queue-wait, group count, payload bytes, and nested writer stage totals.
If initialize_partitions dominates, inspect leaderless_log_partition_initialize_duration_seconds and leaderless_log_partition_initialize_lock_wait_seconds.
If client latency is higher than broker request_total, join logs/http-hammer-request-trace.jsonl and logs/<broker-id>-request-trace.jsonl by request_id and compare client request_write_to_first_byte against broker request_total.
If errors or remaps spike, inspect remap-events.jsonl, logs/http-hammer.log, logs/<broker-id>.log, and Oxia logs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf Debugging Reference

First Files To Open

Run Artifact Inventory

Request-Path Tracing

Metrics Endpoints

Prometheus Metrics

Grafana

Common Debug Flow

FilesExpand file tree

PERF_DEBUGGING.md

Latest commit

History

PERF_DEBUGGING.md

File metadata and controls

Perf Debugging Reference

First Files To Open

Run Artifact Inventory

Request-Path Tracing

Metrics Endpoints

Prometheus Metrics

Grafana

Common Debug Flow