Skip to content

Row format msgpack writes silently dropped with 'no time data in batch' error on flush #401

@xe-nvdk

Description

@xe-nvdk

Summary

Row format MessagePack writes to /api/v1/write/msgpack are accepted with HTTP 204 but silently dropped during buffer flush. Arc logs Failed to flush aged buffer error=\"no time data in batch\" and no Parquet files are written. Data is lost.

Columnar format writes to the same endpoint work correctly end-to-end.

Reproduction

Start Arc:

docker run -d --name arc -p 8000:8000 \
  -e ARC_AUTH_ENABLED=false \
  -e ARC_TELEMETRY_ENABLED=false \
  -e STORAGE_BACKEND=local \
  ghcr.io/basekick-labs/arc:latest

Send a row format payload (any shape works: single bare map, array of rows, or {\"batch\": [...]} wrapper):

package main

import (
	\"bytes\"
	\"fmt\"
	\"io\"
	\"net/http\"
	\"time\"

	\"github.com/klauspost/compress/zstd\"
	\"github.com/vmihailenco/msgpack/v5\"
)

func main() {
	ts := time.Now().UnixMicro()

	// Single bare row map
	payload := map[string]any{
		\"m\": \"row_test\",
		\"t\": ts,
		\"fields\": map[string]any{
			\"sensor\": \"temp-1\",
			\"value\":  22.5,
		},
	}

	data, _ := msgpack.Marshal(payload)
	var buf bytes.Buffer
	w, _ := zstd.NewWriter(&buf)
	w.Write(data)
	w.Close()

	req, _ := http.NewRequest(\"POST\", \"http://localhost:8000/api/v1/write/msgpack\", bytes.NewReader(buf.Bytes()))
	req.Header.Set(\"Content-Type\", \"application/msgpack\")
	req.Header.Set(\"Content-Encoding\", \"zstd\")
	req.Header.Set(\"x-arc-database\", \"default\")

	resp, _ := http.DefaultClient.Do(req)
	body, _ := io.ReadAll(resp.Body)
	resp.Body.Close()
	fmt.Printf(\"status=%d body=%s\\n\", resp.StatusCode, string(body))
}

Output:

status=204 body=

Wait ~6 seconds (MaxBufferAgeMS default 5000ms), then query:

curl -X POST http://localhost:8000/api/v1/query \
  -H \"Content-Type: application/json\" \
  -d '{\"sql\": \"SELECT * FROM default.row_test\"}'

Result:

{\"success\":false,\"error\":\"Query execution failed\",\"row_count\":0}

SHOW TABLES FROM default returns empty. No Parquet files exist on disk.

Arc logs

INF arrow_writer.go:1597 > Flushing aged buffer age=6699.66 buffer_key=default/row_test component=arrow-buffer shard=3
ERR arrow_writer.go:1607 > Failed to flush aged buffer error=\"no time data in batch\" buffer_key=default/row_test component=arrow-buffer
ERR duckdb.go:234 > Query failed error=\"IO Error: No files found that match the pattern /app/data/arc/default/row_test/**/*.parquet\"

Tested payload shapes (all fail)

Tested all four variants, all return 204 but get dropped with the same error:

  1. Single bare row map: {\"m\": \"...\", \"t\": ..., \"fields\": {...}}
  2. Array of 1 row: [{\"m\": \"...\", \"t\": ..., \"fields\": {...}}]
  3. Array of 2 rows: [row1, row2]
  4. Batch wrapper: {\"batch\": [row1, row2]}

All use valid int64 microsecond timestamps in the t field.

Root cause investigation

The error no time data in batch comes from internal/ingest/arrow_writer.go:2104:

times, ok := merged.Data[\"time\"].([]int64)
if !ok || len(times) == 0 {
    return fmt.Errorf(\"no time data in batch\")
}

The flow for row format:

  1. decodeRow returns *models.Record with Time: time.Time set, Timestamp: 0
  2. Write() groups row records by measurement (arrow_writer.go:1177-1197)
  3. rowsToColumnar() converts to ColumnarRecord with columns[\"time\"] = append(columns[\"time\"], timestamp) where timestamp is int64 microseconds (arrow_writer.go:1118)
  4. writeColumnar() calls convertColumnsToTyped() which should produce typed[\"time\"] = []int64{...}
  5. On flush, merged.Data[\"time\"] is either missing or not []int64, triggering the error

Something between step 3 and step 5 is either dropping the time column or producing the wrong type. The columnar path works because it skips rowsToColumnar entirely and constructs the columns directly from the decoded payload.

Impact

  • Silent data loss for any client using row format
  • x_benthos_extra tests, Telegraf when row format is selected, and any other integration using row-format msgpack are all affected
  • Discovered while building the Redpanda Connect arc output plugin (arc: add output plugin for Arc columnar database redpanda-data/connect#4236); that PR's row format integration test had to be downgraded to write-only verification because the end-to-end query check fails

Environment

  • Arc version: 26.01.2 (ghcr.io/basekick-labs/arc:latest as of 2026-04-14)
  • Storage: local
  • Default ingest config (MaxBufferAgeMS=5000, default flush workers)

Suggested next steps

  1. Add debug logging in rowsToColumnar and convertColumnsToTyped to trace what happens to the time column
  2. Add a unit test at the ArrowBuffer.Write() level that sends row records and asserts the Parquet file is produced
  3. Consider returning an error to the HTTP client when flush fails, rather than silent 204 + async drop

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions