chore: remove benchmark marker so tests run in CI

sumedhsakdeo · claude · sumedhsakdeo · commit b2ae7258f941 · 2026-02-16T21:06:59.000-08:00
Remove @pytest.mark.benchmark so the read throughput tests are included
in the default `make test` filter as parametrize-marked tests.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/mkdocs/docs/api.md b/mkdocs/docs/api.md
@@ -389,16 +389,16 @@ for buf in tbl.scan().to_arrow_batch_reader(order=ScanOrder.ARRIVAL, concurrent_
 
 Within each file, batch ordering always follows row order. The `limit` parameter is enforced correctly regardless of configuration.
 
-!!! tip "Which configuration should I use?"
+**Which configuration should I use?**
 
-    | Use case | Recommended config |
-    |---|---|
-    | Small tables, simple queries | Default — no extra args needed |
-    | Large tables, memory-constrained | `streaming=True` — one file at a time, minimal memory |
-    | Maximum throughput with bounded memory | `streaming=True, concurrent_files=N` — tune N to balance throughput vs memory |
-    | Fine-grained batch control | Add `batch_size=N` to any of the above |
+| Use case | Recommended config |
+|---|---|
+| Small tables, simple queries | Default — no extra args needed |
+| Large tables, memory-constrained | `streaming=True` — one file at a time, minimal memory |
+| Maximum throughput with bounded memory | `streaming=True, concurrent_files=N` — tune N to balance throughput vs memory |
+| Fine-grained batch control | Add `batch_size=N` to any of the above |
 
-    **Note:** `streaming=True` yields batches in arrival order (interleaved across files when `concurrent_files > 1`). For deterministic file ordering, use the default non-streaming mode. `batch_size` is usually an advanced tuning knob — the PyArrow default of 131,072 rows works well for most workloads.
+**Note:** `streaming=True` yields batches in arrival order (interleaved across files when `concurrent_files > 1`). For deterministic file ordering, use the default non-streaming mode. `batch_size` is usually an advanced tuning knob — the PyArrow default of 131,072 rows works well for most workloads.
 
 To avoid any type inconsistencies during writing, you can convert the Iceberg table schema to Arrow:
 
diff --git a/tests/benchmark/test_read_benchmark.py b/tests/benchmark/test_read_benchmark.py
@@ -22,7 +22,7 @@
 Memory is measured using pa.total_allocated_bytes() which tracks PyArrow's C++
 memory pool (Arrow buffers, Parquet decompression), not Python heap allocations.
 
-Run with: uv run pytest tests/benchmark/test_read_benchmark.py -v -s -m benchmark
+Run with: uv run pytest tests/benchmark/test_read_benchmark.py -v -s
 """
 
 import gc
@@ -84,7 +84,6 @@ def benchmark_table(tmp_path_factory: pytest.TempPathFactory) -> Table:
     return table
 
 
-@pytest.mark.benchmark
 @pytest.mark.parametrize(
     "streaming,concurrent_files,batch_size",
     [