@@ -38,15 +38,17 @@ All benchmarks are run via `run.py`:
3838python3 run.py --engine <engine> --benchmark <tpch|tpcds> [options]
3939```
4040
41- | Option | Description |
42- | -------------- | ------------------------------------------------ |
43- | ` --engine ` | Engine name (matches a TOML file in ` engines/ ` ) |
44- | ` --benchmark ` | ` tpch ` or ` tpcds ` |
45- | ` --iterations ` | Number of iterations (default: 1) |
46- | ` --output ` | Output directory (default: ` . ` ) |
47- | ` --query ` | Run a single query number |
48- | ` --no-restart ` | Skip Spark master/worker restart |
49- | ` --dry-run ` | Print the spark-submit command without executing |
41+ | Option | Description |
42+ | -------------- | -------------------------------------------------------- |
43+ | ` --engine ` | Engine name (matches a TOML file in ` engines/ ` ) |
44+ | ` --benchmark ` | ` tpch ` or ` tpcds ` |
45+ | ` --iterations ` | Number of iterations (default: 1) |
46+ | ` --output ` | Output directory (default: ` . ` ) |
47+ | ` --query ` | Run a single query number |
48+ | ` --no-restart ` | Skip Spark master/worker restart |
49+ | ` --dry-run ` | Print the spark-submit command without executing |
50+ | ` --jfr ` | Enable Java Flight Recorder profiling |
51+ | ` --jfr-dir ` | Directory for JFR output files (default: ` /results/jfr ` ) |
5052
5153Available engines: ` spark ` , ` comet ` , ` comet-iceberg ` , ` gluten `
5254
@@ -363,3 +365,30 @@ python3 generate-comparison.py --benchmark tpch \
363365 --title " TPC-H @ 100 GB: Parquet vs Iceberg" \
364366 comet-tpch-* .json comet-iceberg-tpch-* .json
365367```
368+
369+ ## Java Flight Recorder Profiling
370+
371+ Use the ` --jfr ` flag to capture JFR profiles from the Spark driver and executors.
372+ JFR is built into JDK 11+ so no additional dependencies are needed.
373+
374+ ``` shell
375+ python3 run.py --engine comet --benchmark tpch --jfr
376+ ```
377+
378+ JFR recordings are written to ` /results/jfr/ ` by default (configurable with
379+ ` --jfr-dir ` ). The driver writes ` driver.jfr ` and each executor writes
380+ ` executor.jfr ` (JFR appends the PID when multiple executors share a path).
381+
382+ With Docker Compose, the ` /results ` volume is shared across all containers,
383+ so JFR files from both driver and executors are collected in
384+ ` $RESULTS_DIR/jfr/ ` on the host:
385+
386+ ``` shell
387+ docker compose -f benchmarks/tpc/infra/docker/docker-compose.yml \
388+ run --rm bench \
389+ python3 /opt/benchmarks/run.py \
390+ --engine comet --benchmark tpch --output /results --no-restart --jfr
391+ ```
392+
393+ Open the ` .jfr ` files with [ JDK Mission Control] ( https://jdk.java.net/jmc/ ) ,
394+ IntelliJ IDEA's profiler, or ` jfr ` CLI tool (` jfr summary driver.jfr ` ).
0 commit comments