Add large-run PerfettoSQL guidance to performance analysis docs by Flamki · Pull Request #819 · precice/precice.github.io

Flamki · 2026-02-26T14:26:26Z

Summary

Add practical guidance for analyzing large preCICE traces with PerfettoSQL and scripted Trace Processor queries.

What changed

Updated content/docs/tooling/tooling-performance-analysis.md
Added a new section: Analyze larger traces with PerfettoSQL
- event inventory query
- runtime aggregation for key preCICE events
- expensive advance call query
- per-track imbalance query
- robust track_id to participant/process/thread mapping query
Added a new section: Scripting trace analysis with Python
- perfetto.trace_processor.TraceProcessor example for repeatable analysis
Added installation hints for:
- trace_processor_shell
- Python package perfetto

Scope

Content-only docs update
No JS/CSS/layout/behavior changes

Context

Relates to Add perfetto instructions for larger runs #631
This PR focuses on large-run analysis guidance (PerfettoSQL + scripting).

Validation

pre-commit run --files content/docs/tooling/tooling-performance-analysis.md (passed)
docker run --rm -v "${PWD}:/srv/jekyll" -w /srv/jekyll jekyll/jekyll:4 bash -lc "bundle install && bundle exec jekyll build" (passed)

MakisH

Thank you for the PR, @Flamki. This requires more in-depth knowledge of the profiling functionality to make sense, but some aspects of this PR are already useful for communicating that PerfettoSQL exists and what the user can do with it. Some queries, however, were not very well-motivated/clear.

I executed all queries and documented the outputs. @fsimonis since you opened the original issue and have written most of this page, please take a look. I would suggest directly editing to not leave this open much longer (or closing if it is too far from what you wanted to have).

For my outputs, I just used the flow-over-heated-plate tutorial with fluid-openfoam, solid-openfoam, all serial, and with <profiling mode="all" synchronize="true" />.

trace.json

I am not sure where you got trace-processor-shell from the linked Perfetto documentation page, but I found the equivalent trace-processor. One query also did not work for me.

MakisH · 2026-05-17T14:11:06Z

+
+For larger runs, it is often faster to query the trace data directly instead of navigating the UI.
+`trace_processor_shell` can execute PerfettoSQL on `trace.json` and gives reproducible results.
+You can download `trace_processor_shell` from the [Perfetto releases page](https://perfetto.dev/docs/quickstart/trace-analysis#get-trace-processor).


Suggested change

You can download `trace_processor_shell` from the [Perfetto releases page](https://perfetto.dev/docs/quickstart/trace-analysis#get-trace-processor).

You can download `trace_processor_shell` from the [Perfetto releases page](https://perfetto.dev/docs/analysis/trace-processor).

MakisH · 2026-05-17T14:22:10Z

+First, inspect which event names are present:
+
+```console
+trace_processor_shell -q "SELECT name, COUNT(*) AS n FROM slice GROUP BY name ORDER BY n DESC LIMIT 30;" trace.json


Output:

> SELECT name, COUNT(*) AS n FROM slice GROUP BY name ORDER BY n DESC LIMIT 30; name n -------------------- -------------------- solver.advance 428 mapping 428 waitAndSendData 426 waitAndReceiveData 426 mn2.sendData.interSy 426 m2n.sendData 426 m2n.receiveData.inte 426 m2n.receiveData 426 advanceCoupling 426 advance 426 readData.Solid-Mesh_ 214 readData.Fluid-Mesh_ 214 handleExports 214 writeData.Solid-Mesh 213 writeData.Fluid-Mesh 213 sendConvergence 213 receiveConvergence 213 map.nn.mapData.FromS 213 map.nn.mapData.FromF 213 accelerate 213 m2n.exchangeVertexDi 4 m2n.createCommunicat 4 m2n.buildCommunicati 4 m2n.broadcastVertexD 4 startProfilingBacken 2 solver.initialize 2 repartitioning 2 readConfiguration 2 initialize 2 initalizeCouplingSch 2 Query executed in 3.294 ms

MakisH · 2026-05-17T14:22:52Z

+### Analyze larger traces with PerfettoSQL
+
+For larger runs, it is often faster to query the trace data directly instead of navigating the UI.
+`trace_processor_shell` can execute PerfettoSQL on `trace.json` and gives reproducible results.


I could not find any trace_processor_shell, only a trace_processor, which at first run downloads trace_processor_shell:

$ curl -LO https://get.perfetto.dev/trace_processor $ chmod +x ./trace_processor $ ./trace_processor trace.json Downloading https://commondatastorage.googleapis.com/perfetto-luci-artifacts/v55.2/linux-amd64/trace_processor_shell

MakisH · 2026-05-17T14:23:21Z

+Then focus on preCICE events, for example to get total runtime per event:
+
+```console
+trace_processor_shell -q "SELECT name, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name IN ('advance', 'initialize', 'solver.advance', 'solver.initialize', 'construction', 'finalize') GROUP BY name ORDER BY total_s DESC;" trace.json


Output:

> SELECT name, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name IN ('advance', 'initialize', 'solver.advance', 'solver.initialize', 'construction', 'finalize') GROUP BY name ORDER BY total_s DESC; name n total_s avg_ms -------------------- -------------------- -------------------- -------------------- advance 426 28.813540 67.637000 solver.advance 428 27.648382 64.599000 initialize 2 3.481007 1740.504000 construction 2 0.357415 178.708000 finalize 2 0.001848 0.924000 solver.initialize 2 0.000714 0.357000 Query executed in 1.510 ms

MakisH · 2026-05-17T14:23:33Z

+trace_processor_shell -q "SELECT name, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name IN ('advance', 'initialize', 'solver.advance', 'solver.initialize', 'construction', 'finalize') GROUP BY name ORDER BY total_s DESC;" trace.json
+```
+
+To find unusually expensive `advance` calls:


Suggested change

To find unusually expensive `advance` calls:

To find the most expensive `advance` calls:

MakisH · 2026-05-17T14:25:04Z

+To find unusually expensive `advance` calls:
+
+```console
+trace_processor_shell -q "SELECT ROUND(ts/1e9, 6) AS start_s, ROUND(dur/1e6, 3) AS dur_ms, track_id FROM slice WHERE name = 'advance' ORDER BY dur DESC LIMIT 20;" trace.json


Output:

> SELECT ROUND(ts/1e9, 6) AS start_s, ROUND(dur/1e6, 3) AS dur_ms, track_id FROM slice WHERE name = 'advance' ORDER BY dur DESC LIMIT 20; start_s dur_ms track_id -------------------- -------------------- -------------------- 1779026672.600307 276.929000 1 1779026673.084940 268.537000 1 1779026671.265589 218.855000 1 1779026679.218089 215.254000 1 1779026671.486101 214.927000 1 1779026673.558230 212.789000 1 1779026667.692958 209.477000 1 1779026673.970424 200.914000 1 1779026673.356204 200.625000 1 1779026672.883513 199.199000 1 1779026673.772448 196.284000 1 1779026666.334844 194.339000 1 1779026672.225841 192.858000 1 1779026667.015985 187.507000 1 1779026666.830590 184.151000 1 1779026654.190474 182.940000 1 1779026672.420540 178.525000 1 1779026672.048006 176.499000 1 1779026671.702332 173.871000 1 1779026675.158442 173.198000 1 Query executed in 0.743 ms

MakisH · 2026-05-17T14:25:25Z

+To estimate rank/participant imbalance on `solver.advance`, compare totals per track:
+
+```console
+trace_processor_shell -q "SELECT track_id, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name = 'solver.advance' GROUP BY track_id ORDER BY total_s DESC;" trace.json


Not sure what rank/participant imbalance means in this context. Do I understand correctly that this query is mainly to list total and average duration of solver.advance() for each rank?

Output:

> SELECT track_id, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name = 'solver.advance' GROUP BY track_id ORDER BY total_s DESC; track_id n total_s avg_ms -------------------- -------------------- -------------------- -------------------- 0 214 27.402214 128.048000 1 214 0.246168 1.150000 Query executed in 1.328 ms

MakisH · 2026-05-17T14:25:57Z

+If you need the mapping from `track_id` to names:
+
+```console
+trace_processor_shell -q "SELECT t.id AS track_id, COALESCE(t.name, tt.name, pt.name, p.name, th.name) AS track_name, p.pid AS pid, p.name AS process_name, tt.utid AS utid, th.name AS thread_name FROM track t LEFT JOIN thread_track tt ON tt.id = t.id LEFT JOIN process_track pt ON pt.id = t.id LEFT JOIN thread th ON th.utid = tt.utid LEFT JOIN process p ON p.upid = COALESCE(tt.upid, pt.upid) ORDER BY t.id;" trace.json


This does not work for me. Output:

SELECT t.id AS track_id, COALESCE(t.name, tt.name, pt.name, p.name, th.name) AS track_name, p.pid AS pid, p.name AS process_name, tt.utid AS utid, th.name AS thread_name FROM track t LEFT JOIN thread_track tt ON tt.id = t.id LEFT JOIN process_track pt ON pt.id = t.id LEFT JOIN thread th ON th.utid = tt.utid LEFT JOIN process p ON p.upid = COALESCE(tt.upid, pt.upid) ORDER BY t.id; Traceback (most recent call last): File "stdin" line 1 col 1 SELECT t.id AS track_id, COALESCE(t.name, tt.name, pt.name, p.name, th.name) AS track_name, p.pid AS pid, p.name AS process_name ^ no such column: tt.upid Query executed in 0.503 ms

MakisH · 2026-05-17T14:34:53Z

+from perfetto.trace_processor import TraceProcessor
+
+tp = TraceProcessor(file_path="trace.json")
+
+query = """
+SELECT
+  name,
+  COUNT(*) AS n,
+  SUM(dur) / 1e9 AS total_s,
+  AVG(dur) / 1e6 AS avg_ms
+FROM slice
+WHERE name IN ('advance', 'initialize', 'solver.advance', 'solver.initialize')
+GROUP BY name
+ORDER BY total_s DESC
+"""
+
+for row in tp.query(query):
+    print(row.name, row.n, row.total_s, row.avg_ms)


Output:

$ python -m venv venv && source venv/bin/activate $ pip install perfetto $ python test.py advance 426 28.81354 67.63741784037559 solver.advance 428 27.648382 64.59902336448597 initialize 2 3.481007 1740.5035 solver.initialize 2 0.000714 0.357

MakisH · 2026-05-17T14:35:26Z

+    print(row.name, row.n, row.total_s, row.avg_ms)
+```
+
+Recent Perfetto Python versions expose selected columns as row attributes (`row.name`, `row.total_s`, ...).


Since which versions? What does that mean for the script?

carme-hp · 2026-05-18T09:37:58Z

+
+for row in tp.query(query):
+    print(row.name, row.n, row.total_s, row.avg_ms)
+```


Hi, I was curious to try this snippet since I'm currently trying to plot preCICE performance in a more automated way. I'm basically looking for an easy way to extract the performance values, and such a small Python script would be great.
However, when using this, I see that it adds up the values for e.g., "advance" for my two participants. I think it would be more useful to see the sum for each participant separately. Also, it would be useful if all names are supported. Currently only the names that appear in the top level when doing analyze seem to be supported (for exampleWHERE name IN ('initialize/mapping')does not seem to work).

docs: add large-run PerfettoSQL guidance for profiling analysis

b37bc1e

MakisH added content Content-only issues GSoC Contributed in the context of the Google Summer of Code labels Feb 27, 2026

Merge branch 'master' into issue-631-perfetto-large-runs

fda23e0

MakisH reviewed May 17, 2026

View reviewed changes

carme-hp reviewed May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add large-run PerfettoSQL guidance to performance analysis docs#819

Add large-run PerfettoSQL guidance to performance analysis docs#819
Flamki wants to merge 2 commits into
precice:masterfrom
Flamki:issue-631-perfetto-large-runs

Flamki commented Feb 26, 2026 •

edited

Loading

Uh oh!

MakisH left a comment •

edited

Loading

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

MakisH May 17, 2026

Uh oh!

carme-hp May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	You can download `trace_processor_shell` from the [Perfetto releases page](https://perfetto.dev/docs/quickstart/trace-analysis#get-trace-processor).
	You can download `trace_processor_shell` from the [Perfetto releases page](https://perfetto.dev/docs/analysis/trace-processor).

	To find unusually expensive `advance` calls:
	To find the most expensive `advance` calls:

Uh oh!

Conversation

Flamki commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Scope

Context

Validation

Uh oh!

MakisH left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Flamki commented Feb 26, 2026 •

edited

Loading

MakisH left a comment •

edited

Loading