Add large-run PerfettoSQL guidance to performance analysis docs#819
Add large-run PerfettoSQL guidance to performance analysis docs#819Flamki wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Thank you for the PR, @Flamki. This requires more in-depth knowledge of the profiling functionality to make sense, but some aspects of this PR are already useful for communicating that PerfettoSQL exists and what the user can do with it. Some queries, however, were not very well-motivated/clear.
I executed all queries and documented the outputs. @fsimonis since you opened the original issue and have written most of this page, please take a look. I would suggest directly editing to not leave this open much longer (or closing if it is too far from what you wanted to have).
For my outputs, I just used the flow-over-heated-plate tutorial with fluid-openfoam, solid-openfoam, all serial, and with <profiling mode="all" synchronize="true" />.
I am not sure where you got trace-processor-shell from the linked Perfetto documentation page, but I found the equivalent trace-processor. One query also did not work for me.
|
|
||
| For larger runs, it is often faster to query the trace data directly instead of navigating the UI. | ||
| `trace_processor_shell` can execute PerfettoSQL on `trace.json` and gives reproducible results. | ||
| You can download `trace_processor_shell` from the [Perfetto releases page](https://perfetto.dev/docs/quickstart/trace-analysis#get-trace-processor). |
There was a problem hiding this comment.
| You can download `trace_processor_shell` from the [Perfetto releases page](https://perfetto.dev/docs/quickstart/trace-analysis#get-trace-processor). | |
| You can download `trace_processor_shell` from the [Perfetto releases page](https://perfetto.dev/docs/analysis/trace-processor). |
| First, inspect which event names are present: | ||
|
|
||
| ```console | ||
| trace_processor_shell -q "SELECT name, COUNT(*) AS n FROM slice GROUP BY name ORDER BY n DESC LIMIT 30;" trace.json |
There was a problem hiding this comment.
Output:
> SELECT name, COUNT(*) AS n FROM slice GROUP BY name ORDER BY n DESC LIMIT 30;
name n
-------------------- --------------------
solver.advance 428
mapping 428
waitAndSendData 426
waitAndReceiveData 426
mn2.sendData.interSy 426
m2n.sendData 426
m2n.receiveData.inte 426
m2n.receiveData 426
advanceCoupling 426
advance 426
readData.Solid-Mesh_ 214
readData.Fluid-Mesh_ 214
handleExports 214
writeData.Solid-Mesh 213
writeData.Fluid-Mesh 213
sendConvergence 213
receiveConvergence 213
map.nn.mapData.FromS 213
map.nn.mapData.FromF 213
accelerate 213
m2n.exchangeVertexDi 4
m2n.createCommunicat 4
m2n.buildCommunicati 4
m2n.broadcastVertexD 4
startProfilingBacken 2
solver.initialize 2
repartitioning 2
readConfiguration 2
initialize 2
initalizeCouplingSch 2
Query executed in 3.294 ms
| ### Analyze larger traces with PerfettoSQL | ||
|
|
||
| For larger runs, it is often faster to query the trace data directly instead of navigating the UI. | ||
| `trace_processor_shell` can execute PerfettoSQL on `trace.json` and gives reproducible results. |
There was a problem hiding this comment.
I could not find any trace_processor_shell, only a trace_processor, which at first run downloads trace_processor_shell:
$ curl -LO https://get.perfetto.dev/trace_processor
$ chmod +x ./trace_processor
$ ./trace_processor trace.json
Downloading https://commondatastorage.googleapis.com/perfetto-luci-artifacts/v55.2/linux-amd64/trace_processor_shell
| Then focus on preCICE events, for example to get total runtime per event: | ||
|
|
||
| ```console | ||
| trace_processor_shell -q "SELECT name, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name IN ('advance', 'initialize', 'solver.advance', 'solver.initialize', 'construction', 'finalize') GROUP BY name ORDER BY total_s DESC;" trace.json |
There was a problem hiding this comment.
Output:
> SELECT name, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name IN ('advance', 'initialize', 'solver.advance', 'solver.initialize', 'construction', 'finalize') GROUP BY name ORDER BY total_s DESC;
name n total_s avg_ms
-------------------- -------------------- -------------------- --------------------
advance 426 28.813540 67.637000
solver.advance 428 27.648382 64.599000
initialize 2 3.481007 1740.504000
construction 2 0.357415 178.708000
finalize 2 0.001848 0.924000
solver.initialize 2 0.000714 0.357000
Query executed in 1.510 ms
| trace_processor_shell -q "SELECT name, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name IN ('advance', 'initialize', 'solver.advance', 'solver.initialize', 'construction', 'finalize') GROUP BY name ORDER BY total_s DESC;" trace.json | ||
| ``` | ||
|
|
||
| To find unusually expensive `advance` calls: |
There was a problem hiding this comment.
| To find unusually expensive `advance` calls: | |
| To find the most expensive `advance` calls: |
| To find unusually expensive `advance` calls: | ||
|
|
||
| ```console | ||
| trace_processor_shell -q "SELECT ROUND(ts/1e9, 6) AS start_s, ROUND(dur/1e6, 3) AS dur_ms, track_id FROM slice WHERE name = 'advance' ORDER BY dur DESC LIMIT 20;" trace.json |
There was a problem hiding this comment.
Output:
> SELECT ROUND(ts/1e9, 6) AS start_s, ROUND(dur/1e6, 3) AS dur_ms, track_id FROM slice WHERE name = 'advance' ORDER BY dur DESC LIMIT 20;
start_s dur_ms track_id
-------------------- -------------------- --------------------
1779026672.600307 276.929000 1
1779026673.084940 268.537000 1
1779026671.265589 218.855000 1
1779026679.218089 215.254000 1
1779026671.486101 214.927000 1
1779026673.558230 212.789000 1
1779026667.692958 209.477000 1
1779026673.970424 200.914000 1
1779026673.356204 200.625000 1
1779026672.883513 199.199000 1
1779026673.772448 196.284000 1
1779026666.334844 194.339000 1
1779026672.225841 192.858000 1
1779026667.015985 187.507000 1
1779026666.830590 184.151000 1
1779026654.190474 182.940000 1
1779026672.420540 178.525000 1
1779026672.048006 176.499000 1
1779026671.702332 173.871000 1
1779026675.158442 173.198000 1
Query executed in 0.743 ms
| To estimate rank/participant imbalance on `solver.advance`, compare totals per track: | ||
|
|
||
| ```console | ||
| trace_processor_shell -q "SELECT track_id, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name = 'solver.advance' GROUP BY track_id ORDER BY total_s DESC;" trace.json |
There was a problem hiding this comment.
Not sure what rank/participant imbalance means in this context. Do I understand correctly that this query is mainly to list total and average duration of solver.advance() for each rank?
Output:
> SELECT track_id, COUNT(*) AS n, ROUND(SUM(dur)/1e9, 6) AS total_s, ROUND(AVG(dur)/1e6, 3) AS avg_ms FROM slice WHERE name = 'solver.advance' GROUP BY track_id ORDER BY total_s DESC;
track_id n total_s avg_ms
-------------------- -------------------- -------------------- --------------------
0 214 27.402214 128.048000
1 214 0.246168 1.150000
Query executed in 1.328 ms
| If you need the mapping from `track_id` to names: | ||
|
|
||
| ```console | ||
| trace_processor_shell -q "SELECT t.id AS track_id, COALESCE(t.name, tt.name, pt.name, p.name, th.name) AS track_name, p.pid AS pid, p.name AS process_name, tt.utid AS utid, th.name AS thread_name FROM track t LEFT JOIN thread_track tt ON tt.id = t.id LEFT JOIN process_track pt ON pt.id = t.id LEFT JOIN thread th ON th.utid = tt.utid LEFT JOIN process p ON p.upid = COALESCE(tt.upid, pt.upid) ORDER BY t.id;" trace.json |
There was a problem hiding this comment.
This does not work for me. Output:
SELECT t.id AS track_id, COALESCE(t.name, tt.name, pt.name, p.name, th.name) AS track_name, p.pid AS pid, p.name AS process_name, tt.utid AS utid, th.name AS thread_name FROM track t LEFT JOIN thread_track tt ON tt.id = t.id LEFT JOIN process_track pt ON pt.id = t.id LEFT JOIN thread th ON th.utid = tt.utid LEFT JOIN process p ON p.upid = COALESCE(tt.upid, pt.upid) ORDER BY t.id;
Traceback (most recent call last):
File "stdin" line 1 col 1
SELECT t.id AS track_id, COALESCE(t.name, tt.name, pt.name, p.name, th.name) AS track_name, p.pid AS pid, p.name AS process_name
^
no such column: tt.upid
Query executed in 0.503 ms
| from perfetto.trace_processor import TraceProcessor | ||
|
|
||
| tp = TraceProcessor(file_path="trace.json") | ||
|
|
||
| query = """ | ||
| SELECT | ||
| name, | ||
| COUNT(*) AS n, | ||
| SUM(dur) / 1e9 AS total_s, | ||
| AVG(dur) / 1e6 AS avg_ms | ||
| FROM slice | ||
| WHERE name IN ('advance', 'initialize', 'solver.advance', 'solver.initialize') | ||
| GROUP BY name | ||
| ORDER BY total_s DESC | ||
| """ | ||
|
|
||
| for row in tp.query(query): | ||
| print(row.name, row.n, row.total_s, row.avg_ms) |
There was a problem hiding this comment.
Output:
$ python -m venv venv && source venv/bin/activate
$ pip install perfetto
$ python test.py
advance 426 28.81354 67.63741784037559
solver.advance 428 27.648382 64.59902336448597
initialize 2 3.481007 1740.5035
solver.initialize 2 0.000714 0.357
| print(row.name, row.n, row.total_s, row.avg_ms) | ||
| ``` | ||
|
|
||
| Recent Perfetto Python versions expose selected columns as row attributes (`row.name`, `row.total_s`, ...). |
There was a problem hiding this comment.
Since which versions? What does that mean for the script?
|
|
||
| for row in tp.query(query): | ||
| print(row.name, row.n, row.total_s, row.avg_ms) | ||
| ``` |
There was a problem hiding this comment.
Hi, I was curious to try this snippet since I'm currently trying to plot preCICE performance in a more automated way. I'm basically looking for an easy way to extract the performance values, and such a small Python script would be great.
However, when using this, I see that it adds up the values for e.g., "advance" for my two participants. I think it would be more useful to see the sum for each participant separately. Also, it would be useful if all names are supported. Currently only the names that appear in the top level when doing analyze seem to be supported (for exampleWHERE name IN ('initialize/mapping')does not seem to work).
Summary
Add practical guidance for analyzing large preCICE traces with PerfettoSQL and scripted Trace Processor queries.
What changed
content/docs/tooling/tooling-performance-analysis.mdadvancecall querytrack_idto participant/process/thread mapping queryperfetto.trace_processor.TraceProcessorexample for repeatable analysistrace_processor_shellperfettoScope
Context
Validation
pre-commit run --files content/docs/tooling/tooling-performance-analysis.md(passed)docker run --rm -v "${PWD}:/srv/jekyll" -w /srv/jekyll jekyll/jekyll:4 bash -lc "bundle install && bundle exec jekyll build"(passed)