[tmva][sofie] Profiler code generation for SOFIE by lmoneta · Pull Request #19829 · root-project/root

lmoneta · 2025-09-05T08:37:55Z

This PR introduces a time-based profiler for the SOFIE inference engine. It provides developers with the tools to analyze the performance of generated C++ models by measuring the execution time of each individual operator, as well as the total inference time.

Changes or fixes:

A new option, SOFIE::Options::kProfile, is added to the RModel::Generate() method to enable the feature.
When enabled, a new RModelProfiler helper class instruments the generated doInfer() method with std::chrono timers.
Utility functions like PrintProfilingResults() and GetOpAvgTime() are added to the generated Session struct to access the collected timing data.
Checklist:

Tested changes locally by running inference in a loop and verifying the timing results.
updated the docs (if necessary)

This PR replaces #19558 by fixing the conflicts with rebasing to master

dpiparo · 2025-09-05T08:50:03Z

Is there perhaps a way in which we could test this new nice feature?

github-actions · 2025-09-05T11:07:53Z

Test Results

22 files 22 suites 3d 10h 52m 8s ⏱️
3 855 tests 3 855 ✅ 0 💤 0 ❌
77 037 runs 77 037 ✅ 0 💤 0 ❌

Results for commit e82aefc.

♻️ This comment has been updated with latest results.

lmoneta · 2025-09-05T12:37:00Z

Yes, I think we should add a tutorial for this, generting code instrumented with timers and then producing the results.
@olia110 can you add this new tutorial?

olia110 · 2025-09-05T13:20:06Z

Yes, thank you. I will look how to add some testing and will add a tutorial.

guitargeek

I didn't look at the full diff yet, but this definitely requires a change in the RModel class number. Or - which I would prefer - remove IO capability from RModel altogether. What's the use case for that IO? If ROOT with SOFIE is installed to read a possible RModel from a file, you can also just recreate the model from ONNX on the fly. I think sticking to ONNX as the persistent model format is cleaner.

guitargeek · 2026-03-30T11:14:26Z

   size_t fConstantTensorSize = 0; // size  (in Bytes) of the allocated constant tensors
   size_t fWeightsTensorSize = 0;  // size  (in Bytes) of the allocated weight tensors
   size_t fOtherTensorSize = 0;    // size  (in Bytes) of intermediate tensors which are not managed by the memory pool

+   std::string fProfilerGC = "";


Adding a data member meas you have to increase the class version, otherwise you break backwards compatible IO.

But I'm happy that I have the opportunity to comment on this here! Because I think in general, supporting IO for RModel is not optimal. It bars us from changing the schema as we like, and I'm not sure what's the use case?

Furthermore, as SOFIE is still in the Experimental namespace, I think supporting persistent IO for these classes is in contradiction with the experimental namespace.

- After the changes for CLAD the mlpf modek could not be parsed anymore. Handle now correctly the variable defining the number of non zero elements coming from Non_Zero - Fixes also TMVA::SOFIE::Copy for different types than float making it a template function - Add also output shape definition in generated code as it is done for the input

The casting to bool was incorrect since it was done a cast to uint8. Fix also the special case of NonZero dynamic parameter which is defindef by NonZero operator. Add at the end a Session data member for the parameter which is then used in creating the output vector Fix a bug introduced in softmax generated code in the generic case Fix the writing of the data in initializer lists for uint8_t types Add correctly new version in RModel.hxx (version 4)

- Fix Where for initialized and Shape tensors. New impelmentation was not taking into account the Shape tensors. This caused a failure to parse the ATLAS Gnn tracking model - Fix Slice for trivial copying. Use now std::copy since we cannot use alias tensor anymore after the change of using a free function with a const Session - Avoid printing tensor names in the comment of Softmax generated code. There is a issue in the function RModel::CollectTensorMemberNames used to get tensor members from Session. The problem if a tensor is gaving as name "tensor_X" and used as member "tensor_tensor_X" the function assume exists a tensor with name "X". This was causing teh Keras parser to crash. - Fix an issue writing the initialized data when are inf or NaN. Use the function from limits in this case

…pe tensor When output is a param shape tensor the tensor values were not assigned in initialization as in a constant tensor, they need to be set at run time in the infer function because they depend on the provided dynamic shale values Fix also a issue on Windows in the new COnvertValuesTOString implementation dealing with inf values. Create a specialisation for float or double which will handle the infinity values in numerical limits.

Fix some operators when input and/or output is a shape tensors Fix also in Gemm when broadcasting dynamic shape for the bias

When computing the last usage of tensors, the loop on the input tensors of the added operator was not performed correctly. The loop was stopping if an initialized tensor was an input

Remove check to include standard library header only if within a list of allowed ones. There is no need for that check and this was excluding addition of extra headers like chrono

Compute also the error on the average when printing results and sort them in decreasing order in time

Split generation of code of RModelProfiler in different funcitons to make it more indipendent of RModel and the rest of teh code generation. This was needed to take into account many changes in the code performed to make the Sofie generated code understantable by Clad for AD. Fix also a compiler warning due to the name of ROperator.

lmoneta self-assigned this Sep 5, 2025

lmoneta requested review from bellenot and couet as code owners September 5, 2025 08:37

lmoneta mentioned this pull request Sep 5, 2025

[tmva][sofie] Profiler code generation for SOFIE #19558

Closed

2 tasks

guitargeek added the in:TMVA label Sep 5, 2025

olia110 mentioned this pull request Sep 21, 2025

[tmva][sofie] Profiler code generation for SOFIE #19933

Open

lmoneta force-pushed the olha_sofie_profiler branch 5 times, most recently from 1585356 to 32d38a2 Compare January 9, 2026 09:19

guitargeek added in:SOFIE and removed in:TMVA labels Mar 4, 2026

lmoneta force-pushed the olha_sofie_profiler branch 10 times, most recently from a881cd3 to f9ed58b Compare March 25, 2026 14:55

lmoneta requested review from guitargeek and vepadulano as code owners March 25, 2026 14:55

couet removed their request for review March 27, 2026 16:39

lmoneta force-pushed the olha_sofie_profiler branch from f9ed58b to 89635fa Compare March 29, 2026 19:33

guitargeek requested changes Mar 30, 2026

View reviewed changes

guitargeek mentioned this pull request Apr 14, 2026

[tmva][sofie] Added RModelProfiler and named ROperator s #8957

Closed

lmoneta force-pushed the olha_sofie_profiler branch 2 times, most recently from be27193 to 58b566e Compare April 20, 2026 14:33

guitargeek mentioned this pull request May 5, 2026

[tmva][sofie] Several Fixes for parsing complex models #22150

Merged

lmoneta added 7 commits May 18, 2026 10:04

[tmva][sofie] Apply fixes for shape tensors

84c2715

Fix some operators when input and/or output is a shape tensors Fix also in Gemm when broadcasting dynamic shape for the bias

[tmva][sofie] Fix a big in computing the end of life of a tensor

6cce19f

When computing the last usage of tensors, the loop on the input tensors of the added operator was not performed correctly. The loop was stopping if an initialized tensor was an input

[tmva][sofie] Fix Gather for negative indices in initialized tensors

86ad783

lmoneta force-pushed the olha_sofie_profiler branch from 58b566e to 775985e Compare May 18, 2026 09:48

lmoneta and others added 4 commits May 19, 2026 10:28

[tmva][sofie] Fix adding standard library header

9e9a991

Remove check to include standard library header only if within a list of allowed ones. There is no need for that check and this was excluding addition of extra headers like chrono

Time Profiler for Sofie

5f1d668

[tmva][sofie] Imporve RModel profiler

ac9bb27

Compute also the error on the average when printing results and sort them in decreasing order in time

lmoneta force-pushed the olha_sofie_profiler branch from 775985e to e82aefc Compare May 19, 2026 08:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[tmva][sofie] Profiler code generation for SOFIE#19829

[tmva][sofie] Profiler code generation for SOFIE#19829
lmoneta wants to merge 11 commits into
root-project:masterfrom
lmoneta:olha_sofie_profiler

lmoneta commented Sep 5, 2025

Uh oh!

dpiparo commented Sep 5, 2025

Uh oh!

github-actions Bot commented Sep 5, 2025 •

edited

Loading

Uh oh!

lmoneta commented Sep 5, 2025

Uh oh!

olia110 commented Sep 5, 2025

Uh oh!

guitargeek left a comment

Uh oh!

guitargeek Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

lmoneta commented Sep 5, 2025

Changes or fixes:

Uh oh!

dpiparo commented Sep 5, 2025

Uh oh!

github-actions Bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

lmoneta commented Sep 5, 2025

Uh oh!

olia110 commented Sep 5, 2025

Uh oh!

guitargeek left a comment

Choose a reason for hiding this comment

Uh oh!

guitargeek Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented Sep 5, 2025 •

edited

Loading