Skip to content

[tmva][sofie] Profiler code generation for SOFIE#19829

Open
lmoneta wants to merge 11 commits into
root-project:masterfrom
lmoneta:olha_sofie_profiler
Open

[tmva][sofie] Profiler code generation for SOFIE#19829
lmoneta wants to merge 11 commits into
root-project:masterfrom
lmoneta:olha_sofie_profiler

Conversation

@lmoneta

@lmoneta lmoneta commented Sep 5, 2025

Copy link
Copy Markdown
Member

This PR introduces a time-based profiler for the SOFIE inference engine. It provides developers with the tools to analyze the performance of generated C++ models by measuring the execution time of each individual operator, as well as the total inference time.

Changes or fixes:

A new option, SOFIE::Options::kProfile, is added to the RModel::Generate() method to enable the feature.
When enabled, a new RModelProfiler helper class instruments the generated doInfer() method with std::chrono timers.
Utility functions like PrintProfilingResults() and GetOpAvgTime() are added to the generated Session struct to access the collected timing data.
Checklist:

Tested changes locally by running inference in a loop and verifying the timing results.
updated the docs (if necessary)

This PR replaces #19558 by fixing the conflicts with rebasing to master

@dpiparo

dpiparo commented Sep 5, 2025

Copy link
Copy Markdown
Member

Is there perhaps a way in which we could test this new nice feature?

@github-actions

github-actions Bot commented Sep 5, 2025

Copy link
Copy Markdown

Test Results

    22 files      22 suites   3d 10h 52m 8s ⏱️
 3 855 tests  3 855 ✅ 0 💤 0 ❌
77 037 runs  77 037 ✅ 0 💤 0 ❌

Results for commit e82aefc.

♻️ This comment has been updated with latest results.

@lmoneta

lmoneta commented Sep 5, 2025

Copy link
Copy Markdown
Member Author

Yes, I think we should add a tutorial for this, generting code instrumented with timers and then producing the results.
@olia110 can you add this new tutorial?

@olia110

olia110 commented Sep 5, 2025

Copy link
Copy Markdown
Contributor

Yes, thank you. I will look how to add some testing and will add a tutorial.

@lmoneta lmoneta force-pushed the olha_sofie_profiler branch 5 times, most recently from 1585356 to 32d38a2 Compare January 9, 2026 09:19
@lmoneta lmoneta force-pushed the olha_sofie_profiler branch 10 times, most recently from a881cd3 to f9ed58b Compare March 25, 2026 14:55
@couet couet removed their request for review March 27, 2026 16:39
@lmoneta lmoneta force-pushed the olha_sofie_profiler branch from f9ed58b to 89635fa Compare March 29, 2026 19:33

@guitargeek guitargeek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look at the full diff yet, but this definitely requires a change in the RModel class number. Or - which I would prefer - remove IO capability from RModel altogether. What's the use case for that IO? If ROOT with SOFIE is installed to read a possible RModel from a file, you can also just recreate the model from ONNX on the fly. I think sticking to ONNX as the persistent model format is cleaner.

Comment thread tmva/sofie/inc/TMVA/RModel.hxx Outdated
size_t fConstantTensorSize = 0; // size (in Bytes) of the allocated constant tensors
size_t fWeightsTensorSize = 0; // size (in Bytes) of the allocated weight tensors
size_t fOtherTensorSize = 0; // size (in Bytes) of intermediate tensors which are not managed by the memory pool

std::string fProfilerGC = "";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a data member meas you have to increase the class version, otherwise you break backwards compatible IO.

But I'm happy that I have the opportunity to comment on this here! Because I think in general, supporting IO for RModel is not optimal. It bars us from changing the schema as we like, and I'm not sure what's the use case?

Furthermore, as SOFIE is still in the Experimental namespace, I think supporting persistent IO for these classes is in contradiction with the experimental namespace.

lmoneta added 7 commits May 18, 2026 10:04
- After the changes for CLAD the mlpf modek could not be parsed anymore.
Handle now correctly the variable defining the number of non zero elements coming from Non_Zero
- Fixes also TMVA::SOFIE::Copy for different types than float making it a template function
- Add also output shape definition in generated code as it is done for the input
The casting to bool was incorrect since it was done a cast to uint8.

Fix also the special case of NonZero dynamic parameter which is defindef by NonZero operator. Add at the end a Session data member for the parameter which is then used in creating the output vector

Fix a bug introduced in softmax generated code in the generic case

Fix the writing of the data in initializer lists for uint8_t types

Add correctly new version in RModel.hxx (version 4)
- Fix Where for initialized and Shape tensors. New impelmentation was not  taking into account the Shape tensors. This caused a failure to parse the ATLAS Gnn tracking model

- Fix Slice for trivial copying. Use now std::copy since we cannot use alias tensor anymore after the change of using a free function with a const Session

- Avoid printing tensor names in the comment of Softmax generated code. There is a issue in the function RModel::CollectTensorMemberNames used to get tensor members from Session. The problem if a tensor is gaving as name "tensor_X" and used as member "tensor_tensor_X" the function assume exists a tensor with name "X". This was causing teh Keras parser to crash.

- Fix an issue writing the initialized data when are inf or NaN. Use the function from limits in this case
…pe tensor

When output is a param shape tensor the tensor values were not assigned in initialization as in a constant tensor, they need to be set at run time in the infer function
because they depend on the provided dynamic shale values

Fix also a issue on Windows in the new COnvertValuesTOString implementation dealing with inf values. Create a specialisation for float or double which will handle the infinity values in numerical limits.
Fix some operators when input and/or output is a shape tensors

Fix also in Gemm when broadcasting dynamic shape for the bias
When computing the last usage of tensors, the loop on the input tensors
of the added operator was not performed correctly. The loop was stopping if
an initialized tensor was an input
@lmoneta lmoneta force-pushed the olha_sofie_profiler branch from 58b566e to 775985e Compare May 18, 2026 09:48
lmoneta and others added 4 commits May 19, 2026 10:28
Remove check to include standard library header only if within a list of allowed ones.
There is no need for that check and this was excluding addition of extra headers like chrono
Compute also the error on the average when printing results and sort them in decreasing order in time
Split generation of code of RModelProfiler in different funcitons to make it more indipendent of RModel and the rest of teh code generation.
This was needed to take into account many changes in the code performed to make the Sofie generated code understantable by Clad for AD.

Fix also a compiler warning due to the name of ROperator.
@lmoneta lmoneta force-pushed the olha_sofie_profiler branch from 775985e to e82aefc Compare May 19, 2026 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants