You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 3, 2025. It is now read-only.
Greetings,
I am currently using tf-trt and I want to measure the perfomance of my models (Latency, Throughput).
The tensorrt c++ API has the functionality of cuda synchronize via the cuda events API https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#cuda-events
On top of that, Pytorch contains the torch.cuda.synchronize() alternative
https://pytorch.org/docs/stable/generated/torch.cuda.synchronize.html
However in the TF TRT docs, I cant find something similar, which in my opinion is essential in order to correctly measure perfomance metrics
Have I missed anything or are there plans to integrate such functionality?
Thank you