Fix frame handoff race and eliminate redundant copies#786
Conversation
royshil
left a comment
There was a problem hiding this comment.
I remember vaguely that we had used swap in the past and reversed back to clone... iirc the reason was that it ran into mem alloc issues
Time savings are great but we can't regress functionality or introduce bugs
|
Thanks for the review! I searched the history and couldn't find evidence of You might be thinking of PR #225, which proposed replacing Our
Correction: Since swap empties |
umireon
left a comment
There was a problem hiding this comment.
Working on the CONTRIBUTING.md now. Sorry but wait a moment, please.
royshil
left a comment
There was a problem hiding this comment.
i think the changes check out. curious to see it in action
|
@xav-ie I've added checklist to your description. Feel free to ask us for question, it is added to this project just now. |
Replace clone() with cv::swap() in video_tick for both background and enhance filters. Swap is O(1) — it transfers buffer ownership to the tick thread instead of deep-copying ~8 MB per frame. In the render thread, use copyTo() instead of clone() for the stage surface data. copyTo reuses the existing cv::Mat allocation when dimensions match, avoiding a fresh heap allocation every frame. Add a newFrameAvailable flag (protected by inputBGRALock) so the tick thread only processes genuinely new frames and doesn't re-process stale data after swap empties the shared buffer. Signed-off-by: Xavier Ruiz <github@xav.ie>
c1b944d to
a47ceb3
Compare
|
Licensing works are almost completed. #797 |
|
@xav-ie are you still working o n this as a draft or want to make it a legit pull requets? |
| } | ||
| if (tf->inputBGRA.empty()) { | ||
| // No data to process | ||
| if (!tf->newFrameAvailable) { |
There was a problem hiding this comment.
The following code:
if (!tf->newFrameAvailable) {
return;
}
imageBGRA = tf->inputBGRA.clone();
cv::swap(imageBGRA, tf->inputBGRA);
tf->newFrameAvailable = false;
should be like:
if (tf->newFrameAvailable.exchange(false, std::memory_order_acq_rel)) {
cv::swap(imageBGRA, tf->inputBGRA);
}
for better performance with appropriate memory barrier.
| if (!tf->newFrameAvailable) { | ||
| return; | ||
| } | ||
| cv::swap(imageBGRA, tf->inputBGRA); | ||
| tf->newFrameAvailable = false; |
There was a problem hiding this comment.
| if (!tf->newFrameAvailable) { | |
| return; | |
| } | |
| cv::swap(imageBGRA, tf->inputBGRA); | |
| tf->newFrameAvailable = false; | |
| if (!tf->newFrameAvailable.exchange(false, std::memory_order_acq_rel)) { | |
| cv::swap(imageBGRA, tf->inputBGRA); | |
| } |
Summary
inputBGRA.empty()checks with anstd::atomic<bool> newFrameAvailableflag, fixing a data race where the tick thread could readinputBGRAoutside the mutex inenhance-filter.cppcv::swap()instead ofclone()in bothbackground-filter.cppandenhance-filter.cppfor O(1) frame handoff from render thread to tick threadcopyTo()instead ofclone()inobs-utils.cppso the destination buffer is reused when dimensions haven't changed, avoiding a heap allocation per frameBenchmark results (300 iterations, best of 3 runs)
Pull Request Checklist
Please read our latest CONTRIBUTING.md.