Skip to content

Fix frame handoff race and eliminate redundant copies#786

Draft
xav-ie wants to merge 1 commit into
royshil:mainfrom
xav-ie:pr/fix-race-and-reduce-copies
Draft

Fix frame handoff race and eliminate redundant copies#786
xav-ie wants to merge 1 commit into
royshil:mainfrom
xav-ie:pr/fix-race-and-reduce-copies

Conversation

@xav-ie
Copy link
Copy Markdown

@xav-ie xav-ie commented Feb 26, 2026

Summary

  • Replaces inputBGRA.empty() checks with an std::atomic<bool> newFrameAvailable flag, fixing a data race where the tick thread could read inputBGRA outside the mutex in enhance-filter.cpp
  • Uses cv::swap() instead of clone() in both background-filter.cpp and enhance-filter.cpp for O(1) frame handoff from render thread to tick thread
  • Uses copyTo() instead of clone() in obs-utils.cpp so the destination buffer is reused when dimensions haven't changed, avoiding a heap allocation per frame

Benchmark results (300 iterations, best of 3 runs)

Step Median (µs/frame) Change
Before (clone + empty() check) 1127.7
After (swap + copyTo + atomic flag) 675.9 −40%

Pull Request Checklist

Please read our latest CONTRIBUTING.md.

  • I have read the latest CONTRIBUTING.md.
  • I have acknowledged the licensing and patent grant policy.
  • I have included the required license headers.
  • I am confident with my code integrity.
  • I have signed off and verified all my commits.

Copy link
Copy Markdown
Owner

@royshil royshil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember vaguely that we had used swap in the past and reversed back to clone... iirc the reason was that it ran into mem alloc issues

Time savings are great but we can't regress functionality or introduce bugs

@xav-ie
Copy link
Copy Markdown
Author

xav-ie commented Feb 27, 2026

Thanks for the review! I searched the history and couldn't find evidence of cv::swap being used before:

$ git log origin/main -S "cv::swap" --oneline -- src/
(no results)

You might be thinking of PR #225, which proposed replacing cv::Mat with raw uint8_t buffers + memcpy for inter-thread communication — that was closed in favor of the mutex + clone() approach in PR #224.

Our cv::swap here is safe under the existing mutex:

  • Render thread (obs-utils.cpp): holds inputBGRALock, calls copyTo() which writes into tf->inputBGRA's buffer
  • Tick thread (background-filter.cpp): holds inputBGRALock, calls cv::swap(imageBGRA, tf->inputBGRA) which just swaps the Mat headers (pointer, size, refcount) in O(1) — no allocation, no copy
  • After the swap, the tick thread exclusively owns the pixel data and tf->inputBGRA is left empty
  • So the only allocation that ever happens is when frame dimensions change, which is the same as clone() but without re-allocating every single frame.

Correction: Since swap empties tf->inputBGRA, copyTo allocates fresh on the next render — so this halves allocations and copies (2 per cycle → 1), not eliminates them. That's where the ~40% benchmark improvement comes from.

@umireon
Copy link
Copy Markdown
Collaborator

umireon commented Feb 27, 2026

#224 and #225 are my PRs. Thanks for contributing stability!

Copy link
Copy Markdown
Collaborator

@umireon umireon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on the CONTRIBUTING.md now. Sorry but wait a moment, please.

@xav-ie xav-ie marked this pull request as draft February 27, 2026 12:39
Copy link
Copy Markdown
Owner

@royshil royshil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the changes check out. curious to see it in action

@umireon
Copy link
Copy Markdown
Collaborator

umireon commented Feb 27, 2026

@xav-ie I've added checklist to your description. Feel free to ask us for question, it is added to this project just now.

@umireon umireon dismissed their stale review February 27, 2026 17:53

Checklist added

Replace clone() with cv::swap() in video_tick for both background
and enhance filters. Swap is O(1) — it transfers buffer ownership
to the tick thread instead of deep-copying ~8 MB per frame.

In the render thread, use copyTo() instead of clone() for the stage
surface data. copyTo reuses the existing cv::Mat allocation when
dimensions match, avoiding a fresh heap allocation every frame.

Add a newFrameAvailable flag (protected by inputBGRALock) so the
tick thread only processes genuinely new frames and doesn't
re-process stale data after swap empties the shared buffer.

Signed-off-by: Xavier Ruiz <github@xav.ie>
@xav-ie xav-ie force-pushed the pr/fix-race-and-reduce-copies branch from c1b944d to a47ceb3 Compare March 8, 2026 05:07
@umireon
Copy link
Copy Markdown
Collaborator

umireon commented Mar 8, 2026

Licensing works are almost completed. #797

@umireon
Copy link
Copy Markdown
Collaborator

umireon commented Mar 10, 2026

@xav-ie #797 merged. Please update your branches and let me help you align with our guidelines.

@royshil
Copy link
Copy Markdown
Owner

royshil commented Mar 15, 2026

@xav-ie are you still working o n this as a draft or want to make it a legit pull requets?

@umireon umireon added the auto-review Turn on this when PR is ready label Mar 15, 2026
Comment thread src/background-filter.cpp
}
if (tf->inputBGRA.empty()) {
// No data to process
if (!tf->newFrameAvailable) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following code:

		if (!tf->newFrameAvailable) {
			return;
		}
		imageBGRA = tf->inputBGRA.clone();
		cv::swap(imageBGRA, tf->inputBGRA);
		tf->newFrameAvailable = false;

should be like:

		if (tf->newFrameAvailable.exchange(false, std::memory_order_acq_rel)) {
			cv::swap(imageBGRA, tf->inputBGRA);
		}

for better performance with appropriate memory barrier.

Comment thread src/enhance-filter.cpp
Comment on lines +257 to +261
if (!tf->newFrameAvailable) {
return;
}
cv::swap(imageBGRA, tf->inputBGRA);
tf->newFrameAvailable = false;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!tf->newFrameAvailable) {
return;
}
cv::swap(imageBGRA, tf->inputBGRA);
tf->newFrameAvailable = false;
if (!tf->newFrameAvailable.exchange(false, std::memory_order_acq_rel)) {
cv::swap(imageBGRA, tf->inputBGRA);
}

@umireon umireon added this to the 1.4.1 milestone Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-review Turn on this when PR is ready

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants