Skip to content

Add Recursive Scene Change Detection#190

Draft
BoatsMcGee wants to merge 2 commits into
rust-av:masterfrom
BoatsMcGee:recursive-decode
Draft

Add Recursive Scene Change Detection#190
BoatsMcGee wants to merge 2 commits into
rust-av:masterfrom
BoatsMcGee:recursive-decode

Conversation

@BoatsMcGee
Copy link
Copy Markdown
Contributor

@BoatsMcGee BoatsMcGee commented Jul 7, 2025

This is a work in progress and is in need of a major refactor to support concurrency without moving certain objects in memory when using std::thread::spawn(move || {}).

Decoding linearly with VapoursynthDecoder does not fully utilize the system resources. This PR aims to add a recursive method to split the decode work into a binary tree and process nodes concurrently to maximize resource utilization and potentially speed up Scene Detection.

Uses the latest changes from rust-av/av-decoders#1 to seek and decode randomly.

Process

Split the video in half and begin seeking from the middle to the next scene change. Use this scene change point as the boundary for 2 child nodes: Left and Right. Left handles from the start to the scene change point and right handles from the scene change point to the end. Each child node repeats this process until some conditions such as when the node is shorter than 10 seconds. Tree initialization should also be parallelized when initializing the child nodes. We can limit the concurrency by using a semaphore.

Once the tree is initialized, it is traversed depth-first to the leaf nodes which will handle the actual work of decoding and finding scene change keyframes. All leaf nodes should be processed in chronological order before any parent nodes begin merging child node results. These are the processes that should be parallelized to speed up Scene Detection.

Caveats

Currently, the code is entirely linear and has no concurrency yet. The challenge is passing the decoder to a spawned thread, which is nearly impossible. Instead, building the tree and working on leaf nodes should not require individual access to the decoder. It should be rewritten without it.

There is one case that may need to be addressed: A node that starts from a non-keyframe might miss a valid scene change because it is too close to the starting point. In case the minimum scene length distance option prevents a node from finding a valid scene change point that is too close to the starting point, we can instead move the starting point backwards by the minimum scene length distance and analyze from there. This can happen on a Right node when !node.start_is_keyframe and the actual scene change keyframe is between start (which is the "blind" middle) and start + options.minimum_scene_length. Moving the starting point back by that distance should avoid this condition. Also if there are somehow multiple scene change keyframes found within that range, we should use the one closest to the start ("blind" middle) or in other words the last one.

Merging may result in a scene that violates the options.maximum_scene_length, but splitting results in at least one of the scenes violating options.minimum_scene_length. For this reason specifying options.maximum_scene_length should be disabled or fallback to linear decoding.

Example Output

Note: this is linear and not yet concurrent but demonstrates the order in which processing occurs for both Initialization and Scene Detection.

Starting detection
Node[0-2213]: Initializing
Node[0-1106]: Initializing
Node[0-553]: Initializing
Node[0-276]: Initializing
Node[0-276]: Done initializing
Node[276-553]: Initializing
Node[276-553]: Done initializing
Node[0-553]: Done initializing
Node[553-1106]: Initializing
Node[553-829]: Initializing
Node[553-829]: Done initializing
Node[829-1106]: Initializing
Node[829-1106]: Done initializing
Node[553-1106]: Done initializing
Node[0-1106]: Done initializing
Node[1106-2213]: Initializing
Node[1106-1659]: Initializing
Node[1106-1382]: Initializing
Node[1106-1382]: Done initializing
Node[1382-1659]: Initializing
test tests::test_vapoursynth_detect has been running for over 60 seconds
Node[1382-1659]: Done initializing
Node[1106-1659]: Done initializing
Node[1659-2213]: Initializing
Node[1659-1936]: Initializing
Node[1659-1936]: Done initializing
Node[1936-2213]: Initializing
Node[1936-2213]: Done initializing
Node[1659-2213]: Done initializing
Node[1106-2213]: Done initializing
Node[0-2213]: Done initializing
Node[0-153]: Starting detection
Node[0-153]: Finished detection
Node[153-276]: Starting detection
Node[153-276]: Finished detection
Node[276-466]: Starting detection
Node[276-466]: Finished detection
Node[466-553]: Starting detection
Node[466-553]: Finished detection
Node[553-693]: Starting detection
Node[553-693]: Finished detection
Node[693-829]: Starting detection
Node[693-829]: Finished detection
Node[829-1004]: Starting detection
Node[829-1004]: Finished detection
Node[1004-1106]: Starting detection
Node[1004-1106]: Finished detection
Node[1106-1244]: Starting detection
Node[1106-1244]: Finished detection
Node[1244-1382]: Starting detection
Node[1244-1382]: Finished detection
Node[1382-1520]: Starting detection
Node[1382-1520]: Finished detection
Node[1520-1659]: Starting detection
Node[1520-1659]: Finished detection
Node[1659-1798]: Starting detection
Node[1659-1798]: Finished detection
Node[1798-1936]: Starting detection
Node[1798-1936]: Finished detection
Node[1936-2103]: Starting detection
Node[1936-2103]: Finished detection
Node[2103-2213]: Starting detection
Node[2103-2213]: Finished detection
DetectionResults { scene_changes: [0, 93, 153, 209, 310, 466, 670, 693, 1004, 1183, 1429, 1798, 1902, 1965, 2103], ...

Thanks,
- Boats M.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant