Add Recursive Scene Change Detection#190
Draft
BoatsMcGee wants to merge 2 commits into
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Decoding linearly with VapoursynthDecoder does not fully utilize the system resources. This PR aims to add a recursive method to split the decode work into a binary tree and process nodes concurrently to maximize resource utilization and potentially speed up Scene Detection.
Uses the latest changes from rust-av/av-decoders#1 to seek and decode randomly.
Process
Split the video in half and begin seeking from the middle to the next scene change. Use this scene change point as the boundary for 2 child nodes: Left and Right. Left handles from the start to the scene change point and right handles from the scene change point to the end. Each child node repeats this process until some conditions such as when the node is shorter than 10 seconds. Tree initialization should also be parallelized when initializing the child nodes. We can limit the concurrency by using a semaphore.
Once the tree is initialized, it is traversed depth-first to the leaf nodes which will handle the actual work of decoding and finding scene change keyframes. All leaf nodes should be processed in chronological order before any parent nodes begin merging child node results. These are the processes that should be parallelized to speed up Scene Detection.
Caveats
Currently, the code is entirely linear and has no concurrency yet. The challenge is passing the decoder to a spawned thread, which is nearly impossible. Instead, building the tree and working on leaf nodes should not require individual access to the decoder. It should be rewritten without it.
There is one case that may need to be addressed: A node that starts from a non-keyframe might miss a valid scene change because it is too close to the starting point. In case the minimum scene length distance option prevents a node from finding a valid scene change point that is too close to the starting point, we can instead move the starting point backwards by the minimum scene length distance and analyze from there. This can happen on a Right node when
!node.start_is_keyframeand the actual scene change keyframe is betweenstart(which is the "blind" middle) andstart + options.minimum_scene_length. Moving the starting point back by that distance should avoid this condition. Also if there are somehow multiple scene change keyframes found within that range, we should use the one closest to thestart("blind" middle) or in other words the last one.Merging may result in a scene that violates the
options.maximum_scene_length, but splitting results in at least one of the scenes violatingoptions.minimum_scene_length. For this reason specifyingoptions.maximum_scene_lengthshould be disabled or fallback to linear decoding.Example Output
Note: this is linear and not yet concurrent but demonstrates the order in which processing occurs for both Initialization and Scene Detection.
Thanks,
- Boats M.