Skip to content

fix: audio/video sync drift in fixMP4BoxFileDuration#446

Merged
hughfenghen merged 2 commits into
WebAV-Tech:mainfrom
johncalvinroberts:main
Aug 1, 2025
Merged

fix: audio/video sync drift in fixMP4BoxFileDuration#446
hughfenghen merged 2 commits into
WebAV-Tech:mainfrom
johncalvinroberts:main

Conversation

@johncalvinroberts
Copy link
Copy Markdown
Contributor

Hello! This PR fixes a small issue leading to audio/video sync drift when using fastConcatMP4 to concatenate mp4s.

The problem: if input MP4 streams to fastConcatMP4 contain internal timestamp offsets that don't start at 0, cumulative A/V drift occurs during concatenation, making the video not match the audio. This was affecting both audio and video, a video output from fastConcatMP4 is both the wrong duration and showing a/v drift, and also dropping frames due to the incorrectly calculated durations.

This fixes it by normalizing sample timestamps within each stream to start from 0 before applying concatenation offsets, ensuring the correct duration is calculated. This also synchronizes audio timing to the video timeline to maintain correct a/v sync.

I don't have a repro case I can share right now but can provide if needed.

@hughfenghen
Copy link
Copy Markdown
Collaborator

Thanks for the fix.
Please provide sample files that can reproduce the issue.

@johncalvinroberts
Copy link
Copy Markdown
Contributor Author

johncalvinroberts commented Jul 29, 2025

Thanks for your response!! @hughfenghen

To demo a reproduction of the bug, please follow these steps.

  1. Firstly, clone this repro: https://github.com/johncalvinroberts/webav-concat-repro
  2. Run it locally and choose one of the examples (Bars example is the most reliable way to repro)
  3. Click "concatenate videos"
Screenshot 2025-07-29 at 11 46 45 AM
  1. To visually see the audio drift you can click Sync play both to play back the original video alongside the fastConcatMp4 concat'd video. You can also play the fastConcatMp4 concat'd video alone to see the a/v drift (watch consonants like p or b to see the drift).
  2. Download the two videos (concatenated.mp4 + original.mp4)
  3. There's a bash script in the repository with an ffprobe command to compare the two videos. Run it against the downloaded videos, like: sh ./check_lengths.sh ./original.mp4 ./concatenated.mp4
  4. Inspect the output to see discrepancies in the downloaded concatenated vs. original video, like so:
File                 | Duration (frames)     | Audio Duration  | A/V Diff
-------------------- | --------------------- | --------------- | ---------
concatenated.mp4     |   56400.0ms (1410 frames) |   56872.0ms |   +472.0ms ⚠️
original.mp4         |   56720.0ms (1418 frames) |   56746.0ms |    +26.0ms ✅

So, as you can see, the file concatenated with webav is missing 8 frames, the video duration is shorter overall and audio duration is longer overall than the original video.

My fix addresses this by normalizing sample timestamps within each input stream to start from 0 before applying concatenation offsets. The root cause was that input MP4 streams contain internal timestamp offsets that don't start at 0, which were being preserved and accumulated during concatenation, causing progressive A/V drift. The normalization eliminates these internal offsets while maintaining proper synchronization between audio and video tracks.

@hughfenghen hughfenghen merged commit cdffcbd into WebAV-Tech:main Aug 1, 2025
2 checks passed
@github-actions github-actions Bot mentioned this pull request Aug 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants