Skip to content

fix(ffmpeg): drop non-primary audio streams to support iPhone spatial video#91

Open
yangxin317 wants to merge 1 commit into
FireRedTeam:mainfrom
yangxin317:fix/spatial-audio-apac-stream
Open

fix(ffmpeg): drop non-primary audio streams to support iPhone spatial video#91
yangxin317 wants to merge 1 commit into
FireRedTeam:mainfrom
yangxin317:fix/spatial-audio-apac-stream

Conversation

@yangxin317
Copy link
Copy Markdown

Summary

iPhone 15/16 Pro shoots Spatial Video (.mov) with two audio streams: a regular AAC stereo track plus an Apple Spatial Audio Codec (apac) track. ffmpeg has no codec tag for apac in the mp4 container, so a stream-copy that maps every audio stream during shot segmentation fails:

[mp4]          Could not find tag for codec none in stream #2,
               codec not currently supported in container
[out#0/segment] Could not write header (incorrect codec parameters?)

The downstream LLM then receives the ffmpeg error text as the clip description and refuses to proceed, so the entire workflow stalls right after the very first user message.

Fix

Restrict the implicit audio mapping in SAFE_MAP_ARGS to the first audio stream (-map 0:a:0?). The ? suffix keeps the mapping optional for media that has no audio, preserving previous behavior on those files.

-SAFE_MAP_ARGS = ["-map", "0:v:0", "-map", "0:a?",  "-dn", "-sn"]
+SAFE_MAP_ARGS = ["-map", "0:v:0", "-map", "0:a:0?", "-dn", "-sn"]

Test plan

  • Reproduced failure on a 4K HEVC Spatial Video .mov from an iPhone 15 Pro (apac audio stream at index 2)
  • After patch, ffmpeg ... -f segment ... produces valid mp4 segments and split_shots completes
  • Re-tested on plain mp4 (no spatial audio) and on a clip with no audio — semantics unchanged
  • Full workflow (split_shotsunderstand_clips) progresses past the failure point

Affected files

src/open_storyline/utils/ffmpeg_utils.py (1 line)

… video

iPhone 15 Pro+ records Spatial Video with two audio streams: a regular
stereo AAC track plus an Apple Spatial Audio Codec (apac) track. ffmpeg
has no codec tag for `apac` in the mp4 container, so a stream-copy that
maps all audio streams during shot segmentation fails with:

    [mp4] Could not find tag for codec none in stream FireRedTeam#2,
    codec not currently supported in container
    [out#0/segment] Could not write header (incorrect codec parameters?)

The downstream LLM then receives the ffmpeg error text as the clip
description and refuses to proceed, so the entire workflow stalls right
after the first user message.

Restrict the implicit audio mapping to the first audio stream
(`0:a:0?`) so the apac stream is dropped during segmentation. The `?`
suffix keeps the mapping optional for media without audio.

Repro: upload an iPhone 15/16 Pro Spatial Video .mov and run any
workflow that triggers `split_shots`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant