Skip to content

Fix/s2v sekotalk only#1194

Merged
helloyongyang merged 2 commits into
mainfrom
fix/s2v-sekotalk-only
Jun 29, 2026
Merged

Fix/s2v sekotalk only#1194
helloyongyang merged 2 commits into
mainfrom
fix/s2v-sekotalk-only

Conversation

@xinyiqin

Copy link
Copy Markdown
Contributor

support 1080p sekotalk

xinyiqin added 2 commits June 29, 2026 15:27
…Talk

- wan_audio_runner: add 1080p tier to fixed_min_side resize mode
- wan_audio_runner: read resize_mode/fixed_area from input_info first, fallback to config
- wan_audio_runner: re-mux with original audio after local inference to replace 16kHz VARecorder output
- input_info: add fixed_area field to S2VInputInfo
@xinyiqin xinyiqin requested a review from helloyongyang June 29, 2026 08:05

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for 1080p resolution in the image resizing logic, refactors image resizing and spatial size resolution into helper methods to reduce duplication, and adds the fixed_area field to S2VInputInfo. Feedback highlights a potential bug in _get_image_resize_kwargs where a truthy UNSET value could bypass the fallback configuration, missing fixed_area definitions in related input classes (SekoTalkInputs and RS2VInputInfo), and a redundant cleanup workaround in run_main that should instead be resolved by making the finally block robust against uninitialized models.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +364 to +371
def _get_image_resize_kwargs(self):
input_info = getattr(self, "input_info", None)
return {
"resize_mode": (getattr(input_info, "resize_mode", None) if input_info is not None else None) or self.config.get("resize_mode", "adaptive"),
"bucket_shape": self.config.get("bucket_shape", None),
"fixed_area": (getattr(input_info, "fixed_area", None) if input_info is not None else None) or self.config.get("fixed_area", None),
"fixed_shape": self.config.get("fixed_shape", None),
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In _get_image_resize_kwargs, getattr(input_info, "resize_mode", None) can return UNSET (which is a truthy _UnsetType instance). Because UNSET is truthy, the expression getattr(...) or self.config.get(...) will evaluate to UNSET instead of falling back to the config default. This will subsequently cause an AssertionError in resize_image because UNSET is not a valid resize mode.

We should explicitly check if the attribute is UNSET or None before falling back.

Suggested change
def _get_image_resize_kwargs(self):
input_info = getattr(self, "input_info", None)
return {
"resize_mode": (getattr(input_info, "resize_mode", None) if input_info is not None else None) or self.config.get("resize_mode", "adaptive"),
"bucket_shape": self.config.get("bucket_shape", None),
"fixed_area": (getattr(input_info, "fixed_area", None) if input_info is not None else None) or self.config.get("fixed_area", None),
"fixed_shape": self.config.get("fixed_shape", None),
}
def _get_image_resize_kwargs(self):
input_info = getattr(self, "input_info", None)
resize_mode = getattr(input_info, "resize_mode", None) if input_info is not None else None
if resize_mode is None or resize_mode is UNSET:
resize_mode = self.config.get("resize_mode", "adaptive")
fixed_area = getattr(input_info, "fixed_area", None) if input_info is not None else None
if fixed_area is None or fixed_area is UNSET:
fixed_area = self.config.get("fixed_area", None)
return {
"resize_mode": resize_mode,
"bucket_shape": self.config.get("bucket_shape", None),
"fixed_area": fixed_area,
"fixed_shape": self.config.get("fixed_shape", None),
}

stream_config: dict = field(default_factory=dict)
# shape related
resize_mode: str = field(default_factory=str)
fixed_area: str = field(default_factory=str)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While fixed_area has been added to S2VInputInfo, it is missing from SekoTalkInputs (which is the actual class instantiated for s2v and rs2v tasks in init_input_info_from_args) and RS2VInputInfo. This omission means fixed_area cannot be passed or resolved correctly during runtime for these tasks.

Please add fixed_area: str | Any = UNSET to SekoTalkInputs and fixed_area: str = field(default_factory=str) to RS2VInputInfo to ensure compatibility.

Comment on lines 729 to +735
if self.va_controller.reader is None:
return super().run_main()
result = super().run_main()
# Stop VARecorder so ffmpeg finishes writing the file
if self.va_controller is not None:
self.va_controller.clear()
self.va_controller = None
return result

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The manual cleanup of self.va_controller before returning result is redundant because the finally block at the end of run_main (line 793) is guaranteed to execute when the try block exits (including via a return statement).

However, the finally block currently contains a bug: if hasattr(self.model, "inputs"): will raise an AttributeError if self.model has not been initialized yet (which is the case when self.va_controller.reader is None and we return early). This crash in the finally block is likely why this manual cleanup was added as a workaround.

Instead of adding redundant cleanup here, please make the finally block robust (e.g., if getattr(self, "model", None) is not None and hasattr(self.model, "inputs"):) and keep this early return simple.

Suggested change
if self.va_controller.reader is None:
return super().run_main()
result = super().run_main()
# Stop VARecorder so ffmpeg finishes writing the file
if self.va_controller is not None:
self.va_controller.clear()
self.va_controller = None
return result
if self.va_controller.reader is None:
return super().run_main()

@helloyongyang helloyongyang merged commit 09f6471 into main Jun 29, 2026
2 checks passed
@helloyongyang helloyongyang deleted the fix/s2v-sekotalk-only branch June 29, 2026 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants