You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(remote): unblock audio-separator-remote for typical files (#288)
* fix(remote): unblock audio-separator-remote for typical files
Two independent issues prevented `audio-separator-remote` from working
end-to-end against the GCP Cloud Run deployment:
1. **CLI hits Cloud Run's 32 MiB request body limit on real audio files.**
The underlying AudioSeparatorAPIClient already supports a `gcs_uri`
mode where the server fetches from GCS (used by karaoke-gen), but the
CLI only exposed the multipart upload path. Now the CLI detects files
>30 MiB and auto-uploads to GCS, passes `gcs_uri` to the API, and
cleans up the GCS object in a `finally` block (the bucket's 1-day
lifecycle is the safety net). Bucket configurable via `--gcs-bucket`
or `AUDIO_SEPARATOR_GCS_INPUT_BUCKET`; defaults to the existing
`nomadkaraoke-audio-separator-outputs` (separator SA already has
objectAdmin, no infra change needed). `google-cloud-storage` is
lazy-imported with a clear install hint if missing.
2. **Cloud Run server silently runs in CPU mode, not GPU.** The image
relied on `pip install ".[gpu]"` for GPU support, which only swaps in
`onnxruntime-gpu` — the `torch>=2.3` constraint pulls PyPI's default
CPU-only PyTorch wheel. Result: `torch.cuda.is_available()` returns
False, Separator falls back to CPU, jobs run ~10x slower (50 min
instead of 5 min for the vocal_balanced preset). karaoke-gen's
audio-separation-job image already documents this gotcha in
`Dockerfile.gpu-base:100-106`; mirroring that pattern here:
install `torch==2.6.0+cu126` from the cu126 index first so
audio-separator[gpu] sees torch as already satisfied.
Tests: 7 new unit tests covering GCS upload helpers (blob path format,
URI parsing, error handling), bucket resolution priority (--flag > env >
default), and the integration into handle_separate_command (large/small
file, cleanup on failure, upload failure).
Bumps version to 0.44.2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test: pass gcs_bucket to handle_separate_command in integration test
Missed this call site when updating the function signature. Unit tests
in tests/unit/test_remote_cli.py were updated, but integration test
test_cli_separate_command_integration still passed only 3 args.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: audio_separator/remote/README.md
+17Lines changed: 17 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -200,6 +200,22 @@ audio-separator-remote separate audio.wav \
200
200
--vr_aggression 10
201
201
```
202
202
203
+
**Large files (>30 MiB):**
204
+
205
+
When the deployment runs on Cloud Run, request bodies are capped at 32 MiB. For larger inputs the CLI automatically uploads the file to GCS first and tells the server to fetch from `gs://...`, bypassing the limit. This is transparent — the same `separate` command works for any file size:
206
+
207
+
```bash
208
+
# Same command, file size detected automatically
209
+
audio-separator-remote separate big_song.wav --preset vocal_balanced
210
+
```
211
+
212
+
Requirements when the GCS path activates:
213
+
- Application Default Credentials on the laptop (`gcloud auth application-default login`)
214
+
- Write permission on the input bucket (defaults to `nomadkaraoke-audio-separator-outputs`)
215
+
- The Cloud Run service account needs read permission on the same bucket (it already does for the default bucket)
216
+
217
+
Override the bucket with `--gcs-bucket my-bucket` or by setting `AUDIO_SEPARATOR_GCS_INPUT_BUCKET`. Uploaded inputs are deleted after the job finishes (success or failure); the bucket's lifecycle policy is the safety net if cleanup fails.
-`--gcs-bucket`: Bucket used for the >30 MiB upload fallback (env: `AUDIO_SEPARATOR_GCS_INPUT_BUCKET`, default: `nomadkaraoke-audio-separator-outputs`)
239
256
-`--timeout`: Set timeout for polling (default: 600 seconds)
240
257
-`--poll_interval`: Set polling interval (default: 10 seconds)
0 commit comments