Skip to content

Commit fca0cf7

Browse files
beveradbclaude
andauthored
fix(remote): unblock audio-separator-remote for typical files (#288)
* fix(remote): unblock audio-separator-remote for typical files Two independent issues prevented `audio-separator-remote` from working end-to-end against the GCP Cloud Run deployment: 1. **CLI hits Cloud Run's 32 MiB request body limit on real audio files.** The underlying AudioSeparatorAPIClient already supports a `gcs_uri` mode where the server fetches from GCS (used by karaoke-gen), but the CLI only exposed the multipart upload path. Now the CLI detects files >30 MiB and auto-uploads to GCS, passes `gcs_uri` to the API, and cleans up the GCS object in a `finally` block (the bucket's 1-day lifecycle is the safety net). Bucket configurable via `--gcs-bucket` or `AUDIO_SEPARATOR_GCS_INPUT_BUCKET`; defaults to the existing `nomadkaraoke-audio-separator-outputs` (separator SA already has objectAdmin, no infra change needed). `google-cloud-storage` is lazy-imported with a clear install hint if missing. 2. **Cloud Run server silently runs in CPU mode, not GPU.** The image relied on `pip install ".[gpu]"` for GPU support, which only swaps in `onnxruntime-gpu` — the `torch>=2.3` constraint pulls PyPI's default CPU-only PyTorch wheel. Result: `torch.cuda.is_available()` returns False, Separator falls back to CPU, jobs run ~10x slower (50 min instead of 5 min for the vocal_balanced preset). karaoke-gen's audio-separation-job image already documents this gotcha in `Dockerfile.gpu-base:100-106`; mirroring that pattern here: install `torch==2.6.0+cu126` from the cu126 index first so audio-separator[gpu] sees torch as already satisfied. Tests: 7 new unit tests covering GCS upload helpers (blob path format, URI parsing, error handling), bucket resolution priority (--flag > env > default), and the integration into handle_separate_command (large/small file, cleanup on failure, upload failure). Bumps version to 0.44.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: pass gcs_bucket to handle_separate_command in integration test Missed this call site when updating the function signature. Unit tests in tests/unit/test_remote_cli.py were updated, but integration test test_cli_separate_command_integration still passed only 3 args. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent f510043 commit fca0cf7

6 files changed

Lines changed: 337 additions & 12 deletions

File tree

Dockerfile.cloudrun

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,17 @@ RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.12 1
5252
&& curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12 \
5353
&& python3 -m pip install --no-cache-dir --upgrade pip setuptools wheel
5454

55+
# Install PyTorch with CUDA 12.6 support BEFORE audio-separator[gpu].
56+
# Without this, `pip install ".[gpu]"` pulls the default CPU-only PyTorch wheel
57+
# from PyPI and Separator silently falls back to CPU (~10× slower).
58+
# Cloud Run L4 GPUs have NVIDIA driver 570 (supports up to CUDA 12.8), so cu126
59+
# works. cu130 would fail with "NVIDIA driver is too old".
60+
# Installing torch first means audio-separator[gpu] sees it already satisfied.
61+
RUN pip install --no-cache-dir \
62+
torch==2.6.0+cu126 \
63+
torchvision==0.21.0+cu126 \
64+
--index-url https://download.pytorch.org/whl/cu126
65+
5566
# Install audio-separator with GPU support and API dependencies
5667
COPY . /tmp/audio-separator-src
5768
RUN cd /tmp/audio-separator-src \

audio_separator/remote/README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,22 @@ audio-separator-remote separate audio.wav \
200200
--vr_aggression 10
201201
```
202202

203+
**Large files (>30 MiB):**
204+
205+
When the deployment runs on Cloud Run, request bodies are capped at 32 MiB. For larger inputs the CLI automatically uploads the file to GCS first and tells the server to fetch from `gs://...`, bypassing the limit. This is transparent — the same `separate` command works for any file size:
206+
207+
```bash
208+
# Same command, file size detected automatically
209+
audio-separator-remote separate big_song.wav --preset vocal_balanced
210+
```
211+
212+
Requirements when the GCS path activates:
213+
- Application Default Credentials on the laptop (`gcloud auth application-default login`)
214+
- Write permission on the input bucket (defaults to `nomadkaraoke-audio-separator-outputs`)
215+
- The Cloud Run service account needs read permission on the same bucket (it already does for the default bucket)
216+
217+
Override the bucket with `--gcs-bucket my-bucket` or by setting `AUDIO_SEPARATOR_GCS_INPUT_BUCKET`. Uploaded inputs are deleted after the job finishes (success or failure); the bucket's lifecycle policy is the safety net if cleanup fails.
218+
203219
**Check job status:**
204220

205221
```bash
@@ -236,6 +252,7 @@ audio-separator-remote --version
236252
**Global Options:**
237253

238254
- `--api_url`: Override the API URL
255+
- `--gcs-bucket`: Bucket used for the >30 MiB upload fallback (env: `AUDIO_SEPARATOR_GCS_INPUT_BUCKET`, default: `nomadkaraoke-audio-separator-outputs`)
239256
- `--timeout`: Set timeout for polling (default: 600 seconds)
240257
- `--poll_interval`: Set polling interval (default: 10 seconds)
241258
- `--debug`: Enable debug logging

audio_separator/remote/cli.py

Lines changed: 94 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,66 @@
55
import os
66
import sys
77
import time
8+
import uuid
89
from importlib import metadata
910

1011
from audio_separator.remote import AudioSeparatorAPIClient
1112

13+
# Cloud Run hard-limits request bodies to 32 MiB. Use 30 MiB threshold so a
14+
# little request overhead won't push us over. Larger files go via GCS.
15+
GCS_UPLOAD_THRESHOLD_BYTES = 30 * 1024 * 1024
16+
DEFAULT_GCS_INPUT_BUCKET = "nomadkaraoke-audio-separator-outputs"
17+
GCS_INPUT_PREFIX = "cli-uploads"
18+
19+
20+
def upload_to_gcs(file_path: str, bucket_name: str, logger: logging.Logger) -> str:
21+
"""Upload a local file to GCS and return its gs:// URI.
22+
23+
Requires `google-cloud-storage` and Application Default Credentials
24+
(run `gcloud auth application-default login` on the laptop).
25+
"""
26+
try:
27+
from google.cloud import storage
28+
except ImportError as e:
29+
raise RuntimeError(
30+
"google-cloud-storage is required to upload files larger than "
31+
f"{GCS_UPLOAD_THRESHOLD_BYTES // (1024 * 1024)} MiB. "
32+
"Install it with: pip install google-cloud-storage"
33+
) from e
34+
35+
filename = os.path.basename(file_path)
36+
blob_path = f"{GCS_INPUT_PREFIX}/{uuid.uuid4()}-{filename}"
37+
gcs_uri = f"gs://{bucket_name}/{blob_path}"
38+
39+
size_mib = os.path.getsize(file_path) / (1024 * 1024)
40+
logger.info(f"Uploading {size_mib:.1f} MiB to {gcs_uri} (server fetches from GCS, bypasses Cloud Run 32 MiB limit)")
41+
42+
client = storage.Client()
43+
bucket = client.bucket(bucket_name)
44+
blob = bucket.blob(blob_path)
45+
blob.upload_from_filename(file_path)
46+
47+
logger.info(f"Upload complete: {gcs_uri}")
48+
return gcs_uri
49+
50+
51+
def delete_from_gcs(gcs_uri: str, logger: logging.Logger) -> None:
52+
"""Best-effort delete of a GCS object. Logs but doesn't raise on failure."""
53+
try:
54+
from google.cloud import storage
55+
56+
without_prefix = gcs_uri[len("gs://"):]
57+
slash_idx = without_prefix.index("/")
58+
bucket_name = without_prefix[:slash_idx]
59+
blob_path = without_prefix[slash_idx + 1:]
60+
61+
client = storage.Client()
62+
bucket = client.bucket(bucket_name)
63+
bucket.blob(blob_path).delete()
64+
logger.info(f"Cleaned up uploaded input: {gcs_uri}")
65+
except Exception as e:
66+
logger.warning(f"Failed to delete {gcs_uri}: {e} (bucket lifecycle will reclaim it)")
67+
1268

1369
def main():
1470
"""Main entry point for the remote CLI."""
@@ -104,6 +160,13 @@ def main():
104160
parser.add_argument("-d", "--debug", action="store_true", help="Enable debug logging")
105161
parser.add_argument("--log_level", default="info", help="Log level (default: info)")
106162
parser.add_argument("--api_url", help="API URL (overrides AUDIO_SEPARATOR_API_URL env var)")
163+
parser.add_argument(
164+
"--gcs-bucket",
165+
help=(
166+
f"GCS bucket for uploading files >{GCS_UPLOAD_THRESHOLD_BYTES // (1024 * 1024)} MiB "
167+
f"(overrides AUDIO_SEPARATOR_GCS_INPUT_BUCKET env var, default: {DEFAULT_GCS_INPUT_BUCKET})"
168+
),
169+
)
107170

108171
args = parser.parse_args()
109172

@@ -145,9 +208,12 @@ def main():
145208
# Create API client
146209
api_client = AudioSeparatorAPIClient(api_url, logger)
147210

211+
# Resolve GCS bucket for large-file uploads
212+
gcs_bucket = args.gcs_bucket or os.environ.get("AUDIO_SEPARATOR_GCS_INPUT_BUCKET", DEFAULT_GCS_INPUT_BUCKET)
213+
148214
# Handle commands
149215
if args.command == "separate":
150-
handle_separate_command(args, api_client, logger)
216+
handle_separate_command(args, api_client, logger, gcs_bucket)
151217
elif args.command == "status":
152218
handle_status_command(args, api_client, logger)
153219
elif args.command == "models":
@@ -159,14 +225,35 @@ def main():
159225
sys.exit(1)
160226

161227

162-
def handle_separate_command(args, api_client: AudioSeparatorAPIClient, logger: logging.Logger):
228+
def handle_separate_command(args, api_client: AudioSeparatorAPIClient, logger: logging.Logger, gcs_bucket: str):
163229
"""Handle the separate command."""
164230
for audio_file in args.audio_files:
165-
logger.info(f"Uploading '{audio_file}' to audio separator...")
231+
logger.info(f"Processing '{audio_file}'...")
232+
233+
# Decide upload path: small files go via multipart POST, large files via GCS
234+
# to bypass the Cloud Run 32 MiB request body limit.
235+
uploaded_gcs_uri = None
236+
try:
237+
file_size = os.path.getsize(audio_file)
238+
use_gcs = file_size > GCS_UPLOAD_THRESHOLD_BYTES
239+
except OSError as e:
240+
logger.error(f"❌ Cannot read '{audio_file}': {e}")
241+
continue
166242

167243
try:
244+
if use_gcs:
245+
logger.info(
246+
f"File is {file_size / (1024 * 1024):.1f} MiB (>{GCS_UPLOAD_THRESHOLD_BYTES // (1024 * 1024)} MiB), "
247+
"uploading via GCS"
248+
)
249+
uploaded_gcs_uri = upload_to_gcs(audio_file, gcs_bucket, logger)
250+
source_kwargs = {"file_path": None, "gcs_uri": uploaded_gcs_uri}
251+
else:
252+
source_kwargs = {"file_path": audio_file, "gcs_uri": None}
253+
168254
# Prepare parameters for separation
169255
kwargs = {
256+
**source_kwargs,
170257
"model": args.model,
171258
"models": args.models,
172259
"preset": args.preset,
@@ -213,7 +300,7 @@ def handle_separate_command(args, api_client: AudioSeparatorAPIClient, logger: l
213300
}
214301

215302
# Use the convenience method that handles everything
216-
result = api_client.separate_audio_and_wait(audio_file, **kwargs)
303+
result = api_client.separate_audio_and_wait(**kwargs)
217304

218305
if result["status"] == "completed":
219306
if "downloaded_files" in result:
@@ -227,6 +314,9 @@ def handle_separate_command(args, api_client: AudioSeparatorAPIClient, logger: l
227314

228315
except Exception as e:
229316
logger.error(f"❌ Error processing '{audio_file}': {e}")
317+
finally:
318+
if uploaded_gcs_uri:
319+
delete_from_gcs(uploaded_gcs_uri, logger)
230320

231321

232322
def handle_status_command(args, api_client: AudioSeparatorAPIClient, logger: logging.Logger):

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
44

55
[tool.poetry]
66
name = "audio-separator"
7-
version = "0.44.1"
7+
version = "0.44.2"
88
description = "Easy to use audio stem separation, using various models from UVR trained primarily by @Anjok07"
99
authors = ["Andrew Beveridge <andrew@beveridge.uk>"]
1010
license = "MIT"

tests/integration/test_remote_api_integration.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -487,7 +487,7 @@ def test_cli_separate_command_integration(self, mock_client_class, test_audio_fi
487487
logger = Mock()
488488

489489
# Execute the command
490-
handle_separate_command(args, mock_client, logger)
490+
handle_separate_command(args, mock_client, logger, "test-bucket")
491491

492492
# Verify the API client method was called
493493
mock_client.separate_audio_and_wait.assert_called_once()

0 commit comments

Comments
 (0)