@@ -7,6 +7,7 @@ Considering [their pricing](https://modal.com/pricing) for execution with an Nvi
77With $30/month in free credits, that's over ** 1,500 GPU-accelerated audio separation jobs per month, for free!**
88
99** ✨ Key Features:**
10+
1011- ** Multiple Model Support** : Upload once, separate with multiple models in a single job
1112- ** Full Parameter Compatibility** : All local CLI parameters and architecture settings supported
1213- ** Efficient Processing** : Avoid repeated uploads when comparing different models
@@ -27,7 +28,7 @@ graph TD
2728 G --> J["📥 Download All Results<br/>Compare quality across models"]
2829 H --> J
2930 I --> J
30-
31+
3132 style A fill:#e1f5fe,color:#000
3233 style B fill:#f3e5f5,color:#000
3334 style C fill:#fff3e0,color:#000
@@ -49,16 +50,17 @@ To use the remote API functionality, you'll need to deploy the Audio Separator A
4950 pip install modal
5051 modal setup
5152 ```
52- 4 . ** Deploy the Audio Separator API** :
53+ 3 . ** Deploy the Audio Separator API** :
5354 ``` bash
5455 modal deploy audio_separator/remote/deploy_modal.py
5556 ```
56- 5 . ** Get your API URL** from the deployment output. It will look like:
57+ 4 . ** Get your API URL** from the deployment output. It will look like:
5758 ```
5859 https://USERNAME--audio-separator-api.modal.run
5960 ```
6061
6162Set this API URL as an environment variable:
63+
6264``` bash
6365export AUDIO_SEPARATOR_API_URL=" https://USERNAME--audio-separator-api.modal.run"
6466```
@@ -117,15 +119,15 @@ result = api_client.separate_audio_and_wait(
117119 # MDX parameters
118120 mdx_segment_size = 512 ,
119121 mdx_batch_size = 2 ,
120- # VR parameters
122+ # VR parameters
121123 vr_aggression = 10 ,
122124 vr_window_size = 320 ,
123125 # And any other separator parameters...
124126)
125127
126128# Advanced approach: manual job management (for custom polling logic)
127129result = api_client.separate_audio(
128- " path/to/audio.wav" ,
130+ " path/to/audio.wav" ,
129131 models = [" model1.ckpt" , " model2.onnx" ],
130132 custom_output_names = {" Vocals" : " vocals_output" , " Instrumental" : " instrumental_output" }
131133)
@@ -137,19 +139,19 @@ import time
137139while True :
138140 status = api_client.get_job_status(task_id)
139141 print (f " Job status: { status[' status' ]} " )
140-
142+
141143 # Show progress with model information
142144 if " progress" in status:
143145 progress_info = f " Progress: { status[' progress' ]} % "
144146 if " current_model_index" in status and " total_models" in status:
145147 model_info = f " (Model { status[' current_model_index' ] + 1 } / { status[' total_models' ]} ) "
146148 progress_info += model_info
147149 print (progress_info)
148-
150+
149151 if status[" status" ] == " completed" :
150152 # Download files manually
151- for filename in status[" files" ]:
152- output_path = api_client.download_file (task_id, filename)
153+ for filehash, filename in status[" files" ].items() :
154+ output_path = api_client.download_file_by_hash (task_id, filehash , filename)
153155 print (f " Downloaded: { output_path} " )
154156 break
155157 elif status[" status" ] == " error" :
@@ -174,6 +176,7 @@ Audio Separator also provides a command-line interface for interacting with remo
174176#### Commands
175177
176178** Separate audio files:**
179+
177180``` bash
178181# Separate audio file (asynchronous processing)
179182audio-separator-remote separate audio.wav --model model_bs_roformer_ep_317_sdr_12.9755.ckpt
@@ -198,11 +201,13 @@ audio-separator-remote separate audio.wav \
198201```
199202
200203** Check job status:**
204+
201205``` bash
202206audio-separator-remote status < task_id>
203207```
204208
205209** List available models:**
210+
206211``` bash
207212# Pretty formatted list
208213audio-separator-remote models
@@ -215,29 +220,34 @@ audio-separator-remote models --filter vocals
215220```
216221
217222** Download specific files:**
223+
218224``` bash
219225audio-separator-remote download < task_id> filename1.wav filename2.wav
220226```
221227
222228** Get version information:**
229+
223230``` bash
224231audio-separator-remote --version
225232```
226233
227234#### CLI Options
228235
229236** Global Options:**
237+
230238- ` --api_url ` : Override the API URL
231239- ` --timeout ` : Set timeout for polling (default: 600 seconds)
232240- ` --poll_interval ` : Set polling interval (default: 10 seconds)
233241- ` --debug ` : Enable debug logging
234242- ` --log_level ` : Set log level (info, debug, warning, etc.)
235243
236244** Model Selection:**
245+
237246- ` --model ` : Single model to use for separation
238247- ` --models ` : Multiple models to use for separation (space-separated)
239248
240249** Output Parameters:**
250+
241251- ` --output_format ` : Output format (default: flac)
242252- ` --output_bitrate ` : Output bitrate
243253- ` --normalization ` : Max peak amplitude to normalize to (default: 0.9)
@@ -251,6 +261,7 @@ audio-separator-remote --version
251261
252262** Architecture-Specific Parameters:**
253263All MDX, VR, Demucs, and MDXC parameters from the local CLI are supported:
264+
254265- ` --mdx_segment_size ` , ` --mdx_overlap ` , ` --mdx_batch_size ` , etc.
255266- ` --vr_batch_size ` , ` --vr_window_size ` , ` --vr_aggression ` , etc.
256267- ` --demucs_segment_size ` , ` --demucs_shifts ` , ` --demucs_overlap ` , etc.
@@ -289,6 +300,7 @@ audio-separator-remote separate vocals.wav \
289300#### Key Features
290301
291302The remote API client automatically handles:
303+
292304- ** File uploading and downloading** : Seamless transfer of audio files and results
293305- ** Multiple model processing** : Upload once, separate with multiple models efficiently
294306- ** Full separator compatibility** : All local CLI parameters and architectures supported
@@ -299,12 +311,14 @@ The remote API client automatically handles:
299311#### Benefits of Multiple Model Support
300312
301313When using multiple models, the remote API provides significant advantages:
314+
302315- ** Efficiency** : Upload your audio file once, process with multiple models without re-uploading
303316- ** Comparison** : Easily compare results from different models (e.g., vocals vs. instrumental quality)
304317- ** Workflow optimization** : Process with complementary models in a single job
305318- ** Time savings** : Avoid repeated upload times for large audio files
306319
307320Example use cases:
321+
308322- Compare quality between ` model_bs_roformer_ep_317_sdr_12.9755.ckpt ` (high-quality vocals) and ` UVR-MDX-NET-Inst_HQ_4.onnx ` (high-quality instrumental)
309323- Process with both 2-stem models (vocals/instrumental) and multi-stem models (vocals/drums/bass/other) in one job
310324- Use different models optimized for different parts of the frequency spectrum
0 commit comments