Local-first script-to-audio production studio.
Free Public Beta: this repository contains the public launcher, Docker configuration, docs, samples, and helper scripts. Source is available for the launcher; the core Draft to Take engine/app source is private for now. Model weights are not bundled and download to your own machine when needed.
Turn your scripts into finished multi-speaker audio, complete with emotion, sound design, and timeline mixing, all running locally on your machine.
Think ElevenLabs-style script production, but Windows-local, IndexTTS2-powered, and built for creators who want control over voices, takes, emotion, SFX, ambience, music, and export.
Watch the 23-second app preview
Formerly IndexTTS Workflow Studio. This repository is the public beta home for Draft to Take, the next-generation version of the original prototype. Most Windows testers should start with the Docker launcher attached to the latest release; the native Windows installer is still experimental.
Most TTS tools are great for one line at a time. Draft to Take is built for the whole production loop:
Write or import a script -> assign prepared voices -> detect emotion -> generate takes -> lock the good ones -> add sound cues -> export the mix
Use it for audio drama, game dialogue, audiobook tests, YouTube narration, podcast sketches, horror scenes, or any project where you want a local script-to-audio workflow instead of a cloud text box.
This beta repo contains the Docker launcher, configuration, diagnostics scripts, tester docs, samples, and an experimental Windows installer preview. It does not contain the private core engine/app source code or model weights. The Docker launcher and installer both download supported model files into your own local machine.
Looking for the old prototype? The previous IndexTTS Workflow Studio code is preserved on the legacy-v2 branch and the v2-legacy-final tag.
Short generated examples:
What you are hearing: audio generated through the Draft to Take workflow using local model-backed dialogue/emotion tooling. Output quality depends on your source voices, settings, model downloads, and hardware.
This is the recommended public beta path while the native installer is still being hardened.
- Open the latest beta release: Draft to Take v3.0.0 beta 19.
- Download
DraftToTake-Docker-Launcher-v3.0.0-beta.19.zipfrom the assets. - Extract it somewhere simple, for example
C:\DraftToTakeBeta. - Start Docker Desktop.
- Double-click
start.bat. - Open the URL printed in the terminal, usually:
http://localhost:3000
First launch can be slow because Docker images and model files are large. A full GPU start can use roughly 25-40 GB of Docker disk space before app models download into %USERPROFILE%\DraftToTake\shared. Pull progress can also pause near 99% while Docker verifies and extracts layers; keep the terminal open and let it finish.
Docker Desktop may keep older beta image tags after updates. If your drive loses a lot of space after updating, run cleanup-docker-space.bat; it removes old Draft to Take beta image tags and dangling image layers, but keeps your shared voices, models, projects, and exports. If the script reports 0 GB cleaned and Windows still shows high disk usage, Docker Desktop may be holding free space inside its WSL virtual disk outside the Draft to Take shared folder.
If Docker reports container startup errors such as exec format error after a partial or interrupted pull, run repair-docker-images.bat, then run start.bat again. The repair script removes only Draft to Take beta containers/images and keeps your shared voices, models, projects, and exports.
Beta 19 adds the latest Script Canvas and timeline editing polish, timeline undo/history groundwork, safer clip controls, frontend state cleanup, refreshed docs, and native installer repair work on top of the recent Docker hardening.
The launcher pulls these public images from GitHub Container Registry:
ghcr.io/jayspiffy/draft-to-take-backend:v3.0.0-beta.19
ghcr.io/jayspiffy/draft-to-take-frontend:v3.0.0-beta.19
ghcr.io/jayspiffy/draft-to-take-script-llm:v3.0.0-beta.19
ghcr.io/jayspiffy/draft-to-take-omnivoice:v3.0.0-beta.19
ghcr.io/jayspiffy/draft-to-take-sfx:v3.0.0-beta.19
Use this only if you specifically want to test the dockerless installer preview. The Docker launcher above is currently more reliable for public testers.
- Open Draft to Take v3.0.0 beta 19.
- Download
DraftToTake-Native-Setup-v3.0.0-beta.19.exe. - Run the installer and choose
Full Studio (recommended). - Start
Draft to Takefrom the Start Menu.
The installer is unsigned during beta, so Windows may show a SmartScreen warning. It does not bundle model weights; the app downloads models into your local %USERPROFILE%\DraftToTake\shared folder.
Installer checksum:
70AEDA06911EE92D74CCE8972E8FF282047802ABD1D0044F376FB2BDF24A93D2
After starting the app, either use the in-app Try Demo Project flow or import the sample scene: Blackmere Road.
Suggested first run inside the app:
- Use
Try Demo Projectand clickPick voices, or openVoicesto prepare a reusable speaker first. - Assign each demo/script role to a prepared voice in the Voice Workbench.
- Open the demo script, or go to
Studio -> Script Canvasand import the sample Markdown file. - Click
Full Episode Timeline. - Click
Detect Active Scene Emotions. - Click
Generate Audio. - Preview and download the mix.
The sample includes dialogue, IndexTTS2 emotion comments, ambience, music, and SFX markers so you can test the full canvas without inventing a script first.
- Script-first workflow - write scenes, chapters, pages, or speeches instead of isolated text snippets.
- Local-first production - scripts, voices, projects, and exports stay in your local shared folder unless you choose to share them.
- Take review - listen, lock strong takes, and regenerate only weak unlocked lines.
- Emotion-aware delivery - Qwen can suggest IndexTTS2 emotion vectors, and you can adjust them manually.
- Timeline export - dialogue, SFX, ambience, and music live in one embedded Script Canvas timeline.
- Reusable libraries - keep prepared voices, source clips, SFX, ambience, and music assets organized for future projects.
Home keeps the first path obvious: create or choose voices, assign readable script roles, open a demo, then place and generate the scene.
Script Canvas is the main workspace: draft or import scripts, assign prepared voices to readable speaker labels, detect emotions, clean up production lines, and send scenes to the embedded timeline.
The Voice Workbench lets you test prepared voices and map them to script roles such as narrator, host, guest, or any character name you wrote.
Review timing, generate missing takes, balance dialogue/SFX/ambience/music tracks, lock good clips, preview the mix, and export without leaving Script Canvas.
Create synthetic voices, prepare source clips, manage reusable voice assets, then assign them to readable Script Canvas role names in the Voice Workbench.
If you are testing the beta for the first time, start with the manuals:
- Docs index
- User Manual
- Tutorial Series
- Script Canvas Authoring Guide
- Script Canvas AI System Prompt
- IndexTTS2 Prompting Guide
- SFX, Ambience, And Music Smoke Test
v3.0.0-beta.19 is the latest public Docker launcher release. It uses refreshed public v3.0.0-beta.19 Docker images.
The native Windows installer preview is now attached to v3.0.0-beta.19, but it remains experimental while clean-machine startup and generation feedback is still being gathered.
All beta container images are public and pullable from GitHub Container Registry:
draft-to-take-backenddraft-to-take-frontenddraft-to-take-script-llmdraft-to-take-omnivoicedraft-to-take-sfx
This beta is best for people comfortable testing local AI tools on Windows. The Docker launcher is currently the recommended route; the installer path is unsigned and experimental.
Good testers:
- Run Windows 11.
- Have an NVIDIA GPU, ideally with 12-16 GB VRAM.
- Can tolerate large downloads and rough edges.
- Are willing to report bugs with hardware details and safe log excerpts.
- Windows 11 recommended.
- NVIDIA GPU strongly recommended.
- 32 GB system RAM recommended for the full workflow.
- 12-16 GB VRAM recommended for the smoother local AI path.
- Plenty of disk space. First-run model downloads can be many gigabytes.
- Docker Desktop with WSL2 and NVIDIA Container Toolkit for the recommended Docker launcher path.
CPU fallback can work for some paths, but it will be much slower.
The installer:
- Installs Draft to Take under your Windows user account.
- Creates Start Menu shortcuts for launch, model setup, diagnostics, stop, and data folder access.
- Uses the native Windows runtime path, so Docker Desktop is not required for the installer preview.
- Offers
Full Studio (recommended),Core dialogue only, andCustomsetup choices. - Downloads model packs after install or on first launch instead of bundling model weights in the setup file.
- Preserves
%USERPROFILE%\DraftToTake\sharedduring updates and uninstall.
The launcher:
- Creates
.envfrom.env.exampleif needed. - Creates a persistent shared folder under your Windows user profile.
- Checks whether Docker is running.
- Checks whether Docker can see your NVIDIA GPU.
- Pulls the prebuilt beta images.
- Starts the backend, frontend, Qwen sidecar, and OmniVoice sidecar.
- Starts the SFX/music sidecar automatically when Docker GPU support is available.
- Opens the local frontend URL after a successful start.
If the port values are blank in .env, the launcher tries nearby ports and prints the actual URLs. Keep them blank unless you need stable URLs.
Your data is stored outside this release folder:
%USERPROFILE%\DraftToTake\shared
That means you can delete and re-download this beta repo without losing downloaded models, voices, projects, or exported audio.
Important folders:
shared\models- downloaded model files.shared\models\checkpoints- IndexTTS2 checkpoints and Hugging Face cache.shared\models\llm- Qwen GGUF files.shared\audio\speakers- prepared speaker WAV files.shared\audio\source_clips- raw clips you want to prepare.shared\audio\outputs- exported mixes.shared\audio\sfx- generated or imported SFX assets.shared\audio\music- generated or imported music assets.shared\data- app/project data.
This beta does not bundle model weights. The Windows installer, Docker launcher, and containers download configured models into your local shared folder or Hugging Face cache.
The default Windows installer path uses IndexTTS2 for dialogue and can install Full Studio model packs for Qwen, OmniVoice, SFX, ambience, and music. The Docker launcher starts managed Qwen, OmniVoice, and SFX/music sidecars when supported. SFX, ambience, and music can still be disabled because those model-backed paths are heavier and license-dependent.
Show model details
| Feature area | Default model/source | Enabled by default | Where it is stored | Notes |
|---|---|---|---|---|
| Dialogue TTS | IndexTeam/IndexTTS-2 |
Yes | shared\models\checkpoints |
Main Script Canvas and timeline speech generation. The upstream bundle includes the IndexTTS2 checkpoints, tokenizer/BPE assets, emotion and speaker matrices, and related vocoder/runtime files used by IndexTTS2. |
| Script assistant and emotion detection | ufoym/Qwen3-8B-Q4_K_M-GGUF / qwen3-8b-q4_k_m.gguf |
Yes | shared\models\llm |
Managed llama.cpp sidecar used by Qwen emotion-vector detection and, only when explicitly enabled, the experimental AI Thread. |
| Reusable voice design | k2-fsa/OmniVoice |
Yes | Hugging Face cache under shared\models\checkpoints\hf_cache |
Creates prepared voice WAVs for the Voice Studio. Final dialogue rendering still uses IndexTTS2. |
| SFX and ambience | AEmotionStudio/woosh-models, default model Woosh-DFlow |
Yes when Docker GPU support is available | shared\models\woosh |
SFX/music sidecar. Woosh-Flow can be selected as a slower quality option. |
| Music beds | facebook/musicgen-small |
Yes when Docker GPU support is available | Hugging Face cache under shared\models\checkpoints\hf_cache |
Music generation through the SFX/music sidecar. |
| Sound-cue alignment | openai/whisper-tiny.en |
Lazy/optional | Hugging Face cache under shared\models\checkpoints\hf_cache |
Used only when Whisper alignment is available and sound cue markers need word-timestamp alignment. |
| Speaker similarity checks | speechbrain/spkrec-ecapa-voxceleb and funasr/campplus / campplus_cn_common.bin |
Lazy/optional | shared\models\pretrained and Hugging Face cache |
Used for optional speaker similarity scoring and reranking during voice prep/quality checks. |
| Neural cleanup | DeepFilterNet via the df package |
Lazy/optional | Docker cache volume / package cache | Used only when DeepFilterNet cleanup is selected or available through auto cleanup mode. Classic noise reduction can be used when it is unavailable. |
Most defaults can be changed in .env. The most useful model overrides are INDTEXTS_MODEL_REPO, SCRIPT_LLM_MODEL_REPO_ID, SCRIPT_LLM_MODEL_FILENAME, OMNIVOICE_MODEL_ID, SFX_WOOSH_WEIGHTS_REPO, SFX_WOOSH_MODEL_NAME, MUSIC_MODEL_ID, and DRAFT_TO_TAKE_WHISPER_MODEL.
The Docker launcher starts these services by default:
- Main Draft to Take backend.
- Frontend UI.
- Managed Qwen sidecar, used for emotion detection and the experimental AI Thread when explicitly enabled.
- OmniVoice sidecar, used for beta testing reusable voice design.
- SFX/music sidecar, when Docker GPU support is available.
Qwen is enabled by default because emotion detection depends on it. You can turn off the AI Thread in the app settings if you do not want to use the experimental assistant workflow.
SFX/music generation is enabled by default when Docker can see an NVIDIA GPU. In the installer preview, Full Studio can download the related model packs, but the heavier native sound-design path is still a preview area. The current model-backed generators are experimental, heavier, and license-dependent, so you can still turn them off.
To disable SFX/music, edit .env and set:
INDTEXTS_SFX_ENABLED=false
Then run start.bat again. If Docker GPU support is not available, start.bat leaves SFX/music disabled unless you explicitly opt in.
Only use SFX/music if your machine has enough VRAM and you understand that generated asset rights depend on the upstream model licenses and your use case.
Installer users can update by downloading the newer setup .exe from the installer release and running it. Your shared folder under %USERPROFILE%\DraftToTake\shared is preserved.
Docker launcher users can update to a newer beta:
- Run:
stop.bat
- Download the new beta repo ZIP or pull the latest repo.
- Run:
start.bat
Your shared folder under %USERPROFILE%\DraftToTake\shared is not deleted.
After the new version starts, you can reclaim old beta Docker images with:
cleanup-docker-space.bat
This keeps the current image tag and removes older Draft to Take beta image tags. It does not delete %USERPROFILE%\DraftToTake\shared.
If a new release uses a new Docker image tag, check .env and update:
DRAFT_TO_TAKE_IMAGE_TAG=v3.0.0-beta.19
Run:
stop.bat
This stops containers but does not delete shared models, voices, projects, or outputs.
If the containers are already running, use:
open.bat
To open your persistent shared folder, use:
open-shared-folder.bat
If something breaks, run:
collect-diagnostics.bat
It writes a diagnostics text file under:
%USERPROFILE%\DraftToTake\diagnostics
Review the file before posting it publicly. Do not share private scripts, voices, tokens, speaker samples, generated audio, or personal data unless you are comfortable doing so.
The installer is unsigned during beta. Windows may warn that the app is from an unknown publisher. Check that the downloaded installer matches the SHA256 checksum on the release page.
This is expected on a fresh install. Runtime setup and model downloads can take a while, especially with Full Studio (recommended) selected. Keep the launch window open and let it finish.
Make sure Docker Desktop is running and your network can reach GitHub Container Registry.
The images are public, so docker login ghcr.io should not be required for this beta.
The first pull is large. A full GPU start can use roughly 25-40 GB of Docker disk space before app models download into %USERPROFILE%\DraftToTake\shared. Pull progress can pause near 99% while Docker verifies and extracts image layers; this can take several minutes on slower disks.
When you update between beta tags, Docker may keep the old image tags as well as the new ones. If a beta update appears to consume another 10-40 GB, run:
cleanup-docker-space.bat
This removes old Draft to Take beta image tags and dangling image layers. It does not delete %USERPROFILE%\DraftToTake\shared.
The launcher pulls each enabled service image separately and retries transient unexpected EOF failures. If it still fails, restart Docker Desktop and run start.bat again.
If a pull half-completes or containers later fail with exec format error or input/output error, run:
repair-docker-images.bat
Then start Docker Desktop again if needed and run:
start.bat
This repairs only Draft to Take beta containers/images. It does not delete %USERPROFILE%\DraftToTake\shared.
If Docker Desktop still shows very high disk usage after cleanup-docker-space.bat, run collect-diagnostics.bat and check the Docker Disk Usage section. Docker Desktop's built-in Troubleshoot / Clean or Purge data option can reset Docker images and containers, but you will need to run start.bat again afterwards.
The launcher will warn if Docker cannot see your NVIDIA GPU. Check Docker Desktop WSL2 integration and NVIDIA Container Toolkit support.
The app may continue in CPU mode, but generation will be much slower.
This is expected on a fresh install. Docker images and model files are large, and Docker can spend several minutes extracting layers after the progress bar looks almost complete. Keep the terminal open and watch the logs before assuming it has crashed.
Check the terminal output. The launcher may choose another port if 3000 is busy, such as:
http://localhost:3001
If the browser shows Cannot GET / on localhost:3000, check the Docker frontend binding: line printed by start.bat. If it is missing or points at a different port, run collect-diagnostics.bat and include the Compose PS section in the issue.
Some upstream model downloads may require Hugging Face authentication depending on the model and account state.
If needed, edit .env and set:
HF_TOKEN=your_token_here
Do not post your token in public issues or screenshots.
Use this repo's Issues tab.
Good bug reports include:
- Windows version.
- Whether you used the installer or Docker launcher.
- GPU model and VRAM.
- System RAM.
- Docker Desktop version, if using Docker.
- Whether Docker GPU support works, if using Docker.
- What you clicked.
- What you expected.
- What happened.
- A safe excerpt from the diagnostics file, if relevant.
Please do not upload private scripts, paid voices, private speaker samples, tokens, or sensitive generated audio to public issues.
Useful beta feedback includes:
- Installer setup, first launch, model download, and uninstall issues.
- First-run setup problems.
- Model download problems.
- Voice preparation and speaker library issues.
- Single-line generation quality.
- Script Canvas workflow confusion.
- Timeline/export bugs.
- VRAM pressure or sidecar crashes.
- Places where the app looks frozen but is actually working.
The launcher, docs, and helper scripts in this repository are released under the MIT License.
Draft to Take does not claim ownership of scripts, projects, voices, source clips, generated audio, SFX, ambience, music, or exported mixes created or provided by users. You may use outputs you create with Draft to Take for commercial or non-commercial creative projects, provided you have the necessary rights to the input material and comply with any applicable third-party model licenses.
This beta does not sell, bundle, or grant rights to third-party model weights. The app may download models from official upstream sources into your local machine.
The public launcher license does not grant ownership of Draft to Take, access to private source code, resale rights, redistribution rights for the full app, sublicensing rights, or the right to repackage Draft to Take as your own product.
Draft to Take container images, private source code, third-party model weights, third-party runtimes, and generated model outputs are governed separately by their own terms.
Read:
SFX/music model-backed generation is experimental and license-dependent. It can be disabled with INDTEXTS_SFX_ENABLED=false.
Draft to Take is designed around a local-first workflow. Your scripts, speaker samples, generated audio, and projects stay in your local shared folder unless you choose to share them.
For beta support, only share the minimum logs and examples needed to reproduce a problem.





