Skip to content

Commit 2f24343

Browse files
Merge pull request #2 from LunarCommand/release/v0.1.1
Release/v0.1.1
2 parents d7ab502 + 7e999d8 commit 2f24343

17 files changed

Lines changed: 589 additions & 223 deletions

.dockerignore

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Virtual environment and build artifacts
2+
.venv/
3+
*.egg-info/
4+
dist/
5+
build/
6+
__pycache__/
7+
*.py[cod]
8+
9+
# Test artifacts
10+
.pytest_cache/
11+
.coverage
12+
htmlcov/
13+
coverage.xml
14+
15+
# Type / lint caches
16+
.mypy_cache/
17+
.ruff_cache/
18+
19+
# Secrets and local config
20+
.env
21+
assets/
22+
23+
# Dev tooling
24+
.pre-commit-config.yaml
25+
.claude/
26+
.idea/
27+
.vscode/
28+
*.swp
29+
*.swo
30+
.DS_Store
31+
32+
# Git
33+
.git/
34+
.github/
35+
36+
# Docs and non-runtime files
37+
docs/
38+
tests/
39+
CHANGELOG.md
40+
CONTRIBUTING.md
41+
CLAUDE.md
42+
SECURITY.md
43+
Makefile
44+
README.md
45+
uv.lock

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
assets/
2+
.claude/
23

34
# Python
45
.venv/

CHANGELOG.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.1.1] - 2026-03-06
11+
12+
### Added
13+
14+
- `combined_report.json` now includes four derived metrics: `avg_time_per_file_seconds`, `avg_time_per_mb_seconds`, `processing_speed_ratio` (real-time factor), and `words_per_audio_hour` (transcription density)
15+
- Slack notifications now include detailed per-stage stats (processed / skipped / failed counts) and average processing time per file
16+
- `make test-slack` Makefile target for validating Slack webhook integration
17+
- Dockerfile and `.dockerignore` for containerized deployment
18+
- Sentiment output directory (`<base>/sentiment/`) support in batch pipeline
19+
20+
### Changed
21+
22+
- Centralized Demucs scratch directory resolution in CLI — RAM disk detection and fallback confirmation now happen in one place
23+
- Worker status reporting and failure aggregation in `pipeline-parallel` refactored for improved accuracy
24+
- `python-dotenv` import in Slack notifier is now conditional — avoids import-time failure when the package is absent
25+
- DEPLOYMENT.md expanded: HuggingFace token setup, NVIDIA driver requirements, cloud instance guidelines, and Docker usage
26+
- Combined report fields documented in README under the Parallel Pipeline section
27+
28+
### Fixed
29+
30+
- Narrowed exception handling in `gpu_utils.py`, `transcriber.py`, and `notifier.py` to avoid masking unexpected errors
31+
- Typo in `SeparationError` docstring
32+
1033
## [0.1.0] - 2026-03-01
1134

1235
### Added
@@ -35,5 +58,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
3558
- `transformers` capped at `<4.40.0` — versions 4.40+ use `torch.utils._pytree.register_pytree_node`, an API introduced in PyTorch 2.2, which breaks with the pinned PyTorch 2.1.2
3659
- `make dev-setup` now reinstalls CUDA torch wheels (`torch==2.1.2+cu121`, `torchaudio==2.1.2+cu121`) as its final step — `uv sync` resolves torch from PyPI and installs the CPU-only build, silently breaking GPU inference
3760

38-
[Unreleased]: https://github.com/LunarCommand/audio-refinery/compare/v0.1.0...HEAD
61+
[Unreleased]: https://github.com/LunarCommand/audio-refinery/compare/v0.1.1...HEAD
62+
[0.1.1]: https://github.com/LunarCommand/audio-refinery/compare/v0.1.0...v0.1.1
3963
[0.1.0]: https://github.com/LunarCommand/audio-refinery/releases/tag/v0.1.0

Dockerfile

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04
2+
3+
# System dependencies
4+
RUN apt-get update && apt-get install -y \
5+
python3.11 python3.11-dev python3-pip python3.11-venv \
6+
ffmpeg git curl \
7+
&& rm -rf /var/lib/apt/lists/*
8+
9+
# Non-root user
10+
RUN useradd -m -u 1000 refinery
11+
WORKDIR /app
12+
USER refinery
13+
14+
# Install uv
15+
RUN pip install --user uv
16+
17+
# Copy and install the package (resolves main deps; may pull CPU-only torch)
18+
COPY --chown=refinery:refinery . .
19+
RUN uv pip install -e .
20+
21+
# Install WhisperX at the pinned commit — no-deps to avoid overwriting torch
22+
# v3.1.1 tag has the old API without device_index; use the correct commit instead
23+
RUN uv pip install --no-deps \
24+
"whisperx @ git+https://github.com/m-bain/whisperX.git@741ab9a2a8a1076c171e785363b23c55a91ceff1"
25+
26+
# Install pinned WhisperX runtime deps
27+
# transformers must stay <4.40.0 — 4.40+ uses torch.utils._pytree.register_pytree_node
28+
# which was added in PyTorch 2.2 and breaks with the pinned 2.1.2
29+
RUN uv pip install \
30+
"av==16.1.0" "ctranslate2==4.7.1" "faster-whisper==1.2.1" \
31+
"flatbuffers==25.12.19" "nltk==3.9.2" "onnxruntime==1.24.1" \
32+
"transformers>=4.30.0,<4.40.0"
33+
34+
# Reinstall PyTorch with CUDA 12.1 wheels last — uv pip install -e . above may have
35+
# pulled CPU-only builds; this guarantees the CUDA wheel is what's actually used
36+
RUN uv pip install torch==2.1.2+cu121 torchaudio==2.1.2+cu121 \
37+
--extra-index-url https://download.pytorch.org/whl/cu121
38+
39+
CMD ["audio-refinery", "--help"]

Makefile

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,17 @@ dev-setup: install-dev install-whisperx install-torch-cuda pre-commit-install ##
7979
@echo " 2. Run 'make test' to verify everything works"
8080
@echo " 3. Run 'audio-refinery --help' to see available commands"
8181

82+
test-slack: ## Send a test Slack notification to verify SLACK_WEBHOOK_URL is configured
83+
@uv run python -c "\
84+
from dotenv import load_dotenv; \
85+
load_dotenv(); \
86+
import os, sys, json, urllib.request; \
87+
url = os.getenv('SLACK_WEBHOOK_URL') or (print('SLACK_WEBHOOK_URL is not set — add it to .env or export it') or sys.exit(1)); \
88+
data = json.dumps({'text': ':white_check_mark: *Test notification* from \`audio-refinery\` — Slack integration is working.'}).encode(); \
89+
req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'}); \
90+
urllib.request.urlopen(req, timeout=5); \
91+
print('Test notification sent — check your Slack channel')"
92+
8293
stats: ## Show project statistics
8394
@echo "Project Statistics:"
8495
@echo "==================="

README.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -607,6 +607,29 @@ Options:
607607
--help Show this message and exit.
608608
```
609609

610+
### Combined report fields
611+
612+
`combined_report.json` is always written after all workers finish. It contains aggregate metrics across all workers:
613+
614+
| Field | Type | Description |
615+
|---|---|---|
616+
| `run_at` | string | ISO 8601 timestamp of run start (UTC) |
617+
| `total_discovered` | int | Total WAV files found in `extracted/` |
618+
| `total_time_seconds` | float | Wall-clock seconds from first worker start to last finish |
619+
| `total_audio_hours` | float | Total audio duration processed across all workers |
620+
| `source_audio_bytes` | int | Combined size of all input WAV files |
621+
| `total_words` | int | Total words transcribed across all files |
622+
| `total_segments` | int | Total transcript segments across all files |
623+
| `avg_time_per_file_seconds` | float | `total_time / total_discovered` — average wall-clock cost per file |
624+
| `avg_time_per_mb_seconds` | float | `total_time / source_MB` — processing seconds per MB of source audio |
625+
| `processing_speed_ratio` | float | `audio_seconds / wall_seconds` — real-time factor (e.g. `3.7` means the pipeline processed audio 3.7× faster than its playback duration) |
626+
| `words_per_audio_hour` | float | Transcription density — useful for detecting sparse/silent audio or diarization misses |
627+
| `gpu_temp_celsius` | object | Per-device temperature summary: `peak_celsius`, `avg_celsius`, `sample_count` |
628+
| `workers` | array | Per-worker label, device, exit code, and individual summary |
629+
| `combined_failures` | array | Aggregated failure records from all workers |
630+
631+
`null` is written for derived metrics when the divisor is zero (e.g. `avg_time_per_file_seconds` is `null` if no files were discovered).
632+
610633
### Power limit / sudoers
611634

612635
`--power-limit` invokes `sudo nvidia-smi -pl <watts>`. To allow this without a password prompt:

0 commit comments

Comments
 (0)