Skip to content

Latest commit

 

History

History
94 lines (63 loc) · 2.37 KB

File metadata and controls

94 lines (63 loc) · 2.37 KB

Whisper Transcriber

CLI tool for transcribing audio files with OpenAI Whisper.

prerequisites

  • uv (install with pip install uv or pipx install uv)
  • ffmpeg install with brew install ffmpeg (Mac) or sudo apt install ffmpeg (Linux)

Setup

uv sync

Usage

simplest case: add your video/audio file to data/input and run:

uv run python main.py

or

  • Extract MP3 from your recording (optional helper script):
./scripts/extract_audio.sh data/input/<filename>.mov
  • Transcribe the MP3:
uv run python main.py data/input/<filename>.mp3

By default, transcript output is saved to:

data/output/<filename>.txt

Options

uv run python main.py data/input/<filename>.mp3 --model small.en
uv run python main.py data/input/<filename>.mp3 --output-dir data/output/custom
uv run python main.py data/input/<filename>.mp3 --language Japanese --task translate --model medium
uv run python main.py data/input/<filename>.mp3 --realtime
uv run python main.py data/input/<filename>.mp3 --timestamps
uv run python main.py data/input/<filename>.mp3 --no-fast-decode

--timestamps writes lines like [00:36.000 --> 00:49.000] ... to the output file. --realtime is optional (off by default) because streaming logs can slow long CPU transcriptions.

Batch Mode (All In One)

Run without audio_file:

uv run python main.py

Batch mode will:

  • Scan data/input for .mov files
  • Convert each .mov to .mp3 with scripts/extract_audio.sh
  • Skip conversion when the target .mp3 already exists
  • Transcribe all .mp3 files in data/input
  • Save transcripts into data/output

You can override paths:

uv run python main.py --input-dir data/input --output-dir data/output --extractor-script scripts/extract_audio.sh

Available models:

tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large, turbo

Notes

  • The CLI uses Typer + Rich for styled logs and progress states.
  • Transcription uses the Python Whisper API (model.transcribe) for full-audio processing.
  • See the Whisper repository for model details.
  • translation with uv run python main.py --task translate --language sv --model tinyis not tested nor giving good results, so use with caution.

Screenshot

screenshot