Skip to content

Latest commit

Β 

History

History
128 lines (109 loc) Β· 6.58 KB

File metadata and controls

128 lines (109 loc) Β· 6.58 KB

Architecture

Overview

Mofusand Synth is a Tauri 2 + SvelteKit desktop app that converts YouTube audio into 8-bit chiptune. The Rust backend downloads audio via yt-dlp and handles file I/O through three IPC commands. The Svelte frontend owns all audio processing and offers two conversion modes:

  • DSP Crush β€” degrades the original recording with a bitcrusher / lowpass / sample-rate-reduction chain (real-time, tweakable). Optional vocal removal.
  • True Chiptune β€” transcribes the song to notes with Spotify's basic-pitch ML model, then re-synthesizes those notes on square/triangle/pulse/sawtooth oscillators.

Both modes ultimately produce an AudioBuffer that flows through one shared playback engine (play/pause/seek) and one shared WAV export path.


System Layers

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  SvelteKit Frontend (Webview)                  β”‚
β”‚                                                                β”‚
β”‚  +page.svelte ── UrlInput ── Player ── ChiptuneControls        β”‚
β”‚                                                                β”‚
β”‚  DSP mode:      source β†’ preGain β†’ WaveShaper β†’ lowpass         β”‚
β”‚                 β†’ downsampler(AudioWorklet) β†’ destination       β”‚
β”‚                                                                β”‚
β”‚  Chiptune mode: original β†’ transcribe(basic-pitch)             β”‚
β”‚                 β†’ notes β†’ renderChiptune() β†’ AudioBuffer        β”‚
β”‚                 β†’ source β†’ cleanGain β†’ destination              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚ Tauri IPC (invoke)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Rust / Tauri Commands                        β”‚
β”‚  download_audio(url)            β†’ { path, title }               β”‚
β”‚  read_audio_file(path)          β†’ Vec<u8>                       β”‚
β”‚  save_audio_file(bytes, name)   β†’ ()  (native save dialog)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚ std::process::Command
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      yt-dlp (external binary)                   β”‚
β”‚  Downloads YouTube audio (mp3) to the OS temp directory         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Directory Structure

mofusand-synth/
β”œβ”€β”€ src-tauri/                       # Rust / Tauri backend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ main.rs                  # thin entry β†’ lib::run()
β”‚   β”‚   β”œβ”€β”€ lib.rs                   # Tauri builder, registers commands
β”‚   β”‚   └── commands/
β”‚   β”‚       β”œβ”€β”€ mod.rs
β”‚   β”‚       β”œβ”€β”€ download.rs          # download_audio (yt-dlp)
β”‚   β”‚       └── file.rs              # read_audio_file, save_audio_file
β”‚   β”œβ”€β”€ capabilities/default.json    # dialog:allow-save permission
β”‚   β”œβ”€β”€ Cargo.toml
β”‚   └── tauri.conf.json              # 620Γ—800 window, csp: null
β”œβ”€β”€ src/                             # SvelteKit frontend
β”‚   β”œβ”€β”€ app.css                      # global styles + Mofusand theme
β”‚   β”œβ”€β”€ app.html
β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   β”œβ”€β”€ +layout.js               # ssr = false (SPA mode)
β”‚   β”‚   β”œβ”€β”€ +layout.svelte           # imports app.css
β”‚   β”‚   └── +page.svelte             # root: state + Tauri invocations
β”‚   └── lib/
β”‚       β”œβ”€β”€ audio.js                 # makeBitCrushCurve, encodeWav
β”‚       β”œβ”€β”€ transcribe.js            # basic-pitch wrapper (audio β†’ notes)
β”‚       β”œβ”€β”€ chiptune.js              # notes β†’ rendered chiptune AudioBuffer
β”‚       β”œβ”€β”€ worklet/downsampler.js   # AudioWorklet sample-rate reducer
β”‚       └── components/
β”‚           β”œβ”€β”€ UrlInput.svelte
β”‚           β”œβ”€β”€ Player.svelte        # both modes, playback, download
β”‚           └── ChiptuneControls.svelte  # DSP sliders
β”œβ”€β”€ static/
β”‚   └── model/                       # basic-pitch model (served at /model/)
β”‚       β”œβ”€β”€ model.json
β”‚       └── group1-shard1of1.bin
└── docs/

Data Flow

Download (both modes)

paste URL β†’ +page.svelte invoke("download_audio", {url})
  β†’ Rust yt-dlp β†’ { path, title }
  β†’ invoke("read_audio_file", {path}) β†’ bytes
  β†’ Player decodes β†’ originalBuffer

DSP Crush mode

originalBuffer β†’ BufferSource β†’ [vocal removal?] β†’ preGain
  β†’ WaveShaper(bitcrush) β†’ BiquadFilter(lowpass)
  β†’ AudioWorklet(downsampler) β†’ destination

Sliders (Bit Depth / Sample Rate / Wave Crush) update nodes live.

True Chiptune mode

originalBuffer β†’ resample mono 22050Hz β†’ basic-pitch.evaluateModel()
  β†’ note events (pitchMidi, startTimeSeconds, durationSeconds, amplitude)
  β†’ renderChiptune(notes) [OfflineAudioContext: oscillators + envelopes]
  β†’ chiptuneBuffer β†’ BufferSource β†’ cleanGain β†’ destination

Notes are cached; changing the waveform only re-runs renderChiptune.

Download

DSP mode:      OfflineAudioContext re-renders effects chain β†’ encodeWav
Chiptune mode: chiptuneBuffer already rendered β†’ encodeWav
  β†’ invoke("save_audio_file", {bytes, filename}) β†’ native dialog

Key Constraints

  • yt-dlp must be on PATH β€” surfaced as an inline error if missing.
  • Tauri capabilities must allow dialog:allow-save.
  • CSP is disabled (csp: null) so tfjs and the local model load freely.
  • Vocal removal uses Lβˆ’R channel cancellation; requires a stereo source.
  • basic-pitch transcription is best on melodic content; very dense mixes get noisy.
  • Audio is held in memory (Vec<u8> / AudioBuffer) β€” fine for typical song lengths.