|
| 1 | +# audio-encode |
| 2 | + |
| 3 | +Reverse of [audio-decode](../audio-decode) — encode AudioData (`{ channelData, sampleRate }`) into binary format. |
| 4 | +Individual packages under `@audio` org (npm: `@audio/*-encode`), united by umbrella `audio-encode`. |
| 5 | +**Must work in both browser and Node.** |
| 6 | + |
| 7 | +## Formats |
| 8 | + |
| 9 | +All practical audio formats in use: |
| 10 | + |
| 11 | +### Tier 1 — Essential (high demand, implement first) |
| 12 | + |
| 13 | +| Format | Type | Strategy | Size | Dep | |
| 14 | +|--------|------|----------|------|-----| |
| 15 | +| **WAV** | Uncompressed PCM | Pure JS (~50 LOC). RIFF header + interleaved PCM. | 0 | none | |
| 16 | +| **MP3** | Lossy | `wasm-media-encoders` (libmp3lame). 66 KB gz. Excellent. | tiny | wasm-media-encoders | |
| 17 | +| **OGG Vorbis** | Lossy | `wasm-media-encoders` (libvorbis). 158 KB gz. | small | wasm-media-encoders | |
| 18 | +| **Opus** | Lossy | libopus WASM. Best quality/bitrate ratio. ~300 KB. Needs Ogg muxer. | medium | libopus WASM (custom build or opusscript) | |
| 19 | +| **FLAC** | Lossless | `libflacjs` (libFLAC WASM). ~500 KB. | medium | libflacjs | |
| 20 | + |
| 21 | +### Tier 2 — Important |
| 22 | + |
| 23 | +| Format | Type | Strategy | Size | Dep | |
| 24 | +|--------|------|----------|------|-----| |
| 25 | +| **AAC/M4A** | Lossy | Hardest format. No clean JS/WASM path. Options: libav.js (ffmpeg AAC, LGPL), or fdk-aac WASM (abandoned, license issues). | large | libav.js or custom fdk-aac WASM | |
| 26 | +| **AIFF** | Uncompressed PCM | Pure JS (~80 LOC). IFF/AIFF header + big-endian PCM. | 0 | none | |
| 27 | +| **WebM** | Container (Opus/Vorbis) | Opus encoder + WebM/Matroska muxer. | medium | opus encoder + muxer | |
| 28 | + |
| 29 | +### Tier 3 — Niche / Legacy |
| 30 | + |
| 31 | +| Format | Type | Strategy | Notes | |
| 32 | +|--------|------|----------|-------| |
| 33 | +| **QOA** | Lossy (simple) | Pure JS. qoa-format may have encoder. | Emerging format, very simple codec | |
| 34 | +| **ALAC** | Lossless (Apple) | Apple's open-source ALAC → compile to WASM. Or libav.js. | Feasible but no existing WASM build | |
| 35 | +| **CAF** | Container (Apple) | Pure JS container writer + codec. | Wraps PCM/AAC/ALAC | |
| 36 | +| **AMR** | Lossy (telephony) | opencore-amr WASM. | Very niche | |
| 37 | +| **WMA** | Lossy (Microsoft) | ffmpeg only. No open-source encoder lib. | Legacy, low priority | |
| 38 | + |
| 39 | +### Not worth encoding (decode-only) |
| 40 | + |
| 41 | +Formats that exist only for legacy playback and nobody encodes to intentionally: |
| 42 | +- None excluded yet — even WMA has some enterprise use cases. |
| 43 | + |
| 44 | +## API Design |
| 45 | + |
| 46 | +Mirror of audio-decode, reversed direction. Same pattern should be applied to audio-decode as well (`decode.mp3.stream()`). |
| 47 | + |
| 48 | +- `channelData` (Float32Array[]) is the payload — passed directly, not wrapped in an object |
| 49 | +- `sampleRate` is session config — set once in options, cannot change between chunks |
| 50 | +- Format is part of the method name, not an argument |
| 51 | + |
| 52 | +### 1. Whole-file encode |
| 53 | + |
| 54 | +```js |
| 55 | +import encode from 'audio-encode' |
| 56 | + |
| 57 | +let buf = await encode.wav(channelData, { sampleRate: 44100 }) |
| 58 | +let buf = await encode.mp3(channelData, { sampleRate: 44100, bitrate: 128 }) |
| 59 | +let buf = await encode.flac(channelData, { sampleRate: 44100 }) |
| 60 | +// → Uint8Array |
| 61 | +``` |
| 62 | + |
| 63 | +### 2. Streaming encode |
| 64 | + |
| 65 | +```js |
| 66 | +import encode from 'audio-encode' |
| 67 | + |
| 68 | +let enc = await encode.mp3.stream({ sampleRate: 44100, bitrate: 128 }) |
| 69 | +let chunk1 = enc.encode(channelData) // → Uint8Array |
| 70 | +let chunk2 = enc.encode(channelData2) // → Uint8Array |
| 71 | +let final = enc.encode() // flush + free → Uint8Array |
| 72 | +``` |
| 73 | + |
| 74 | +StreamEncoder interface: |
| 75 | +- `.encode(channelData)` → Uint8Array (encoded chunk) |
| 76 | +- `.encode()` → Uint8Array (flush remaining + finalize + free) |
| 77 | +- `.flush()` → Uint8Array (flush without freeing) |
| 78 | +- `.free()` → void (discard without flushing) |
| 79 | + |
| 80 | +### Common options |
| 81 | + |
| 82 | +``` |
| 83 | +sampleRate — output sample rate (required) |
| 84 | +bitrate — target bitrate in kbps (lossy formats) |
| 85 | +quality — quality level 0-10 (VBR, format-specific mapping) |
| 86 | +channels — output channel count (downmix/upmix) |
| 87 | +``` |
| 88 | + |
| 89 | +### Individual @audio packages |
| 90 | + |
| 91 | +Not required to exist as separate packages — umbrella wraps whatever underlying encoder lib exposes. |
| 92 | +Own @audio packages may be created for consistency where it makes sense, but the API contract lives in the umbrella. |
| 93 | +Same as audio-decode: some decoders are external packages, some are @audio, umbrella normalizes all. |
| 94 | + |
| 95 | +## Implementation Order |
| 96 | + |
| 97 | +### Phase 0: Scaffold ✓ |
| 98 | +* [x] `audio-encode` package.json, types, entry point skeleton |
| 99 | +* [x] `streamEncoder()` helper (mirrors `streamDecoder()` from audio-decode) |
| 100 | +* [x] `norm()` for encoding results, `merge()` for flushed chunks |
| 101 | +* [x] Test harness: round-trip test pattern (encode → decode → compare) |
| 102 | + |
| 103 | +### Phase 1: Pure JS formats (zero deps, prove the API) |
| 104 | +* [ ] `@audio/wav-encode` — RIFF/WAV writer. 16-bit int + 32-bit float. Trivial. |
| 105 | +* [ ] `@audio/aiff-encode` — IFF/AIFF writer. Big-endian PCM. Trivial. |
| 106 | +* [ ] Wire into `audio-encode` umbrella, verify round-trip with audio-decode. |
| 107 | + |
| 108 | +### Phase 2: Lightweight WASM (wasm-media-encoders) |
| 109 | +* [ ] `@audio/mp3-encode` — wasm-media-encoders (libmp3lame). 66 KB gz. VBR/CBR, bitrate, quality. |
| 110 | +* [ ] `@audio/ogg-encode` — wasm-media-encoders (libvorbis). 158 KB gz. Quality-based VBR. |
| 111 | +* [ ] Wire into umbrella. Round-trip tests. |
| 112 | + |
| 113 | +### Phase 3: Medium WASM |
| 114 | +* [ ] `@audio/opus-encode` — libopus WASM + Ogg container. Evaluate: opusscript (battle-tested, 3.6M dl/wk for Discord) vs custom libopus 1.5.1 WASM build. Need Ogg muxer on top. |
| 115 | +* [ ] `@audio/flac-encode` — libflacjs. Compression levels 0-8. Verify bit-perfect round-trip. |
| 116 | +* [ ] Wire into umbrella. |
| 117 | + |
| 118 | +### Phase 4: Hard formats |
| 119 | +* [ ] `@audio/aac-encode` — Evaluate: (a) libav.js variant with ffmpeg AAC encoder, (b) compile fdk-aac to WASM ourselves (license: non-free but distributable), (c) browser MediaRecorder fallback. |
| 120 | +* [ ] `@audio/webm-encode` — Opus encoding + WebM/Matroska container muxing. Evaluate ebml-muxer or custom muxer. |
| 121 | +* [ ] Wire into umbrella. |
| 122 | + |
| 123 | +### Phase 5: Niche formats (as needed) |
| 124 | +* [ ] `@audio/qoa-encode` — qoa-format (if encoder exists) or implement from spec (simple). |
| 125 | +* [ ] `@audio/alac-encode` — Compile Apple ALAC (Apache 2.0) to WASM, mux into M4A. |
| 126 | +* [ ] `@audio/caf-encode` — CAF container writer (PCM payload). |
| 127 | +* [ ] `@audio/amr-encode` — opencore-amr WASM. |
| 128 | +* [ ] `@audio/wma-encode` — ffmpeg only. Lowest priority. |
| 129 | + |
| 130 | +### Phase 6: Polish |
| 131 | +* [ ] README, docs, examples |
| 132 | +* [ ] Benchmark: encode speed, output size vs native tools |
| 133 | +* [ ] Publish all packages |
| 134 | + |
| 135 | +## Encoder Research Summary |
| 136 | + |
| 137 | +| Format | Best option | Alternative | Pure JS? | WASM size | |
| 138 | +|--------|-----------|-------------|----------|-----------| |
| 139 | +| WAV | Custom (trivial) | node-wav | Yes | 0 | |
| 140 | +| MP3 | wasm-media-encoders | lamejs (pure JS, buggy npm ver) | Partial | 66 KB gz | |
| 141 | +| OGG | wasm-media-encoders | — | No | 158 KB gz | |
| 142 | +| Opus | opusscript or custom libopus WASM | libav.js variant-opus | No | ~300 KB | |
| 143 | +| FLAC | libflacjs | libav.js variant-flac | No | ~500 KB | |
| 144 | +| AAC | libav.js (ffmpeg AAC) | fdk-aac WASM (abandoned) | No | ~1.5 MB | |
| 145 | +| AIFF | Custom (trivial) | — | Yes | 0 | |
| 146 | +| WebM | opus + ebml muxer | libav.js | No | ~300 KB + muxer | |
| 147 | +| QOA | qoa-format or custom | — | Yes | 0 | |
| 148 | +| ALAC | Apple ALAC → WASM | libav.js | No | ~200 KB est | |
| 149 | +| WMA | ffmpeg only | — | No | ~3 MB | |
| 150 | + |
| 151 | +## Notes |
| 152 | + |
| 153 | +- Google recently published efficient MP3 encoder config for better compression — track for future integration into mp3-encode (custom LAME params or alternative approach). |
| 154 | +- lamejs has a known `MPEGMode` bug in npm-published version — prefer wasm-media-encoders. |
| 155 | +- AAC is the hardest format: fdk-aac (best quality) has non-free license, ffmpeg's native AAC encoder is lower quality. No clean path. |
| 156 | +- For Opus: opusscript has 3.6M downloads/week (Discord bots) but only does raw frames — need Ogg container muxing on top. |
| 157 | +- libav.js (ffmpeg WASM, 492 stars, actively maintained) is the universal fallback for any format. |
| 158 | +- @ffmpeg/ffmpeg (17K stars, 294K dl/wk) is the nuclear option — 31 MB WASM, requires SharedArrayBuffer. |
0 commit comments