Skip to content

Commit 579054a

Browse files
committed
Scaffold
1 parent 0def8eb commit 579054a

7 files changed

Lines changed: 834 additions & 0 deletions

File tree

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
/.claude
2+
/node_modules

.work/todo.md

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
# audio-encode
2+
3+
Reverse of [audio-decode](../audio-decode) — encode AudioData (`{ channelData, sampleRate }`) into binary format.
4+
Individual packages under `@audio` org (npm: `@audio/*-encode`), united by umbrella `audio-encode`.
5+
**Must work in both browser and Node.**
6+
7+
## Formats
8+
9+
All practical audio formats in use:
10+
11+
### Tier 1 — Essential (high demand, implement first)
12+
13+
| Format | Type | Strategy | Size | Dep |
14+
|--------|------|----------|------|-----|
15+
| **WAV** | Uncompressed PCM | Pure JS (~50 LOC). RIFF header + interleaved PCM. | 0 | none |
16+
| **MP3** | Lossy | `wasm-media-encoders` (libmp3lame). 66 KB gz. Excellent. | tiny | wasm-media-encoders |
17+
| **OGG Vorbis** | Lossy | `wasm-media-encoders` (libvorbis). 158 KB gz. | small | wasm-media-encoders |
18+
| **Opus** | Lossy | libopus WASM. Best quality/bitrate ratio. ~300 KB. Needs Ogg muxer. | medium | libopus WASM (custom build or opusscript) |
19+
| **FLAC** | Lossless | `libflacjs` (libFLAC WASM). ~500 KB. | medium | libflacjs |
20+
21+
### Tier 2 — Important
22+
23+
| Format | Type | Strategy | Size | Dep |
24+
|--------|------|----------|------|-----|
25+
| **AAC/M4A** | Lossy | Hardest format. No clean JS/WASM path. Options: libav.js (ffmpeg AAC, LGPL), or fdk-aac WASM (abandoned, license issues). | large | libav.js or custom fdk-aac WASM |
26+
| **AIFF** | Uncompressed PCM | Pure JS (~80 LOC). IFF/AIFF header + big-endian PCM. | 0 | none |
27+
| **WebM** | Container (Opus/Vorbis) | Opus encoder + WebM/Matroska muxer. | medium | opus encoder + muxer |
28+
29+
### Tier 3 — Niche / Legacy
30+
31+
| Format | Type | Strategy | Notes |
32+
|--------|------|----------|-------|
33+
| **QOA** | Lossy (simple) | Pure JS. qoa-format may have encoder. | Emerging format, very simple codec |
34+
| **ALAC** | Lossless (Apple) | Apple's open-source ALAC → compile to WASM. Or libav.js. | Feasible but no existing WASM build |
35+
| **CAF** | Container (Apple) | Pure JS container writer + codec. | Wraps PCM/AAC/ALAC |
36+
| **AMR** | Lossy (telephony) | opencore-amr WASM. | Very niche |
37+
| **WMA** | Lossy (Microsoft) | ffmpeg only. No open-source encoder lib. | Legacy, low priority |
38+
39+
### Not worth encoding (decode-only)
40+
41+
Formats that exist only for legacy playback and nobody encodes to intentionally:
42+
- None excluded yet — even WMA has some enterprise use cases.
43+
44+
## API Design
45+
46+
Mirror of audio-decode, reversed direction. Same pattern should be applied to audio-decode as well (`decode.mp3.stream()`).
47+
48+
- `channelData` (Float32Array[]) is the payload — passed directly, not wrapped in an object
49+
- `sampleRate` is session config — set once in options, cannot change between chunks
50+
- Format is part of the method name, not an argument
51+
52+
### 1. Whole-file encode
53+
54+
```js
55+
import encode from 'audio-encode'
56+
57+
let buf = await encode.wav(channelData, { sampleRate: 44100 })
58+
let buf = await encode.mp3(channelData, { sampleRate: 44100, bitrate: 128 })
59+
let buf = await encode.flac(channelData, { sampleRate: 44100 })
60+
// → Uint8Array
61+
```
62+
63+
### 2. Streaming encode
64+
65+
```js
66+
import encode from 'audio-encode'
67+
68+
let enc = await encode.mp3.stream({ sampleRate: 44100, bitrate: 128 })
69+
let chunk1 = enc.encode(channelData) // → Uint8Array
70+
let chunk2 = enc.encode(channelData2) // → Uint8Array
71+
let final = enc.encode() // flush + free → Uint8Array
72+
```
73+
74+
StreamEncoder interface:
75+
- `.encode(channelData)` → Uint8Array (encoded chunk)
76+
- `.encode()` → Uint8Array (flush remaining + finalize + free)
77+
- `.flush()` → Uint8Array (flush without freeing)
78+
- `.free()` → void (discard without flushing)
79+
80+
### Common options
81+
82+
```
83+
sampleRate — output sample rate (required)
84+
bitrate — target bitrate in kbps (lossy formats)
85+
quality — quality level 0-10 (VBR, format-specific mapping)
86+
channels — output channel count (downmix/upmix)
87+
```
88+
89+
### Individual @audio packages
90+
91+
Not required to exist as separate packages — umbrella wraps whatever underlying encoder lib exposes.
92+
Own @audio packages may be created for consistency where it makes sense, but the API contract lives in the umbrella.
93+
Same as audio-decode: some decoders are external packages, some are @audio, umbrella normalizes all.
94+
95+
## Implementation Order
96+
97+
### Phase 0: Scaffold ✓
98+
* [x] `audio-encode` package.json, types, entry point skeleton
99+
* [x] `streamEncoder()` helper (mirrors `streamDecoder()` from audio-decode)
100+
* [x] `norm()` for encoding results, `merge()` for flushed chunks
101+
* [x] Test harness: round-trip test pattern (encode → decode → compare)
102+
103+
### Phase 1: Pure JS formats (zero deps, prove the API)
104+
* [ ] `@audio/wav-encode` — RIFF/WAV writer. 16-bit int + 32-bit float. Trivial.
105+
* [ ] `@audio/aiff-encode` — IFF/AIFF writer. Big-endian PCM. Trivial.
106+
* [ ] Wire into `audio-encode` umbrella, verify round-trip with audio-decode.
107+
108+
### Phase 2: Lightweight WASM (wasm-media-encoders)
109+
* [ ] `@audio/mp3-encode` — wasm-media-encoders (libmp3lame). 66 KB gz. VBR/CBR, bitrate, quality.
110+
* [ ] `@audio/ogg-encode` — wasm-media-encoders (libvorbis). 158 KB gz. Quality-based VBR.
111+
* [ ] Wire into umbrella. Round-trip tests.
112+
113+
### Phase 3: Medium WASM
114+
* [ ] `@audio/opus-encode` — libopus WASM + Ogg container. Evaluate: opusscript (battle-tested, 3.6M dl/wk for Discord) vs custom libopus 1.5.1 WASM build. Need Ogg muxer on top.
115+
* [ ] `@audio/flac-encode` — libflacjs. Compression levels 0-8. Verify bit-perfect round-trip.
116+
* [ ] Wire into umbrella.
117+
118+
### Phase 4: Hard formats
119+
* [ ] `@audio/aac-encode` — Evaluate: (a) libav.js variant with ffmpeg AAC encoder, (b) compile fdk-aac to WASM ourselves (license: non-free but distributable), (c) browser MediaRecorder fallback.
120+
* [ ] `@audio/webm-encode` — Opus encoding + WebM/Matroska container muxing. Evaluate ebml-muxer or custom muxer.
121+
* [ ] Wire into umbrella.
122+
123+
### Phase 5: Niche formats (as needed)
124+
* [ ] `@audio/qoa-encode` — qoa-format (if encoder exists) or implement from spec (simple).
125+
* [ ] `@audio/alac-encode` — Compile Apple ALAC (Apache 2.0) to WASM, mux into M4A.
126+
* [ ] `@audio/caf-encode` — CAF container writer (PCM payload).
127+
* [ ] `@audio/amr-encode` — opencore-amr WASM.
128+
* [ ] `@audio/wma-encode` — ffmpeg only. Lowest priority.
129+
130+
### Phase 6: Polish
131+
* [ ] README, docs, examples
132+
* [ ] Benchmark: encode speed, output size vs native tools
133+
* [ ] Publish all packages
134+
135+
## Encoder Research Summary
136+
137+
| Format | Best option | Alternative | Pure JS? | WASM size |
138+
|--------|-----------|-------------|----------|-----------|
139+
| WAV | Custom (trivial) | node-wav | Yes | 0 |
140+
| MP3 | wasm-media-encoders | lamejs (pure JS, buggy npm ver) | Partial | 66 KB gz |
141+
| OGG | wasm-media-encoders || No | 158 KB gz |
142+
| Opus | opusscript or custom libopus WASM | libav.js variant-opus | No | ~300 KB |
143+
| FLAC | libflacjs | libav.js variant-flac | No | ~500 KB |
144+
| AAC | libav.js (ffmpeg AAC) | fdk-aac WASM (abandoned) | No | ~1.5 MB |
145+
| AIFF | Custom (trivial) || Yes | 0 |
146+
| WebM | opus + ebml muxer | libav.js | No | ~300 KB + muxer |
147+
| QOA | qoa-format or custom || Yes | 0 |
148+
| ALAC | Apple ALAC → WASM | libav.js | No | ~200 KB est |
149+
| WMA | ffmpeg only || No | ~3 MB |
150+
151+
## Notes
152+
153+
- Google recently published efficient MP3 encoder config for better compression — track for future integration into mp3-encode (custom LAME params or alternative approach).
154+
- lamejs has a known `MPEGMode` bug in npm-published version — prefer wasm-media-encoders.
155+
- AAC is the hardest format: fdk-aac (best quality) has non-free license, ffmpeg's native AAC encoder is lower quality. No clean path.
156+
- For Opus: opusscript has 3.6M downloads/week (Discord bots) but only does raw frames — need Ogg container muxing on top.
157+
- libav.js (ffmpeg WASM, 492 stars, actively maintained) is the universal fallback for any format.
158+
- @ffmpeg/ffmpeg (17K stars, 294K dl/wk) is the nuclear option — 31 MB WASM, requires SharedArrayBuffer.

audio-encode.d.ts

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
export interface StreamEncoder {
2+
/** Encode a chunk of audio. */
3+
encode(channelData: Float32Array[] | Float32Array): Promise<Uint8Array>;
4+
/** Flush remaining data, finalize, and free resources. */
5+
encode(): Promise<Uint8Array>;
6+
/** Flush without freeing. */
7+
flush(): Promise<Uint8Array>;
8+
/** Free resources without flushing. */
9+
free(): void;
10+
}
11+
12+
export interface EncodeOptions {
13+
/** Output sample rate (required). */
14+
sampleRate: number;
15+
/** Output channel count. */
16+
channels?: number;
17+
/** Target bitrate in kbps (lossy). */
18+
bitrate?: number;
19+
/** Quality 0-10 (VBR, format-specific). */
20+
quality?: number;
21+
[key: string]: any;
22+
}
23+
24+
export interface FormatEncoder {
25+
(channelData: Float32Array[] | Float32Array, opts: EncodeOptions): Promise<Uint8Array>;
26+
stream(opts: EncodeOptions): Promise<StreamEncoder>;
27+
}
28+
29+
/** Encoder registry. Formats attached as encode.wav, encode.mp3, etc. */
30+
declare const encode: {
31+
[format: string]: FormatEncoder;
32+
};
33+
34+
export default encode;
35+
36+
/** Wrap codec callbacks into a StreamEncoder with lifecycle management. */
37+
export function streamEncoder(
38+
onEncode: (channels: Float32Array[]) => Uint8Array | Promise<Uint8Array>,
39+
onFlush?: (() => Uint8Array | Promise<Uint8Array>) | null,
40+
onFree?: (() => void) | null
41+
): StreamEncoder;
42+
43+
/** Wrap a stream factory into whole-file encoder + .stream property. */
44+
export function fmt(
45+
init: (opts: EncodeOptions) => Promise<StreamEncoder>
46+
): FormatEncoder;
47+
48+
/** Normalize input to Float32Array[]. */
49+
export function channels(data: Float32Array[] | Float32Array | null): Float32Array[];
50+
51+
/** Ensure result is Uint8Array. */
52+
export function norm(r: any): Uint8Array;
53+
54+
/** Concatenate two Uint8Arrays. */
55+
export function merge(a: Uint8Array, b: Uint8Array): Uint8Array;

audio-encode.js

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
/**
2+
* Audio encoder: whole-file and streaming
3+
* @module audio-encode
4+
*
5+
* let buf = await encode.wav(channelData, { sampleRate: 44100 })
6+
*
7+
* let enc = await encode.mp3.stream({ sampleRate: 44100, bitrate: 128 })
8+
* let chunk = enc.encode(channelData)
9+
* let final = enc.encode() // flush + free
10+
*/
11+
12+
const EMPTY = new Uint8Array(0)
13+
14+
const encode = {}
15+
export default encode
16+
17+
// --- format registration ---
18+
19+
// encode.wav = fmt(async (opts) => streamEncoder(...))
20+
// encode.mp3 = fmt(async (opts) => streamEncoder(...))
21+
22+
/**
23+
* Wrap a stream factory into whole-file encoder + .stream
24+
* @param {function} init - async (opts) => StreamEncoder
25+
*/
26+
function fmt(init) {
27+
let fn = async (data, opts = {}) => {
28+
if (!opts.sampleRate) throw Error('sampleRate is required')
29+
let ch = channels(data)
30+
if (!ch.length || !ch[0].length) return EMPTY
31+
let enc = await init(opts)
32+
try {
33+
let result = await enc.encode(ch)
34+
let flushed = await enc.encode()
35+
return merge(result, flushed)
36+
} catch (e) { enc.free(); throw e }
37+
}
38+
fn.stream = init
39+
return fn
40+
}
41+
42+
// normalize input to Float32Array[]
43+
function channels(data) {
44+
if (!data) return []
45+
if (Array.isArray(data)) {
46+
if (data[0] instanceof Float32Array) return data
47+
return []
48+
}
49+
if (data instanceof Float32Array) return [data]
50+
return []
51+
}
52+
53+
/**
54+
* StreamEncoder:
55+
* .encode(channelData) — encode audio, returns Uint8Array
56+
* .encode() — flush + finalize + free
57+
* .flush() — flush without freeing
58+
* .free() — release without flushing
59+
*/
60+
export function streamEncoder(onEncode, onFlush, onFree) {
61+
let done = false
62+
return {
63+
async encode(data) {
64+
if (data) {
65+
if (done) throw Error('Encoder already freed')
66+
let ch = channels(data)
67+
try { return norm(await onEncode(ch)) }
68+
catch (e) { done = true; onFree?.(); throw e }
69+
}
70+
// no args = end of stream
71+
if (done) return EMPTY
72+
done = true
73+
try {
74+
let result = onFlush ? norm(await onFlush()) : EMPTY
75+
onFree?.()
76+
return result
77+
} catch (e) { onFree?.(); throw e }
78+
},
79+
async flush() {
80+
if (done) return EMPTY
81+
return onFlush ? norm(await onFlush()) : EMPTY
82+
},
83+
free() {
84+
if (done) return
85+
done = true
86+
onFree?.()
87+
}
88+
}
89+
}
90+
91+
// ensure Uint8Array
92+
function norm(r) {
93+
if (!r?.length) return EMPTY
94+
if (r instanceof Uint8Array) return r
95+
if (r.buffer) return new Uint8Array(r.buffer, r.byteOffset, r.byteLength)
96+
return new Uint8Array(r)
97+
}
98+
99+
// concat two Uint8Arrays
100+
function merge(a, b) {
101+
if (!b?.length) return a || EMPTY
102+
if (!a?.length) return b || EMPTY
103+
let out = new Uint8Array(a.length + b.length)
104+
out.set(a)
105+
out.set(b, a.length)
106+
return out
107+
}
108+
109+
export { fmt, channels, norm, merge }

0 commit comments

Comments
 (0)