Skip to content

Commit d9b2fed

Browse files
committed
doc: /encoding.js
1 parent 9d1be7c commit d9b2fed

1 file changed

Lines changed: 131 additions & 0 deletions

File tree

README.md

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,137 @@ Same as `windows1252toString = createSinglebyteDecoder('windows-1252')`.
173173
##### `async toWifString({ version, privateKey, compressed })`
174174
##### `toWifStringSync({ version, privateKey, compressed })`
175175

176+
### `@exodus/bytes/encoding.js`
177+
178+
Implements the [Encoding standard](https://encoding.spec.whatwg.org/):
179+
[TextDecoder](https://encoding.spec.whatwg.org/#interface-textdecoder),
180+
[TextEncoder](https://encoding.spec.whatwg.org/#interface-textdecoder),
181+
some [hooks](https://encoding.spec.whatwg.org/#specification-hooks) (see below).
182+
183+
```js
184+
import { TextDecoder, TextDecoder } from '@exodus/bytes/encoding.js'
185+
186+
// Hooks for standards
187+
import { getBOMEncoding, legacyHookDecode, normalizeEncoding } from '@exodus/bytes/encoding.js'
188+
```
189+
190+
#### `new TextDecoder(label = 'utf-8', { fatal = false, ignoreBOM = false })`
191+
192+
[TextDecoder](https://encoding.spec.whatwg.org/#interface-textdecoder) implementation/polyfill.
193+
194+
#### `new TextEncoder()`
195+
196+
[TextEncoder](https://encoding.spec.whatwg.org/#interface-textdecoder) implementation/polyfill.
197+
198+
#### `normalizeEncoding(label)`
199+
200+
Implements [get an encoding from a string `label`](https://encoding.spec.whatwg.org/#concept-encoding-get).
201+
202+
Converts an encoding [label](https://encoding.spec.whatwg.org/#names-and-labels) to its name,
203+
as an ASCII-lowercased string.
204+
205+
If an encoding with that label does not exist, returns `null`.
206+
207+
This is the same as [`decoder.encoding` getter](https://encoding.spec.whatwg.org/#dom-textdecoder-encoding),
208+
except that it does not throw for invalid labels and instead returns `null`, and is identical to
209+
the following code:
210+
```js
211+
try {
212+
if (!label) return null // does not default to 'utf-8'
213+
return new TextDecoder(label).encoding
214+
} catch {
215+
return null
216+
}
217+
```
218+
219+
All encoding names are also valid labels for corresponding encodings.
220+
221+
#### `getBOMEncoding(input)`
222+
223+
Implements [BOM sniff](https://encoding.spec.whatwg.org/#bom-sniff) legacy hook.
224+
225+
Given a `TypedArray` or an `ArrayBuffer` instance `input`, returns either of:
226+
* `'utf-8'`, if `input` starts with UTF-8 byte order mark.
227+
* `'utf-16le'`, if `input` starts with UTF-16LE byte order mark.
228+
* `'utf-16be'`, if `input` starts with UTF-16BE byte order mark.
229+
* `null` otherwise.
230+
231+
#### `legacyHookDecode(input, fallbackEncoding = 'utf-8')`
232+
233+
Implements [decode](https://encoding.spec.whatwg.org/#decode) legacy hook.
234+
235+
Given a `TypedArray` or an `ArrayBuffer` instance `input` and an optional `fallbackEncoding`
236+
normalized encoding name, sniffs encoding from BOM with `fallbackEncoding` fallback and then
237+
decodes the `input` using that encoding, skipping BOM if it was present.
238+
239+
Notes:
240+
241+
* BOM-sniffed encoding takes precedence over `fallbackEncoding` option per spec.
242+
Use with care.
243+
* `fallbackEncoding` must be ASCII-lowercased encoding name,
244+
e.g. a result of `normalizeEncoding(label)` call.
245+
* Always operates in non-fatal [mode](https://encoding.spec.whatwg.org/#textdecoder-error-mode),
246+
aka replacement. It can convert different byte sequences to equal strings.
247+
248+
This method is similar to the following code, except that it doesn't support encoding labels and
249+
only expects lowercased encoding name:
250+
251+
```js
252+
new TextDecoder(getBOMEncoding(input) ?? fallbackEncoding ?? 'utf-8').decode(input)
253+
```
254+
255+
### `@exodus/bytes/encoding-lite.js`
256+
257+
```js
258+
import { TextDecoder, TextDecoder } from '@exodus/bytes/encoding-lite.js'
259+
260+
// Hooks for standards
261+
import { getBOMEncoding, legacyHookDecode, normalizeEncoding } from '@exodus/bytes/encoding-lite.js'
262+
```
263+
264+
The exact same exports as `@exodus/bytes/encoding.js` are also exported as
265+
`@exodus/bytes/encoding-lite.js`, with the difference that the lite version does not load
266+
multi-byte `TextDecoder` encodings by default to reduce bundle size 10x.
267+
268+
The only affected encodings are: `gbk`, `gb18030`, `big5`, `euc-jp`, `iso-2022-jp`, `shift_jis`
269+
and their [labels](https://encoding.spec.whatwg.org/#names-and-labels) when used with `TextDecoder`.
270+
271+
Legacy single-byte encodingds are loaded by default in both cases.
272+
273+
`TextEncoder` and hooks for standards (including `normalizeEncoding`) do not have any behavior
274+
differences in the lite version and support full range if inputs.
275+
276+
To avoid inconsistencies, the exported classes and methods are exactly the same objects.
277+
278+
```console
279+
> lite = require('@exodus/bytes/encoding-lite.js')
280+
[Module: null prototype] {
281+
TextDecoder: [class TextDecoder],
282+
TextEncoder: [class TextEncoder],
283+
getBOMEncoding: [Function: getBOMEncoding],
284+
legacyHookDecode: [Function: legacyHookDecode],
285+
normalizeEncoding: [Function: normalizeEncoding]
286+
}
287+
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
288+
Uncaught:
289+
Error: Legacy multi-byte encodings are disabled in /encoding-lite.js, use /encoding.js for full encodings range support
290+
291+
> full = require('@exodus/bytes/encoding.js')
292+
[Module: null prototype] {
293+
TextDecoder: [class TextDecoder],
294+
TextEncoder: [class TextEncoder],
295+
getBOMEncoding: [Function: getBOMEncoding],
296+
legacyHookDecode: [Function: legacyHookDecode],
297+
normalizeEncoding: [Function: normalizeEncoding]
298+
}
299+
> full.TextDecoder === lite.TextDecoder
300+
true
301+
> new full.TextDecoder('big5').decode(Uint8Array.of(0x25))
302+
'%'
303+
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
304+
'%'
305+
```
306+
176307
## License
177308
178309
[MIT](./LICENSE)

0 commit comments

Comments
 (0)