Skip to content

Commit 38d354e

Browse files
committed
doc: /encoding.js
1 parent 9d1be7c commit 38d354e

1 file changed

Lines changed: 121 additions & 0 deletions

File tree

README.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,127 @@ Same as `windows1252toString = createSinglebyteDecoder('windows-1252')`.
173173
##### `async toWifString({ version, privateKey, compressed })`
174174
##### `toWifStringSync({ version, privateKey, compressed })`
175175

176+
### `@exodus/bytes/encoding.js`
177+
178+
Implements the [Encoding standard](https://encoding.spec.whatwg.org/):
179+
[TextDecoder](https://encoding.spec.whatwg.org/#interface-textdecoder),
180+
[TextEncoder](https://encoding.spec.whatwg.org/#interface-textdecoder),
181+
some [hooks](https://encoding.spec.whatwg.org/#specification-hooks) (see below).
182+
183+
```js
184+
import { TextDecoder, TextDecoder } from '@exodus/bytes/encoding.js'
185+
186+
// Hooks for standards
187+
import { getBOMEncoding, legacyHookDecode, normalizeEncoding } from '@exodus/bytes/encoding.js'
188+
```
189+
190+
#### `new TextDecoder(label = 'utf-8', { fatal = false, ignoreBOM = false })`
191+
192+
[TextDecoder](https://encoding.spec.whatwg.org/#interface-textdecoder) implementation/polyfill.
193+
194+
#### `new TextEncoder()`
195+
196+
[TextEncoder](https://encoding.spec.whatwg.org/#interface-textdecoder) implementation/polyfill.
197+
198+
#### `normalizeEncoding(label)`
199+
200+
Converts an encoding [Label](https://encoding.spec.whatwg.org/#names-and-labels) to its name,
201+
as an ASCII-lowercased string.
202+
203+
If an encoding with that label does not exist, returns `null`.
204+
205+
This is the same as `new TextDecoder(label).encoding`
206+
[getter](https://encoding.spec.whatwg.org/#dom-textdecoder-encoding),
207+
except that it does not throw for invalid labels and instead returns `null`.
208+
209+
All encoding names are also valid labels for corresponding encodings.
210+
211+
#### `getBOMEncoding(input)`
212+
213+
Implements [BOM sniff](https://encoding.spec.whatwg.org/#bom-sniff) legacy hook.
214+
215+
Given a `TypedArray` or an `ArrayBuffer` instance `input`, returns either of:
216+
* `'utf-8'`, if `input` starts with UTF-8 byte order mark.
217+
* `'utf-16le'`, if `input` starts with UTF-16LE byte order mark.
218+
* `'utf-16be'`, if `input` starts with UTF-16BE byte order mark.
219+
* `null` otherwise.
220+
221+
#### `legacyHookDecode(input, fallbackEncoding = 'utf-8')`
222+
223+
Implements [decode](https://encoding.spec.whatwg.org/#decode) legacy hook.
224+
225+
Given a `TypedArray` or an `ArrayBuffer` instance `input` and an optional `fallbackEncoding`
226+
normalized encoding name, sniffs encoding from BOM with `fallbackEncoding` fallback and then
227+
decodes the `input` using that encoding, skipping BOM if it was present.
228+
229+
Notes:
230+
231+
* BOM-sniffed encoding takes precedence over `fallbackEncoding` option per spec.
232+
Use with care.
233+
* `fallbackEncoding` must be ASCII-lowercased encoding name,
234+
e.g. a result of `normalizeEncoding(label)` call.
235+
* Always operates in non-fatal [mode](https://encoding.spec.whatwg.org/#textdecoder-error-mode),
236+
aka replacement. It can convert different byte sequences to equal strings.
237+
238+
This method is similar to the following code, except that it doesn't support encoding labels and
239+
only expects lowercased encoding name:
240+
241+
```js
242+
new TextDecoder(getBOMEncoding(input) ?? fallbackEncoding ?? 'utf-8').decode(input)
243+
```
244+
245+
### `@exodus/bytes/encoding-lite.js`
246+
247+
```js
248+
import { TextDecoder, TextDecoder } from '@exodus/bytes/encoding-lite.js'
249+
250+
// Hooks for standards
251+
import { getBOMEncoding, legacyHookDecode, normalizeEncoding } from '@exodus/bytes/encoding-lite.js'
252+
```
253+
254+
The exact same exports as `@exodus/bytes/encoding.js` are also exported as
255+
`@exodus/bytes/encoding-lite.js`, with the difference that the lite version does not load
256+
multi-byte `TextDecoder` encodings by default to reduce bundle size 10x.
257+
258+
The only affected encodings are: `gbk`, `gb18030`, `big5`, `euc-jp`, `iso-2022-jp`, `shift_jis`
259+
and their [labels](https://encoding.spec.whatwg.org/#names-and-labels) when used with `TextDecoder`.
260+
261+
Legacy single-byte encodingds are loaded by default in both cases.
262+
263+
`TextEncoder` and hooks for standards (including `normalizeEncoding`) do not have any behavior
264+
differences in the lite version and support full range if inputs.
265+
266+
To avoid inconsistencies, the exported classes and methods are exactly the same objects.
267+
268+
```console
269+
> lite = require('@exodus/bytes/encoding-lite.js')
270+
[Module: null prototype] {
271+
TextDecoder: [class TextDecoder],
272+
TextEncoder: [class TextEncoder],
273+
getBOMEncoding: [Function: getBOMEncoding],
274+
legacyHookDecode: [Function: legacyHookDecode],
275+
normalizeEncoding: [Function: normalizeEncoding]
276+
}
277+
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
278+
Uncaught:
279+
Error: Legacy multi-byte encodings are disabled in /encoding-lite.js, use /encoding.js for full encodings range support
280+
281+
> full = require('@exodus/bytes/encoding.js')
282+
[Module: null prototype] {
283+
TextDecoder: [class TextDecoder],
284+
TextEncoder: [class TextEncoder],
285+
getBOMEncoding: [Function: getBOMEncoding],
286+
legacyHookDecode: [Function: legacyHookDecode],
287+
normalizeEncoding: [Function: normalizeEncoding]
288+
}
289+
> full.TextDecoder === lite.TextDecoder
290+
true
291+
> new full.TextDecoder('big5').decode(Uint8Array.of(0x25))
292+
'%'
293+
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
294+
'%'
295+
```
296+
176297
## License
177298
178299
[MIT](./LICENSE)

0 commit comments

Comments
 (0)