Skip to content

Commit 36ed419

Browse files
committed
doc: /encoding.js
1 parent 9d1be7c commit 36ed419

1 file changed

Lines changed: 123 additions & 0 deletions

File tree

README.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,129 @@ Same as `windows1252toString = createSinglebyteDecoder('windows-1252')`.
173173
##### `async toWifString({ version, privateKey, compressed })`
174174
##### `toWifStringSync({ version, privateKey, compressed })`
175175

176+
### `@exodus/bytes/encoding.js`
177+
178+
Implements the [Encoding standard](https://encoding.spec.whatwg.org/):
179+
[TextDecoder](https://encoding.spec.whatwg.org/#interface-textdecoder),
180+
[TextEncoder](https://encoding.spec.whatwg.org/#interface-textdecoder),
181+
some [hooks](https://encoding.spec.whatwg.org/#specification-hooks) (see below).
182+
183+
```js
184+
import { TextDecoder, TextDecoder } from '@exodus/bytes/encoding.js'
185+
186+
// Hooks for standards
187+
import { getBOMEncoding, legacyHookDecode, normalizeEncoding } from '@exodus/bytes/encoding.js'
188+
```
189+
190+
#### `new TextDecoder(label = 'utf-8', { fatal = false, ignoreBOM = false })`
191+
192+
[TextDecoder](https://encoding.spec.whatwg.org/#interface-textdecoder) implementation/polyfill.
193+
194+
#### `new TextEncoder()`
195+
196+
[TextEncoder](https://encoding.spec.whatwg.org/#interface-textdecoder) implementation/polyfill.
197+
198+
#### `normalizeEncoding(label)`
199+
200+
Implements [get an encoding from a string `label`](https://encoding.spec.whatwg.org/#concept-encoding-get).
201+
202+
Converts an encoding [Label](https://encoding.spec.whatwg.org/#names-and-labels) to its name,
203+
as an ASCII-lowercased string.
204+
205+
If an encoding with that label does not exist, returns `null`.
206+
207+
This is the same as `new TextDecoder(label).encoding`
208+
[getter](https://encoding.spec.whatwg.org/#dom-textdecoder-encoding),
209+
except that it does not throw for invalid labels and instead returns `null`.
210+
211+
All encoding names are also valid labels for corresponding encodings.
212+
213+
#### `getBOMEncoding(input)`
214+
215+
Implements [BOM sniff](https://encoding.spec.whatwg.org/#bom-sniff) legacy hook.
216+
217+
Given a `TypedArray` or an `ArrayBuffer` instance `input`, returns either of:
218+
* `'utf-8'`, if `input` starts with UTF-8 byte order mark.
219+
* `'utf-16le'`, if `input` starts with UTF-16LE byte order mark.
220+
* `'utf-16be'`, if `input` starts with UTF-16BE byte order mark.
221+
* `null` otherwise.
222+
223+
#### `legacyHookDecode(input, fallbackEncoding = 'utf-8')`
224+
225+
Implements [decode](https://encoding.spec.whatwg.org/#decode) legacy hook.
226+
227+
Given a `TypedArray` or an `ArrayBuffer` instance `input` and an optional `fallbackEncoding`
228+
normalized encoding name, sniffs encoding from BOM with `fallbackEncoding` fallback and then
229+
decodes the `input` using that encoding, skipping BOM if it was present.
230+
231+
Notes:
232+
233+
* BOM-sniffed encoding takes precedence over `fallbackEncoding` option per spec.
234+
Use with care.
235+
* `fallbackEncoding` must be ASCII-lowercased encoding name,
236+
e.g. a result of `normalizeEncoding(label)` call.
237+
* Always operates in non-fatal [mode](https://encoding.spec.whatwg.org/#textdecoder-error-mode),
238+
aka replacement. It can convert different byte sequences to equal strings.
239+
240+
This method is similar to the following code, except that it doesn't support encoding labels and
241+
only expects lowercased encoding name:
242+
243+
```js
244+
new TextDecoder(getBOMEncoding(input) ?? fallbackEncoding ?? 'utf-8').decode(input)
245+
```
246+
247+
### `@exodus/bytes/encoding-lite.js`
248+
249+
```js
250+
import { TextDecoder, TextDecoder } from '@exodus/bytes/encoding-lite.js'
251+
252+
// Hooks for standards
253+
import { getBOMEncoding, legacyHookDecode, normalizeEncoding } from '@exodus/bytes/encoding-lite.js'
254+
```
255+
256+
The exact same exports as `@exodus/bytes/encoding.js` are also exported as
257+
`@exodus/bytes/encoding-lite.js`, with the difference that the lite version does not load
258+
multi-byte `TextDecoder` encodings by default to reduce bundle size 10x.
259+
260+
The only affected encodings are: `gbk`, `gb18030`, `big5`, `euc-jp`, `iso-2022-jp`, `shift_jis`
261+
and their [labels](https://encoding.spec.whatwg.org/#names-and-labels) when used with `TextDecoder`.
262+
263+
Legacy single-byte encodingds are loaded by default in both cases.
264+
265+
`TextEncoder` and hooks for standards (including `normalizeEncoding`) do not have any behavior
266+
differences in the lite version and support full range if inputs.
267+
268+
To avoid inconsistencies, the exported classes and methods are exactly the same objects.
269+
270+
```console
271+
> lite = require('@exodus/bytes/encoding-lite.js')
272+
[Module: null prototype] {
273+
TextDecoder: [class TextDecoder],
274+
TextEncoder: [class TextEncoder],
275+
getBOMEncoding: [Function: getBOMEncoding],
276+
legacyHookDecode: [Function: legacyHookDecode],
277+
normalizeEncoding: [Function: normalizeEncoding]
278+
}
279+
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
280+
Uncaught:
281+
Error: Legacy multi-byte encodings are disabled in /encoding-lite.js, use /encoding.js for full encodings range support
282+
283+
> full = require('@exodus/bytes/encoding.js')
284+
[Module: null prototype] {
285+
TextDecoder: [class TextDecoder],
286+
TextEncoder: [class TextEncoder],
287+
getBOMEncoding: [Function: getBOMEncoding],
288+
legacyHookDecode: [Function: legacyHookDecode],
289+
normalizeEncoding: [Function: normalizeEncoding]
290+
}
291+
> full.TextDecoder === lite.TextDecoder
292+
true
293+
> new full.TextDecoder('big5').decode(Uint8Array.of(0x25))
294+
'%'
295+
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
296+
'%'
297+
```
298+
176299
## License
177300
178301
[MIT](./LICENSE)

0 commit comments

Comments
 (0)