Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/iconv-runtime-wrapper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@tailor-platform/sdk": minor
---

Add `@tailor-platform/sdk/iconv` runtime wrapper for character encoding conversion. Exports typed `convert`, `convertBuffer`, `decode`, `encode`, `encodings`, and `Iconv` class that delegate to the platform's `tailor.iconv` runtime API. Use `setupIconvMock()` from `@tailor-platform/sdk/test` to mock these calls in unit tests.
4 changes: 4 additions & 0 deletions packages/sdk/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,10 @@ the installed SDK version. Files are copied (not symlinked) so they survive
| [Static Website](./docs/services/staticwebsite.md) | Static file hosting |
| [Secret Manager](./docs/services/secret.md) | Secure credential storage |

### Runtime Utilities

- [Character Encoding Conversion (iconv)](./docs/iconv.md) - Convert between UTF-8, Shift_JIS, EUC-JP, IBM EBCDIC, and other encodings via `@tailor-platform/sdk/iconv`

### Guides

- [Testing Guide](./docs/testing.md) - Unit and E2E testing patterns
Expand Down
174 changes: 174 additions & 0 deletions packages/sdk/docs/iconv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Character Encoding Conversion (iconv)

`@tailor-platform/sdk/iconv` is a thin typed wrapper around the platform-provided `tailor.iconv` runtime API. It enables conversion between character encodings — useful for handling Shift_JIS / EUC-JP CSV imports, integrating with mainframes that use IBM EBCDIC variants, or normalizing legacy data into UTF-8.

For the full list of supported encodings and platform-side details, see the official [Character Encoding Conversion](https://docs.tailor.tech/reference/concepts/character-encodings.html) reference.

## Overview

The module provides:

- Stateless functions for one-off conversions: `convert`, `convertBuffer`, `decode`, `encode`, `encodings`
- A stateful `Iconv` class for repeated conversions between a fixed encoding pair (compatible with the `node-iconv` API surface)
- A typed test mock helper, `setupIconvMock`, for unit tests

All functions and the `Iconv` class delegate to `globalThis.tailor.iconv` at runtime, which is provided by the Tailor Platform Function runtime. They are intended for use inside resolvers, executors, and workflow jobs.

## Supported Encodings

Common encodings include:

- **Unicode**: `UTF-8`, `UTF-16`, `UTF-16BE`, `UTF-16LE`
- **Japanese**: `Shift_JIS` (aliases: `SJIS`, `CP932`), `EUC-JP`, `EUC-JP-MS`, `ISO-2022-JP`
- **Enterprise / mainframe**: IBM EBCDIC variants (`IBM037`, `IBM290`, `IBM930`, `IBM939`, `IBM943`), Hitachi KEIS, NEC JIS aliases
- **Chinese**: `GB2312`, `GBK`, `GB18030`, `Big5`, `BIG5HKSCS`
- **Korean**: `EUC-KR`, `UHC`, `JOHAB`, `ISO-2022-KR`
- **Other**: `ISO-8859-1`, `ASCII`

Call `encodings()` at runtime to get the full list supported by the platform.

## API

### `convert(data, fromEncoding, toEncoding)`

Convert a string or buffer between encodings. The return type narrows based on `toEncoding`: it is `string` when `toEncoding` is `"UTF-8"` or `"UTF8"`, and `Uint8Array` otherwise.

```typescript
import { convert } from "@tailor-platform/sdk/iconv";

// UTF-8 string → Shift_JIS bytes
const sjisBytes = convert("日本語テキスト", "UTF-8", "Shift_JIS");
// ^? Uint8Array

// EUC-JP bytes → UTF-8 string
const utf8Text = convert(eucjpBuffer, "EUC-JP", "UTF-8");
// ^? string
```

### `convertBuffer(buffer, fromEncoding, toEncoding)`

Like `convert`, but accepts only a `Uint8Array | ArrayBuffer` input. Use this when you want the type system to enforce buffer input.

### `decode(buffer, encoding)`

Decode a buffer into a UTF-8 string by interpreting it with the given source encoding. Equivalent to `convert(buffer, encoding, "UTF-8")`.

```typescript
import { decode } from "@tailor-platform/sdk/iconv";

const text = decode(sjisCsvBuffer, "Shift_JIS"); // string
```

### `encode(str, encoding)`

Encode a UTF-8 string into the given target encoding. Returns `string` when the target is UTF-8, otherwise `Uint8Array`.

```typescript
import { encode } from "@tailor-platform/sdk/iconv";

const sjisBytes = encode("こんにちは", "Shift_JIS"); // Uint8Array
```

### `encodings()`

Return the list of supported encoding identifiers from the runtime.

```typescript
import { encodings } from "@tailor-platform/sdk/iconv";

const list = encodings(); // string[]
```

### `Iconv` class

Stateful converter for repeated conversions between a fixed encoding pair. Useful when you process many records with the same source/target encoding and want to avoid passing the encoding pair on every call.

```typescript
import { Iconv } from "@tailor-platform/sdk/iconv";

const conv = new Iconv("Shift_JIS", "UTF-8");
for (const row of sjisRows) {
const utf8 = conv.convert(row); // string | Uint8Array
}
```

## Error Handling Flags

Append flags to `toEncoding` to control behavior on unconvertible characters:

| Flag | Behavior |
| ----------------- | --------------------------------------------------------------------------------- |
| `//IGNORE` | Silently skip characters that cannot be represented in the target encoding |
| `//TRANSLIT` | Replace unconvertible characters with `?` (default substitute) |
| `//TRANSLIT:char` | Replace unconvertible characters with the specified replacement (e.g. `*`, `[?]`) |

```typescript
import { convert } from "@tailor-platform/sdk/iconv";

convert("Hello 世界!", "UTF-8", "ASCII//TRANSLIT:*");
// → "Hello **!"

convert("Test 日本語", "UTF-8", "ASCII//TRANSLIT:[?]");
// → "Test [?][?][?]"
```

## Usage in a Resolver

A common pattern is to fetch bytes from a TailorDB file field, decode them, and process the result. Bytes can come from `tailordb.file.download` (or a generated helper from the [`file-utils` plugin](./plugin/index.md)), an external HTTP fetch, or a base64-encoded input.

```typescript
import { createResolver, t } from "@tailor-platform/sdk";
import { decode } from "@tailor-platform/sdk/iconv";

export default createResolver({
name: "importSjisCsv",
operation: "mutation",
input: { csvBase64: t.string() },
output: { rows: t.int() },
body: async ({ input }) => {
const bytes = Uint8Array.from(atob(input.csvBase64), (c) => c.charCodeAt(0));
const text = decode(bytes, "Shift_JIS");
const rows = text.split("\n").filter((line) => line.length > 0).length;
// ...persist parsed rows
return { rows };
},
});
```

## Testing

Use `setupIconvMock()` from `@tailor-platform/sdk/test` to mock `tailor.iconv` in unit tests. The default implementation passes strings through and uses Node's `TextEncoder`/`TextDecoder` for UTF-8, which is enough for most assertions. For non-UTF-8 round trips, supply your own handler via `onConvert`.

```typescript
import { afterEach, describe, expect, test } from "vitest";
import { setupIconvMock, unauthenticatedTailorUser } from "@tailor-platform/sdk/test";
import resolver from "./resolvers/importSjisCsv";

const TailorGlobal = globalThis as { tailor?: { iconv?: unknown } };

describe("importSjisCsv resolver", () => {
afterEach(() => {
delete TailorGlobal.tailor;
});

test("decodes Shift_JIS CSV", async () => {
const { calls } = setupIconvMock({
onDecode: (_buffer, encoding) => {
expect(encoding).toBe("Shift_JIS");
return "name,age\nAlice,30\n";
},
});

const result = await resolver.body({
input: { csvBase64: btoa("dummy bytes") },
user: unauthenticatedTailorUser,
env: {},
});

expect(result).toEqual({ rows: 2 });
expect(calls).toHaveLength(1);
});
});
```

`setupIconvMock` records every call in the returned `calls` array (`{ method, args }`) so you can assert that the right encoding was requested. Clean up by deleting `TailorGlobal.tailor` in `afterEach`.
5 changes: 5 additions & 0 deletions packages/sdk/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@
"import": "./dist/kysely/index.mjs",
"default": "./dist/kysely/index.mjs"
},
"./iconv": {
"types": "./dist/iconv/index.d.mts",
"import": "./dist/iconv/index.mjs",
"default": "./dist/iconv/index.mjs"
},
"./plugin": {
"types": "./dist/plugin/index.d.mts",
"import": "./dist/plugin/index.mjs",
Expand Down
107 changes: 107 additions & 0 deletions packages/sdk/src/iconv/index.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
import { afterEach, describe, it, expect, expectTypeOf } from "vitest";
import { setupIconvMock } from "@/utils/test/mock";
import { convert, convertBuffer, decode, encode, encodings, Iconv } from "./index";

const TailorGlobal = globalThis as { tailor?: { iconv?: unknown } };

describe("@tailor-platform/sdk/iconv", () => {
afterEach(() => {
delete TailorGlobal.tailor;
});

describe("convert", () => {
it("delegates to tailor.iconv.convert", () => {
const { calls } = setupIconvMock();
const result = convert("hello", "UTF-8", "Shift_JIS");
expect(result).toBeInstanceOf(Uint8Array);
expect(calls).toEqual([{ method: "convert", args: ["hello", "UTF-8", "Shift_JIS"] }]);
});

it("returns string when toEncoding is UTF-8", () => {
setupIconvMock({
onConvert: (input, _from, to) => {
if (to === "UTF-8") return "decoded";
return new Uint8Array([1, 2, 3]);
},
});
const result = convert(new Uint8Array([0xe3, 0x81, 0x82]), "Shift_JIS", "UTF-8");
expect(result).toBe("decoded");
});

it("type narrows return based on toEncoding literal", () => {
setupIconvMock();
// Type-level checks — only reachable when iconv mock is set up at runtime.
expectTypeOf(convert("a", "UTF-8", "UTF-8")).toEqualTypeOf<string>();
expectTypeOf(convert("a", "UTF-8", "UTF8")).toEqualTypeOf<string>();
expectTypeOf(convert("a", "UTF-8", "Shift_JIS")).toEqualTypeOf<Uint8Array>();
});
});

describe("convertBuffer", () => {
it("delegates to tailor.iconv.convertBuffer", () => {
const { calls } = setupIconvMock();
const buf = new Uint8Array([1, 2, 3]);
convertBuffer(buf, "Shift_JIS", "UTF-8");
expect(calls).toEqual([{ method: "convertBuffer", args: [buf, "Shift_JIS", "UTF-8"] }]);
});
});

describe("decode", () => {
it("decodes a buffer to a UTF-8 string", () => {
const { calls } = setupIconvMock();
const buf = new TextEncoder().encode("hello");
const result = decode(buf, "UTF-8");
expect(result).toBe("hello");
expect(calls).toEqual([{ method: "decode", args: [buf, "UTF-8"] }]);
});
});

describe("encode", () => {
it("encodes a string to a buffer", () => {
const { calls } = setupIconvMock();
const result = encode("hello", "Shift_JIS");
expect(result).toBeInstanceOf(Uint8Array);
expect(calls).toEqual([{ method: "encode", args: ["hello", "Shift_JIS"] }]);
});

it("returns string when encoding is UTF-8", () => {
setupIconvMock();
expectTypeOf(encode("a", "UTF-8")).toEqualTypeOf<string>();
expectTypeOf(encode("a", "Shift_JIS")).toEqualTypeOf<Uint8Array>();
});
});

describe("encodings", () => {
it("returns the platform's supported encoding list", () => {
const { calls } = setupIconvMock({ onEncodings: () => ["UTF-8", "FOO"] });
expect(encodings()).toEqual(["UTF-8", "FOO"]);
expect(calls).toEqual([{ method: "encodings", args: [] }]);
});
});

describe("Iconv class", () => {
it("constructs and converts via the platform Iconv class", () => {
const { calls } = setupIconvMock();
const conv = new Iconv("Shift_JIS", "UTF-8");
const result = conv.convert(new Uint8Array([0xe3, 0x81, 0x82]));
expect(typeof result).toBe("string");
expect(calls).toHaveLength(1);
expect(calls[0]?.method).toBe("convert");
});

it("reuses fixed encoding pair across calls", () => {
const { calls } = setupIconvMock();
const conv = new Iconv("UTF-8", "Shift_JIS");
conv.convert("a");
conv.convert("b");
expect(calls).toHaveLength(2);
expect(calls[0]?.args[1]).toBe("UTF-8");
expect(calls[0]?.args[2]).toBe("Shift_JIS");
expect(calls[1]?.args[1]).toBe("UTF-8");
});
});

it("throws a clear runtime error when tailor.iconv is not available", () => {
expect(() => convert("a", "UTF-8", "UTF-8")).toThrow();
});
});
Loading
Loading