Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions cloudinary/src/getGenerateSignature.ts
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,12 @@ const allowedParams = new Set(['folder', 'public_id', 'timestamp'])
const normalizeFolder = (value: string): string => {
let start = 0
let end = value.length
while (start < end && value[start] === '/') start++
while (end > start && value[end - 1] === '/') end--
while (start < end && value[start] === '/') {
start++
}
while (end > start && value[end - 1] === '/') {
end--
}
return value.slice(start, end)
}

Expand Down
1 change: 1 addition & 0 deletions content-translator/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## Unreleased

- feat: add incremental richText translation ("Translate new & changed content") that translates only new or changed paragraphs, preserves existing translations and manual edits, and reports how many paragraphs need review when a source paragraph changed under a hand-edited translation
- fix: translate rich text block-level elements as one unit using segment markers so inline formatting spans stay aligned and word order can change across languages
- fix: reconstruct OpenAI translations by input index so a merged, dropped, or reordered entry no longer shifts later translations into the wrong fields; missing entries keep their original text
- fix: abort a translation when the resolver returns a different number of texts than were sent, and guard against non-string values reaching `he.decode`
Expand Down
25 changes: 25 additions & 0 deletions content-translator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,31 @@ export default buildConfig({
})
```

## Translation modes

The translator modal offers three actions:

- **Translate all fields** — retranslates every field, discarding existing target content.
- **Translate new & changed content** — incremental mode (see below).
- **Translate only empty fields** — fills target fields that have no value yet, leaving the rest untouched.

### Incremental mode

Incremental mode translates only what actually changed and preserves existing translations, which matters most for `richText`. For a lexical field it diffs the source against the existing translation at the **paragraph / block level**:

- a paragraph whose source text is unchanged keeps its current translation (including any manual edits) and is not retranslated;
- a new or edited source paragraph is translated and placed in source order, so inserts and reorders land in the right position;
- a paragraph removed from the source is removed from the translation;
- if a source paragraph changed **and** its translation had been hand-edited, the human's version is left in place and counted — the success toast reports how many paragraphs need review, so machine accuracy never silently overwrites manual work.

**Limitation:** change detection currently applies to lexical `richText` only. Every other field type (`text`, `textarea`, `number`, `array`, `blocks`, non-lexical `richText`) behaves like "translate only empty fields" in incremental mode — an empty target is filled, but a field whose source changed _after_ it was already translated is **not** retranslated. Detecting edits on those would require storing a source hash per field (plain fields have no inline NodeState slot like lexical nodes do).

Paragraph identity is content-addressed: a hash of the source text and a hash of the machine output are stored inline on the translated node using Lexical's [NodeState](https://lexical.dev/docs/concepts/node-state) slot (`$`), under a single namespaced key — `"$": { "translator-plugin": { "srcHash": { "<sourceLocale>": … }, "outHash": … } }`. These pass through Payload saves and admin-editor edits untouched (covered by a regression test). Because identity comes from content rather than position, the diff survives inserts, deletes and reorders.

The source language is whatever the editor selects in the modal (it defaults to your `defaultLocale`), so `srcHash` is keyed **by source locale**: translating a target from EN vs. DE are tracked independently, and a paragraph translated from one source isn't mistaken for content from another. The `outHash` is a single value — it hashes the target's own text, independent of which source produced it.

The first incremental run on a field translated by an older version (no stored hashes) retranslates it once and then stamps the hashes; subsequent runs are incremental. If a future lexical/Payload release ever stopped preserving the `$` slot, the same merge can fall back to a sidecar field keyed by field path — the algorithm is identical, only the read/write of the hash changes.

## Configuration

### Plugin Options
Expand Down
1 change: 1 addition & 0 deletions content-translator/dev/src/collections/authors.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ export const authorsSchema: CollectionConfig = {
admin: {
useAsTitle: 'name',
},
versions: true,
fields: [
{
name: 'name',
Expand Down
1 change: 1 addition & 0 deletions content-translator/dev/src/collections/media.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ const dirname = path.dirname(filename)

export const mediaSchema: CollectionConfig = {
slug: 'media',
versions: true,
fields: [],
upload: {
staticDir: path.resolve(dirname, '../media'),
Expand Down
1 change: 1 addition & 0 deletions content-translator/dev/src/collections/pages.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import type { CollectionConfig } from 'payload'

export const pagesSchema: CollectionConfig = {
slug: 'pages',
versions: true,
fields: [
{
name: 'title',
Expand Down
1 change: 1 addition & 0 deletions content-translator/dev/src/collections/posts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import type { CollectionConfig, CollectionSlug } from 'payload'

export const postsSchema: CollectionConfig = {
slug: 'posts',
versions: true,
fields: [
{
name: 'title',
Expand Down
55 changes: 53 additions & 2 deletions content-translator/dev/src/seed.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,34 @@ interface AuthorSeedData {
}

interface PageSeedData {
content?: string[]
keywords: string[]
slug: string
title: string
}

/** Build a minimal lexical richText value from plain-text paragraphs. */
const lexical = (paragraphs: string[]) => ({
root: {
type: 'root',
children: paragraphs.map((text) => ({
type: 'paragraph',
children: [
{ type: 'text', detail: 0, format: 0, mode: 'normal', style: '', text, version: 1 },
],
direction: 'ltr',
format: '',
indent: 0,
textFormat: 0,
version: 1,
})),
direction: 'ltr',
format: '',
indent: 0,
version: 1,
},
})

interface PostSeedData {
slug: string
title: string
Expand Down Expand Up @@ -73,6 +96,18 @@ export const seed = async (payload: Payload) => {

const pages: PageSeedData[] = [
{
// Source content for trying incremental richText translation. To see each
// case of the classification table:
// 1. open this page, switch to the German locale, "Translate all fields"
// 2. switch back to English, add / edit / reorder / delete a paragraph
// 3. switch to German, "Translate new & changed content" — only the
// changed paragraph is translated; the rest (incl. any manual edits
// you made to the German text) are preserved
content: [
'Welcome to our company. We build software that helps teams move faster.',
'Our mission is to make complex workflows feel simple and reliable.',
'Get in touch to learn how we can help your organisation.',
],
slug: 'home',
keywords: ['welcome', 'home page', 'getting started'],
title: 'Welcome to Our Website',
Expand All @@ -95,15 +130,31 @@ export const seed = async (payload: Payload) => {
]

for (const pageData of pages) {
const { totalDocs: existingPage } = await payload.count({
const { docs } = await payload.find({
collection: 'pages' as CollectionSlug,
depth: 0,
limit: 1,
where: { slug: { equals: pageData.slug } },
})

const existingPage = docs[0] as { content?: unknown; id: number | string } | undefined

if (!existingPage) {
await payload.create({
collection: 'pages' as CollectionSlug,
data: pageData,
data: {
slug: pageData.slug,
title: pageData.title,
keywords: pageData.keywords,
...(pageData.content ? { content: lexical(pageData.content) } : {}),
} as Record<string, unknown>,
})
} else if (pageData.content && !existingPage.content) {
// Backfill demo content onto a page seeded before it had any.
await payload.update({
collection: 'pages' as CollectionSlug,
id: existingPage.id,
data: { content: lexical(pageData.content) } as Record<string, unknown>,
})
}
}
Expand Down
3 changes: 3 additions & 0 deletions content-translator/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -43,14 +43,17 @@
"react-dom": "19.2.7"
},
"devDependencies": {
"@lexical/headless": "^0.41.0",
"@payloadcms/eslint-config": "^3.28.0",
"@payloadcms/richtext-lexical": "^3.84.1",
"@swc/cli": "^0.8.1",
"@swc/core": "^1.15.41",
"@types/he": "^1.2.3",
"@types/react": "^19.2.17",
"@types/react-dom": "^19.2.3",
"copyfiles": "^2.4.1",
"eslint": "^9.39.4",
"lexical": "^0.41.0",
"prettier": "^3.8.4",
"rimraf": "^6.1.3",
"tsx": "^4.22.4",
Expand Down
Loading
Loading