Skip to content

Fix OG image issues: CJK line breaking and generic title enrichment#2879

Merged
felixkrrr merged 5 commits intomainfrom
cursor/fix-og-issues-0b52
Apr 28, 2026
Merged

Fix OG image issues: CJK line breaking and generic title enrichment#2879
felixkrrr merged 5 commits intomainfrom
cursor/fix-og-issues-0b52

Conversation

@felixkrrr
Copy link
Copy Markdown
Contributor

@felixkrrr felixkrrr commented Apr 27, 2026

Fixes LFE-9532 — two OG image issues.

Issue 1: CJK line break broken

Before: Japanese titles like "Langfuse Cloud 日本リージョンを開始しました" were crammed onto a single line that overflowed, because the width estimator treated CJK characters (~1em wide) identically to Latin characters (~0.48em).

After: The title correctly wraps into two lines.

Changes in app/api/og/route.tsx:

  • Added isCjkOrFullWidth() to detect CJK/full-width Unicode codepoints (Hiragana, Katakana, CJK ideographs, Hangul, etc.)
  • Added effectiveCharCount() that weighs CJK characters at 1.0 / em units instead of 1, making approxLineWidthPx() accurate for mixed-script text
  • Added tokenize() to split CJK text at individual character boundaries (since CJK text often has no whitespace between words)
  • Added joinTokens() to reconstruct display text with proper spacing between token types
  • Updated splitTwoLinesByWidth, splitTwoLines, greedyWordsToTitleRows, and splitTitleIntoBalancedLines to use CJK-aware tokenization
  • CJK titles are no longer classified as "short" (which forced single-line layout)

Japanese OG image - properly line-wrapped

Issue 2: Generic OG titles not descriptive enough

Before: Pages like /docs/prompt-management/get-started showed just "Get Started" in the OG image, and /docs showed just "Overview" — not meaningful in social shares.

After: Generic titles are automatically enriched with parent folder context:

  • "Get Started" → "Get Started with Prompt Management"
  • "Overview" (at /docs) → "Langfuse Overview"
  • "Overview" (at /docs/metrics/overview) → "Metrics Overview"
  • "Troubleshooting and FAQ" → "Prompt Management Troubleshooting and FAQ"

Pages with explicit seoTitle frontmatter are not affected.

Changes in lib/mdx-page.ts:

  • Added GENERIC_TITLES set of titles that need enrichment
  • Added slugSegmentToTitle() to convert slug segments to title case (with known abbreviation overrides for API, SDK, FAQ, LLM, MCP, UI)
  • Added enrichOgTitle() that prepends parent context for generic titles
  • buildSectionMetadata() now uses enriched titles for OG images only (the page <title> tag is unchanged)

Get Started OG image - enriched title

Langfuse Overview OG image

To show artifacts inline, enable in settings.

Linear Issue: LFE-9532

Open in Web Open in Cursor 

Disclaimer: Experimental PR review

Greptile Summary

This PR fixes two OG image issues: CJK text (Japanese/Korean/Chinese) now wraps correctly via a new isCjkOrFullWidth detector, effectiveCharCount width estimator, and tokenize/joinTokens helpers; and generic page titles like "Overview" or "Get Started" are automatically enriched with parent-slug context via enrichOgTitle in lib/mdx-page.ts, leaving the page <title> tag unchanged.

Confidence Score: 5/5

Safe to merge — no logic bugs found; only minor style inconsistencies flagged.

All P2 findings. The raw-length check in splitTwoLines is harmless because the pixel-budget check in tryTwoLineLayout catches any oversized candidates, and the primary splitTwoLinesByWidth path handles CJK correctly. The enrichOgTitle changes are isolated to OG metadata only.

No files require special attention.

Important Files Changed

Filename Overview
app/api/og/route.tsx Adds CJK-aware tokenization, width estimation, and line-splitting helpers; correctly excludes CJK titles from the short single-line fast path. Minor inconsistency in splitTwoLines (raw .length check, not effectiveCharCount) and redundant hasCjk call in greedyWordsToTitleRows; both are functionally harmless but worth cleaning up.
lib/mdx-page.ts Adds enrichOgTitle to prepend parent-slug context to generic page titles (Overview, Get Started, etc.) in OG images only; page title tag is unchanged. Logic is straightforward and the seoTitle guard correctly bypasses enrichment when an explicit override exists.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[fitTitleLayout] -->|isLongTitle?| B[fitTitleLayoutLong]
    A -->|isShortTitle?| C[fitTitleLayoutSingleLine]
    A --> D[splitTwoLinesByWidth\nCJK-aware pixel budget]
    A --> E[splitTwoLines\nchar budget fallback]
    A --> F[wrapWords\nchar-cap fallback]
    D --> G{tryTwoLineLayout\ntitleLinesFitRenderConstraints}
    E --> G
    F --> G
    subgraph CJK helpers
        H[hasCjk]
        I[tokenize\nper-char for CJK]
        J[joinTokens\nno space CJK-CJK]
        K[effectiveCharCount\n1/em units for CJK]
        L[approxLineWidthPx\nuses effectiveCharCount]
    end
    D --> I
    E --> I
    I --> J
    L --> K
    subgraph OG title enrichment
        M[buildSectionMetadata] --> N{pageData.seoTitle?}
        N -->|yes| O[use seoTitle as-is]
        N -->|no| P[enrichOgTitle]
        P --> Q{GENERIC_TITLES match?}
        Q -->|no| R[return title unchanged]
        Q -->|yes| S{slug.length}
        S -->|>=2| T[slugSegmentToTitle slug at -2]
        S -->|0| U[Langfuse prefix]
        S -->|1| V[sectionTitle prefix]
    end
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: app/api/og/route.tsx
Line: 158-178

Comment:
**`splitTwoLines` budget check ignores CJK width for mixed-script text**

The PR updated `splitTwoLines` to tokenize CJK text correctly, but the `l1.length <= maxCharsPerLine` guard still uses raw character count rather than `effectiveCharCount`. Since `maxCharsPerLine` is derived from pixel budget ÷ `ANALOG_CHAR_EM` (0.48), a CJK line of 18 raw characters would pass the guard while consuming roughly 2× the intended pixel budget (each CJK character is `1/0.48 ≈ 2.08×` the nominal char width). The downstream `tryTwoLineLayout``titleLinesFitRenderConstraints` will correctly reject these candidates, so correctness is preserved, but every CJK-tokenized split this function produces will be a false positive when font sizes are explored.

```suggestion
    if (
      effectiveCharCount(l1, ANALOG_CHAR_EM) <= maxCharsPerLine &&
      effectiveCharCount(l2, ANALOG_CHAR_EM) <= maxCharsPerLine
    ) {
      const imbalance = Math.abs(
        effectiveCharCount(l1, ANALOG_CHAR_EM) -
        effectiveCharCount(l2, ANALOG_CHAR_EM)
      );
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: app/api/og/route.tsx
Line: 233-239

Comment:
**`hasCjk(title)` called twice**

`hasCjk(title)` is invoked on lines 233 and 239 — the result could be cached once into a local variable to avoid iterating over the string twice.

```suggestion
  const cjk = hasCjk(title);
  const tokens = cjk
    ? tokenize(title)
    : title.trim().split(/\s+/).filter(Boolean);
  if (tokens.length === 0) {
    return [""];
  }
  const join = cjk ? joinTokens : (t: string[]) => t.join(" ");
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (2): Last reviewed commit: "Fix isLongTitle CJK threshold: use 105 n..." | Re-trigger Greptile

Issue 1: CJK characters (Japanese, Chinese, Korean) are full-width (~1em)
but the width estimator treated them identically to Latin characters (~0.48em).
This caused titles like 'Langfuse Cloud 日本リージョンを開始しました' to be
crammed onto one line instead of wrapping properly.

- Add isCjkOrFullWidth() and effectiveCharCount() for accurate width estimation
- Add tokenize() to split CJK text at character boundaries (not just whitespace)
- Add joinTokens() to reconstruct display text with proper spacing
- Update all layout functions (splitTwoLinesByWidth, splitTwoLines,
  greedyWordsToTitleRows, splitTitleIntoBalancedLines) to use CJK-aware splitting
- Prevent CJK titles from being classified as 'short' (single-line)

Issue 2: Pages with generic frontmatter titles (Overview, Get Started, etc.)
now get enriched OG titles using parent folder context from the URL slug.

- 'Get Started' at /docs/prompt-management/get-started becomes
  'Get Started with Prompt Management'
- 'Overview' at /docs becomes 'Langfuse Overview'
- 'Overview' at /docs/metrics/overview becomes 'Metrics Overview'
- Pages with explicit seoTitle are unchanged

Co-authored-by: felixkrrr <felixkrrr@users.noreply.github.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
langfuse-docs Ready Ready Preview, Comment Apr 28, 2026 10:51am

Request Review

Generic frontmatter titles like 'Get Started' and 'Overview' now render
enriched in both the OG image AND the page <title> tag:
- 'Get Started - Langfuse' → 'Get Started with Prompt Management - Langfuse'
- 'Overview - Langfuse' → 'Langfuse Overview - Langfuse'

Pages with explicit seoTitle in frontmatter are unaffected.

Co-authored-by: felixkrrr <felixkrrr@users.noreply.github.com>
@felixkrrr felixkrrr marked this pull request as ready for review April 28, 2026 08:34
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 28, 2026
@github-actions
Copy link
Copy Markdown

@claude review

Comment thread app/api/og/route.tsx Outdated
@felixkrrr
Copy link
Copy Markdown
Contributor Author

@greptileai

The effective char count threshold was 105/0.48 ≈ 219, which could never
fire because the t.length > 105 guard above catches all strings that long.
A 40-char CJK title (visually as wide as ~83 Latin chars) was never routed
to fitTitleLayoutLong. Using 105 directly means 'this title has the visual
weight of a 105-char Latin string.'

Addresses PR review comment from greptile.

Co-authored-by: felixkrrr <felixkrrr@users.noreply.github.com>
@felixkrrr felixkrrr added this pull request to the merge queue Apr 28, 2026
@dosubot dosubot Bot added the auto-merge This PR is set to be merged label Apr 28, 2026
Merged via the queue into main with commit 1b7a8bd Apr 28, 2026
14 checks passed
@felixkrrr felixkrrr deleted the cursor/fix-og-issues-0b52 branch April 28, 2026 13:40
@dosubot dosubot Bot removed the auto-merge This PR is set to be merged label Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants