Skip to content

feat(seo): structured data (Organization + DefinedTermSet) and pre-render the article#592

Merged
rdmueller merged 1 commit into
LLM-Coding:mainfrom
raifdmueller:feat/jsonld-defined-terms
Jun 10, 2026
Merged

feat(seo): structured data (Organization + DefinedTermSet) and pre-render the article#592
rdmueller merged 1 commit into
LLM-Coding:mainfrom
raifdmueller:feat/jsonld-defined-terms

Conversation

@raifdmueller

@raifdmueller raifdmueller commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Implements the structured-data half of #579 and fixes a missing pre-render route the article was hiding behind.

Changes

Structured data (#579)

  • Standalone Organization entity in index.html with @id, url, logo, sameAs (GitHub). WebSite.publisher now references it by @id instead of inlining a separate, unresolvable copy.
  • DefinedTermSet + DefinedTerm — new scripts/generate-jsonld.js reads anchors.json at build time and emits one DefinedTerm per anchor (161 total: name, canonical url, termCode, and a description extracted from the first Core Concept where cleanly available — 124/161). Injected after prerender into the home page and /all-anchors only (the canonical locations for the set), wired as the last step of npm run build.

Bug fix — invisible article

  • /training-data-vs-practice was absent from the prerender ROUTES list, so it shipped only as the client-rendered SPA shell — invisible to search engines and LLM crawlers. Added it; it now pre-renders like every other doc page (16 routes, was 15).

Why this split

The human-readable anchor definitions are already crawlable via the pre-rendered /all-anchors page. What was missing is the machine-readable entity graph that lets search/AI resolve "Semantic Anchors" as a distinct DefinedTermSet and each anchor as a defined term with a canonical URL. Crisp per-anchor "direct answer" blocks remain #580; those will later supersede the extracted descriptions here.

A note on expectations (for #579 and #580)

Structured data helps the retrieval-grounded path (Perplexity, AI Overviews, Bing Copilot, ChatGPT-with-search) — it does not write a training-data prior. A cold model with no live search will still not cite the project from its weights; that only changes with time and coverage. Worth tightening #580's "shows the project surfaced" acceptance criterion accordingly (testable only on retrieval-grounded assistants).

Verification

Full vite build + prerender + injection run locally:

  • article page carries real content (<title>, canonical, filled #page-content);
  • home + /all-anchors carry the DefinedTermSet (161 terms); a control route (/changelog) does not;
  • both index.html JSON-LD blocks and the generated set validate;
  • injection is idempotent and </script>-escaped.

Remaining manual check for the reviewer: run the live output through the Google Rich Results Test / schema.org validator (third AC of #579) — I can't reach those from here.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Verbesserte Auffindbarkeit durch strukturierte Daten (Schema.org) für Suchmaschinen und KI-Systeme.
    • Organisationsentität und Anker-Verzeichnis sind nun maschinenlesbar.
  • Bug Fixes

    • Eine Seite wird jetzt korrekt vorgerendert und ist für Suchmaschinen sichtbar.

…coverability

Implements the structured-data half of LLM-Coding#579 and fixes a missing pre-render route.

- Standalone Organization entity (resolvable @id + sameAs) in index.html;
  WebSite.publisher now references it by @id (LLM-Coding#579/1a).
- scripts/generate-jsonld.js: build-time DefinedTermSet + a DefinedTerm per
  anchor (161 terms; name, canonical URL, termCode, and a definition where
  cleanly extractable from the .adoc) generated from anchors.json and injected
  into the home and /all-anchors pages after prerender (LLM-Coding#579/1b). The
  human-readable definitions already ship via /all-anchors; crisp answer
  blocks remain LLM-Coding#580.
- Pre-render /training-data-vs-practice: the article was absent from the
  prerender ROUTES list, so it was invisible to search engines and LLM
  crawlers. Now pre-rendered like every other doc page.

Verified with a full vite build + prerender + injection: the article page
carries real content; home and /all-anchors carry the set, other routes do
not; both index.html JSON-LD blocks and the generated set validate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: d5cbceab-e76a-46ee-9e1e-05fb9a900f19

📥 Commits

Reviewing files that changed from the base of the PR and between 86ee9ca and dd8ab77.

📒 Files selected for processing (5)
  • docs/changelog.adoc
  • scripts/generate-jsonld.js
  • scripts/prerender-routes.js
  • website/index.html
  • website/package.json

Walkthrough

Diese PR erweitert die Website um Schema.org-Structured Data für verbesserte Suchmaschinen- und LLM-Sichtbarkeit: Die Organization-Entität wird refaktoriert, ein build-zeitliches JSON-LD-Generierungsskript wird eingeführt (das Anchor-Definitionen aus anchors.json extrahiert und injiziert), die Build-Pipeline wird angepasst und eine neue vorgerer nderte Route wird hinzugefügt.

Changes

SEO/AI Discoverability Enhancement

Layer / File(s) Summary
Schema.org Organization Refaktorierung
website/index.html
WebSite-JSON-LD verweist auf Organization über @id statt verschachteltem Objekt; neue standalone Organization-Entity mit name, alternateName, url, logo, description und sameAs-Links wird eingefügt.
JSON-LD DefinedTermSet-Generierung
scripts/generate-jsonld.js
Neues Skript lädt anchors.json, extrahiert optional Beschreibungen aus AsciiDoc-Core-Concepts-Blöcken, bereinigt und kürzt Text, serialisiert als JSON-LD DefinedTermSet mit DefinedTerm-Einträgen und injiziert idempotent vor </head> in website/dist/index.html und website/dist/all-anchors/index.html.
Build-Pipeline-Integration
website/package.json, scripts/prerender-routes.js
Build-Skript wird erweitert, um JSON-LD-Generierung nach Vite-Build auszuführen; neue Route /training-data-vs-practice wird mit Fragment, Title und Meta-Description in prerender-routes definiert.
Release Notes
docs/changelog.adoc
Changelog dokumentiert SEO/AI Structured-Data-Erweiterung (Organization + DefinedTermSet-Generierung) und Fix für fehlende Vorrendering einer Dokumentseite zum Stand 2026-06-10.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • #579: Implementiert die gleiche Code-Ebenen-Verbesserung mit standalone Organization JSON-LD und build-zeitlicher DefinedTermSet/DefinedTerm-Generierung aus anchors.json.

Possibly related PRs

  • LLM-Coding/Semantic-Anchors#586: Beide PRs wiring die neue Route /training-data-vs-practice; diese PR erweitert zusätzlich scripts/prerender-routes.js mit Title/Meta-Behandlung für diese Route.
  • LLM-Coding/Semantic-Anchors#378: Beide PRs modifizieren Schema.org Structured Data in website/index.html (diese PR: WebSite/Organization JSON-LD Wiring; PR #378: JSON-LD Description und Meta/OG/Twitter Tags).
  • LLM-Coding/Semantic-Anchors#554: Diese PR fügt ein build-zeitliches JSON-LD-Generator hinzu, das website/public/data/anchors.json konsumiert; PR #554 fügt neue Semantic Anchors (mit ihren docs/anchors/*.adoc Pfaden) hinzu, die der Generator verarbeitet.

Suggested reviewers

  • JensGrote
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rdmueller rdmueller merged commit c5d4735 into LLM-Coding:main Jun 10, 2026
5 of 6 checks passed
if (html.includes(SET_ID)) return false // idempotent
if (!html.includes('</head>')) return false
html = html.replace('</head>', ` ${scriptTag}\n </head>`)
fs.writeFileSync(file, html, 'utf-8')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants