feat(seo): structured data (Organization + DefinedTermSet) and pre-render the article#592
Conversation
…coverability Implements the structured-data half of LLM-Coding#579 and fixes a missing pre-render route. - Standalone Organization entity (resolvable @id + sameAs) in index.html; WebSite.publisher now references it by @id (LLM-Coding#579/1a). - scripts/generate-jsonld.js: build-time DefinedTermSet + a DefinedTerm per anchor (161 terms; name, canonical URL, termCode, and a definition where cleanly extractable from the .adoc) generated from anchors.json and injected into the home and /all-anchors pages after prerender (LLM-Coding#579/1b). The human-readable definitions already ship via /all-anchors; crisp answer blocks remain LLM-Coding#580. - Pre-render /training-data-vs-practice: the article was absent from the prerender ROUTES list, so it was invisible to search engines and LLM crawlers. Now pre-rendered like every other doc page. Verified with a full vite build + prerender + injection: the article page carries real content; home and /all-anchors carry the set, other routes do not; both index.html JSON-LD blocks and the generated set validate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
WalkthroughDiese PR erweitert die Website um Schema.org-Structured Data für verbesserte Suchmaschinen- und LLM-Sichtbarkeit: Die Organization-Entität wird refaktoriert, ein build-zeitliches JSON-LD-Generierungsskript wird eingeführt (das Anchor-Definitionen aus anchors.json extrahiert und injiziert), die Build-Pipeline wird angepasst und eine neue vorgerer nderte Route wird hinzugefügt. ChangesSEO/AI Discoverability Enhancement
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| if (html.includes(SET_ID)) return false // idempotent | ||
| if (!html.includes('</head>')) return false | ||
| html = html.replace('</head>', ` ${scriptTag}\n </head>`) | ||
| fs.writeFileSync(file, html, 'utf-8') |
Implements the structured-data half of #579 and fixes a missing pre-render route the article was hiding behind.
Changes
Structured data (#579)
Organizationentity inindex.htmlwith@id,url,logo,sameAs(GitHub).WebSite.publishernow references it by@idinstead of inlining a separate, unresolvable copy.DefinedTermSet+DefinedTerm— newscripts/generate-jsonld.jsreadsanchors.jsonat build time and emits oneDefinedTermper anchor (161 total:name, canonicalurl,termCode, and adescriptionextracted from the first Core Concept where cleanly available — 124/161). Injected after prerender into the home page and/all-anchorsonly (the canonical locations for the set), wired as the last step ofnpm run build.Bug fix — invisible article
/training-data-vs-practicewas absent from the prerenderROUTESlist, so it shipped only as the client-rendered SPA shell — invisible to search engines and LLM crawlers. Added it; it now pre-renders like every other doc page (16 routes, was 15).Why this split
The human-readable anchor definitions are already crawlable via the pre-rendered
/all-anchorspage. What was missing is the machine-readable entity graph that lets search/AI resolve "Semantic Anchors" as a distinctDefinedTermSetand each anchor as a defined term with a canonical URL. Crisp per-anchor "direct answer" blocks remain #580; those will later supersede the extracted descriptions here.A note on expectations (for #579 and #580)
Structured data helps the retrieval-grounded path (Perplexity, AI Overviews, Bing Copilot, ChatGPT-with-search) — it does not write a training-data prior. A cold model with no live search will still not cite the project from its weights; that only changes with time and coverage. Worth tightening #580's "shows the project surfaced" acceptance criterion accordingly (testable only on retrieval-grounded assistants).
Verification
Full
vite build+ prerender + injection run locally:<title>, canonical, filled#page-content);/all-anchorscarry theDefinedTermSet(161 terms); a control route (/changelog) does not;index.htmlJSON-LD blocks and the generated set validate;</script>-escaped.Remaining manual check for the reviewer: run the live output through the Google Rich Results Test / schema.org validator (third AC of #579) — I can't reach those from here.
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes