Skip to content

Commit d904631

Browse files
slayerjainnehagup
andauthored
seo: HowTo schema, glossary DefinedTermSet, docs landing fixes (#857)
* docs(seo): add HowTo schema, glossary DefinedTermSet, docs landing fixes Audit findings on keploy.io/docs and follow-up coverage. Critical fixes: - /docs/ landing was rendering zero H1, with title "Keploy Documentation" (20 chars) and description "API Test Generator Tool" (23 chars). Both too short to capture docs intent (install, capture, replay, SDK). Added an sr-only H1 plus a longer Layout title/description on src/pages/index.js ("Keploy Documentation — Install, Capture & Replay API Tests" + a 159-char description covering install, capture, CI replay, SDK references). - src/pages/about.js shipped zero JSON-LD because src/pages/* are not covered by the docs schema plugin. Inlined Article + BreadcrumbList JSON-LD via @docusaurus/Head. - src/pages/concepts/reference/glossary.js shipped zero JSON-LD too. Now emits a single DefinedTermSet from the entire glossaryEntries data, with one DefinedTerm per glossary entry. Mirrors the pattern in landing's /what-is-api-testing layout. - Docusaurus sitemap noIndex on 1.0.0 / 2.0.0 archives + ignorePatterns to drop /tags/** and /1.0.0/** /2.0.0/** from the generated sitemap. Reduces crawl-budget dilution; preserves the recently-added priority bucket comments above the sitemap config. HowTo schema rollout: - New src/components/HowTo.js wrapper. API: <HowTo name="Install Keploy on Linux" totalTime="PT5M" tools={[...]} supplies={[...]} steps={[{name, text, url}, ...]} visible={true|false} /> Emits valid HowTo JSON-LD via @docusaurus/Head and (when visible) renders a numbered <ol>. Use visible={false} on pages that already render the steps in prose to avoid duplicate UI. - Applied to versioned_docs/version-4.0.0/server/installation.md (visible HowTo above the existing prose). - Applied to all 32 quickstart pages in versioned_docs/version-4.0.0/quickstart with visible={false} (existing tutorial prose remains; only schema is added). SDK-installation title alignment: - /docs/server/sdk-installation/go|python|javascript pages had H1/title "Merge Test Coverage Data" while the URL says "sdk-installation". Title rewritten to "Keploy [Language] SDK — Install & Merge Test Coverage" so URL and content topic align without renaming the route or moving files. java.md is unchanged here — upstream main has rewritten that file as the Enterprise dynamic-deduplication agent guide, so the SDK-install framing no longer applies to it. Built + verified: `npx docusaurus build` produces a clean static build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Neha Gupta <gneha21@yahoo.in> * docs(seo): address PR #857 review comments - versioned_docs/version-4.0.0/server/installation.md: add explicit {#capturing-testcases} / {#running-testcases} anchors on the H2s and update HowTo step.url to those anchors. The previous `#-capturing-testcases` slug relied on Docusaurus's auto-slugger behavior with leading emojis and was fragile. - versioned_docs/version-4.0.0/quickstart/k8s-proxy.md: HowTo `tools` list now includes Kind, kubectl, Helm — the prerequisites the page itself calls out below the schema. - versioned_docs/version-4.0.0/quickstart/samples-express-mongoose.md: HowTo `name` is now title-cased ("Sample Course-Selling API (Express) — Record and Replay Tests with Keploy") instead of all lowercase. - versioned_docs/version-4.0.0/quickstart/samples-node-mongo.md: fix typo "Intoduction" → "Introduction". - src/pages/about.js: drop unused `useBaseUrl` import. JSON-LD URLs now carry trailing slashes to match Docusaurus `trailingSlash: true` config — both the Article `url`/`@id` and the BreadcrumbList `item`s. - src/pages/concepts/reference/glossary.js: same trailing-slash fix for DefinedTermSet `@id`/`url`, per-term DefinedTerm `url`s, and the BreadcrumbList items. Centralized via a `withTrailingSlash` helper so future glossary entries inherit the canonical form. - src/pages/index.js: Article schema's `headline` and `description` now derive from `docsHomeTitle` / `docsHomeDescription` (the same values used for the rendered <title>, meta description, and sr-only H1) instead of the old short `siteConfig.title` / `siteConfig.tagline`. Schema and on-page metadata now agree. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * docs(seo): address remaining PR #857 review comments - src/components/HowTo.js: when `visible={true}`, the rendered <ol> no longer derives <li id> from `s.url` alone (multiple steps can share the same anchor and that produced duplicate ids in the DOM). The id now suffixes the step position — `${slug}-step-${i+1}`. - versioned_docs/version-4.0.0/server/sdk-installation/python.md: meta description rephrased — was a sentence fragment ending "Combined reports seamlessly.", now a complete clause that reads cleanly in SERP snippets. - versioned_docs/version-4.0.0/server/sdk-installation/javascript.md: same fix for the "Combined integration + unit-test reports." fragment. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * docs(seo): drop HowTo li id, fix glossary typo + guard JSON-LD URLs - src/components/HowTo.js: stop deriving the visible <li> `id` from `s.url`. In docs usage step.url often points at an existing heading anchor on the page (e.g. `#capturing-testcases`), so the list-item id would clash with the h2 anchor whenever `visible` is enabled. The list is just the readable view; `step.url` in the JSON-LD already carries the schema linkage. - static/data/glossaryEntries.js: fix the "Stubs" entry which had `ink:` instead of `link:`. - src/pages/concepts/reference/glossary.js: defensively filter glossary entries missing a valid `link` before mapping into DefinedTerm.url, so a future similar gap can't ship a malformed `https://keploy.ioundefined` URL into the JSON-LD. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * fix(seo): drop wrong HowTo step anchor, rephrase Go SDK description The first install-Linux HowTo step ("Download and install the Keploy binary") was pointing at #capturing-testcases — that anchor exists on the page but it's the wrong section for the install command. There's no installation-specific anchor to point to (the install runs inline above the H2s), so drop the url field for that step rather than emit a misleading deep link. Go SDK meta description was a noun phrase tail of "Graceful shutdown setup, -cover flag, combined reports." Rephrase as a complete sentence so AI engines and snippet renderers can quote it cleanly. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * fix(seo): prefix sitemap ignorePatterns with /docs/, validate HowTo steps baseUrl is /docs/, so Docusaurus emits sitemap routes as /docs/tags/... and /docs/1.0.0/.../docs/2.0.0/... . The previous bare patterns (/tags/**, /1.0.0/**, /2.0.0/**) couldn't match those, so tag indexes and the legacy doc versions were quietly slipping into the generated sitemap despite the noIndex headers elsewhere. Add the prefixed patterns; keep the bare ones as defence-in-depth in case baseUrl is ever flattened. HowTo.js was emitting steps with empty `text` and auto-generated "Step N" names whenever authors omitted those fields, producing low-quality HowTo structured data the rich-results test flags. Filter to entries that carry both `name` and `text`; drop the schema entirely if none qualify, and render the visible <ol> from the same filtered list so the markup and JSON-LD stay in sync. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * style(seo): apply prettier 2.8.8 to changed quickstart docs The PR's editor (and the post-prettier action) had drifted from the repo's pinned prettier 2.8.8: trailing commas, JSX prop indentation inside MDX, and a few `<HowTo>` block layouts didn't match. CI's `prettier --check` over the changed file list was flagging 37 files; this commit runs `prettier --write` over the same list so the formatter check passes without rewriting any of the actual content. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * chore(vale): skip <HowTo> MDX blocks when linting prose Vale was treating JSX prop values inside <HowTo> blocks as narrative prose and flagging Google.Quotes (commas inside `"Keploy CLI", "Docker"` arrays), Google.EmDash (the " — " separators in display names), and Vale.Spelling on identifier-like step text. None of those are prose; they're structured component data. Add BlockIgnores patterns covering both self-closing and paired forms so the linter stays focused on the actual tutorial copy. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * chore(vale): match <HowTo> across newlines and `>` inside JSX values The previous BlockIgnores pattern `[^>]*?` stopped on the first `>`, which appears inside JSX prop values like `Linux kernel >= 5.10` and inside the `=>` arrow callbacks generated by prettier. As a result the ignore never spanned the full HowTo element and Vale was still linting its body. Switch to `[\s\S]*?` (any char, lazy) so the regex spans multiple lines and tolerates literal `>` characters in prop values. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * chore(vale): accept "Sanic" — Python web framework name Vale was flagging "Sanic" in the Sanic-MongoDB quickstart's introduction prose as a misspelling. It's the proper noun of the Python framework the sample app is built on. Add to the accept vocab. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * fix(seo): align about-page metadata, JSON-LD, and visible H1 src/pages/about.js was shipping three different copies of the page's title and description: Layout `title`/`description` ("About the docs" / "User General Information about Keploy's Documentation"), the Article JSON-LD `headline`/`description` ("About the Keploy Documentation" / "Information about Keploy's documentation, contribution guidelines, and licensing."), and the visible H1 ("About the docs"). Snippet generators and rich-result renderers see a mismatch between the rendered meta tags and the structured data, which confuses the title they pick for SERP displays. Hoist the title and description into ABOUT_TITLE / ABOUT_DESCRIPTION constants and feed them into all three sites: Layout props, Article JSON-LD, and the visible H1. One source of truth. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * fix(seo): drop unused imports flagged by Copilot on /docs landing src/pages/about.js previously imported useDocusaurusContext but the ABOUT_TITLE/ABOUT_DESCRIPTION refactor removed every read of context / siteConfig. The hook call was sitting dead in the component body. Drop both the call and the import. src/pages/index.js was importing KeployCloud, Resources, QuickStart, and Products (the latter via the separate Product subpath import) only to reference them in commented-out JSX. Strip the dead imports — the commented JSX stays put so the references are still discoverable when those sections are re-enabled. No behaviour change. Quiets ESLint no-unused-vars on these files and keeps the bundle from pulling those component modules into the docs landing chunk. Signed-off-by: Neha Gupta <gneha21@yahoo.in> * fix(seo): correct k8s-proxy HowTo, express casing, about.js URL source Addresses @amaan-bhati's pre-merge review on #857. 1. k8s-proxy.md HowTo described the wrong workflow. The page is the Kubernetes live record/replay guide (Kind + kubectl + Helm + the Keploy Dashboard proxy), but the schema steps walked through the standard CLI flow (`keploy record -c "CMD_TO_RUN_APP"`), which does not apply here. Google would have ingested factually wrong HowTo structured data for this URL. Rewrote name/description/tools/steps to the real flow: install prereqs + clone, create Kind cluster, build & load images, kubectl apply + port-forward, connect cluster in the dashboard, install the proxy via Helm, start recording, generate tests with AI. Dropped "Keploy CLI" from tools (this page uses the proxy, not the CLI); kept visible={false} since the prose renders the tutorial. 2. samples-express-mongoose.md HowTo name used Title Case for the "— Record and Replay Tests with Keploy" suffix while all 30+ other quickstarts use lowercase "— record and replay tests with Keploy". Lowercased the suffix to match; kept "API" / "Express" capitalized (the earlier Copilot ask). 3. about.js hardcoded `https://keploy.io/docs/...` in the Article url, @id, logo, and every breadcrumb item. Refactored to a single SITE constant + derived path constants, mirroring glossary.js, so the structured data tracks the domain/baseUrl in one place instead of going stale field-by-field. Verified with `docusaurus build` (Generated static files, no errors) and prettier 2.5.1. Signed-off-by: Neha Gupta <gneha21@yahoo.in> --------- Signed-off-by: Neha Gupta <gneha21@yahoo.in> Co-authored-by: Neha Gupta <gneha21@yahoo.in>
1 parent f475fd0 commit d904631

44 files changed

Lines changed: 1483 additions & 43 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.vale.ini

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,18 @@ Vocab = Base
88
# Enable Markdown-specific styles.
99
BasedOnStyles = Vale, Google
1010

11+
# Skip MDX component blocks. The <HowTo> wrapper carries JSX prop values
12+
# (step text, tool/supply names, em-dash separators in display names) that
13+
# Vale otherwise lints as prose — flagging quote/punctuation rules and
14+
# em-dash spacing on what is structured component data, not narrative
15+
# copy. The two patterns cover both self-closing (<HowTo ... />) and
16+
# paired (<HowTo>...</HowTo>) usages so future authors can use either.
17+
# Use [\s\S] instead of `.` (Vale's regex doesn't honour `(?s)` reliably
18+
# in BlockIgnores) and intentionally allow `>` inside the prop body —
19+
# `[^>]` would break on JSX values like `Linux kernel >= 5.10`.
20+
BlockIgnores = (?s)<HowTo\b[\s\S]*?/>, \
21+
(?s)<HowTo\b[\s\S]*?</HowTo>
22+
1123
# Customize specific rules based on your needs.
1224
List.Capitalization = YES
1325

docusaurus.config.js

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -365,11 +365,13 @@ module.exports = {
365365
label: "1.0.0",
366366
path: "1.0.0",
367367
banner: "unmaintained",
368+
noIndex: true,
368369
},
369370
"2.0.0": {
370371
label: "2.0.0",
371372
path: "2.0.0",
372373
banner: "unmaintained",
374+
noIndex: true,
373375
},
374376
},
375377
onlyIncludeVersions: ["1.0.0", "2.0.0", "4.0.0"],
@@ -464,6 +466,27 @@ module.exports = {
464466
// 0.5 → /docs/concepts/reference/glossary/* (long-tail
465467
// glossary; noindexed legacy versions excluded via
466468
// netlify headers + robots.txt)
469+
//
470+
// Also exclude auto-generated tag indexes and the unmaintained
471+
// 1.0.0 / 2.0.0 doc versions from the sitemap. Those versions
472+
// additionally carry `noIndex: true` via their `versions` config
473+
// above; excluding from the sitemap signals that they should not
474+
// be ranked at all.
475+
//
476+
// Docusaurus matches `ignorePatterns` against the full route path
477+
// including `baseUrl` (`/docs/`), so the patterns must carry that
478+
// prefix — bare `/tags/**` and `/1.0.0/**` would never match the
479+
// emitted `/docs/tags/...` and `/docs/1.0.0/...` routes. Bare
480+
// patterns are kept as defence-in-depth in case `baseUrl` is ever
481+
// flattened to `/`.
482+
ignorePatterns: [
483+
"/docs/tags/**",
484+
"/docs/1.0.0/**",
485+
"/docs/2.0.0/**",
486+
"/tags/**",
487+
"/1.0.0/**",
488+
"/2.0.0/**",
489+
],
467490
createSitemapItems: async (params) => {
468491
const {defaultCreateSitemapItems, ...rest} = params;
469492
const items = await defaultCreateSitemapItems(rest);

src/components/HowTo.js

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
import React from "react";
2+
import Head from "@docusaurus/Head";
3+
4+
/**
5+
* HowTo schema.org wrapper for Docusaurus MDX pages.
6+
*
7+
* Emits valid schema.org/HowTo JSON-LD into <head> and (optionally) renders a
8+
* matching numbered <ol> of visible steps. Authors can pass `visible={false}`
9+
* when the prose below already renders the steps so the JSON-LD is the only
10+
* change to the page.
11+
*
12+
* Required HowTo fields per Google: name, step (array of HowToStep with name + text).
13+
* Optional: totalTime (ISO 8601 duration), estimatedCost (MonetaryAmount), tool, supply.
14+
*
15+
* Example:
16+
* <HowTo
17+
* name="Install Keploy on Linux"
18+
* totalTime="PT5M"
19+
* estimatedCost={{currency: "USD", value: "0"}}
20+
* tools={["bash", "curl"]}
21+
* supplies={["Linux machine with kernel >= 5.10"]}
22+
* steps={[
23+
* {name: "Download", text: "Run: curl ...", url: "#download"},
24+
* {name: "Install", text: "Run: sudo install ...", url: "#install"},
25+
* ]}
26+
* visible={false}
27+
* />
28+
*/
29+
export default function HowTo({
30+
name,
31+
description,
32+
totalTime,
33+
estimatedCost,
34+
tools,
35+
supplies,
36+
image,
37+
steps,
38+
visible = true,
39+
}) {
40+
if (!name || !Array.isArray(steps) || steps.length === 0) {
41+
// Component is a no-op without the minimum required fields.
42+
return null;
43+
}
44+
45+
// Filter to steps that carry both `name` and `text` per Google's HowTo
46+
// requirements. Auto-generating "Step N" placeholders or emitting empty
47+
// `text` produces low-quality structured data that the rich-results test
48+
// flags. If the author gave us nothing usable, drop the schema entirely
49+
// rather than ship a hollow HowTo.
50+
const validSteps = steps.filter(
51+
(s) =>
52+
typeof s.name === "string" &&
53+
s.name.trim() &&
54+
typeof s.text === "string" &&
55+
s.text.trim()
56+
);
57+
if (validSteps.length === 0) {
58+
return null;
59+
}
60+
61+
const schema = {
62+
"@context": "https://schema.org",
63+
"@type": "HowTo",
64+
name,
65+
step: validSteps.map((s, i) => {
66+
const step = {
67+
"@type": "HowToStep",
68+
position: i + 1,
69+
name: s.name,
70+
text: s.text,
71+
};
72+
if (s.url) step.url = s.url;
73+
if (s.image) step.image = s.image;
74+
return step;
75+
}),
76+
};
77+
78+
if (description) schema.description = description;
79+
if (totalTime) schema.totalTime = totalTime;
80+
if (image) schema.image = image;
81+
if (estimatedCost && estimatedCost.value !== undefined) {
82+
schema.estimatedCost = {
83+
"@type": "MonetaryAmount",
84+
currency: estimatedCost.currency || "USD",
85+
value: String(estimatedCost.value),
86+
};
87+
}
88+
if (Array.isArray(tools) && tools.length > 0) {
89+
schema.tool = tools.map((t) =>
90+
typeof t === "string" ? {"@type": "HowToTool", name: t} : t
91+
);
92+
}
93+
if (Array.isArray(supplies) && supplies.length > 0) {
94+
schema.supply = supplies.map((s) =>
95+
typeof s === "string" ? {"@type": "HowToSupply", name: s} : s
96+
);
97+
}
98+
99+
return (
100+
<>
101+
<Head>
102+
<script type="application/ld+json">{JSON.stringify(schema)}</script>
103+
</Head>
104+
{visible && (
105+
<section
106+
aria-label={name}
107+
style={{
108+
border: "1px solid var(--ifm-color-emphasis-200)",
109+
borderRadius: "12px",
110+
padding: "1rem 1.25rem",
111+
margin: "1rem 0 1.5rem",
112+
background: "var(--ifm-background-surface-color)",
113+
}}
114+
>
115+
<h3 style={{marginTop: 0}}>{name}</h3>
116+
{description && <p>{description}</p>}
117+
<ol>
118+
{/* Don't derive an `id` from `s.url`. In docs usage `step.url`
119+
often points at an existing heading anchor on the page (e.g.
120+
`#capturing-testcases`), so reusing that as a list-item id
121+
would produce duplicate ids in the DOM whenever `visible`
122+
is enabled. The list is the readable view; `step.url` in
123+
the JSON-LD already covers the schema linkage. */}
124+
{validSteps.map((s, i) => (
125+
<li key={i}>
126+
<strong>{s.name}</strong>
127+
<div>{s.text}</div>
128+
</li>
129+
))}
130+
</ol>
131+
</section>
132+
)}
133+
</>
134+
);
135+
}

src/pages/about.js

Lines changed: 91 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,103 @@
11
import React from "react";
22
import Layout from "@theme/Layout";
3-
import useDocusaurusContext from "@docusaurus/useDocusaurusContext";
4-
import useBaseUrl from "@docusaurus/useBaseUrl";
3+
import Head from "@docusaurus/Head";
4+
5+
// Custom React pages under src/pages/ are not covered by the docs schema
6+
// plugin — add Article + BreadcrumbList JSON-LD inline so the page is
7+
// machine-readable for search engines and AI crawlers.
8+
//
9+
// Site config sets `trailingSlash: true`, so canonical URLs in the JSON-LD
10+
// must carry the trailing slash to match the actual emitted href and avoid
11+
// duplicate URL variants in structured data.
12+
//
13+
// Single source of truth for the page's title and description: the Layout
14+
// `title`/`description` props, the visible H1, and the Article JSON-LD
15+
// `headline`/`description` all read from these constants. Previously the
16+
// page shipped Layout title "About the docs" / description "User General
17+
// Information about..." while the JSON-LD claimed headline "About the
18+
// Keploy Documentation" / a different description, which confuses snippet
19+
// generators and leaves rich-result text out of sync with the meta tags.
20+
const ABOUT_TITLE = "About the Keploy Documentation";
21+
const ABOUT_DESCRIPTION =
22+
"Information about Keploy's documentation, contribution guidelines, and licensing.";
23+
24+
// Derive every canonical URL from a single `SITE` + path constants instead
25+
// of hardcoding `https://keploy.io/docs/...` in each field — mirrors the
26+
// pattern in concepts/reference/glossary.js. If the domain or docs baseUrl
27+
// ever changes, the Article/BreadcrumbList structured data updates in one
28+
// place instead of going stale field-by-field.
29+
//
30+
// Site config sets `trailingSlash: true`, so paths that map to a page carry
31+
// a trailing slash to match the canonical href and avoid duplicate-URL
32+
// variants in structured data.
33+
const SITE = "https://keploy.io";
34+
const HOME_URL = `${SITE}/`;
35+
const DOCS_URL = `${SITE}/docs/`;
36+
const ABOUT_URL = `${SITE}/docs/about/`;
37+
const LOGO_URL = `${SITE}/docs/img/favicon.png`;
38+
39+
const aboutStructuredData = [
40+
{
41+
"@context": "https://schema.org",
42+
"@type": "Article",
43+
headline: ABOUT_TITLE,
44+
description: ABOUT_DESCRIPTION,
45+
url: ABOUT_URL,
46+
publisher: {
47+
"@type": "Organization",
48+
name: "Keploy",
49+
logo: {
50+
"@type": "ImageObject",
51+
url: LOGO_URL,
52+
},
53+
},
54+
mainEntityOfPage: {
55+
"@type": "WebPage",
56+
"@id": ABOUT_URL,
57+
},
58+
},
59+
{
60+
"@context": "https://schema.org",
61+
"@type": "BreadcrumbList",
62+
itemListElement: [
63+
{
64+
"@type": "ListItem",
65+
position: 1,
66+
name: "Home",
67+
item: HOME_URL,
68+
},
69+
{
70+
"@type": "ListItem",
71+
position: 2,
72+
name: "Docs",
73+
item: DOCS_URL,
74+
},
75+
{
76+
"@type": "ListItem",
77+
position: 3,
78+
name: "About",
79+
item: ABOUT_URL,
80+
},
81+
],
82+
},
83+
];
584

685
function About() {
7-
const context = useDocusaurusContext();
8-
const {siteConfig = {}} = context;
986
return (
1087
<Layout
11-
title="About the docs"
88+
title={ABOUT_TITLE}
1289
permalink="/about"
13-
description="User General Information about Keploy's Documentation"
90+
description={ABOUT_DESCRIPTION}
1491
>
92+
<Head>
93+
{aboutStructuredData.map((schema, i) => (
94+
<script key={i} type="application/ld+json">
95+
{JSON.stringify(schema)}
96+
</script>
97+
))}
98+
</Head>
1599
<main className="margin-vert--lg container">
16-
<h1>About the docs</h1>
100+
<h1>{ABOUT_TITLE}</h1>
17101
<div className="margin-bottom--lg">
18102
<h2 id="latest">Documentation SLA</h2>
19103
<p>

0 commit comments

Comments
 (0)