diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 6bfe1eaf78d..179f1036d5c 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -32,7 +32,7 @@ When creating PRs, follow `.github/pull_request_template.md`: | Workflow | Trigger | Purpose | |---|---|---| -| `build-asciidoc.yml` | Push to main/release | Builds AsciiDoc docs and deploys to GitHub Pages. Cleans up merged PR preview branches. | +| `build-asciidoc.yml` | Push to main/release | Builds AsciiDoc docs and deploys to GitHub Pages (deploy includes cleanup of merged/closed PRs and deleted branches). | | `pr.yml` | PR | Builds HTML preview, runs CQA checks, deploys to `gh-pages`, posts preview URL and CQA checklist as PR comments. Build scripts sourced from base branch. | | `style-guide.yml` | PR | Runs Vale linter on `assemblies/` for style guide compliance. | | `shellcheck.yml` | PR (`*.sh`) | Runs shellcheck on changed shell scripts via reviewdog. | diff --git a/.github/workflows/build-asciidoc.yml b/.github/workflows/build-asciidoc.yml index 2cb7ff3b3bc..48c81fa017e 100644 --- a/.github/workflows/build-asciidoc.yml +++ b/.github/workflows/build-asciidoc.yml @@ -68,39 +68,10 @@ jobs: run: | echo "Building branch ${{ env.GIT_BRANCH }}" touch .lycheecache - build/scripts/build-ccutil.sh -b ${{ env.GIT_BRANCH }} + node build/scripts/build-orchestrator.js -b ${{ env.GIT_BRANCH }} --no-cqa - name: Deploy to the gh-pages branch env: GITHUB_TOKEN: ${{ secrets.RHDH_BOT_TOKEN }} GITHUB_REPOSITORY: ${{ github.repository }} run: bash build/scripts/deploy-gh-pages.sh ./titles-generated --message "Deploy ${{ env.GIT_BRANCH }}" - - - name: Cleanup merged PR branches - run: | - PULL_URL="https://api.github.com/repos/redhat-developer/red-hat-developers-documentation-rhdh/pulls" - GITHUB_TOKEN="${{ secrets.RHDH_BOT_TOKEN }}" - git config user.name "rhdh-bot service account" - git config user.email "rhdh-bot@redhat.com" - - git checkout gh-pages; git pull || true - dirs=$(find . -maxdepth 1 -name "pr-*" -type d | sed -r -e "s|^\./pr-||") - refs=$(cat pulls.html | grep pr- | sed -r -e "s|.+.html>pr-([0-9]+).+|\1|") - for d in $(echo -e "$dirs\n$refs" | sort -uV); do - PR="${d}" - echo -n "Check merge status of PR $PR ... " - PR_JSON=$(curl -sSL -H "Accept: application/vnd.github+json" -H "Authorization: Bearer $GITHUB_TOKEN" "$PULL_URL/$PR") - if [[ $(echo "$PR_JSON" | grep merged\") == *"merged\": true"* ]]; then - echo "merged, can delete from pulls.html and remove folder $d" - git rm -fr --quiet "pr-${d}" || rm -fr "pr-${d}" - sed -r -e "/pr-$PR\/index.html>pr-$PRpr-$PR ---- @@ -101,9 +117,11 @@ PRs have a link to the generated HTML attached as a comment. The publication workflow has two stages: -. The link:.github/workflows/pr.yml[PR workflow] and link:.github/workflows/build-asciidoc.yml[GitHub Pages workflow] build HTML from AsciiDoc sources and push the output to the `gh-pages` branch using `build/scripts/deploy-gh-pages.sh`, which handles concurrent pushes with automatic retry. +. The link:.github/workflows/pr.yml[PR workflow] and link:.github/workflows/build-asciidoc.yml[GitHub Pages workflow] build HTML from AsciiDoc sources and push the output to the `gh-pages` branch using `build/scripts/deploy-gh-pages.js`, which handles concurrent pushes with automatic retry, cleanup of stale PR/branch directories, and index regeneration. . The link:https://github.com/redhat-developer/red-hat-developers-documentation-rhdh/actions/workflows/pages/pages-build-deployment[GitHub Pages build and deployment] workflow, managed by GitHub, detects pushes to the `gh-pages` branch and publishes the content to GitHub Pages. +See link:docs/github-publication-workflow.md[GitHub Publication Workflow Architecture] for the full technical reference. + ## Reviews All PRs are reviewed for technical accuracy by an SME and writing quality by another tech writer. diff --git a/build/scripts/README.md b/build/scripts/README.md new file mode 100644 index 00000000000..c4004364cb2 --- /dev/null +++ b/build/scripts/README.md @@ -0,0 +1,35 @@ +# Build Scripts + +Build, deploy, and content quality tooling for the RHDH documentation project. + +## Scripts + +| Script | Purpose | +|---|---| +| `build-ccutil.sh` | Wrapper that delegates to `build-orchestrator.js`. Used as a fallback in `pr.yml` on older branches and for local builds. | +| `build-orchestrator.js` | Parallel documentation build orchestrator. Runs ccutil title builds, lychee link validation, and CQA assessment. Produces `build-report.json`. Supports `--no-cqa` and `--no-lychee` flags to skip phases. | +| `deploy-gh-pages.sh` | Deploys build output to the `gh-pages` branch. Handles cleanup of stale PR/branch directories, index regeneration with release notes links, and retry with rebase on push conflicts. | +| `error-patterns.json` | Regex patterns for classifying ccutil build errors into structured messages with cause and fix fields. | +| `update-cqa-resources.sh` | Fetches upstream Red Hat style guide resources into `.claude/resources/`. | + +## CQA (`cqa/`) + +Content Quality Assessment framework with 19 checks (CQA-00a through CQA-17). + +```bash +node build/scripts/cqa/index.js titles//master.adoc # report +node build/scripts/cqa/index.js --fix titles/<title>/master.adoc # auto-fix +node build/scripts/cqa/index.js --check 14 titles/<title>/master.adoc # single check +node build/scripts/cqa/index.js --all # all titles +``` + +See `.claude/plugins/project-cqa/resources/cqa-spec.md` for the full specification. + +## Workflows + +These scripts are called by GitHub Actions workflows in `.github/workflows/`: + +- **`build-asciidoc.yml`** (push to main/release) -- `build-orchestrator.js --no-cqa` + `deploy-gh-pages.sh` +- **`pr.yml`** (pull requests) -- `build-orchestrator.js` (or `build-ccutil.sh` on older branches) + `deploy-gh-pages.sh` + +See `docs/github-publication-workflow.md` for the full architecture documentation. diff --git a/build/scripts/build-cqa.sh b/build/scripts/build-cqa.sh deleted file mode 100755 index e1b101cd0f8..00000000000 --- a/build/scripts/build-cqa.sh +++ /dev/null @@ -1,15 +0,0 @@ -#!/usr/bin/env bash -# -# Copyright (c) Red Hat, Inc. -# This program and the accompanying materials are made -# available under the terms of the Eclipse Public License 2.0 -# which is available at https://www.eclipse.org/legal/epl-2.0/ -# -# SPDX-License-Identifier: EPL-2.0 -# -# Requires: Node.js - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -pushd "${SCRIPT_DIR}" >/dev/null || exit -node "cqa/index.js" --all # "$@" -popd >/dev/null || exit diff --git a/build/scripts/build-orchestrator.js b/build/scripts/build-orchestrator.js index 4c8ca1f7312..ce3a47d7b9b 100755 --- a/build/scripts/build-orchestrator.js +++ b/build/scripts/build-orchestrator.js @@ -9,6 +9,7 @@ * node build/scripts/build-orchestrator.js -b main * node build/scripts/build-orchestrator.js -b pr-123 --verbose * node build/scripts/build-orchestrator.js -b main --jobs 4 + * node build/scripts/build-orchestrator.js -b main --no-cqa --no-lychee */ import { existsSync, readFileSync, writeFileSync, mkdirSync, rmSync, readdirSync, renameSync, copyFileSync } from 'node:fs'; @@ -16,7 +17,6 @@ import { resolve, dirname, join } from 'node:path'; import { spawn } from 'node:child_process'; import { cpus } from 'node:os'; import { fileURLToPath } from 'node:url'; -import { get as httpsGet } from 'node:https'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); @@ -32,12 +32,14 @@ const SAFE_PATH = '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' // ── Argument parsing ───────────────────────────────────────────────────────── function parseArgs(argv) { - const args = { branch: 'main', verbose: false, jobs: cpus().length }; + const args = { branch: 'main', verbose: false, jobs: cpus().length, lychee: true, cqa: true }; for (let i = 2; i < argv.length; i++) { switch (argv[i]) { case '-b': args.branch = argv[++i]; break; case '--verbose': args.verbose = true; break; case '--jobs': args.jobs = Number.parseInt(argv[++i], 10); break; + case '--no-lychee': args.lychee = false; break; + case '--no-cqa': args.cqa = false; break; } } return args; @@ -437,51 +439,6 @@ function generateBranchIndex(branch, results, repoRoot) { writeFileSync(join(indexDir, 'index.html'), html); } -function fetchUrl(url) { - return new Promise((resolve, reject) => { - httpsGet(url, (res) => { - if (res.statusCode >= 300 && res.statusCode < 400 && res.headers.location) { - fetchUrl(res.headers.location).then(resolve, reject); - return; - } - if (res.statusCode !== 200) { - res.resume(); - reject(new Error(`HTTP ${res.statusCode}`)); - return; - } - let data = ''; - res.on('data', (chunk) => { data += chunk; }); - res.on('end', () => resolve(data)); - res.on('error', reject); - }).on('error', reject); - }); -} - -async function updateRootIndex(branch, repoRoot) { - const isPR = branch.startsWith('pr-'); - const indexFile = isPR ? 'pulls.html' : 'index.html'; - const indexPath = join(repoRoot, 'titles-generated', indexFile); - const url = `${PAGES_BASE}/${indexFile}`; - - // Fetch existing index from GitHub Pages - try { - const data = await fetchUrl(url); - writeFileSync(indexPath, data); - } catch { - // If fetch fails, create a minimal file - writeFileSync(indexPath, '<html><body><ul>\n</ul></body></html>'); - } - - const content = readFileSync(indexPath, 'utf8'); - const link = `./${branch}/index.html`; - if (!content.includes(link)) { - console.log(`Building root index for ${branch} in titles-generated/${indexFile} ...`); - const entry = `<li><a href=${link}>${branch}</a></li>`; - const updated = content.replace('</ul>', `${entry}\n</ul>`); - writeFileSync(indexPath, updated); - } -} - // ── Summary output ─────────────────────────────────────────────────────────── function printFailedTitle(r) { @@ -653,31 +610,39 @@ async function main() { // Generate branch index HTML (only for passed titles) generateBranchIndex(args.branch, buildResults, repoRoot); - // Update root index - await updateRootIndex(args.branch, repoRoot); - // Run lychee link validation - console.log('\nRunning link validation (lychee)...'); - const lycheeResult = await runLychee(repoRoot, args.branch, args.verbose); - if (lycheeResult.errors.length === 0) { - lycheeResult.errors = classifyErrors(lycheeResult.output, patterns); + const skippedResult = { status: 'skipped', duration: 0, output: '', stats: { total: 0, successful: 0, errors: 0, excludes: 0, timeouts: 0 }, errors: [] }; + let lycheeResult; + if (args.lychee) { + console.log('\nRunning link validation (lychee)...'); + lycheeResult = await runLychee(repoRoot, args.branch, args.verbose); + if (lycheeResult.errors.length === 0) { + lycheeResult.errors = classifyErrors(lycheeResult.output, patterns); + } + } else { + console.log('\nSkipping link validation (--no-lychee)'); + lycheeResult = { ...skippedResult }; } // Run CQA content quality assessment - // Skip when CQA_RUNNING env is set (CQA-14 recursion guard) - const cqaResult = (process.env.CQA_RUNNING) - ? { status: 'skipped', duration: 0, output: '', stats: { total: 0, pass: 0, fail: 0 } } - : await (async () => { - console.log('\nRunning CQA content quality assessment...'); - return runCqa(repoRoot, args.verbose); - })(); + const skippedCqa = { status: 'skipped', duration: 0, output: '', stats: { total: 0, pass: 0, fail: 0 } }; + let cqaResult; + if (!args.cqa || process.env.CQA_RUNNING) { + if (!args.cqa) console.log('\nSkipping CQA (--no-cqa)'); + cqaResult = skippedCqa; + } else { + // Write preliminary report so CQA-14 can read lychee results without rebuilding + const pendingCqa = { status: 'pending', duration: 0, output: '', stats: { total: 0, pass: 0, fail: 0 } }; + writeReport(args.branch, buildResults, lycheeResult, pendingCqa, args.jobs, 0, repoRoot); + + console.log('\nRunning CQA content quality assessment...'); + process.env.CQA_RUNNING = '1'; + cqaResult = await runCqa(repoRoot, args.verbose); + delete process.env.CQA_RUNNING; + } const totalDuration = Math.round((Date.now() - totalStart) / 1000); - - // Print summary printSummary(buildResults, lycheeResult, cqaResult, patterns, totalDuration); - - // Write JSON report writeReport(args.branch, buildResults, lycheeResult, cqaResult, args.jobs, totalDuration, repoRoot); // Exit with error if any builds, lychee, or CQA failed diff --git a/build/scripts/build.sh b/build/scripts/build.sh deleted file mode 100755 index d0315b21ddd..00000000000 --- a/build/scripts/build.sh +++ /dev/null @@ -1,80 +0,0 @@ -#!/usr/bin/env bash -# -# Copyright (c) Red Hat, Inc. -# This program and the accompanying materials are made -# available under the terms of the Eclipse Public License 2.0 -# which is available at https://www.eclipse.org/legal/epl-2.0/ -# -# SPDX-License-Identifier: EPL-2.0 -# -# Utility script build html previews with referenced images -# Requires: asciidoctor - see https://docs.asciidoctor.org/asciidoctor/latest/install/linux-packaging/ -# input: titles/ -# output: titles-generated/ and titles-generated/$BRANCH/ - -# grep regex for title folders to exclude from processing below -EXCLUDED_TITLES="rhdh-plugins-reference" -BRANCH="main" - -while [[ "$#" -gt 0 ]]; do - case $1 in - '-b') BRANCH="$2"; shift 1;; - esac - shift 1 -done - -rm -fr titles-generated/; -mkdir -p titles-generated/"${BRANCH}"; -echo "<html><head><title>Red Hat Developer Hub Documentation Preview - ${BRANCH}" >> titles-generated/"${BRANCH}"/index.html - -# shellcheck disable=SC2143 -if [[ $BRANCH == "pr-"* ]]; then - # fetch the existing https://redhat-developer.github.io/red-hat-developers-documentation-rhdh/index.html to add prs and branches - curl -sSL https://redhat-developer.github.io/red-hat-developers-documentation-rhdh/pulls.html -o titles-generated/pulls.html - if [[ -z $(grep "./${BRANCH}/index.html" titles-generated/pulls.html) ]]; then - echo "Building root index for $BRANCH in titles-generated/pulls.html ..."; - echo "
  • ${BRANCH}
  • " >> titles-generated/pulls.html - fi -else - # fetch the existing https://redhat-developer.github.io/red-hat-developers-documentation-rhdh/index.html to add prs and branches - curl -sSL https://redhat-developer.github.io/red-hat-developers-documentation-rhdh/index.html -o titles-generated/index.html - if [[ -z $(grep "./${BRANCH}/index.html" titles-generated/index.html) ]]; then - echo "Building root index for $BRANCH in titles-generated/index.html ..."; - echo "
  • ${BRANCH}
  • " >> titles-generated/index.html - fi -fi diff --git a/build/scripts/cqa/checks/cqa-14-no-broken-links.js b/build/scripts/cqa/checks/cqa-14-no-broken-links.js index c30e32e673b..c39a7071c8d 100644 --- a/build/scripts/cqa/checks/cqa-14-no-broken-links.js +++ b/build/scripts/cqa/checks/cqa-14-no-broken-links.js @@ -88,18 +88,25 @@ function getLycheeIssues(root) { if (_lycheeIssuesCache !== null) return _lycheeIssuesCache; _lycheeIssuesCache = []; - try { - // Run build orchestrator (builds fresh HTML + runs lychee with remapping) - // Set CQA_RUNNING to prevent build-orchestrator from running CQA again (recursion) - execFileSync('node', ['build/scripts/build-orchestrator.js', '-b', 'main'], { // NOSONAR — fixed args, no user input - cwd: root, - stdio: 'pipe', - timeout: 600000, // 10 minutes - env: { ...process.env, CQA_RUNNING: '1' }, - }); - } catch { - // Build may exit non-zero if lychee finds broken links — that's expected + if (!process.env.CQA_RUNNING) { + // Standalone mode: build current state to get lychee results. + // Detect current branch for correct output directory naming and link remapping. + let branch = 'main'; + try { + branch = execFileSync('git', ['rev-parse', '--abbrev-ref', 'HEAD'], { cwd: root, encoding: 'utf8' }).trim(); // NOSONAR + } catch { /* fall back to main */ } + try { + execFileSync('node', ['build/scripts/build-orchestrator.js', '-b', branch, '--no-cqa'], { // NOSONAR — fixed args, no user input + cwd: root, + stdio: 'pipe', + timeout: 600000, + env: { ...process.env, CQA_RUNNING: '1' }, + }); + } catch { + // Orchestrator exits non-zero when lychee finds broken links; the report is still written + } } + // When CQA_RUNNING is set: preliminary report already exists with lychee results // Read the build report const reportPath = join(root, 'build-report.json'); diff --git a/build/scripts/deploy-gh-pages.sh b/build/scripts/deploy-gh-pages.sh index aa8839e9388..48f06f3a5bf 100755 --- a/build/scripts/deploy-gh-pages.sh +++ b/build/scripts/deploy-gh-pages.sh @@ -7,8 +7,18 @@ # # SPDX-License-Identifier: EPL-2.0 # -# Deploy files to the gh-pages branch with retry on push rejection. -# Replaces peaceiris/actions-gh-pages to handle concurrent builds. +# Deploy build output (titles-generated/) to the gh-pages branch. +# +# Flow: +# 1. Create a temp git repo, fetch gh-pages (shallow) +# 2. Copy --publish-dir content into the working tree +# 3. For branch deploys: clean up stale PR and branch directories +# 4. Regenerate index.html (branch list) and pulls.html (PR list) +# 5. Commit everything (content + cleanup + indexes) and push +# 6. On push rejection: rebase and retry (max 3 attempts) +# +# Branch deploys clean up merged/closed PR dirs and deleted branch dirs. +# PR deploys only update content and pulls.html — no cleanup. # # Usage: deploy-gh-pages.sh [--message ] # @@ -16,6 +26,11 @@ set -euo pipefail +MAX_RETRIES=3 +RELEASE_NOTES_BASE="https://red-hat-developers-documentation.pages.redhat.com/red-hat-developer-hub-release-notes" + +# ── Parse arguments ────────────────────────────────────────────────────────── + PUBLISH_DIR="${1:?Usage: deploy-gh-pages.sh [--message ]}" shift @@ -28,73 +43,246 @@ while [[ $# -gt 0 ]]; do done PUBLISH_DIR="$(cd "$PUBLISH_DIR" && pwd)" +: "${GITHUB_TOKEN:?GITHUB_TOKEN is required}" +: "${GITHUB_REPOSITORY:?GITHUB_REPOSITORY is required}" -: "${GITHUB_TOKEN:?GITHUB_TOKEN is required (set by GitHub Actions)}" -: "${GITHUB_REPOSITORY:?GITHUB_REPOSITORY is required (set by GitHub Actions)}" +# Detect branch directory (first non-hidden top-level dir in publish dir) +BRANCH_DIR="" +for d in "$PUBLISH_DIR"/*/; do + [[ -d "$d" ]] || continue + name="$(basename "$d")" + [[ "$name" == .* ]] && continue + BRANCH_DIR="$name" + break +done + +if [[ -z "$BRANCH_DIR" ]]; then + echo "No top-level directory found in publish dir" >&2 + exit 1 +fi -# ── Diagnostics: log PUBLISH_DIR contents before deploying ── echo "PUBLISH_DIR: $PUBLISH_DIR" -echo "Top-level entries in PUBLISH_DIR:" -find "$PUBLISH_DIR" -maxdepth 1 -not -path "$PUBLISH_DIR" -printf '%f\n' +echo "Branch directory: $BRANCH_DIR" + +# ── Set up temp deploy repo ────────────────────────────────────────────────── -MAX_RETRIES=3 DEPLOY_DIR="$(mktemp -d)" trap 'rm -rf "$DEPLOY_DIR"' EXIT -cd "$DEPLOY_DIR" -git init -q -git config user.name "github-actions[bot]" -git config user.email "github-actions[bot]@users.noreply.github.com" -git remote add origin "https://x-access-token:${GITHUB_TOKEN}@github.com/${GITHUB_REPOSITORY}.git" - -# ── Fetch gh-pages and prepare working tree ── -fetch_output=$(git fetch origin gh-pages --depth=1 2>&1) && fetch_ok=true || fetch_ok=false -if [[ "$fetch_ok" == "true" ]]; then - git checkout -B gh-pages FETCH_HEAD -elif echo "$fetch_output" | grep -qi "not found\|couldn't find\|no such remote ref"; then - echo "gh-pages branch does not exist, creating orphan" - git checkout --orphan gh-pages - git rm -rf . 2>/dev/null || true -else - echo "ERROR: Failed to fetch gh-pages: $fetch_output" >&2 - exit 1 -fi +git -C "$DEPLOY_DIR" init -q +git -C "$DEPLOY_DIR" config user.name "github-actions[bot]" +git -C "$DEPLOY_DIR" config user.email "github-actions[bot]@users.noreply.github.com" -# ── Copy content and stage ── -cp -a "$PUBLISH_DIR"/. . +REPO_URL="https://github.com/${GITHUB_REPOSITORY}.git" +git -C "$DEPLOY_DIR" remote add origin "$REPO_URL" +# Auth via http.extraHeader keeps the token out of the remote URL (avoids leaking in logs) +CREDENTIALS="$(printf 'x-access-token:%s' "$GITHUB_TOKEN" | base64 -w0)" +git -C "$DEPLOY_DIR" config "http.${REPO_URL}.extraHeader" "Authorization: Basic ${CREDENTIALS}" -# Force-add only the files from PUBLISH_DIR (bypasses .gitignore) -publish_entries=() -while IFS= read -r entry; do - publish_entries+=("$entry") -done < <(find "$PUBLISH_DIR" -maxdepth 1 -not -path "$PUBLISH_DIR" -printf '%f\n') +# ── Core functions ─────────────────────────────────────────────────────────── -echo "Staging ${#publish_entries[@]} entries from PUBLISH_DIR..." -git add --force -- "${publish_entries[@]}" +fetch_gh_pages() { + git -C "$DEPLOY_DIR" fetch origin gh-pages --depth=1 + git -C "$DEPLOY_DIR" checkout -B gh-pages FETCH_HEAD + return 0 +} -if git diff --cached --quiet; then - echo "No changes to deploy" - exit 0 -fi +apply_content() { + cp -a "$PUBLISH_DIR"/. "$DEPLOY_DIR"/ + return 0 +} + +# ── Cleanup (branch deploys only) ──────────────────────────────────────────── + +get_pr_state() { + local pr_number="$1" + local owner="${GITHUB_REPOSITORY%%/*}" + local repo="${GITHUB_REPOSITORY##*/}" + local response status merged + + response="$(curl -sf \ + -H "Authorization: Bearer $GITHUB_TOKEN" \ + -H "Accept: application/vnd.github+json" \ + "https://api.github.com/repos/${owner}/${repo}/pulls/${pr_number}" 2>/dev/null)" || { echo "unknown"; return; } + + status="$(printf '%s' "$response" | grep -o '"state": *"[^"]*"' | head -1 | grep -o '"[^"]*"$' | tr -d '"')" + merged="$(printf '%s' "$response" | grep -o '"merged": *[a-z]*' | head -1 | grep -o '[a-z]*$')" + + if [[ "$status" == "closed" ]]; then + [[ "$merged" == "true" ]] && echo "merged" || echo "closed" + else + echo "${status:-unknown}" + fi +} + +cleanup() { + # PR cleanup: remove directories for merged/closed PRs + for d in "$DEPLOY_DIR"/pr-*/; do + [[ -d "$d" ]] || continue + local dir_name pr_number state + dir_name="$(basename "$d")" + pr_number="${dir_name#pr-}" + [[ "$pr_number" =~ ^[0-9]+$ ]] || continue + + state="$(get_pr_state "$pr_number")" + if [[ "$state" == "merged" || "$state" == "closed" ]]; then + echo "Removing $dir_name (PR $state)" + rm -rf "$d" + fi + done + + # Branch cleanup: remove directories for deleted remote branches + local remote_branches + remote_branches="$(git -C "$DEPLOY_DIR" ls-remote --heads origin 2>/dev/null | awk '{print $2}' | sed 's|refs/heads/||')" + + for d in "$DEPLOY_DIR"/*/; do + [[ -d "$d" ]] || continue + local dir_name + dir_name="$(basename "$d")" + [[ "$dir_name" == pr-* || "$dir_name" == .* ]] && continue + + if ! grep -qx "$dir_name" <<< "$remote_branches"; then + echo "Removing $dir_name (branch no longer exists on remote)" + rm -rf "$d" + fi + done + return 0 +} -echo "Staged files:" -git diff --cached --stat +# ── Index generation ───────────────────────────────────────────────────────── -git commit -q -m "$COMMIT_MSG" +# See also: getReleaseNotesLink() in build-orchestrator.js (per-title links) +release_notes_url() { + local branch="$1" + if [[ "$branch" == "main" ]]; then + echo "${RELEASE_NOTES_BASE}/main/index.html" + elif [[ "$branch" =~ ^release-([0-9]+)\.([0-9]+)$ ]]; then + local major="${BASH_REMATCH[1]}" minor="${BASH_REMATCH[2]}" + if (( major > 1 || minor >= 9 )); then + echo "${RELEASE_NOTES_BASE}/release-${major}-${minor}/index.html" + fi + fi + return 0 +} + +regenerate_indexes() { + local branch_items="" pr_items="" + + for d in "$DEPLOY_DIR"/*/; do + [[ -d "$d" ]] || continue + local name + name="$(basename "$d")" + [[ "$name" == .* ]] && continue + + if [[ "$name" == pr-* ]]; then + pr_items+="
  • ${name}
  • "$'\n' + else + local entry="
  • ${name}" + local rn_url + rn_url="$(release_notes_url "$name")" + [[ -n "$rn_url" ]] && entry+=" | Release Notes" + branch_items+="${entry}
  • "$'\n' + fi + done + + # Branch deploys regenerate both; PR deploys regenerate pulls.html only + if [[ "$BRANCH_DIR" != pr-* ]]; then + cat > "$DEPLOY_DIR/index.html" <RHDH Documentation - Documentation Branches + +
      +${branch_items}
    + +EOF + fi + + cat > "$DEPLOY_DIR/pulls.html" <RHDH Documentation - PR Previews + +
      +${pr_items}
    + +EOF + return 0 +} + +# ── Stage, commit, push ───────────────────────────────────────────────────── + +stage_and_commit() { + regenerate_indexes + + # Force-add publish entries and indexes (.gitignore may exclude them) + local to_stage=() + for e in "$PUBLISH_DIR"/*/; do + [[ -d "$e" ]] || continue + local name + name="$(basename "$e")" + [[ "$name" == .* ]] && continue + to_stage+=("$name") + done + [[ -f "$DEPLOY_DIR/index.html" ]] && to_stage+=("index.html") + [[ -f "$DEPLOY_DIR/pulls.html" ]] && to_stage+=("pulls.html") + + git -C "$DEPLOY_DIR" add --force -- "${to_stage[@]}" + git -C "$DEPLOY_DIR" add -A + + if git -C "$DEPLOY_DIR" diff --cached --quiet; then + echo "No changes to deploy" + return 1 + fi + + echo "Staged files:" + git -C "$DEPLOY_DIR" diff --cached --stat || true + git -C "$DEPLOY_DIR" commit -q -m "$COMMIT_MSG" +} + +# On push rejection (concurrent deploy), try rebase first. +# If rebase conflicts (e.g. both touched index.html), reset and rebuild. +try_rebase_and_push() { + local attempt="$1" + if git -C "$DEPLOY_DIR" pull --rebase origin gh-pages 2>/dev/null; then + if git -C "$DEPLOY_DIR" push origin gh-pages 2>/dev/null; then + echo "Deployed successfully (attempt $attempt, after rebase)" + return 0 + fi + echo "Push failed after rebase, will rebuild" + else + echo "Rebase conflict — resetting to remote" + git -C "$DEPLOY_DIR" rebase --abort 2>/dev/null || true + fi + fetch_gh_pages + return 1 +} + +# ── Main ───────────────────────────────────────────────────────────────────── + +fetch_gh_pages +apply_content + +# Cleanup runs once before retries (avoids redundant API calls) +if [[ "$BRANCH_DIR" != pr-* ]]; then + cleanup +fi -# ── Push with pull-before-push retry ── for attempt in $(seq 1 "$MAX_RETRIES"); do - if git push origin gh-pages; then + if [[ "$attempt" -gt 1 ]]; then + apply_content + fi + + if ! stage_and_commit; then + exit 0 + fi + + if git -C "$DEPLOY_DIR" push origin gh-pages 2>/dev/null; then echo "Deployed successfully (attempt $attempt)" exit 0 fi echo "Push rejected (attempt $attempt/$MAX_RETRIES)" - if [[ $attempt -lt $MAX_RETRIES ]]; then - echo "Pulling remote changes before retrying..." - git pull --rebase origin gh-pages + if [[ "$attempt" -lt "$MAX_RETRIES" ]]; then + try_rebase_and_push "$attempt" && exit 0 fi done -echo "ERROR: Deploy failed after $MAX_RETRIES attempts" +echo "Deploy failed after $MAX_RETRIES attempts" >&2 exit 1 diff --git a/build/scripts/lint-scripts.sh b/build/scripts/lint-scripts.sh deleted file mode 100755 index 62e25b91257..00000000000 --- a/build/scripts/lint-scripts.sh +++ /dev/null @@ -1,35 +0,0 @@ -#!/usr/bin/env bash -# -# Copyright (c) Red Hat, Inc. -# This program and the accompanying materials are made -# available under the terms of the Eclipse Public License 2.0 -# which is available at https://www.eclipse.org/legal/epl-2.0/ -# -# SPDX-License-Identifier: EPL-2.0 -# -# lint-scripts.sh — Run shellcheck on CQA scripts -# -# Usage: ./build/scripts/lint-scripts.sh - -set -e - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" - -echo "## Lint: CQA scripts" -echo "" - -errors=0 -for script in "$SCRIPT_DIR"/*.sh; do - if ! shellcheck -S warning -e SC2034,SC2329,SC1091 "$script" 2>/dev/null; then - errors=$((errors + 1)) - fi -done - -echo "" -if [[ $errors -eq 0 ]]; then - echo "All scripts pass shellcheck." -else - echo "$errors script(s) have shellcheck warnings." -fi - -[[ $errors -gt 0 ]] && exit 1 || exit 0 diff --git a/docs/github-publication-workflow.md b/docs/github-publication-workflow.md new file mode 100644 index 00000000000..23d089fd7ec --- /dev/null +++ b/docs/github-publication-workflow.md @@ -0,0 +1,213 @@ +# GitHub Publication Workflow Architecture + +## Overview + +The RHDH documentation project uses GitHub Actions to build AsciiDoc documentation and deploy HTML previews to GitHub Pages via the `gh-pages` branch. Two workflows handle this: + +- **`build-asciidoc.yml`** -- triggered on branch pushes, builds and deploys production documentation. +- **`pr.yml`** -- triggered on pull requests, builds preview HTML and posts a PR comment with a preview link and CQA checklist. + +Both workflows produce HTML output under `titles-generated//`, then push the result to the `gh-pages` branch using `deploy-gh-pages.sh`. + +## Triggers and Branch Matrix + +| Workflow | Event | Branches | Build Script | +|---|---|---|---| +| `build-asciidoc.yml` | `push` | `main`, `release-1.**`, `rhdh-1.**`, `1.**.x` | `build-orchestrator.js --no-cqa` | +| `pr.yml` | `pull_request_target` | `main`, `release-1.**`, `release-2.**` | release-1.9+/main: `build-orchestrator.js`; release-1.8: `build-ccutil.sh` (base branch scripts) | + +The `build-asciidoc.yml` workflow calls `build-orchestrator.js --no-cqa` (CQA results aren't surfaced in branch builds, only in PR comments). The `pr.yml` workflow detects whether `build-orchestrator.js` exists on the base branch and uses it when available (release-1.9+, main), falling back to `build-ccutil.sh` on older branches (release-1.8). The orchestrator wraps ccutil with parallel execution, lychee link validation, CQA assessment, and JSON reporting. + +## Security Model + +The `pr.yml` workflow uses `pull_request_target` instead of `pull_request` so it can access repository secrets (needed for `RHDH_BOT_TOKEN` to push to `gh-pages` and post PR comments). This event runs workflow code from the base branch, not the PR, which avoids exfiltration of secrets from untrusted PRs. + +### Two-checkout pattern + +To separate trusted code from untrusted content: + +1. **Trusted checkout** -- checks out `build/scripts` from the base branch (`sparse-checkout: build/scripts`) into `trusted-scripts/`. +2. **Content checkout** -- checks out the full PR head into `pr-content/`. +3. **Merge** -- the workflow replaces `pr-content/build/scripts` with `trusted-scripts/build/scripts` via `rsync`, then runs the build from `pr-content/`. + +Build scripts are always sourced from the base branch, never from the PR. This prevents a malicious PR from modifying build scripts to exfiltrate secrets. + +### Authorization gate + +The workflow enforces team-based authorization before running the build: + +1. `check-commit-author` -- uses a GitHub App token to check if the PR author is a member of the `rhdh` team in the `redhat-developer` organization. +2. `authorize` -- selects the `internal` or `external` environment: + - **Internal**: PR author is in the `rhdh` team, or the PR is from the same repository (not a fork). Runs immediately. + - **External**: fork PRs from non-team members. The `external` environment requires manual approval from the `rhdh-content` team before the build proceeds. +3. `adoc_build` -- depends on `authorize`, so it only runs after the gate passes. + +## Build Pipeline + +### build-orchestrator.js + +The orchestrator replaces the sequential `build-ccutil.sh` with parallel title builds, structured error reporting, and a JSON report. + +**Phases:** + +1. **Title discovery** -- scans `titles/` for directories containing `master.adoc`, excluding `rhdh-plugins-reference`. +2. **Parallel builds** -- runs `podman run ... ccutil compile` for each title, limited by a semaphore (`--jobs`, defaults to CPU count). Each title produces HTML under `titles-generated///`. +3. **Image copy** -- parses each generated `index.html` to find image references and copies them into the output directory. +4. **Branch index** -- generates `titles-generated/<branch>/index.html` listing all successfully built titles, with an optional release notes link. +5. **Lychee link validation** -- runs `lychee` against `titles-generated/` with cross-title link remapping (rewrites `docs.redhat.com` links to local file paths). Broken links are traced back to `.adoc` source files via `grep`. +6. **Preliminary report** -- writes `build-report.json` with lychee results and CQA status "pending". This allows CQA-14 to read lychee results without triggering a rebuild. +7. **CQA assessment** -- sets `CQA_RUNNING=1` in `process.env`, then runs `node build/scripts/cqa/index.js --all`. The env var propagates to CQA-14, which skips its internal orchestrator call and reads the preliminary report instead. +8. **Final report** -- overwrites `build-report.json` with completed CQA results. + +**Error classification:** the orchestrator loads `build/scripts/error-patterns.json`, which maps regex patterns to structured error messages with `cause` and `fix` fields. These appear in the JSON report and PR comment. + +**CQA-14 recursion guard:** CQA-14 (lychee link validation check) can trigger the orchestrator internally. To prevent infinite recursion, the orchestrator sets `CQA_RUNNING=1` when invoking CQA, so CQA-14 reads existing lychee results from the report instead of triggering a full rebuild. + +**CLI usage:** + +```bash +node build/scripts/build-orchestrator.js -b <branch> +node build/scripts/build-orchestrator.js -b pr-123 --verbose +node build/scripts/build-orchestrator.js -b main --jobs 4 +node build/scripts/build-orchestrator.js -b main --no-cqa --no-lychee +``` + +The `-b` flag determines the output directory name under `titles-generated/`. `--no-cqa` and `--no-lychee` skip CQA and lychee respectively (used by `build-asciidoc.yml` where CQA results aren't surfaced). The orchestrator exits with code 1 if any enabled phase fails. + +## Deploy Pipeline + +### deploy-gh-pages.sh + +Handles deployment of built content to the `gh-pages` branch, including cleanup and index regeneration in a single commit. + +**Sequence:** + +1. Detect the branch directory from `<publish_dir>` (single top-level directory, e.g., `main/`, `pr-123/`). +2. Create a temporary git repo with `github-actions[bot]` identity. +3. Fetch `gh-pages` (shallow, depth=1). +4. Copy `<publish_dir>` contents into the working tree. +5. For branch deploys: run cleanup (see Cleanup section below). +6. Regenerate indexes from current directories on `gh-pages` (see below). +7. Stage all changes (content + cleanup deletions + indexes), commit, and push. + +**Index regeneration:** rebuilds HTML indexes from directories present on `gh-pages`: +- `index.html` -- lists all non-`pr-*` directories with optional release notes links (for `release-1.9+` and `main`). +- `pulls.html` -- lists all `pr-*` directories. + +Branch deploys regenerate both indexes. PR deploys regenerate `pulls.html` only. + +**Retry logic:** on push rejection, the script attempts `git pull --rebase`. If rebase succeeds, it pushes immediately. If rebase conflicts, it aborts, re-fetches `gh-pages`, re-applies content and cleanup, and retries. Maximum 3 attempts. + +**Invocation:** + +```bash +# Branch deploy +bash build/scripts/deploy-gh-pages.sh ./titles-generated --message "Deploy main" + +# PR deploy (from pr.yml, using trusted scripts) +bash trusted-scripts/build/scripts/deploy-gh-pages.sh ./pr-content/titles-generated --message "Deploy PR 123 preview" +``` + +### Branch vs PR deploys + +- **Branch deploys** (`build-asciidoc.yml`): deploy content + cleanup stale PRs/branches + regenerate both `index.html` and `pulls.html` → single commit. +- **PR deploys** (`pr.yml`): deploy content under `pr-<N>/` + regenerate `pulls.html` only → single commit. No cleanup runs. + +## gh-pages Branch Structure + +``` +gh-pages/ +|-- index.html # Links to branch builds + release notes +|-- pulls.html # Links to PR preview builds +|-- main/ # Main branch build +| |-- index.html # Per-branch title listing +| +-- <title>/ # Individual title HTML +| +-- index.html +|-- release-1.9/ # Release branch build +|-- release-1.8/ # Legacy release branch ++-- pr-123/ # PR preview build + |-- index.html + +-- <title>/ +``` + +**Preview URL pattern:** + +``` +https://redhat-developer.github.io/red-hat-developers-documentation-rhdh/<branch-or-pr>/ +``` + +For PR previews: + +``` +https://redhat-developer.github.io/red-hat-developers-documentation-rhdh/pr-<N>/ +``` + +## PR Preview Lifecycle + +1. PR opened, synchronized, reopened, or marked ready for review -- `pr.yml` triggers. +2. Authorization gate checks team membership. Fork PRs from non-team members require manual approval via the `external` environment. +3. Trusted build scripts are checked out from the base branch. PR content is checked out separately. +4. Build scripts from the base replace `pr-content/build/scripts`. The orchestrator (or `build-ccutil.sh` on older branches) runs with `-b pr-<N>`. +5. If HTML was successfully generated (checked via `build-report.json`), `deploy-gh-pages.sh` pushes the output to `gh-pages` under `pr-<N>/`. +6. A consolidated PR comment is posted (or updated) with: + - Build status (passed/failed) with title counts and duration. + - Preview link (marked stale if title build failed). + - Build error details with classified causes and fixes. + - CQA checklist with pass/fail counts (when available). + - Link to full CI logs. +7. Old standalone CQA comments (from the previous two-comment format) are cleaned up. +8. When the PR is merged or closed, the next branch deploy cleans up the `pr-<N>/` directory from `gh-pages`. + +**Concurrency:** the workflow uses `concurrency` groups keyed on the PR number. If a new push arrives while a build is in progress, the in-progress run is cancelled. + +## Cleanup + +Cleanup is integrated into `deploy-gh-pages.sh` and runs during branch deploys only, not during PR deploys. It executes before index regeneration so indexes reflect the cleaned-up state. Cleanup, content deployment, and index regeneration are committed together in a single commit. + +### PR cleanup + +1. List `pr-*` directories on `gh-pages`. +2. For each, query the GitHub API (`GET /repos/{owner}/{repo}/pulls/{number}`). +3. If the PR is merged or closed, remove the `pr-<N>/` directory. + +### Branch cleanup + +1. List non-`pr-*` directories on `gh-pages`. +2. For each, run `git ls-remote --heads origin <branch>`. +3. If the remote branch no longer exists, remove the directory. This cleans up directories for deleted branches (e.g., `release-1.9-post-cqa`). + +## Local Development + +### Build all titles + +```bash +node build/scripts/build-orchestrator.js -b main +``` + +Requires Podman. Builds all titles in parallel, runs lychee link validation, runs CQA, and writes `build-report.json`. + +### Run CQA standalone + +```bash +# All checks on a single title +node build/scripts/cqa/index.js titles/<title>/master.adoc + +# Auto-fix issues +node build/scripts/cqa/index.js --fix titles/<title>/master.adoc + +# Run a specific check +node build/scripts/cqa/index.js --check NN titles/<title>/master.adoc + +# All checks on all titles +node build/scripts/cqa/index.js --all +``` + +CQA-14 (lychee link validation) in standalone mode runs the orchestrator internally. It sets `CQA_RUNNING=1` to prevent recursion -- the orchestrator skips CQA when this variable is set, so CQA-14 reads the lychee results from the existing `build-report.json` instead of triggering another full build. + +### Legacy build + +```bash +build/scripts/build-ccutil.sh -b <branch> +``` + +Used on `release-1.8` and as a fallback on branches where `build-orchestrator.js` does not exist. Runs title builds sequentially without lychee or CQA.