docs: restructure site, add changelogs, improve content#6098
Conversation
Redesign the documentation site structure and content for clarity: Site structure: - Top nav: Home, User Guide, Blog Posts, Resources - User Guide: vLLM, vLLM-Omni, Ray (each multi-page with sidebar nav) - Resources: Reference (Image Access, Available Images, Region Availability, Support Policy, Release Notifications) + Security Home page: - New landing page with DLC intro, quick start example, use-case cards - Replaces the previous README copy Framework pages (vLLM, vLLM-Omni, Ray): - Split monolithic pages into Overview, Supported Models, Deployment (EC2/EKS/SageMaker), Configuration, and Changelog sub-pages - Add changelogs with real release content from PRs - Remove auto-generated vllm-server release notes (covered by changelog) - Update SageMaker docs with standard-supervisor features Reference: - Separate Region Availability into its own page - Trim Image Access to essentials - Simplify Release Notifications Tests: - Update test_generate_available_images.py for removed Region Availability section from available_images template
User Guide additions: - Add Base image guide (overview + changelog) under docs/base/ - Add PyTorch image guide (overview + EC2/SageMaker deployment + changelog) under docs/pytorch/ - Wire both into top-level User Guide nav - Expand vLLM/vLLM-Omni/Ray content (What's Included, API endpoints, port columns, model coverage labels, fact-check fixes) Release-notes pipeline removal: - Delete docs/releasenotes/ tree (output had no nav entry, dead surface) - Remove generate_release_notes() and helpers from docs/src/generate.py - Remove release-notes templates, table config, tests - Strip announcements/packages from 91 data YAMLs - Trim scripts/autocurrency/docs-pr.sh to drop docker introspection and announcement/packages emission; update autocurrency-tracker.yml Net: User Guide changelogs are the single source of truth; ~2100 lines removed.
- CI: switch docs-test.yml to `mkdocs build --strict`. Catches broken
internal links, missing nav targets, broken anchors, and orphan-page
warnings that the existing test_links.py tests don't cover (anchors
in particular).
- Fix pre-existing strict-mode warnings:
- Add /README.md, /tutorials/README.md, /DEVELOPMENT.md to mkdocs.yaml
exclude_docs (these conflicted with index.md or were not in nav).
Anchored with leading / so per-tutorial README files still build.
- Remove broken `available_images.md#tensorflow-training` anchor link
from the home page (TensorFlow Training section was previously
removed from the available_images table).
- Sidebar: darken section titles ("User Guide", "vLLM", etc.) to pure
black/white in light/dark mode for readability.
- Home page: switch use-case grid to 3 columns (3 + 2) to accommodate
the new "Build Your Own Image" Base card.
- vLLM Inference → LLM Serving using vLLM DLC - vLLM-Omni Inference → Multimodal Serving using vLLM-Omni DLC - Ray Serve Inference → ML Serving using Ray DLC - PyTorch Training → ML Training using PyTorch DLC - Base Inference → Build Custom Images using Base DLC Sidebar nav labels (vLLM / vLLM-Omni / Ray / PyTorch / Base) are unchanged — only the H1 title on each guide's overview page is updated.
… rendering - Each guide overview (vLLM, vLLM-Omni, Ray, PyTorch, Base) now points to its respective ECR Public Gallery page next to the existing Image Access reference. - Ray Example Deployments table: drop inline-code wrapping on path links so they render as plain links (the code-block background made the text hard to read against the link color).
…-and-content # Conflicts: # scripts/autocurrency/docs-pr.sh
Eren-Jeager123
left a comment
There was a problem hiding this comment.
autocurrency/docspr part lgtm
Per review feedback: while the DLC image doesn't read RAYSERVE_NUM_GPUS, the mnist-direct-app example's deployment.py uses it to parameterize ray_actor_options.num_gpus. Restore the env var in the example, with a comment clarifying it's a user-side convention rather than a DLC contract. Also extend docs/ray/deployment/ec2.md Direct App Import section to call out this pattern: env vars consumed by the user's deployment.py are valid; they're just not defined by the DLC.
sirutBuasai
left a comment
There was a problem hiding this comment.
A small thing is that release notes used to have the main package version such as python version, cuda version, pytorch version, etc. Now the new changelogs doesn't display which version or commit it comes from but rather only the main package (eg: vllm) source commit.
Another small nit is that we should try to use the variables such as EC2/ECS/EKS/SageMaker variables from global.yml as much as possible to keep everything in the docs standard and any future changes are easy to make
Base images use nvidia/cuda:*-{base,runtime,devel}-amzn2023, not the
-cudnn flavors. cuDNN is not installed in v1 or v2.
- Use global.yml variables ({{ ec2_short }}, {{ eks_short }}, {{ sm_short }},
{{ sagemaker }}) for AWS service names in guide pages and docs/index.md,
user_guide/index.md instead of hardcoded "EC2", "EKS", "SageMaker",
"Amazon SageMaker AI" so future renames update everywhere
- Add "Bundled versions" line per release in docs/vllm/changelog/index.md
(CUDA, Python, FlashInfer, DeepEP) so the changelog conveys per-release
framework state, matching the existing PyTorch/Ray changelog format
|
@sirutBuasai thanks for the review. Both points addressed in 1. Per-release version info on changelogs. PyTorch and Ray changelogs already include per-release framework versions; only the vLLM changelog was light. Added a 2. Use
Image tag URLs ( One follow-up thought worth raising: it's worth weighing whether this level of templating is worth the effort across the doc tree. Writing |
Summary
Major documentation site restructure for clarity, navigation, and content quality.
Site Structure
Home Page
Framework Pages
runai-streamer, OpenAI-compatible API endpointsRelease-Notes Pipeline Removal
The auto-generated
docs/releasenotes/pages were not wired into the site nav and had no inbound links — effectively dead surface. The manually-maintained User Guide changelogs now own image history.docs/releasenotes/treegenerate_release_notes()and helpers fromdocs/src/generate.pyannouncements:andpackages:from 91 data YAMLs (no longer consumed)scripts/autocurrency/docs-pr.shto drop docker-image introspection and the deadannouncements:/packages:emitdocs_packages:from.github/config/autocurrency-tracker.ymltest_generate.pyto drop the release-notes mockReference
Fact-Check Findings Fixed
RAYSERVE_NUM_GPUSwas a phantom variable in Ray deployment docs (no script reads it) — removedCA_REPOSITORY_ARN) was incorrectly implied to work on Ray EC2 image — corrected to SageMaker-onlyCI Hardening
mkdocs build→mkdocs build --strictindocs-test.yml. PRs that introduce broken internal links, missing nav targets, broken anchors, or page conflicts will now fail CI (existingtest_links.pycovered.mdlink targets but not anchors)./README.md,/tutorials/README.md,/DEVELOPMENT.mdtomkdocs.yamlexclude_docsso contributor docs don't conflict with the published site.Test plan
mkdocs build --strictpasses locallypytest test/docs/)pre-commit run --all-filespasses (exceptactionlintwhich needs network in the local env)