This document describes the end-to-end process for cutting and shipping a new version of NeMo Platform.
Who can release? Any team member can trigger the workflow. The
release-stableGitHub Actions environment requires approval from a member of thenmp_devopsteam before the workflow proceeds.
The examples in this document use
0.1.2as the version being released.
trigger release-stable.yaml with source_sha + version → nmp_devops approval → workflow tags source_sha → Platform-Deploy publishes to PyPI
Artifacts published on a stable release:
Nightly builds go to pypi.nvidia.com (NVIDIA's internal/public PyPI mirror), not public PyPI.
Release and nightly wheel versions are resolved at build time. The release workflow runs .github/scripts/stamp_sdk_version.py, then passes the resolved version to Hatch through UV_DYNAMIC_VERSIONING_BYPASS.
Dynamic versioning is intentionally limited to packages that need release/nightly wheel metadata:
packages/nemo_platform(nemo-platform)packages/nemo_platform_plugin(nemo-platform-plugin)sdk/python/nemo-platform(nemo-platform-sdk, consumed by the released wrappers and SDK tooling)
All other first-party workspace packages use static stub versions, normally 0.0.0, because they are implementation packages rather than independently released artifacts. Do not add nmp-dynamic-versioning to another package unless that package is added to the release catalog or otherwise needs published wheel metadata.
packages/nmp_build_tools centralizes the Hatch version source and its defaults, but that package itself is also an internal stub-version package. The OpenAPI specs are schema inputs for SDK generation and intentionally keep a fixed info.version: 0.0.0; package release versions should not be copied into the specs.
Pick the full 40-character commit SHA on main that should be released, plus the SemVer core version to publish, for example 0.1.2. The stable workflow creates the release tag at source_sha, and the wheel build receives the package version from the workflow input.
If the API surface changed since the last SDK update, regenerate the OpenAPI spec and SDKs before releasing:
make update-sdkThis runs make refresh-openapi (regenerates openapi/openapi.yaml and plugin specs) and then syncs the Python and web SDKs via Stainless. Requires STAINLESS_API_KEY to be set — see sdk/README.md for setup instructions. The generated OpenAPI specs should keep info.version: 0.0.0.
To find the right SHA:
git log --oneline main | head -5
# Pick the commit to release and copy its full 40-character SHA.Navigate to the release-stable.yaml workflow and click Run workflow.
| Input | Required | Description |
|---|---|---|
source_sha |
Yes | The full 40-character commit SHA to release from (must be on main). |
version |
Yes | SemVer core version string to release, e.g. 0.1.2. This becomes the stable git tag and wheel version. |
release_date |
No | YYYY-MM-DD. Provide only on the first run for a given version; leave blank on reruns. |
release_scope |
No | all (default) releases every catalog SDK and container. Use sdks, containers, or custom for narrower releases. |
sdk_ids |
No | Comma-separated SDK IDs for release_scope: custom; must exist in release/assets.yaml. |
container_ids |
No | Comma-separated container IDs for release_scope: custom; must exist in release/assets.yaml. |
The workflow runs from the main branch by default. The source_sha must be reachable from that branch.
What the workflow does:
- Validates inputs and previews the release.
- Pauses at the
approve-stable-releasegate — a member of thenmp_devopsteam must approve in the GitHub environment UI. - Creates and pushes a git tag (e.g.
0.1.2) atsource_sha. - Builds Python wheels for each SDK in
release/assets.yamlusing.github/actions/build-nemo-platform-wheel. - Assembles a release bundle with checksums and metadata.
- Dispatches a
release-bundle-producedevent to the Platform-Deploy repository (CI_DISPATCH_REPOsecret), which handles the actual PyPI publish.
If the PyPI publishing service is returning 5xx errors, the publish step in Platform-Deploy will fail. Wait for the service to recover and re-run the workflow with the same
source_shaandversion— the stable tag is already reserved so re-running is safe.
Once the workflow completes, verify the release landed correctly:
uv tool upgrade nemo-platform
nemo --version
# Expected: nemo version <version>Also check:
- pypi.org/project/nemo-platform — version and description updated.
- pypi.org/project/nemo-platform-plugin — version updated.
- GitHub: a tag (e.g.
0.1.2) exists on the release commit.
The container: list in release/assets.yaml declares which container
images are eligible for release publishing. The bundle workflow records the
selected containers as container-typed entries in release-manifest.json,
and the release consumer stages those images after the SDK publish, reading
this list from this repository at the release ref. Eligibility is therefore
version-pinned: re-staging an old tag publishes the container set declared at
that commit.
release_scope controls what a release includes (default all):
| Scope | Includes |
|---|---|
all |
every catalog SDK + every catalog container (default) |
sdks |
every catalog SDK, no containers |
containers |
every catalog container, no SDKs |
custom |
exactly the comma-separated sdk_ids + container_ids (either may be empty) |
custom enables single-artifact or arbitrary-subset releases (for example a
patch release of one container via release_scope: custom,
container_ids: nmp-automodel-tasks); containers releases the whole
container set with no SDK wheels.
Adding an image here also requires a catalog metadata entry on the consumer side. Images are built into the dev registry tagged with this repository's commit SHA on every merge to main; release SHAs that predate that build trigger need a manual image build first.
Nightly builds run automatically at 20:00 PT and publish to pypi.nvidia.com. They use the HEAD of main and version strings like 0.1.3.dev20260101120000. No action required from the team.
To trigger a nightly manually: release-nightly.yaml → Run workflow (no inputs required). Leave send_notifications enabled for real reruns; disable it only for quiet smoke/ad-hoc runs.