feature: prime train routes full_finetune TOMLs to hosted endpoint by JannikSt · Pull Request #592 · PrimeIntellect-ai/prime

JannikSt · 2026-05-02T23:30:48Z

Note

Medium Risk
Changes how expensive training jobs are launched and deleted (hosted vs LoRA dispatch) and handles per-run credentials; mistakes could send the wrong workload or delete path, though behavior is covered by tests and explicit confirmation for dispatch.

Overview
prime train now detects full-parameter (prime-rl) TOMLs before the LoRA schema runs and dispatches them to POST /v1/training/runs via a new HostedTrainingClient, while unchanged LoRA configs still use the shared RFT path.

Detection covers type = "full_finetune" and [deployment] signals (including multi-node num_train_nodes / num_infer_nodes, which previously fell through to LoRA). The hosted path ships the full TOML as config, applies team_id and secrets from env files, strips local env_file paths, and does not print the minted API token in CLI/JSON output.

RLRun gains optional kind and makes LoRA-centric fields optional so dedicated full-FT runs list/get cleanly; UI shows placeholders when those fields are absent.

prime train delete tries the hosted full-FT delete endpoint first and falls back to LoRA on typed NotFoundError (404 from the HTTP client).

prime train logs adds trainer / inference components and routes most env-server reads through unified /rft/runs/.../logs with component + env_name; name/N qualifiers still use the legacy env-server-logs endpoint.

^{Reviewed by Cursor Bugbot for commit bc982f4. Bugbot is set up for automated code reviews on this repo. Configure here.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b88b5f7d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

JannikSt · 2026-06-01T12:50:19Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b07791f22c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…o hosted endpoint `prime train <toml>` stays the single entry point. When the TOML carries `type = "full_finetune"` (or a `[hosted]` block, or a `[deployment]` block matching prime-rl's qwen30b_math/rl.toml shape), the CLI routes to the new public API at /api/v1/training/runs instead of the LoRA shared-cluster path. Backwards compatible: configs without these markers run through the existing flow unchanged. * api/training.py: new HostedTrainingClient + build_payload_from_toml (whitelist-maps prime-rl example fields onto the API payload). * api/rl.py: surface `kind` on RLRun so `prime train delete` can route to the right endpoint based on run kind. * commands/rl.py: peek the TOML before strict RLConfig parse; on full-FT hand off to _dispatch_full_finetune_run with shared env-file/secrets plumbing. Delete looks up kind and dispatches via HostedTrainingClient for DEDICATED_FULL_FT runs. Tested end-to-end against local backend on rft-freyr cluster: dispatch + status mirroring + completion + delete all clean.

Drop the friction of looking up a cluster cuid for the common single- cluster setup. Backend now auto-picks the first uncordoned PrimeCluster when the field is omitted, so `prime train backend/examples/training/ reverse_text.toml` is zero-config.

…eyRef The platform materialises the per-run RFT API token into the per-run k8s Secret on dispatch and the chart binds it to the orchestrator pod's PRIME_API_KEY env var. The token already lives where prime-monitor needs it — surfacing it on stdout just makes it easy to leak into shared shell history or CI logs.

- Drop prime_cluster_id from CLI plumbing entirely. Backend auto-picks the first uncordoned PrimeCluster; the CLI never threads a cluster id, removing a footgun (mistargeting a config to the wrong cluster). Drops the [hosted] discriminator path too — type = 'full_finetune' or a [deployment] block remain the only triggers. - codex P1 / cursor: --output json on the full-FT path no longer short-circuits with a {would_dispatch} preview. Mirrors the LoRA path's 'create then format' contract — automation that pipes the JSON to grab run_id now actually dispatches the run. - codex P2: env_file (deprecated, singular) is loaded BEFORE env_files (canonical, plural) so env_files wins on key collision. Matches the LoRA path's documented precedence.

build_payload_from_toml used to whitelist ~12 individual fields; the backend rebuilt a minimal TOML from them, dropping anything outside the whitelist (custom optim schedules, eval configs, custom scheduler params, …). E2E prime-rl runs that depended on those knobs silently diverged from `uv run rl @ rl.toml` behaviour. Now: ship the whole TOML as 'config' (companion to platform PR #1824 faa934d56). The backend's build_values takes the config dict directly so the same TOML works locally with prime-rl and remote-dispatched through prime-cli with no fork in behaviour.

Unrelated to the full-FT training payload change; accidentally picked up by 'git add -A'. Restoring inference.py to its 4b9be8f state. The streaming improvements will land in a sibling PR off main.

Cursor caught: when rl_client.get_run fails (e.g. pydantic ValidationError because a DEDICATED_FULL_FT row doesn't carry the LoRA-required RLRun fields), the prior except APIError block silently set kind=None and routed delete to the LoRA endpoint. The hosted helm release + namespace would stay live with no signal back. Restructure: try hosted_client.delete_run first; on HTTP 404 the backend's kind gate told us 'not a DEDICATED_FULL_FT', so we fall through to the LoRA path. Removes the get_run discriminator roundtrip entirely — and any pydantic surprise it could have raised.

Cursor flagged: distinguishing 404 by 'HTTP 404' not in str(e) is fragile — depends on the message format never changing AND on no unrelated error body coincidentally containing the substring. - Add NotFoundError(APIError) subclass; APIClient raises it for 404 responses (sync + async paths). - HostedTrainingClient.create_run / delete_run no longer catch Exception and rewrap into a generic APIError — typed APIError subclasses propagate so callers can branch by class. - prime train delete uses except NotFoundError as the 'not a hosted run, try LoRA' fallback signal.

Cursor caught: `if output != "json" and not yes` meant a researcher piping JSON for scripting would have full-FT runs auto-launched without ack. The LoRA path always prompts unless --yes regardless of output format. Match that contract — gate confirmation purely on --yes.

`_dispatch_full_finetune_run` was building the payload without a teamId, so every dispatched run landed on the caller's personal account even when `prime config view` showed a team selected. The LoRA path (rl.py:1216) already passes `app_config.team_id` — match that behaviour for the full-FT path. Backend's `CreateDedicatedRunRequest.team_id` is Optional, so the prior omission was silently wrong instead of returning a 400.

Backend's /api/v1/rft/runs/{run_id}/logs now accepts component + env_name params (dedicated full-FT). Surface them through the CLI: prime train logs <run_id> -c trainer prime train logs <run_id> -c inference prime train logs <run_id> -c env-server --env <name> Legacy --env <name>/<idx> still routes through the env-server-logs endpoint (shared-RFT pods, cluster_id-backed lookup). Dedicated env-server (slug, no slash) goes through the unified /logs route. Per-rank --pod-index intentionally not exposed yet: the chart's torchrun --local-ranks-filter=0 already collapses in-pod rank fan-out to rank 0 stdout, and Loki's pod-label indexing in this tenant doesn't actually filter the prime-job-* streams — per-pod inspection on multi-node runs is kubectl + the PVC log files for now.

JannikSt · 2026-06-02T10:14:23Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e5c0e2f79a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-02T10:18:09Z

+
+    Cluster targeting is backend-side (auto-pick first uncordoned).
+    """
+    payload: Dict[str, Any] = {"config": cfg}


Strip CLI-only env file keys before dispatch

When a full-finetune TOML uses the supported env_file/env_files keys, _dispatch_full_finetune_run reads those files into secrets but then passes the original raw_cfg here unchanged, so the backend receives config.env_file(s) as part of the prime-rl config. Those local paths will not exist in the hosted pod (and can also be rejected as unknown prime-rl settings), so runs that rely on config-level env files can fail even though the secrets were already extracted; copy the config and remove these CLI-only keys before putting it under config.

Useful? React with 👍 / 👎.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit a41e4b8. Configure here.}

cursor · 2026-06-02T10:30:19Z

+        name=name,
+        team_id=team_id,
+        wandb_api_key=secrets.get("WANDB_API_KEY"),
+        hf_token=secrets.get("HF_TOKEN"),


Full-FT path silently drops non-WANDB/HF secrets

Medium Severity

The _dispatch_full_finetune_run function collects all secrets from -e/--env-file via collect_env_vars, but only forwards WANDB_API_KEY and HF_TOKEN to build_payload_from_toml. Any other user-provided secrets (e.g., -e CUSTOM_API_KEY=value) are silently discarded without warning. In contrast, the LoRA path passes all secrets via secrets=secrets. A user relying on custom env vars reaching the training container would experience silent runtime failures.

Additional Locations (1)

packages/prime/src/prime_cli/api/training.py#L81-L93

^{Reviewed by Cursor Bugbot for commit a41e4b8. Configure here.}

…r_nodes

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

cursor Bot reviewed May 2, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

cursor Bot reviewed May 3, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

cursor Bot reviewed May 3, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

cursor Bot reviewed May 3, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py

JannikSt requested review from JohannesHa, burnpiro, d42me, kcoopermiller and willccbb as code owners May 19, 2026 00:05

willccbb previously approved these changes May 19, 2026

View reviewed changes

JannikSt dismissed willccbb’s stale review via 2fe14da May 19, 2026 07:31

cursor Bot reviewed May 19, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py

chatgpt-codex-connector Bot reviewed Jun 1, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/api/rl.py

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

JannikSt added 10 commits June 1, 2026 14:54

fix: move inference reasoning-content streaming changes off this PR

3bc37f9

Unrelated to the full-FT training payload change; accidentally picked up by 'git add -A'. Restoring inference.py to its 4b9be8f state. The streaming improvements will land in a sibling PR off main.

JannikSt force-pushed the feature/train-hosted-full-finetune branch from b07791f to 8aafec7 Compare June 1, 2026 13:00

JannikSt force-pushed the feature/train-hosted-full-finetune branch from 8aafec7 to feea570 Compare June 1, 2026 13:02

cursor Bot reviewed Jun 1, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/api/rl.py

Comment thread packages/prime/src/prime_cli/commands/rl.py

fix: pod_index=0 reaches backend; drop dead _parse_env_qualifier

e5c0e2f

chatgpt-codex-connector Bot reviewed Jun 2, 2026

View reviewed changes

fix: strip env_file/env_files before shipping full-FT config to backend

a41e4b8

cursor Bot reviewed Jun 2, 2026

View reviewed changes

fix: detect multi-node full-FT configs via num_train_nodes / num_infe…

bc982f4

…r_nodes

Conversation

JannikSt commented May 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JannikSt commented Jun 1, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JannikSt commented Jun 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 2, 2026

Choose a reason for hiding this comment

Full-FT path silently drops non-WANDB/HF secrets

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JannikSt commented May 2, 2026 •

edited by cursor Bot

Loading