Skip to content

Commit b579c9a

Browse files
style: apply /style-guide pass to serverless-rl (#2667)
## Summary This PR applies the `/style-guide` skill (Google Developer Style Guide + CoreWeave conventions) to the `serverless-rl` section. The run was fully automated; only style-level edits were made. ## Files edited - `serverless-rl/api-reference.mdx` - `serverless-rl/prerequisites.mdx` - `serverless-rl/sft.mdx` - `serverless-rl/sft/usage.mdx` - `serverless-rl/usage-limits.mdx` - `serverless-rl/usage.mdx` - `serverless-rl/use-trained-models.mdx` ## Recommendations for technical review ### Prerequisites - Confirm whether a brief link to `/serverless-rl/prerequisites` should appear at the top of the Authentication section in `api-reference.mdx`, not only under Related resources. - In `prerequisites.mdx`, confirm whether the `ApiKeyCreate` snippet covers W&B account sign-up, or whether a separate sign-up link/instruction is needed so the page is self-contained. - In `sft.mdx`, verify whether ART-specific prerequisites (Python version, OpenPipe account, ART package install) belong on the page or in the linked prerequisites doc. - In `sft/usage.mdx`, consider adding a Prerequisites section listing the required W&B account, access to Serverless Inference, Models, Artifacts, optional Weave, and the ART library so the page is self-contained. - In `usage.mdx`, confirm whether tutorial-style inline prerequisites should be present per CoreWeave tutorial guidelines, rather than only linking out. - In `use-trained-models.mdx`, consider a labeled Prerequisites section covering a completed training run, a known saved step, and any IAM/permissions scope required on the W&B API key; also introduce `WANDB_API_KEY` before its first use in the cURL example. - Across pages, confirm that key terms (Serverless RL, LoRA, LoRA adapter, checkpoint, base model, trajectory, reward function, post-train, ART framework) are defined on linked prerequisite pages or add short inline definitions/links. ### Verification steps - In `api-reference.mdx`, consider a short cURL example against `GET /v1/health` or `GET /v1/system-check` so readers can confirm credentials work. - In `prerequisites.mdx`, consider a verification cue confirming the API key works and the project was created before "Next steps." - In `sft/usage.mdx`, confirm whether a brief "what you'll have accomplished" / expected-outcome sentence is appropriate after training completes (e.g., a deployed LoRA available via Serverless Inference and stored as an artifact). - In `usage.mdx`, consider adding an in-doc "what success looks like" statement after the linked ART quickstart. - In `use-trained-models.mdx`, add a short "expected result" note after each example (response shape, status, or how to confirm the model is serving). ### Technical accuracy - **`api-reference.mdx`** — `/v1/chat/completions` and `/v1/chat/completions/` both link to "Create Chat Completion" (slugs `create-chat-completion-1` and `create-chat-completion`); likely a duplicate from the OpenAPI spec — needs SME input on whether one should be hidden. - **`api-reference.mdx`** — Auto-generated bullet titles render as "Create Sft Training Job" / "Create Rl Training Job"; should be "Create SFT Training Job" / "Create RL Training Job" (upstream OpenAPI fix). - **`api-reference.mdx`** — The bullet for `POST /v1/preview/models/{model_id}/log` is just "Log", which is ambiguous; the OpenAPI summary should be clarified. - **`api-reference.mdx`** — Endpoint paths mix `/v1/...` (health, chat) with `/v1/preview/...` (models, training-jobs); consider whether "preview" status should be called out. - **`prerequisites.mdx`** — Confirm the link `https://docs.wandb.ai/models/track/project-page` is the canonical "Projects guide" URL (path suggests "Project page," which may not match the displayed link text). Also confirm `/serverless-rl/api-reference` resolves to the intended destination. - **`prerequisites.mdx` (line 6)** — "Serverless RL **currently** supports the following foundation models for training." Confirm intent: a list as of publication (remove "currently") or a live catalog (remove as redundant with the auto-generation note). - **`sft.mdx`** — Confirm `https://art.openpipe.ai/fundamentals/sft-training` is the canonical, current Serverless SFT entry point, and that `https://art.openpipe.ai/getting-started/about` is the correct ART "about" landing. - **`sft/usage.mdx`** — "Save them locally or to a third party for backup" — "third party" is vague; name supported options or link to a reference. - **`usage-limits.mdx`** — Verify "Training is free during the public preview period" (line 20) is still accurate and plan how the page should be updated when the preview ends. - **`usage-limits.mdx`** — Confirm the 2,000 / 6,000 concurrency numbers (line 30) are current, and that `support@wandb.com` is still the right contact for limit-increase requests. - **`usage-limits.mdx`** — Verify the "5 GB / ~30 LoRAs" model-storage estimate (line 24) is current. - **`usage.mdx`** — Verify the Google Colab notebook URL (`openpipe/art-notebooks/.../2048/2048.ipynb`) still resolves to the intended 2048 example. Confirm whether "Serverless RL" should remain as a product name or get a formal first-use expansion ("Serverless reinforcement learning (RL)"). - **`use-trained-models.mdx`** — Slash inconsistency in `wandb-artifact://` scheme: the cURL example (line 41) uses `wandb-artifact://[ENTITY]/...` (two slashes), while the schema (line 16) and Python example (line 66) use `wandb-artifact:///` (three slashes). Confirm the accepted form and standardize. - **`use-trained-models.mdx`** — Step prefix ambiguity: schema uses `:step25`, Python uses `:step100`, cURL uses `[STEP]` without clarifying whether `step` is part of the value. Confirm and document. - **`use-trained-models.mdx` (line 21)** — "Your W&B entity's (team) name" reads awkwardly; confirm whether "entity" and "team" are synonymous in this context and rephrase consistently. ### Missing content - **`api-reference.mdx`** — No mention of rate limits, pagination, error response format, versioning policy, or SDKs/client libraries; the "OpenAI-compatible" claim for chat completions isn't elaborated (which OpenAI client configurations work and any deviations). - **`prerequisites.mdx`** — Clarify whether `mailto:support@wandb.ai` is for (a) requesting unlisted models, (b) requesting access/onboarding for listed models, or (c) both. Consider a brief legend explaining "Type," "Context Window," and "Parameters" (especially `Active-Total` MoE notation). The HTML comment about auto-generation is invisible to readers — consider a visible "Last updated" or "Source" note. - **`prerequisites.mdx` ("Create a project in W&B")** — Consider noting whether the project name must match anything referenced later in ART or API configuration, and whether the section should be an explicit single-step procedure or merged with the API-key step. - **`prerequisites.mdx` ("Next steps")** — Only two items; consider additional follow-on resources (tutorial, examples repo, pricing/limits). - **`sft.mdx`** — Indicate which models or model families are supported via Serverless SFT and any scale/quota considerations distinct from Serverless RL; add a brief statement of how Serverless SFT relates to Serverless RL for readers landing from search. - **`sft/usage.mdx`** — Consider a high-level overview of the workflow (what the user does in ART, what comes back into W&B) so readers know what they're committing to before leaving the page. Define or link "checkpoint" on first use. - **`usage-limits.mdx`** — Geographic restrictions bullet (line 32) links to the Terms of Service without indicating whether an in-docs list of supported/unsupported regions exists; add one if available. Consider whether `429` concurrency-limit behavior warrants a `<Note>` or `<Warning>` callout rather than inline prose. - **`usage.mdx`** — No "Resources" or "Related" section; if more references are added later (Colab notebooks, API reference, blog posts), consider reintroducing it, provided the inline links in the intro are restructured to avoid duplication. "Post-train" and "ART framework" are referenced without on-page definition (ART is linked; "post-train" may warrant a glossary or concept link). - **`use-trained-models.mdx`** — No mention of how to find the entity, project, model name, or step values from the W&B UI or API; link to the relevant reference. Consider adding a parent H2 (such as "Make an inference request") with the two examples as H3 children, plus procedural subheadings ("Construct the endpoint", "Send a request") for scannability. Standardize the Python example to bracket-style uppercase placeholders so both examples teach the same naming convention. ## How to review - Each file's changes are style edits only. Compare side-by-side and flag any that change technical meaning. - Approve and merge to accept the edits, or close to reject them.
1 parent d033e71 commit b579c9a

7 files changed

Lines changed: 66 additions & 49 deletions

File tree

serverless-rl/api-reference.mdx

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,56 @@
11
---
22
title: API overview
33
description: "Browse the Serverless RL API endpoints for chat completions, models, training jobs, and health checks with authentication details."
4+
keywords: ["OpenAI-compatible", "create training job", "list model checkpoints", "SFT training job", "reinforcement learning API"]
45
---
56

67
<Note>
7-
The Serverless RL API provides endpoints for managing and interacting with training jobs, including serverless reinforcement learning (RL) and supervised fine-tuning (SFT). The API is OpenAI-compatible for chat completions.
8+
The Serverless RL API provides endpoints for managing training jobs, including serverless reinforcement learning (RL) and supervised fine-tuning (SFT). The API is OpenAI-compatible for chat completions.
89
</Note>
910

11+
Use this reference to look up authentication requirements, the base URL, and the available endpoints for chat completions, models, training jobs, and health checks.
12+
1013
## Authentication
1114

1215
All API requests require authentication using your W&B API key. Create an API key at [wandb.ai/settings](https://wandb.ai/settings).
1316

1417
Include your API key in the `Authorization` header:
1518

16-
```
19+
```http
1720
Authorization: Bearer YOUR_API_KEY
1821
```
1922

2023
## Base URL
2124

22-
```
25+
```text
2326
https://api.training.wandb.ai/v1
2427
```
2528

2629
## Available endpoints
2730

31+
The following sections describe the available endpoints, grouped by resource type.
2832

29-
### chat-completions
33+
### Chat completions
3034

3135
- **[POST /v1/chat/completions](https://docs.wandb.ai/serverless-rl/api-reference/chat-completions/create-chat-completion-1)** - Create Chat Completion
3236
- **[POST /v1/chat/completions/](https://docs.wandb.ai/serverless-rl/api-reference/chat-completions/create-chat-completion)** - Create Chat Completion
3337

34-
### models
38+
### Models
3539

3640
- **[POST /v1/preview/models](https://docs.wandb.ai/serverless-rl/api-reference/models/create-model)** - Create Model
3741
- **[DELETE /v1/preview/models/{model_id}](https://docs.wandb.ai/serverless-rl/api-reference/models/delete-model)** - Delete Model
3842
- **[DELETE /v1/preview/models/{model_id}/checkpoints](https://docs.wandb.ai/serverless-rl/api-reference/models/delete-model-checkpoints)** - Delete Model Checkpoints
3943
- **[GET /v1/preview/models/{model_id}/checkpoints](https://docs.wandb.ai/serverless-rl/api-reference/models/list-model-checkpoints)** - List Model Checkpoints
4044
- **[POST /v1/preview/models/{model_id}/log](https://docs.wandb.ai/serverless-rl/api-reference/models/log)** - Log
4145

42-
### training-jobs
46+
### Training jobs
4347

4448
- **[POST /v1/preview/sft-training-jobs](https://docs.wandb.ai/serverless-rl/api-reference/training-jobs/create-sft-training-job)** - Create Sft Training Job
4549
- **[POST /v1/preview/training-jobs](https://docs.wandb.ai/serverless-rl/api-reference/training-jobs/create-rl-training-job)** - Create Rl Training Job
4650
- **[GET /v1/preview/training-jobs/{training_job_id}](https://docs.wandb.ai/serverless-rl/api-reference/training-jobs/get-training-job)** - Get Training Job
4751
- **[GET /v1/preview/training-jobs/{training_job_id}/events](https://docs.wandb.ai/serverless-rl/api-reference/training-jobs/get-training-job-events)** - Get Training Job Events
4852

49-
### health
53+
### Health
5054

5155
- **[GET /v1/health](https://docs.wandb.ai/serverless-rl/api-reference/health/health-check)** - Health Check
5256
- **[GET /v1/system-check](https://docs.wandb.ai/serverless-rl/api-reference/health/system-check)** - System Check

serverless-rl/prerequisites.mdx

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
title: Prerequisites
33
description: "Set up your environment for Serverless RL by creating an account, generating an API key, and configuring a project."
4+
keywords: ["OpenPipe ART", "wandb login", "install wandb", "ART framework setup"]
45
---
56

67
import ApiKeyCreate from "/snippets/_includes/api-key-create.mdx";
@@ -13,17 +14,17 @@ Before starting, review the [usage information and limits](/serverless-rl/usage-
1314

1415
## Sign up and create an API key
1516

16-
To authenticate your machine with W&B, you must first generate an API key.
17+
If you don't already have a W&B account, sign up before continuing. To authenticate your machine with W&B, you must first generate an API key.
1718

1819
<ApiKeyCreate/>
1920

2021
## Create a project in W&B
2122

22-
Create a project in your W&B account to track usage, record training metrics, and save trained models. See the [Projects guide](https://docs.wandb.ai/models/track/project-page) for more information.
23+
Create a project in your W&B account to track usage, record training metrics, and save trained models. For more information, see the [Projects guide](https://docs.wandb.ai/models/track/project-page).
2324

2425
## Next steps
2526

26-
After completing the prerequisites:
27+
After you complete the prerequisites, try the following resources:
2728

28-
* Check the [API reference](/serverless-rl/api-reference) to learn about available endpoints
29-
* Try the [ART quickstart](https://art.openpipe.ai/getting-started/quick-start)
29+
* Check the [API reference](/serverless-rl/api-reference) to learn about available endpoints.
30+
* Try the [ART quickstart](https://art.openpipe.ai/getting-started/quick-start).

serverless-rl/sft.mdx

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,37 @@
11
---
22
title: Serverless SFT
33
description: Learn how to fine-tune models using supervised fine-tuning (SFT) on W&B
4+
keywords: ["LoRA adapter", "distillation", "warmup before RL", "ART framework", "managed training cluster"]
45
---
56

6-
Use Serverless SFT to fine-tune LLMs with supervised learning on curated datasets. Serverless SFT is now in public preview. W&B provisions the training infrastructure ([on CoreWeave](https://docs.coreweave.com/docs/platform)) for you while allowing full flexibility in your environment's setup. You get instant access to a managed training cluster that elastically auto-scales to handle your training workloads.
7+
Use Serverless SFT to fine-tune LLMs with supervised learning on curated datasets. Serverless SFT is in public preview. W&B provisions the training infrastructure ([on CoreWeave](https://docs.coreweave.com/docs/platform)) for you and gives you full flexibility to set up your environment. You get instant access to a managed training cluster that auto-scales to handle your training workloads.
78

8-
Serverless SFT is ideal for tasks like:
9-
* **Distillation**: Transferring knowledge from a larger, more capable model into a smaller, faster one
10-
* **Teaching output style and format**: Training a model to follow specific response formats, tone, or structure
11-
* **Warmup before RL**: Pre-training a model with supervised examples before applying reinforcement learning for further refinement
9+
Serverless SFT is ideal for tasks such as:
10+
* **Distillation**: Transferring knowledge from a larger, more capable model into a smaller, faster one.
11+
* **Teaching output style and format**: Training a model to follow specific response formats, tone, or structure.
12+
* **Warmup before RL**: Pre-training a model with supervised examples before applying reinforcement learning for further refinement.
1213

1314
Serverless SFT trains low-rank adapters (LoRAs) to specialize a model for your specific task. W&B automatically stores the LoRAs you train as artifacts in your account. You can also save them locally or to a third party for backup. Serverless Inference also automatically hosts models that you train through Serverless SFT.
1415

15-
See the ART [Serverless SFT docs](https://art.openpipe.ai/fundamentals/sft-training) to get started.
16+
To begin training a model with Serverless SFT, see the ART [Serverless SFT docs](https://art.openpipe.ai/fundamentals/sft-training).
1617

17-
## Why Serverless SFT?
18+
## Why Serverless SFT
1819

1920
Supervised fine-tuning (SFT) is a training technique where a model learns from curated input-output examples. Serverless SFT on W&B provides the following advantages:
2021

21-
* **Lower training costs**: By multiplexing shared infrastructure across many users, skipping the setup process for each job, and scaling your GPU costs down to 0 when you're not actively training, Serverless SFT reduces training costs significantly.
22-
* **Faster training time**: By immediately provisioning training infrastructure when you need it, Serverless SFT speeds up your training jobs and lets you iterate faster.
23-
* **Automatic deployment**: Serverless SFT automatically deploys every checkpoint you train, so you do not need to manually set up hosting infrastructure. You can access and test trained models immediately in local, staging, or production environments.
22+
* **Lower training costs**: Serverless SFT multiplexes shared infrastructure across many users, skips the setup process for each job, and scales your GPU costs down to zero when you aren't actively training. This reduces training costs.
23+
* **Faster training time**: Serverless SFT immediately provisions training infrastructure when you need it. This speeds up your training jobs and lets you iterate faster.
24+
* **Automatic deployment**: Serverless SFT automatically deploys every checkpoint you train, so you don't need to manually set up hosting infrastructure. You can access and test trained models immediately in local, staging, or production environments.
2425

2526
## How Serverless SFT uses W&B services
2627

2728
Serverless SFT uses a combination of the following W&B components to operate:
2829

29-
* [Inference](/inference): To run your models
30-
* [Models](/models): To track performance metrics during the LoRA adapter's training
31-
* [Artifacts](/models/artifacts): To store and version the LoRA adapters
32-
* [Weave (optional)](/weave): To gain observability into how the model responds at each step of the training loop
30+
* [Inference](/inference): To run your models.
31+
* [Models](/models): To track performance metrics during the LoRA adapter's training.
32+
* [Artifacts](/models/artifacts): To store and version the LoRA adapters.
33+
* [Weave](/weave) (Optional): To gain observability into how the model responds at each step of the training loop.
3334

34-
Serverless SFT is in public preview. During the preview, W&B charges you only for inference usage and artifact storage. W&B does not charge for adapter training during the preview period.
35+
<Note>
36+
Serverless SFT is in public preview. During the preview, W&B charges you only for inference usage and artifact storage. W&B doesn't charge for adapter training during the preview period.
37+
</Note>

serverless-rl/sft/usage.mdx

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
---
22
title: How to use Serverless SFT
33
description: "Fine-tune models with Serverless SFT using OpenPipe's ART framework and the Serverless RL API for supervised learning."
4+
keywords: ["run SFT job", "ART SFT training", "SFT tutorial"]
45
---
56

6-
Use Serverless SFT through [OpenPipe's ART framework](https://art.openpipe.ai/getting-started/about) and the [Serverless RL API](/serverless-rl/api-reference).
7+
Use Serverless supervised fine-tuning (SFT) through [OpenPipe's ART framework](https://art.openpipe.ai/getting-started/about) and the Serverless RL API.
78

8-
To start using Serverless SFT, satisfy the [prerequisites](/serverless-rl/prerequisites) to use W&B tools, and then go through the ART [Serverless SFT docs](https://art.openpipe.ai/fundamentals/sft-training).
9-
- To learn about Serverless SFT's API endpoints, see the [Serverless RL API reference](/serverless-rl/api-reference).
9+
To start using Serverless SFT, satisfy the [prerequisites](/serverless-rl/prerequisites) for W&B tools, and then see the [ART Serverless SFT documentation](https://art.openpipe.ai/fundamentals/sft-training).
10+
11+
To learn about Serverless SFT's API endpoints, see the [Serverless RL API reference](/serverless-rl/api-reference).

serverless-rl/usage-limits.mdx

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,18 @@
11
---
22
title: Usage information and limits
33
description: Understand pricing, usage limits, and account restrictions for W&B Serverless RL
4+
keywords: ["concurrency limits", "geographic restrictions", "LoRA checkpoint storage", "rate limit"]
45
---
56

7+
This page describes the pricing model, concurrency limits, and geographic restrictions that apply to W&B Serverless RL. Review this information to estimate costs and to understand the constraints that affect how you run training and inference workloads.
8+
69
## Pricing
710

8-
Pricing has three components: inference, training, and storage. For specific billing rates, visit our [pricing page](https://site.wandb.ai/pricing/training).
11+
Pricing has three components: inference, training, and storage. For specific billing rates, visit our [pricing page](https://site.wandb.ai/pricing/training). The following sections describe each component.
912

1013
### Inference
1114

12-
Pricing for Serverless RL inference requests matches Serverless Inference pricing. See [model-specific costs](https://site.wandb.ai/pricing/reinforcement-learning) for more details. Learn more about purchasing credits, account tiers, and usage caps in the [Serverless Inference docs](/inference/usage-limits#purchase-more-credits).
15+
Pricing for Serverless RL inference requests matches Serverless Inference pricing. See [model-specific costs](https://site.wandb.ai/pricing/reinforcement-learning). Learn more about purchasing credits, account tiers, and usage caps in the [Serverless Inference docs](/inference/usage-limits#purchase-more-credits).
1316

1417
### Training
1518

@@ -19,10 +22,12 @@ Training is free during the public preview period.
1922

2023
### Model storage
2124

22-
Serverless RL stores checkpoints of your trained LoRAs so you can evaluate, serve, or continue training them at any time. W&B bills storage monthly based on total checkpoint size and your [pricing plan](https://wandb.ai/site/pricing). Every plan includes at least 5GB of free storage, which is enough for roughly 30 LoRAs. Delete low-performing LoRAs to save space. See the [ART SDK](https://art.openpipe.ai/features/checkpoint-deletion) for instructions.
25+
Serverless RL stores checkpoints of your trained LoRAs so you can evaluate, serve, or continue training them at any time. W&B bills storage monthly based on total checkpoint size and your [pricing plan](https://wandb.ai/site/pricing). Every plan includes at least 5 GB of free storage, which is enough for roughly 30 LoRAs. To save space, delete low-performing LoRAs. See the [ART SDK](https://art.openpipe.ai/features/checkpoint-deletion) for instructions.
2326

2427
## Limits
2528

26-
* **Inference concurrency limits**: By default, Serverless RL currently supports up to 2000 concurrent requests per user and 6000 per project. If you exceed your rate limit, the Inference API returns a `429 Concurrency limit reached for requests` response. To avoid this error, reduce the number of concurrent requests your training job or production workload makes at once. If you need a higher rate limit, you can request one at support@wandb.com.
29+
The following limits apply to Serverless RL usage. Review them when sizing workloads or when planning to use the service from a new region.
30+
31+
* **Inference concurrency limits**: By default, Serverless RL supports up to 2,000 concurrent requests per user and 6,000 per project. If you exceed your rate limit, the Inference API returns a `429 Concurrency limit reached for requests` response. To avoid this error, reduce the number of concurrent requests your training job or production workload makes at once. If you need a higher rate limit, request one at support@wandb.com.
2732

2833
* **Geographic restrictions**: Serverless RL is only available in supported geographic locations. For more information, see the [Terms of Service](https://site.wandb.ai/terms/).

serverless-rl/usage.mdx

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
---
22
title: How to use Serverless RL
33
description: "Post-train models with Serverless RL using OpenPipe's ART framework and the Serverless RL API for reinforcement learning."
4+
keywords: ["Colab notebook", "ART quickstart", "post-training tutorial"]
45
---
56

67
Use Serverless RL through [OpenPipe's ART framework](https://art.openpipe.ai/getting-started/about) and the [Serverless RL API](/serverless-rl/api-reference).
78

8-
To start using Serverless RL, satisfy the [prerequisites](/serverless-rl/prerequisites) to use W&B tools, and then go through the ART [quickstart](https://art.openpipe.ai/getting-started/quick-start).
9-
- For code examples and workflows, see the [Google Colab notebook](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb).
10-
- To learn about Serverless RL's API endpoints, see the [Serverless RL API reference](/serverless-rl/api-reference).
9+
To start using Serverless RL, you must satisfy the [prerequisites](/serverless-rl/prerequisites) to use W&B tools, and then go through the ART [quickstart](https://art.openpipe.ai/getting-started/quick-start).
10+
11+
For code examples and workflows, see the [Google Colab notebook](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb).

serverless-rl/use-trained-models.mdx

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,45 @@
11
---
22
title: Use your trained models
33
description: Make inference requests to the models you've trained
4+
keywords: ["wandb-artifact", "model endpoint", "OpenAI SDK", "cURL", "checkpoint step"]
45
---
56

6-
After you train a model with Serverless RL, it is automatically available for inference.
7+
After you train a model with Serverless RL, it's automatically available for inference. This page shows you how to construct the endpoint for a trained model and send inference requests to it. Use this endpoint to integrate your model into your application or evaluation workflows.
78

8-
To send requests to your trained model, you need:
9+
To send requests to your trained model, you need the following:
910
* Your [W&B API key](https://wandb.ai/settings)
1011
* The [Serverless RL API's](/serverless-rl/api-reference) base URL, `https://api.training.wandb.ai/v1/`
1112
* Your model's endpoint
1213

1314
The model's endpoint uses the following schema:
1415

15-
```
16-
wandb-artifact:///<entity>/<project>/<model-name>:<step>
16+
```text
17+
wandb-artifact:///[ENTITY]/[PROJECT]/[MODEL-NAME]:[STEP]
1718
```
1819

1920
The schema consists of:
2021

2122
* Your W&B entity's (team) name
2223
* The name of the project associated with your model
2324
* The trained model's name
24-
* The training step of the model you want to deploy (this is usually the step where the model performed best in your evaluations)
25+
* The training step of the model you want to deploy. This is usually the step where the model performed best in your evaluations.
2526

26-
For example, if your W&B team is named `email-specialists`, your project is called `mail-search`, your trained model is named `agent-001`, and you wanted to deploy it on step 25, the endpoint looks like this:
27+
For example, if your W&B team is named `email-specialists`, your project is called `mail-search`, your trained model is named `agent-001`, and you want to deploy it on step 25, the endpoint looks like this:
2728

28-
```
29+
```text
2930
wandb-artifact:///email-specialists/mail-search/agent-001:step25
3031
```
3132

32-
Once you have your endpoint, you can integrate it into your normal inference workflows. The following examples show how to make inference requests to your trained model using a cURL request or the [Python OpenAI SDK](https://github.com/openai/openai-python).
33+
After you have your endpoint, you can integrate it into your normal inference workflows. The following examples show how to make inference requests to your trained model using a cURL request or the [Python OpenAI SDK](https://github.com/openai/openai-python). Choose the example that matches your environment.
3334

34-
### cURL
35+
## cURL
3536

36-
```shell
37+
```bash
3738
curl https://api.training.wandb.ai/v1/chat/completions \
3839
-H "Authorization: Bearer $WANDB_API_KEY" \
3940
-H "Content-Type: application/json" \
4041
-d '{
41-
"model": "wandb-artifact://<entity>/<project>/<model-name>:<step>",
42+
"model": "wandb-artifact://[ENTITY]/[PROJECT]/[MODEL-NAME]:[STEP]",
4243
"messages": [
4344
{"role": "system", "content": "You are a helpful assistant."},
4445
{"role": "user", "content": "Summarize our training run."}
@@ -48,7 +49,7 @@ curl https://api.training.wandb.ai/v1/chat/completions \
4849
}'
4950
```
5051

51-
### OpenAI SDK
52+
## OpenAI SDK
5253

5354
```python
5455
from openai import OpenAI

0 commit comments

Comments
 (0)