Commit 8b37277
authored
style: apply /style-guide pass to models/launch (#2685)
## Summary
This PR applies the `/style-guide` skill (Google Developer Style Guide +
CoreWeave conventions) to documentation under `models/launch`. The run
was automated; no technical content was intentionally changed.
## Files edited
- `models/launch/evaluate-hosted-model.mdx`
- `models/launch/evaluate-model-checkpoint.mdx`
- `models/launch/evaluations.mdx`
## Recommendations for technical review
**Prerequisites**
- Confirm whether `WANDB_API_KEY` being required-but-unused for
Serverless Inference (`evaluate-hosted-model.mdx` line 23) is still
accurate, and whether a Note callout should explain this behavior.
- Prerequisite 3 of `evaluate-hosted-model.mdx` (line 24) mentions a
team-scoped secret holding the model's API key but doesn't specify a
required name/format or where it's selected in the flow. Add guidance if
needed.
- In `evaluate-model-checkpoint.mdx`, the prerequisite item references
"OpenAPI API key" but surrounding text and the secret name use OpenAI
(`OPENAI_API_KEY`). Likely a typo — confirm and correct.
- Confirm whether a specific W&B role (beyond team-admin for secrets) is
required to launch evaluation jobs. Add to prerequisites if so.
- In `evaluations.mdx`, the credentials section only links to the
secrets doc. Consider also linking to role/permission documentation and
to the "Evaluate a model checkpoint" / "Evaluate a hosted API model"
pages from the credentials section (not only from "Next steps").
**Verification steps**
- After "Click **Launch**" in both `evaluate-hosted-model.mdx` (line 44)
and `evaluate-model-checkpoint.mdx` (step 11), no confirmation cue
(toast, queued status, log location) is described before directing the
reader to the recent-run modal. Consider adding a confirmation
description.
- `evaluate-model-checkpoint.mdx` doesn't describe expected outcome
states for the evaluation job (running, succeeded, failed) or how the
reader confirms successful completion.
- Neither launch page has troubleshooting guidance for failed
benchmarks, missing secrets, unreachable model URLs, or runs that don't
appear in the recent runs list.
**Technical accuracy**
- `evaluate-hosted-model.mdx` line 38: "custom **OpenAPI-compliant**
model" — verify whether this should be "OpenAI-compatible" (matching the
rest of the page) or whether "OpenAPI" is intentional.
- `evaluate-hosted-model.mdx` line 38: the custom-model syntax
`openai-api/wandb/[MODEL-NAME]` appears identical to the Serverless
Inference syntax on line 36. Confirm whether the custom case should use
a different prefix.
- Confirm the "AI Security Institute" attribution in
`evaluate-hosted-model.mdx` (line 35) is current and correct.
- In `evaluate-model-checkpoint.mdx`, "VLLM-compatible format" is used
but not defined or linked. Add a definition or link.
- Confirm the `AutoModelForCausalLM.from_pretrained` / `save_pretrained`
example in `evaluate-model-checkpoint.mdx` produces VLLM-compatible
output for all supported architectures, or document when additional
steps are required.
- `evaluate-model-checkpoint.mdx` step 5 mentions an "up to four
benchmarks" limit without explanation or reference. Link or document the
limit.
- In `evaluations.mdx`, confirm the OpenAI Scorer column shows `Yes`
(not `true`) in the product UI — Pass 8 aligned the prose to match the
table cells; verify both reflect what users see.
- Confirm the exact field labels `Scorer API key` and `Hugging Face
Token` in `evaluations.mdx` match the strings in the product UI.
- The hidden HTML comment in `evaluations.mdx` (lines 23–26) points to
source-of-truth files in other repos. Confirm those URLs are current and
the catalog is in sync.
**Missing content**
- `evaluate-hosted-model.mdx` doesn't link to a secrets-management page
on first mention of "team-scoped secret". Consider linking for
self-containment.
- `evaluate-hosted-model.mdx` has no "Next steps" pointer beyond
imported snippets. Optional editorial improvement.
- The `wandb.init` call in `evaluate-model-checkpoint.mdx` uses
placeholder strings but doesn't clarify whether `entity` and `project`
must already exist or are created on the fly.
- In `evaluations.mdx`, `SOSBench` in the Safety table is the only entry
without a hyperlink on its name — likely an oversight.
- The "Next steps" list in `evaluations.mdx` mixes link-label items with
one longer descriptive item. Consider rephrasing for parallel structure
(e.g., "Browse all benchmarks at AISI Inspect Evals").
- Some benchmark descriptions in the `evaluations.mdx` tables don't
fully expand their acronyms on first use within the cell. Confirm
whether the table is self-explanatory or if a glossary link would help.
## How to review
- Each file's changes are style edits only. Compare side-by-side and
flag any that change technical meaning.
- Approve and merge to accept the edits, or close to reject them.1 parent 37769c9 commit 8b37277
3 files changed
Lines changed: 45 additions & 31 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
12 | | - | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
15 | 19 | | |
16 | 20 | | |
17 | | - | |
18 | | - | |
19 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
20 | 24 | | |
21 | 25 | | |
22 | | - | |
| 26 | + | |
23 | 27 | | |
24 | 28 | | |
25 | | - | |
| 29 | + | |
| 30 | + | |
26 | 31 | | |
27 | 32 | | |
28 | 33 | | |
29 | 34 | | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
34 | 39 | | |
35 | | - | |
| 40 | + | |
36 | 41 | | |
37 | | - | |
38 | | - | |
| 42 | + | |
| 43 | + | |
39 | 44 | | |
40 | 45 | | |
41 | 46 | | |
| |||
45 | 50 | | |
46 | 51 | | |
47 | 52 | | |
48 | | - | |
| 53 | + | |
49 | 54 | | |
50 | 55 | | |
51 | 56 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
12 | | - | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
15 | 19 | | |
16 | 20 | | |
17 | | - | |
18 | | - | |
| 21 | + | |
| 22 | + | |
19 | 23 | | |
20 | | - | |
| 24 | + | |
21 | 25 | | |
22 | 26 | | |
23 | 27 | | |
24 | | - | |
| 28 | + | |
| 29 | + | |
25 | 30 | | |
26 | 31 | | |
27 | 32 | | |
| |||
30 | 35 | | |
31 | 36 | | |
32 | 37 | | |
33 | | - | |
34 | | - | |
| 38 | + | |
| 39 | + | |
35 | 40 | | |
36 | 41 | | |
37 | 42 | | |
| |||
58 | 63 | | |
59 | 64 | | |
60 | 65 | | |
61 | | - | |
| 66 | + | |
| 67 | + | |
62 | 68 | | |
63 | 69 | | |
64 | 70 | | |
| |||
0 commit comments