cache benchmark runs by vastzuby · Pull Request #417 · vast-ai/vast-cli

vastzuby · 2026-06-10T22:32:02Z

vastai run benchmarks rents a real GPU, measures perf, and tears it down. Every run's perf is reported to the benchmarks tablex so the table is a cross-user record of measured perf per spec.

This PR uses that table as a cache. Before renting, we check for an
existing benchmark with the same GPU type, num_gpus, and template,
reported in the last 30 days.

Cache hit: serve the median of the reported perf values and skip the
rental. We also show the spread (range + sample count), since perf
varies machine-to-machine for some workloads.
Cache miss: fall back to the current rent-and-measure flow.

Cached rows report perf only. We deliberately do not show $/hr or perf/$
for them: the price would come from a different machine than the one
benchmarked, so a cached perf/$ wouldn't correspond to anything real.
perf/$ stays accurate only on freshly measured (--no-cache) rows.

If a cached spec has no offers available to rent right now, we flag it
("no offers available to rent right now") rather than implying it's
rentable.

Reuse the benchmarks table as a cross-user cache for `vastai run benchmarks`: before renting, look for a recent (<=30d) row matching the same template + GPU + count and serve its median measured perf instead of renting and re-measuring. - Perf-only cache: cached rows carry no $/hr, since a cached row's price would come from a different machine than the one benchmarked; $/hr and perf/$ render "-" for cached rows. - Show the perf spread (median, range, n, age) so real machine-to-machine variance is visible rather than a single misleading number. - On a cache hit, prompt to reuse the result or run a fresh benchmark; -y/--raw are non-interactive and reuse the cache. --no-cache always re-measures. - Flag cache hits that have no current matching offers as not rentable. - Fix endpoint_name: drop the parens around the uuid suffix; the backend rejects shell metacharacters in endpoint_name, so any real rental 400'd (pre-existing bug in the base command). - Lower the default --timeout from 60m to 30m. - search_benchmarks type hint accepts dict queries. - Drop leading underscores from module helpers/constants to match the rest of the cli commands; tidy comments, docstrings, and constant grouping. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01JKE6tJgMRhV5tcLtYBYKnQ

LucasArmandVast · 2026-06-30T20:11:21Z

+    Template is matched client-side: rows carry template_hash or template_id
+    depending on how the benchmarked workergroup was created.
+    """
+    query = {


We should include a limit on this query, since it could potentially return 1000s of rows for common template/num_gpu/gpu_name combos (i.e. ComfyUI on 1x5090)

vastzuby requested a review from LucasArmandVast June 24, 2026 01:07

vastzuby force-pushed the AUTO-1452-benchmarks-run-cache branch from e97b98e to 9ebf5d5 Compare June 24, 2026 01:09

vastzuby changed the title ~~feat(AUTO-1452): use benchmarks table as cache for the run command~~ cache benchmark runs Jun 30, 2026

vastzuby force-pushed the AUTO-1452-benchmarks-run-cache branch from 88582c7 to abeda07 Compare June 30, 2026 19:54

LucasArmandVast reviewed Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cache benchmark runs#417

cache benchmark runs#417
vastzuby wants to merge 1 commit into
masterfrom
AUTO-1452-benchmarks-run-cache

vastzuby commented Jun 10, 2026 •

edited

Loading

Uh oh!

LucasArmandVast Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

vastzuby commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LucasArmandVast Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vastzuby commented Jun 10, 2026 •

edited

Loading