cache benchmark runs#417
Open
vastzuby wants to merge 1 commit into
Open
Conversation
e97b98e to
9ebf5d5
Compare
Reuse the benchmarks table as a cross-user cache for `vastai run benchmarks`: before renting, look for a recent (<=30d) row matching the same template + GPU + count and serve its median measured perf instead of renting and re-measuring. - Perf-only cache: cached rows carry no $/hr, since a cached row's price would come from a different machine than the one benchmarked; $/hr and perf/$ render "-" for cached rows. - Show the perf spread (median, range, n, age) so real machine-to-machine variance is visible rather than a single misleading number. - On a cache hit, prompt to reuse the result or run a fresh benchmark; -y/--raw are non-interactive and reuse the cache. --no-cache always re-measures. - Flag cache hits that have no current matching offers as not rentable. - Fix endpoint_name: drop the parens around the uuid suffix; the backend rejects shell metacharacters in endpoint_name, so any real rental 400'd (pre-existing bug in the base command). - Lower the default --timeout from 60m to 30m. - search_benchmarks type hint accepts dict queries. - Drop leading underscores from module helpers/constants to match the rest of the cli commands; tidy comments, docstrings, and constant grouping. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01JKE6tJgMRhV5tcLtYBYKnQ
88582c7 to
abeda07
Compare
| Template is matched client-side: rows carry template_hash or template_id | ||
| depending on how the benchmarked workergroup was created. | ||
| """ | ||
| query = { |
Contributor
There was a problem hiding this comment.
We should include a limit on this query, since it could potentially return 1000s of rows for common template/num_gpu/gpu_name combos (i.e. ComfyUI on 1x5090)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
vastai run benchmarksrents a real GPU, measures perf, and tears it down. Every run's perf is reported to the benchmarks tablex so the table is a cross-user record of measured perf per spec.This PR uses that table as a cache. Before renting, we check for an
existing benchmark with the same GPU type, num_gpus, and template,
reported in the last 30 days.
rental. We also show the spread (range + sample count), since perf
varies machine-to-machine for some workloads.
Cached rows report perf only. We deliberately do not show$/hr or perf/$
for them: the price would come from a different machine than the one
benchmarked, so a cached perf/$ wouldn't correspond to anything real.
perf/$ stays accurate only on freshly measured (
--no-cache) rows.If a cached spec has no offers available to rent right now, we flag it
("no offers available to rent right now") rather than implying it's
rentable.