Add Gemma 4 RTX 4090 backend helpers#152
Conversation
adybag14-cyber
commented
May 11, 2026
|
RTX 4090 Gemma 4 running at 60 tk/s 31b it abliterated |
There was a problem hiding this comment.
3 issues found across 4 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/Start-LuceboxGemma4090.ps1">
<violation number="1" location="scripts/Start-LuceboxGemma4090.ps1:6">
P2: Hard-coding the default repo path to one workstation makes this helper fail by default on other machines.</violation>
</file>
<file name="scripts/verify_gemma4_4090.py">
<violation number="1" location="scripts/verify_gemma4_4090.py:117">
P2: `--runs` is unchecked, so 0/negative values can make aggregation crash on an empty result set.</violation>
</file>
<file name="scripts/lucebox-gemma4-4090.sh">
<violation number="1" location="scripts/lucebox-gemma4-4090.sh:62">
P2: Readiness timeout can be bypassed because the health probe has no curl timeout, so a single stalled request can block `wait_ready()` indefinitely.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| [string] $Command = 'Start', | ||
|
|
||
| [string] $Distro = '', | ||
| [string] $RepoPath = '/mnt/c/Users/adyba/src/lucebox-hub', |
There was a problem hiding this comment.
P2: Hard-coding the default repo path to one workstation makes this helper fail by default on other machines.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/Start-LuceboxGemma4090.ps1, line 6:
<comment>Hard-coding the default repo path to one workstation makes this helper fail by default on other machines.</comment>
<file context>
@@ -0,0 +1,66 @@
+ [string] $Command = 'Start',
+
+ [string] $Distro = '',
+ [string] $RepoPath = '/mnt/c/Users/adyba/src/lucebox-hub',
+ [int] $WaitSeconds = 300
+)
</file context>
| parser = argparse.ArgumentParser() | ||
| parser.add_argument("--base-url", default="http://127.0.0.1:18191") | ||
| parser.add_argument("--threshold", type=float, default=60.0) | ||
| parser.add_argument("--runs", type=int, default=3) |
There was a problem hiding this comment.
P2: --runs is unchecked, so 0/negative values can make aggregation crash on an empty result set.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/verify_gemma4_4090.py, line 117:
<comment>`--runs` is unchecked, so 0/negative values can make aggregation crash on an empty result set.</comment>
<file context>
@@ -0,0 +1,162 @@
+ parser = argparse.ArgumentParser()
+ parser.add_argument("--base-url", default="http://127.0.0.1:18191")
+ parser.add_argument("--threshold", type=float, default=60.0)
+ parser.add_argument("--runs", type=int, default=3)
+ parser.add_argument("--n-predict", type=int, default=256)
+ parser.add_argument("--wait", type=float, default=300.0)
</file context>
| } | ||
|
|
||
| health() { | ||
| curl -fsS "$(url)/health" |
There was a problem hiding this comment.
P2: Readiness timeout can be bypassed because the health probe has no curl timeout, so a single stalled request can block wait_ready() indefinitely.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/lucebox-gemma4-4090.sh, line 62:
<comment>Readiness timeout can be bypassed because the health probe has no curl timeout, so a single stalled request can block `wait_ready()` indefinitely.</comment>
<file context>
@@ -0,0 +1,194 @@
+}
+
+health() {
+ curl -fsS "$(url)/health"
+}
+
</file context>
|
@adybag14-cyber thanks for your contribution! @dusterbloom can you take a look at this |
|
Thanks for the contribution and the time you put into the scripts. We can't take this one as-is. A few specific reasons: The Gemma 4 path in this repo is The submodule pin is intentional and non-negotiable. The scripts hardcode paths from a workstation we don't have access to — We're moving toward a declarative config layout — a |
There was a problem hiding this comment.
1 issue found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/probe_gemma4_context.py">
<violation number="1" location="scripts/probe_gemma4_context.py:155">
P2: Threshold validation can falsely pass when every run lacks a numeric `predicted_per_second` metric.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| "cache_type_v": args.cache_type_v, | ||
| "threshold": args.threshold, | ||
| "all_ok": all(r["ok"] for r in results), | ||
| "all_ge_threshold": (all(rate >= args.threshold for rate in rates) if args.threshold > 0 else None), |
There was a problem hiding this comment.
P2: Threshold validation can falsely pass when every run lacks a numeric predicted_per_second metric.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/probe_gemma4_context.py, line 155:
<comment>Threshold validation can falsely pass when every run lacks a numeric `predicted_per_second` metric.</comment>
<file context>
@@ -0,0 +1,176 @@
+ "cache_type_v": args.cache_type_v,
+ "threshold": args.threshold,
+ "all_ok": all(r["ok"] for r in results),
+ "all_ge_threshold": (all(rate >= args.threshold for rate in rates) if args.threshold > 0 else None),
+ "min_predicted_per_second": min(rates) if rates else None,
+ "avg_predicted_per_second": (sum(rates) / len(rates) if rates else None),
</file context>