Skip to content

feat: add platform gating to ModelConfig and UI#657

Open
neuron-tech-ai wants to merge 2 commits into
jamiepine:mainfrom
neuron-tech-ai:feat/platform-gating
Open

feat: add platform gating to ModelConfig and UI#657
neuron-tech-ai wants to merge 2 commits into
jamiepine:mainfrom
neuron-tech-ai:feat/platform-gating

Conversation

@neuron-tech-ai
Copy link
Copy Markdown

@neuron-tech-ai neuron-tech-ai commented May 14, 2026

Some models only run on specific platforms (e.g., MLX models require macOS with Apple Silicon; some CUDA-only models won't run on macOS). Currently these models show up in the UI on all platforms and fail at download or load time with unhelpful errors.

This adds a platforms field to ModelConfig so each model can declare which platforms it supports. The UI uses this to hide or disable models that won't run on the current platform — users only see models that will actually work for them.

Reduces confusion and support burden from "why won't this model load" issues that are really just platform incompatibility.

Summary by CodeRabbit

  • New Features
    • Hardware platform compatibility checking for models and engines with visual lock icons indicating incompatibility
    • Required hardware information now displayed via tooltips for incompatible options
    • Automatic detection of available compute platforms on your system (CUDA, ROCm, MPS, XPU, CPU)
    • Incompatible engines are now disabled and marked as unavailable based on detected platform support

Review Change Stack

Add `requires: list[str]` field to `ModelConfig` so engines can declare
hardware requirements (cuda/mps/xpu/rocm/cpu). Add `get_supported_platforms()`
in platform_detect.py that returns what the current machine supports.

`/models/status` now includes `platform_compatible` and `requires` per model.
`load_engine_model()` hard-refuses incompatible engines with a clear HTTP 400.

Frontend: `ModelManagement` shows a lock icon + requires badge for
incompatible models; `EngineModelSelector` disables incompatible options
with a tooltip explaining the hardware requirement.

All existing engines have `requires=[]` (cross-platform); the field is
ready for CUDA-only engines (e.g. VoxCPM) to be shipped safely.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Warning

Rate limit exceeded

@neuron-tech-ai has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minute and 24 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 269e95ac-08f4-41e4-a17e-477505fffc1b

📥 Commits

Reviewing files that changed from the base of the PR and between 4082f27 and 62ebad3.

📒 Files selected for processing (3)
  • app/src/components/Generation/EngineModelSelector.tsx
  • backend/backends/__init__.py
  • backend/utils/platform_detect.py
📝 Walkthrough

Walkthrough

This PR introduces a complete platform and hardware compatibility system. It detects available compute backends on the machine, stores engine hardware requirements, exposes compatibility data via the /models/status API, validates engine selection at runtime, and displays compatibility status in the frontend UI.

Changes

Platform & Hardware Compatibility System

Layer / File(s) Summary
Platform Detection Utility
backend/utils/platform_detect.py
New get_supported_platforms() function probes PyTorch for available compute backends (CUDA, ROCm, MPS, XPU) and returns supported platform list, falling back to CPU when no accelerators available.
Backend Data Models
backend/backends/__init__.py, backend/models.py
ModelConfig dataclass extended with requires: list[str] field; ModelStatus response schema extended with platform_compatible: bool and requires: List[str] fields.
Model Status API Endpoint
backend/routes/models.py
/models/status endpoint now computes get_supported_platforms() once and populates platform_compatible and requires for each model by matching engine requirements against available platforms.
Engine Runtime Validation
backend/backends/__init__.py
New is_engine_platform_compatible() helper determines engine compatibility; load_engine_model() now validates platform compatibility before loading and raises 400 HTTPException with detailed error message if engine requires unsupported platforms.
Frontend API Types
app/src/lib/api/types.ts
ModelStatus interface extended with platform_compatible: boolean and requires: string[] to match backend response schema.
Frontend Compatibility UI
app/src/components/ServerSettings/ModelManagement.tsx, app/src/components/Generation/EngineModelSelector.tsx
ModelManagement displays lock icon for incompatible models and renders requirement badges with tooltips. EngineModelSelector fetches model status via React Query, derives per-engine compatibility, and disables incompatible engine options with tooltips and lock icons.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A rabbit hops through platforms new,
Detecting CUDA, ROCm too,
With CUDA checks and lock icons bright,
Engines now know which ones take flight!
Requirements shown with helpful care,
Cross-platform magic fills the air! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add platform gating to ModelConfig and UI' accurately and concisely summarizes the main change: adding platform compatibility gating across both the backend configuration and frontend UI components.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (5)
backend/utils/platform_detect.py (1)

59-66: ⚡ Quick win

Simplify ROCm detection to match official PyTorch pattern.

The torch.version.hip approach is officially recommended by PyTorch documentation. However, the hasattr guard is unnecessary—torch.version.hip is always present as an attribute (returning a version string if ROCm is enabled, or None otherwise). Simplify to:

is_rocm = torch.version.hip is not None

This aligns with the official PyTorch documentation pattern for ROCm detection.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/utils/platform_detect.py` around lines 59 - 66, The ROCm detection in
the torch.cuda.is_available() block should drop the redundant hasattr guard;
replace the is_rocm assignment that uses hasattr(torch.version, "hip") with a
direct check using torch.version.hip is not None, so the code uses is_rocm =
torch.version.hip is not None before appending "rocm" or "cuda" to the supported
list.
app/src/components/Generation/EngineModelSelector.tsx (4)

134-153: ⚡ Quick win

Remove unused code.

Lines 134-153 create incompatibleEngines and engineRequires but never use them. The actual compatibility logic is implemented later at lines 155-182 using engineCompatibility. This appears to be an abandoned first attempt.

🧹 Remove the unused code
-  // Build a set of engine names that are platform-incompatible.
-  const incompatibleEngines = new Set<string>();
-  const engineRequires: Record<string, string[]> = {};
-  if (modelStatus?.models) {
-    for (const m of modelStatus.models) {
-      if (!m.platform_compatible && m.requires && m.requires.length > 0) {
-        incompatibleEngines.add(m.model_name.split('-')[0]); // coarse key
-        // Map by engine derived from model_name patterns
-        // We'll match by checking the option engine field below
-      }
-    }
-    // More precise: map from engine name to requires via model entries
-    for (const m of modelStatus.models) {
-      if (m.requires && m.requires.length > 0 && !m.platform_compatible) {
-        // Derive engine name from model — use the requires list keyed by display
-        // We rely on the fact that incompatible means requires is non-empty
-        // Store by hf_repo_id to later map to engine
-      }
-    }
-  }
-
   // Build engine -> {compatible, requires} from model status
   const engineCompatibility: Record<string, { compatible: boolean; requires: string[] }> = {};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 134 -
153, Remove the dead/unused compatibility-prep code: delete the declarations and
loops that build incompatibleEngines and engineRequires and any references to
modelStatus.models within that block (the code that starts with const
incompatibleEngines = new Set<string>(); and the subsequent for-loops), since
compatibility is handled by engineCompatibility later; ensure no remaining
references to incompatibleEngines or engineRequires remain so the component
compiles cleanly.

160-180: ⚖️ Poor tradeoff

Model name derivation is tightly coupled to backend naming conventions.

The logic at lines 161-172 derives backend model names from frontend engine identifiers using hardcoded assumptions. If the backend changes its model naming scheme, this mapping will silently fail, and incompatible engines won't be properly disabled.

Consider having the backend return an engine_type field in the ModelStatus response to avoid this fragile string matching, or verify the current matching logic against the actual backend model names.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 160 -
180, The current loop in EngineModelSelector that derives optModelName from
ENGINE_OPTIONS uses brittle string manipulation on opt.value and opt.engine and
compares against m.model_name, which couples frontend to backend naming; update
the logic to prefer a backend-provided field (e.g., m.engine_type) in the
ModelStatus response and use that to set engineCompatibility[opt.engine] and
platform compatibility; if backend change is not available immediately, replace
the inline derivation with a single source-of-truth mapping object (e.g.,
ENGINE_TO_MODEL_NAME or MODEL_NAME_PATTERNS) referenced by ENGINE_OPTIONS and
use explicit pattern matching against m.model_name and m.engine_type as a safe
fallback, updating the check that sets engineCompatibility[opt.engine] to rely
on these fields instead of the current string-manipulation branch.

216-216: ⚡ Quick win

Remove redundant key prop.

The SelectItem at line 216 has a key={opt.value} prop, but the parent Tooltip at line 212 already has the same key. React only needs the key on the outermost element in the map.

♻️ Remove redundant key
               <Tooltip key={opt.value}>
                 <TooltipTrigger asChild>
                   <div>
                     <SelectItem
-                      key={opt.value}
                       value={opt.value}
                       className={`${itemClass ?? ''} opacity-50 cursor-not-allowed`}
                       disabled
                     >
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/Generation/EngineModelSelector.tsx` at line 216, Remove
the redundant key prop from the inner SelectItem element: the Tooltip component
already uses key={opt.value} in the mapped list, so delete key={opt.value} from
the SelectItem inside Generation/EngineModelSelector.tsx (referencing the
SelectItem and Tooltip elements and the opt.value usage) to avoid duplicate keys
and keep the key only on the outermost mapped element.

229-229: ⚡ Quick win

Add internationalization for tooltip message.

The tooltip message "Requires {requiresLabel} hardware" is hardcoded in English. For consistency with the rest of the application, this should use the i18n translation system.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/Generation/EngineModelSelector.tsx` at line 229, Replace
the hardcoded tooltip text "Requires {requiresLabel} hardware" in
EngineModelSelector with a call to the app's i18n translation function (e.g.,
useTranslation/t) and pass requiresLabel as an interpolation variable; update
the component to import and use the translation hook/utility (e.g., const { t }
= useTranslation()) and change the tooltip content to t('requires_hardware', {
requiresLabel }) or the project's equivalent translation key so the message is
localized.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Around line 161-171: The nested ternary building optModelName is hard to read;
extract it into a small helper (e.g., getModelName(opt) or computeOptModelName)
that returns opt.engine + '-tts-' + part after ':' when opt.value includes ':'
otherwise maps engines ('kokoro' -> 'kokoro', 'luxtts' -> 'luxtts', 'chatterbox'
-> 'chatterbox-tts', 'chatterbox_turbo' -> 'chatterbox-turbo') with a default of
''. Replace the ternary block in EngineModelSelector (where optModelName is set)
with a call to that helper to improve readability and maintainability.

In `@backend/backends/__init__.py`:
- Line 24: Move the relative import "from ..utils.platform_detect import
get_backend_type, get_supported_platforms" into the module's top import block so
it follows module-level import ordering and fixes Ruff E402; locate the import
in backend/backends/__init__.py and place it with the other top-level imports
(ensuring you don't introduce a circular import), then run the linter to confirm
the E402 warning is resolved.
- Around line 521-535: is_engine_platform_compatible currently picks configs[0]
and ignores per-variant `requires`; update it to consider the correct variant or
all variants: use the provided `model_size` (or accept one) to find the matching
config from get_tts_model_configs() and read its `requires`, or if `model_size`
isn’t available, compute compatibility across all configs for the engine (e.g.,
return True if any variant's `requires` is satisfied). Reference:
is_engine_platform_compatible, get_tts_model_configs(), configs, requires,
get_supported_platforms(), and the caller load_engine_model(engine, model_size)
so the per-variant check can use model_size when available.

In `@backend/utils/platform_detect.py`:
- Around line 71-77: Remove the unnecessary import of
intel_extension_for_pytorch and the try/except wrapper; directly check torch.xpu
availability by using if hasattr(torch, "xpu") and torch.xpu.is_available():
then append "xpu" to the supported list (the symbols to update are the import
intel_extension_for_pytorch, the try/except block, and the conditional using
torch.xpu and supported.append("xpu")).

---

Nitpick comments:
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Around line 134-153: Remove the dead/unused compatibility-prep code: delete
the declarations and loops that build incompatibleEngines and engineRequires and
any references to modelStatus.models within that block (the code that starts
with const incompatibleEngines = new Set<string>(); and the subsequent
for-loops), since compatibility is handled by engineCompatibility later; ensure
no remaining references to incompatibleEngines or engineRequires remain so the
component compiles cleanly.
- Around line 160-180: The current loop in EngineModelSelector that derives
optModelName from ENGINE_OPTIONS uses brittle string manipulation on opt.value
and opt.engine and compares against m.model_name, which couples frontend to
backend naming; update the logic to prefer a backend-provided field (e.g.,
m.engine_type) in the ModelStatus response and use that to set
engineCompatibility[opt.engine] and platform compatibility; if backend change is
not available immediately, replace the inline derivation with a single
source-of-truth mapping object (e.g., ENGINE_TO_MODEL_NAME or
MODEL_NAME_PATTERNS) referenced by ENGINE_OPTIONS and use explicit pattern
matching against m.model_name and m.engine_type as a safe fallback, updating the
check that sets engineCompatibility[opt.engine] to rely on these fields instead
of the current string-manipulation branch.
- Line 216: Remove the redundant key prop from the inner SelectItem element: the
Tooltip component already uses key={opt.value} in the mapped list, so delete
key={opt.value} from the SelectItem inside Generation/EngineModelSelector.tsx
(referencing the SelectItem and Tooltip elements and the opt.value usage) to
avoid duplicate keys and keep the key only on the outermost mapped element.
- Line 229: Replace the hardcoded tooltip text "Requires {requiresLabel}
hardware" in EngineModelSelector with a call to the app's i18n translation
function (e.g., useTranslation/t) and pass requiresLabel as an interpolation
variable; update the component to import and use the translation hook/utility
(e.g., const { t } = useTranslation()) and change the tooltip content to
t('requires_hardware', { requiresLabel }) or the project's equivalent
translation key so the message is localized.

In `@backend/utils/platform_detect.py`:
- Around line 59-66: The ROCm detection in the torch.cuda.is_available() block
should drop the redundant hasattr guard; replace the is_rocm assignment that
uses hasattr(torch.version, "hip") with a direct check using torch.version.hip
is not None, so the code uses is_rocm = torch.version.hip is not None before
appending "rocm" or "cuda" to the supported list.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3af1819b-93a8-4bec-9b88-88d4885b74d8

📥 Commits

Reviewing files that changed from the base of the PR and between b35b909 and 4082f27.

📒 Files selected for processing (7)
  • app/src/components/Generation/EngineModelSelector.tsx
  • app/src/components/ServerSettings/ModelManagement.tsx
  • app/src/lib/api/types.ts
  • backend/backends/__init__.py
  • backend/models.py
  • backend/routes/models.py
  • backend/utils/platform_detect.py

Comment thread app/src/components/Generation/EngineModelSelector.tsx Outdated
Comment thread backend/backends/__init__.py Outdated
Comment thread backend/backends/__init__.py
Comment thread backend/utils/platform_detect.py Outdated
- Move platform_detect import to top of __init__.py to fix Ruff E402
- Remove unnecessary intel_extension_for_pytorch import; torch.xpu.is_available() is native since PyTorch 2.7+
- Fix is_engine_platform_compatible to check across all variants instead of only configs[0]
- Fix load_engine_model to use per-variant requires based on model_size
- Extract nested ternary for optModelName into deriveModelName() helper function
@neuron-tech-ai
Copy link
Copy Markdown
Author

Addressed all four CodeRabbit review comments:

  1. Nested ternary refactor: Extracted the optModelName logic into a deriveModelName(opt) helper function with a ENGINE_TO_MODEL_NAME lookup map. The loop body now calls deriveModelName(opt) instead of the 10-line nested ternary.

  2. Ruff E402 import ordering: Moved from ..utils.platform_detect import get_backend_type, get_supported_platforms into the top import block (immediately after the hf_offline_patch import), before the constant definitions.

  3. Per-variant platform requirements: Updated is_engine_platform_compatible to check across all variants for an engine (returns True if any variant's requirements are met). Updated load_engine_model to select the config matching model_size before reading requires, falling back to configs[0] only if no match is found.

  4. Remove unnecessary intel_extension_for_pytorch import: Removed the try/except IPEX import. Native torch.xpu.is_available() (PyTorch 2.7+) is used directly with a hasattr(torch, 'xpu') guard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant