Skip to content

Commit 2b1f91f

Browse files
nowgnuesLeeclaude
authored andcommitted
docs(FR-2571): document Model Store V2 and Auto Scaling Rules V2 (26.4.0+)
Append V2 subsections to Model Store, Admin Model Store Management, and Auto Scaling Rules in all four languages, documenting the redesigned UI in Backend.AI manager 26.4.0+ while preserving legacy prose for older installs. - Add 7 new V2 screenshots (Auto Scaling Rules list, Kernel/Prometheus editor modals, Model Store page, Model Card drawer, Deploy modal, Admin Model Card list) - Augment 3 ripple sites with dual-label callouts: legacy `Run this model` stays, `Deploy` added for 26.4.0+ - Fix broken cross-reference `[Data](#data)` → `[Data](../vfolder/vfolder.md)` - Keep legacy screenshots and prose intact; version cutover via `:::note` admonitions and `#...-version-26-4-0-and-later` anchors Capability gate verified at `src/lib/backend.ai-client-esm.ts:895` (manager 26.4.0 for both `model-card-v2` and `prometheus-auto-scaling-rule`). UI labels verified against `resources/i18n/{en,ko,ja,th}.json`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent f02fa57 commit 2b1f91f

32 files changed

Lines changed: 618 additions & 25 deletions
109 KB
Loading
33.4 KB
Loading
28.5 KB
Loading
88 KB
Loading
10.8 KB
Loading
63.4 KB
Loading
55.4 KB
Loading

packages/backend.ai-webui-docs/src/en/model_serving/model_serving.md

Lines changed: 153 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ To use the Model Service, you need to follow the steps below:
7272
:::tip
7373
As an alternative workflow, you can browse pre-configured models in the
7474
[Model Store](#model-store) and deploy them with a single click using the
75-
`Run this model` button.
75+
`Run this model` button (renamed `Deploy` on version 26.4.0 and later).
7676
:::
7777

7878
<a id="model-definition-guide"></a>
@@ -293,10 +293,10 @@ please refer to the [Explore Folder](#explore-folder) section.
293293
The service definition file (`service-definition.toml`) allows administrators to pre-configure the resources, environment, and runtime settings required for a model service. When this file is present in a model folder, the system uses these settings as default values when creating a service.
294294
295295
Both `model-definition.yaml` and `service-definition.toml` must be present in the
296-
model folder to enable the `Run this model` button on the Model Store page. These two
297-
files work together: the model definition specifies the model and inference server
298-
configuration, while the service definition specifies the runtime environment, resource
299-
allocation, and environment variables.
296+
model folder to enable the `Run this model` button (`Deploy` on version 26.4.0 and
297+
later) on the Model Store page. These two files work together: the model definition
298+
specifies the model and inference server configuration, while the service definition
299+
specifies the runtime environment, resource allocation, and environment variables.
300300
301301
The service definition file follows the TOML format with sections organized by runtime variant. Each section configures a specific aspect of the service:
302302
@@ -340,10 +340,10 @@ selected runtime variant when creating the service.
340340
:::
341341

342342
:::note
343-
When a service is created from the Model Store using the `Run this model` button,
344-
the settings from `service-definition.toml` are applied automatically. If you later
345-
need to adjust the resource allocation, you can modify the service through the
346-
Model Serving page.
343+
When a service is created from the Model Store using the `Run this model` button
344+
(`Deploy` on version 26.4.0 and later), the settings from `service-definition.toml`
345+
are applied automatically. If you later need to adjust the resource allocation, you
346+
can modify the service through the Model Serving page.
347347
:::
348348

349349
## Serving Page Overview
@@ -587,6 +587,48 @@ where you can add a rule. Each field in the modal is described below:
587587

588588
![](../images/auto_scaling_rules_modal.png)
589589

590+
:::note
591+
From Backend.AI version **26.4.0** onwards, Auto Scaling Rules have been redesigned with
592+
Prometheus preset support and a new condition model. If you are on an older version, the
593+
description above still applies. Otherwise, refer to
594+
[Auto Scaling Rules (version 26.4.0 and later)](#auto-scaling-rules-version-26-4-0-and-later) below.
595+
:::
596+
597+
<a id="auto-scaling-rules-version-26-4-0-and-later"></a>
598+
599+
#### Auto Scaling Rules (version 26.4.0 and later)
600+
601+
On Backend.AI version 26.4.0 and later, Auto Scaling Rules are redesigned with a Prometheus metric source, a segmented condition control, and a richer rule list.
602+
603+
![](../images/auto_scaling_rules_v2.png)
604+
605+
The rule list provides:
606+
607+
- A property filter bar to filter rules by **Created At** and **Last Triggered** datetime ranges.
608+
- Server-side pagination.
609+
- The following columns: **Metric Source**, **Condition**, **Time Window**, **Step Size**, **Min / MAX Replicas**, **Created At**, and **Last Triggered**. The **Step Size** column automatically shows `+`, ``, or `±` based on the direction derived from the thresholds you have set, so you no longer choose **Scale Out** or **Scale In** explicitly.
610+
- Per-row edit and delete icons shown next to the condition summary in each row.
611+
612+
Click the `Add Rules` button to open the **Add Auto Scaling Rule** editor. To modify an existing rule, click the edit icon on its row; the **Edit Auto Scaling Rule** editor opens with the rule's values pre-filled. The editor contains the following fields in order:
613+
614+
- **Metric Source**: Select one of `Kernel`, `Inference Framework`, or `Prometheus`.
615+
- **Metric Name**: For `Kernel` and `Inference Framework`, enter a metric name. For `Kernel`, a list of common metrics (such as `cpu_util`, `mem`, `net_rx`, and `net_tx`) is offered as autocomplete suggestions, and you can also type a custom name freely.
616+
- **Metric Name (Prometheus Preset)**: Shown only when **Metric Source** is `Prometheus`. Select a preset from the dropdown; the preset's metric name, query template, and (when defined) **Time Window** are filled in automatically. Below the selector, a **Current value** preview shows the latest value returned by the preset, with a refresh button. When multiple series are returned, the preview shows the number of series and the most recent value; if no data is available, it shows **No data available**.
617+
- **Condition**: A segmented control with two modes:
618+
619+
- **Single**: Defines a single comparison `Metric <op> Threshold`, where `<op>` is either `>` or `<`.
620+
- **Range**: Defines a range `Min Threshold < Metric < Max Threshold`. Both thresholds are required; the minimum must be less than the maximum.
621+
622+
- **Step Size**: A positive integer specifying how many replicas to add or remove per scaling event. The direction (add or remove) is derived automatically from which threshold is configured, so you only specify the magnitude.
623+
- **Time Window**: The time window, in seconds, over which the metric is aggregated and evaluated for scaling. This replaces the legacy `CoolDown Seconds` field and has a different meaning.
624+
- **Min Replicas** and **Max Replicas**: The lower and upper bounds that auto-scaling enforces on the replica count. Auto-scaling will not reduce the number of replicas below **Min Replicas** or increase it above **Max Replicas**.
625+
626+
![](../images/auto_scaling_rules_modal_v2.png)
627+
628+
When **Metric Source** is set to `Prometheus`, the editor shows the preset selector and the live **Current value** preview.
629+
630+
![](../images/auto_scaling_rules_modal_prometheus_v2.png)
631+
590632
<a id="generating-tokens"></a>
591633

592634
### Generating Tokens
@@ -750,6 +792,65 @@ When a service is created from the Model Store, the settings from
750792
service later through the Serving page.
751793
:::
752794

795+
:::note
796+
From Backend.AI version **26.4.0** onwards, the Model Store has been redesigned. If you are on
797+
an older version, the description above still applies. Otherwise, refer to
798+
[Model Store (version 26.4.0 and later)](#model-store-version-26-4-0-and-later) below.
799+
:::
800+
801+
<a id="model-store-version-26-4-0-and-later"></a>
802+
803+
#### Model Store (version 26.4.0 and later)
804+
805+
On Backend.AI version 26.4.0 and later, the Model Store is redesigned with a simplified browsing experience, a card-detail drawer, and a streamlined deploy flow that replaces the legacy browse/detail/run workflow.
806+
807+
![](../images/model_store_page_v2.png)
808+
809+
The page uses a search and sort layout at the top:
810+
811+
- **Search Models**: Use the **Filter By Name** property filter to search model cards by name.
812+
- **Sort**: Choose how results are ordered. The available options are `Name (A→Z)`, `Name (Z→A)`, `Oldest first`, and `Newest first`.
813+
- **Refresh**: Click the refresh button to reload the card list.
814+
815+
Each card displays the model brand icon, title (or name when no title is set), task tag, relative creation time, and the author with an icon. Cards that have **no compatible presets** for the current project are shown at 50% opacity. You can still open such a card to view its details, but its **Deploy** button is disabled and an error alert is shown in the drawer: *No compatible presets available. This model cannot be deployed.*
816+
817+
If the MODEL_STORE project is not set up on the server, the page shows a *Model Store project not found* message with instructions to contact an administrator. If no model cards match your filters, the page displays *No models found*.
818+
819+
The list is paginated at the bottom. You can change the page size between `10`, `20`, and `50` entries.
820+
821+
Click a card to open the model card drawer on the right side of the page. The drawer shows the model title and description at the top, followed by the task, category, labels, and license tags, and then a details list with the following items:
822+
823+
- **Author**
824+
- **Architecture**
825+
- **Framework** (each framework is shown with an icon)
826+
- **Version**
827+
- **Created** and **Last Modified** timestamps
828+
- **Model Folder**: A clickable link that opens the folder explorer for the model storage folder
829+
- **Min Resource**: The minimum resource requirements (CPU, memory, GPU)
830+
831+
If the model card includes a README, it is rendered as a `README.md` card at the bottom of the drawer.
832+
833+
![](../images/model_card_detail_drawer.png)
834+
835+
To clone a model folder in version 26.4.0 and later, use the [Data](../vfolder/vfolder.md) page directly, since the Model Store drawer no longer provides a dedicated Clone button.
836+
837+
Click the **Deploy** button in the drawer header to deploy the model as a service. The deploy flow behaves in one of two ways:
838+
839+
- **Auto-deploy**: If the model has exactly one available preset and the current project has exactly one accessible resource group, the deployment is created silently without showing a modal. After the endpoint becomes queryable, you are navigated to its endpoint detail page.
840+
- **Deploy Model modal**: Otherwise, a **Deploy Model** modal opens with the following required fields:
841+
842+
- **Preset**: A grouped dropdown of available resource presets. When presets span multiple runtime variants, options are grouped by runtime variant name; otherwise the options are shown as a flat list.
843+
- **Resource Group**: The resource group where the service will run.
844+
845+
Click the **Deploy** button in the modal to start the deployment. A success toast confirms that the model has been deployed, and you are navigated to the endpoint detail page.
846+
847+
![](../images/model_card_deploy_modal.png)
848+
849+
:::note
850+
If the selected model has no compatible presets for the current project, the drawer's
851+
**Deploy** button is disabled and deployment is blocked until a compatible preset is available.
852+
:::
853+
753854
## Admin Features
754855

755856
### Admin Serving Page
@@ -801,3 +902,46 @@ You can delete individual model cards by clicking the delete icon in the **Contr
801902
#### Scanning Project Model Cards
802903

803904
Click the `Scan Project Model Cards` button to automatically scan a project's model folders and create model cards for any folders that contain valid model definitions. The scan results show the number of model cards created and updated.
905+
906+
:::note
907+
The **Scan Project Model Cards** button is not available on Backend.AI version 26.4.0 and later.
908+
:::
909+
910+
:::note
911+
From Backend.AI version **26.4.0** onwards, the Admin Model Store Management tab has been
912+
redesigned. If you are on an older version, the description above still applies. Otherwise,
913+
refer to [Admin Model Store Management (version 26.4.0 and later)](#admin-model-store-management-version-26-4-0-and-later) below.
914+
:::
915+
916+
<a id="admin-model-store-management-version-26-4-0-and-later"></a>
917+
918+
#### Admin Model Store Management (version 26.4.0 and later)
919+
920+
On Backend.AI version 26.4.0 and later, the **Model Store Management** tab presents a redesigned model card list.
921+
922+
![](../images/admin_model_card_list_v2.png)
923+
924+
The list provides the following columns:
925+
926+
- **Name**: The unique identifier of the model card.
927+
- **Title**: The human-readable display name.
928+
- **Category**: The model category (e.g., LLM).
929+
- **Task**: The inference task type (e.g., text-generation).
930+
- **Access Level**: Shows a green `Public` tag when the model card is publicly accessible, or a default `Private` tag otherwise.
931+
- **Domain**: The domain that owns the model card.
932+
- **Project**: The project that owns the model card.
933+
- **Created At**: The timestamp when the model card was created.
934+
935+
You can filter the list by **Name** using the property filter bar at the top. Edit and delete action icons are shown directly in the **Name** cell of each row.
936+
937+
To delete multiple model cards at once, select the rows you want to remove using the checkboxes and click the red trash-bin button next to the selection count. A confirmation dialog appears before the cards are deleted.
938+
939+
:::note
940+
The Create, Edit, and Delete dialogs for individual model cards are the same as in the legacy
941+
version. See [Creating a Model Card](#creating-a-model-card),
942+
[Editing a Model Card](#editing-a-model-card), and [Deleting Model Cards](#deleting-model-cards).
943+
:::
944+
945+
:::note
946+
The **Scan Project Model Cards** button is not available in Backend.AI version 26.4.0 and later.
947+
:::
109 KB
Loading
33.4 KB
Loading

0 commit comments

Comments
 (0)