Skip to content
This repository was archived by the owner on Oct 15, 2025. It is now read-only.
This repository was archived by the owner on Oct 15, 2025. It is now read-only.

How do I use PVC in msvc? #375

@GGGsk

Description

@GGGsk

Currently, I have replaced the model address (hf://xxxxxxxx) with PVC (pvc://pvc_name/model_path), but it is not mounted into the started decode and prefill pods. Is there any other place that needs to be configured?

This is the yaml of my msvc

apiVersion: llm-d.ai/v1alpha1
kind: ModelService
metadata:
  creationTimestamp: "2025-07-27T02:16:44Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: Sail
  name: llm-3
  namespace: admin-default
  resourceVersion: "89810862"
  uid: b2194e6d-d622-4050-978a-6cc4b29c5477
spec:
  baseConfigMapRef:
    name: basic-sim-preset
  decode:
    containers:
    - args:
      - --model
      - /models
      name: vllm
      resources:
        limits:
          cpu: 1000m
          memory: 1024Mi
        requests:
          cpu: 1000m
          memory: 1024Mi
    replicas: 1
  decoupleScaling: false
  endpointPicker:
    containers:
    - name: epp
      resources:
        limits:
          cpu: 1000m
          memory: 1024Mi
        requests:
          cpu: 1000m
          memory: 1024Mi
    replicas: 1
  modelArtifacts:
    uri: pvc://pvc-45690f60056e40698a99669d9d060543/Qwen2.5-VL-3B-Instruct
  prefill:
    containers:
    - args:
      - --model
      - /models
      name: vllm
      resources:
        limits:
          cpu: 1000m
          memory: 1024Mi
        requests:
          cpu: 1000m
          memory: 1024Mi
    replicas: 1
  routing:
    modelName: admin-default/Qwen2.5-VL-3B-Instruct
status:
  conditions:
  - lastTransitionTime: "2025-07-27T02:16:51Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: PrefillAvailable
  - lastTransitionTime: "2025-07-27T02:16:51Z"
    message: ReplicaSet "llm-3-prefill-746cb9b4" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: PrefillProgressing
  - lastTransitionTime: "2025-07-27T02:16:51Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: DecodeAvailable
  - lastTransitionTime: "2025-07-27T02:16:51Z"
    message: ReplicaSet "llm-3-decode-5bc7dfd998" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: DecodeProgressing
  - lastTransitionTime: "2025-07-27T02:16:56Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: EppAvailable
  - lastTransitionTime: "2025-07-27T02:16:56Z"
    message: ReplicaSet "llm-3-epp-7569bd4c5c" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: EppProgressing
  decodeAvailable: 1
  decodeDeploymentRef: llm-3-decode
  decodeReady: 1/1
  eppAvailable: 1
  eppDeploymentRef: llm-3-epp
  eppReady: 1/1
  eppRoleBinding: llm-3-epp-rolebinding
  inferenceModelRef: llm-3
  inferencePoolRef: llm-3-inference-pool
  prefillAvailable: 1
  prefillDeploymentRef: llm-3-prefill
  prefillReady: 1/1
  prefillServiceAccountRef: llm-3-epp-sa

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions