Skip to content

Commit 811a82d

Browse files
authored
Add support for deepseek-r1-distill-qwen-7b and deepseek-r1-distill-qwen-7b local model variants (#4137)
* Create asset.yaml for DeepSeek-R1-Distill-Qwen-1.5B * Upload files for DeepSeek-R1-Distill-Qwen-1.5B * Create asset.yaml for DeepSeek-R1-Distill-Qwen-1.5B-cuda-gpu * Add files for DeepSeek-R1-Distill-Qwen-1.5B-cuda-gpu * Create asset.yaml for DeepSeek-R1-Distill-Qwen-1.5B-generic-cpu * Add files for DeepSeek-R1-Distill-Qwen-1.5B-generic-cpu * Create asset.yaml for DeepSeek-R1-Distill-Qwen-1.5B-generic-gpu * Add files for DeepSeek-R1-Distill-Qwen-1.5B-generic-gpu * Create asset.yaml for DeepSeek-R1-Distill-Qwen-1.5B-qnn-npu * Add files for DeepSeek-R1-Distill-Qwen-1.5B-qnn-npu * Update and rename model.yml to model.yaml * Update and rename spec.yml to spec.yaml * Update description.md * Update and rename model.yml to model.yaml * Update and rename spec.yml to spec.yaml * Update and rename description (1).md to description.md * Update and rename model (1).yaml to model.yaml * Update and rename spec (1).yaml to spec.yaml * Update and rename description (2).md to description.md * Update and rename model (2).yaml to model.yaml * Update and rename spec (2).yaml to spec.yaml * Update and rename description (3).md to description.md * Update and rename model (3).yaml to model.yaml * Update and rename spec (3).yaml to spec.yaml * Create asset.yaml for DeepSeek-R1-Distill-Qwen-7B * Add files for DeepSeek-R1-Distill-Qwen-7B * Create asset.yaml for DeepSeek-R1-Distill-Qwen-7B-cuda-gpu * Add files for DeepSeek-R1-Distill-Qwen-7B-cuda-gpu * Create asset.yaml for DeepSeek-R1-Distill-Qwen-7B-generic-cpu * Add files for DeepSeek-R1-Distill-Qwen-7B-generic-cpu * Create asset.yaml for DeepSeek-R1-Distill-Qwen-7B-generic-gpu * Add files for DeepSeek-R1-Distill-Qwen-7B-generic-gpu * Create asset.yaml for DeepSeek-R1-Distill-Qwen-7B-qnn-npu * Add files for DeepSeek-R1-Distill-Qwen-7B-qnn-npu * Update and rename description (5).md to description.md * Update and rename model (5).yaml to model.yaml * Update and rename spec (5).yaml to spec.yaml * Update and rename description (6).md to description.md * Update and rename model (6).yaml to model.yaml * Update and rename spec (6).yaml to spec.yaml * Update and rename description (7).md to description.md * Update and rename model (7).yaml to model.yaml * Update and rename spec (7).yaml to spec.yaml * Update and rename description (8).md to description.md * Update and rename model (8).yaml to model.yaml * Update and rename spec (8).yaml to spec.yaml * Update spec.yaml * Update and rename description (4).md to description.md * Update and rename model (4).yaml to model.yaml * Update and rename spec (4).yaml to spec.yaml * Update model.yaml * Update spec.yaml * Update model.yaml * Update spec.yaml * Update model.yaml * Update spec.yaml * Update model.yaml * Update spec.yaml * Update model.yaml * Update model.yaml * Update spec.yaml * Update model.yaml * Update spec.yaml * Update spec.yaml * Update spec.yaml * Update spec.yaml * Update model.yaml * Update spec.yaml * Update model.yaml * Update spec.yaml * Update model.yaml * Update spec.yaml * Update model.yaml * Update spec.yaml * Update model.yaml for 1.5b cuda gpu * Update model.yaml for 1.5b generic cpu * Update model.yaml for 1.5b generic gpu * Delete assets/models/system/DeepSeek-R1-Distill-Qwen-1.5B-qnn-npu directory * Update model.yaml * Update model.yaml for 7b cuda gpu * Update model.yaml for 7b generic cpu * Update model.yaml for 7b generic gpu * Delete assets/models/system/DeepSeek-R1-Distill-Qwen-7B-qnn-npu directory * Update model.yaml * Update foundrylocal tag to foundryLocal
1 parent e9d9b0a commit 811a82d

32 files changed

Lines changed: 354 additions & 0 deletions

File tree

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
extra_config: model.yaml
2+
spec: spec.yaml
3+
type: model
4+
categories: ["Local"]
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
This model is an optimized version of DeepSeek-R1-Distill-Qwen-1.5B to enable local inference on CUDA GPUs. This model uses RTN quantization.
2+
3+
# Model Description
4+
- **Developed by:** Microsoft
5+
- **Model type:** ONNX
6+
- **License:** MIT
7+
- **Model Description:** This is a conversion of the DeepSeek-R1-Distill-Qwen-1.5B for local inference on CUDA GPUs.
8+
- **Disclaimer:** Model is only an optimization of the base model, any risk associated with the model is the responsibility of the user of the model. Please verify and test for your scenarios. There may be a slight difference in output from the base model with the optimizations applied. Note that optimizations applied are distinct from fine tuning and thus do not alter the intended uses or capabilities of the model.
9+
10+
# Base Model Information
11+
See Hugging Face model [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) for details.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
path:
2+
container_name: models
3+
container_path: foundrylocal/foundry-local/deepseek-r1-distill-qwen-1.5b/onnx/cuda/cuda-int4-rtn-block-32
4+
storage_name: automlcesdkdataresources
5+
type: azureblob
6+
publish:
7+
description: description.md
8+
type: custom_model
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
$schema: https://azuremlschemas.azureedge.net/latest/model.schema.json
2+
name: deepseek-r1-distill-qwen-1.5b-cuda-gpu
3+
version: 1
4+
path: ./
5+
tags:
6+
foundryLocal: ""
7+
license: "MIT"
8+
licenseDescription: "This model is provided under the License Terms available at <https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/blob/main/LICENSE>."
9+
author: Microsoft
10+
inputModalities: "text"
11+
outputModalities: "text"
12+
task: chat-completion
13+
maxOutputTokens: 2048
14+
type: custom_model
15+
variantInfo:
16+
parents:
17+
- assetId: azureml://registries/azureml/models/deepseek-r1-distill-qwen-1.5b/versions/1
18+
variantMetadata:
19+
modelType: 'ONNX'
20+
quantization: ['RTN']
21+
device: 'gpu'
22+
executionProvider: 'CUDAExecutionProvider'
23+
fileSizeBytes: 1073741824
24+
vRamFootprintBytes: 1362861314
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
extra_config: model.yaml
2+
spec: spec.yaml
3+
type: model
4+
categories: ["Local"]
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
This model is an optimized version of DeepSeek-R1-Distill-Qwen-1.5B to enable local inference on CPUs. This model uses RTN quantization.
2+
3+
# Model Description
4+
- **Developed by:** Microsoft
5+
- **Model type:** ONNX
6+
- **License:** MIT
7+
- **Model Description:** This is a conversion of the DeepSeek-R1-Distill-Qwen-1.5B for local inference on CPUs.
8+
- **Disclaimer:** Model is only an optimization of the base model, any risk associated with the model is the responsibility of the user of the model. Please verify and test for your scenarios. There may be a slight difference in output from the base model with the optimizations applied. Note that optimizations applied are distinct from fine tuning and thus do not alter the intended uses or capabilities of the model.
9+
10+
# Base Model Information
11+
See Hugging Face model [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) for details.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
path:
2+
container_name: models
3+
container_path: foundrylocal/foundry-local/deepseek-r1-distill-qwen-1.5b/onnx/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4
4+
storage_name: automlcesdkdataresources
5+
type: azureblob
6+
publish:
7+
description: description.md
8+
type: custom_model
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
$schema: https://azuremlschemas.azureedge.net/latest/model.schema.json
2+
name: deepseek-r1-distill-qwen-1.5b-generic-cpu
3+
version: 1
4+
path: ./
5+
tags:
6+
foundryLocal: ""
7+
license: "MIT"
8+
licenseDescription: "This model is provided under the License Terms available at <https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/blob/main/LICENSE>."
9+
author: Microsoft
10+
inputModalities: "text"
11+
outputModalities: "text"
12+
task: chat-completion
13+
maxOutputTokens: 2048
14+
type: custom_model
15+
variantInfo:
16+
parents:
17+
- assetId: azureml://registries/azureml/models/deepseek-r1-distill-qwen-1.5b/versions/1
18+
variantMetadata:
19+
modelType: 'ONNX'
20+
quantization: ['RTN']
21+
device: 'cpu'
22+
executionProvider: 'CPUExecutionProvider'
23+
fileSizeBytes: 1964944541
24+
vRamFootprintBytes: 1965162961
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
extra_config: model.yaml
2+
spec: spec.yaml
3+
type: model
4+
categories: ["Local"]
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
This model is an optimized version of DeepSeek-R1-Distill-Qwen-1.5B to enable local inference on GPUs. This model uses RTN quantization.
2+
3+
# Model Description
4+
- **Developed by:** Microsoft
5+
- **Model type:** ONNX
6+
- **License:** MIT
7+
- **Model Description:** This is a conversion of the DeepSeek-R1-Distill-Qwen-1.5B for local inference on GPUs.
8+
- **Disclaimer:** Model is only an optimization of the base model, any risk associated with the model is the responsibility of the user of the model. Please verify and test for your scenarios. There may be a slight difference in output from the base model with the optimizations applied. Note that optimizations applied are distinct from fine tuning and thus do not alter the intended uses or capabilities of the model.
9+
10+
# Base Model Information
11+
See Hugging Face model [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) for details.

0 commit comments

Comments
 (0)