Support NVIDIA-Nemotron-3-Nano-4B-BF16

**Is your feature request related to a problem? Please describe.**
Add support for Nemotron 3 4B dense model: https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16

**Describe the solution you'd like**
Working loading, training, and inference via `.generate()` function. Should be able to re-use the KV cache I contributed for Nano 30B-A3B.

**Describe alternatives you've considered**
N/A

**Additional context**
N/A


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support NVIDIA-Nemotron-3-Nano-4B-BF16 #2004

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support NVIDIA-Nemotron-3-Nano-4B-BF16 #2004

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions