Is your feature request related to a problem? Please describe.
Add support for Nemotron 3 4B dense model: https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
Describe the solution you'd like
Working loading, training, and inference via .generate() function. Should be able to re-use the KV cache I contributed for Nano 30B-A3B.
Describe alternatives you've considered
N/A
Additional context
N/A
Is your feature request related to a problem? Please describe.
Add support for Nemotron 3 4B dense model: https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
Describe the solution you'd like
Working loading, training, and inference via
.generate()function. Should be able to re-use the KV cache I contributed for Nano 30B-A3B.Describe alternatives you've considered
N/A
Additional context
N/A