End-to-end tutorials for Minitron structured pruning followed by knowledge distillation, quantization, evaluation,and vLLM deployment.
Each subdirectory covers a specific source model and target size, including the full data blend, pruning config, distillation hyperparameters, evaluation results, and throughput benchmarks.