Enhancement Request
Currently, the llm-benchmark-concurrent-load.yml playbook requires specifying a single base_workload parameter:
ansible-playbook -i inventory/hosts.yml \
llm-benchmark-concurrent-load.yml \
-e "test_model=meta-llama/Llama-3.2-1B-Instruct" \
-e "base_workload=chat" \
-e "requested_cores=16"
Proposed Enhancement
Support base_workload=all to automatically loop over all available workload types:
ansible-playbook -i inventory/hosts.yml \
llm-benchmark-concurrent-load.yml \
-e "test_model=meta-llama/Llama-3.2-1B-Instruct" \
-e "base_workload=all" \
-e "requested_cores=16"
This would automatically run all workloads:
chat - Chat workload (512:256)
rag - RAG workload (4096:512)
code - Code generation (512:4096)
summarization - Summarization (1024:256)
short_codegen - Short code generation
Benefits
- Comprehensive testing: Easy way to run all workload types for a model
- Reduced manual intervention: No need to run the playbook multiple times
- Better coverage: Ensures all workload types are tested consistently
Implementation Notes
The playbook would need to:
- Detect when
base_workload=all is specified
- Loop over all available workload types
- Run the 3-phase testing for each workload
- Collect and organize results by workload type
Related file: automation/test-execution/ansible/llm-benchmark-concurrent-load.yml
Enhancement Request
Currently, the
llm-benchmark-concurrent-load.ymlplaybook requires specifying a singlebase_workloadparameter:Proposed Enhancement
Support
base_workload=allto automatically loop over all available workload types:This would automatically run all workloads:
chat- Chat workload (512:256)rag- RAG workload (4096:512)code- Code generation (512:4096)summarization- Summarization (1024:256)short_codegen- Short code generationBenefits
Implementation Notes
The playbook would need to:
base_workload=allis specifiedRelated file:
automation/test-execution/ansible/llm-benchmark-concurrent-load.yml