Skip to content

Commit f4be903

Browse files
abrichrclaude
andauthored
refactor(benchmarks): extract library API from CLI (#21)
* refactor(benchmarks): extract library API from CLI for programmatic usage Extract core VM management and pool lifecycle logic from cli.py into importable modules (azure_vm.py, pool.py) with clean Python APIs. - Add AzureVMManager class with Azure SDK primary path + az CLI fallback - Add PoolManager class for pool create/wait/run/cleanup lifecycle - Add configurable resource_group via Settings, env var, or --resource-group flag - Support DefaultAzureCredential for enterprise SSO/service principals - CLI handlers become thin wrappers delegating to library classes - Add agent_factory parameter stub on PoolManager.run() for pluggable agents All 327 tests pass, CLI surface unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix pre-existing ruff lint errors in pool_viewer and resource_tracker Remove unused import `json` and unused variable `worker_re` in pool_viewer.py, and unused import `Optional` in resource_tracker.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: run ruff formatter on benchmarks modules Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(azure_vm): add SDK path for set_auto_shutdown via generic resource API Auto-shutdown schedules are Microsoft.DevTestLab/schedules resources. Use azure-mgmt-resource (already a dependency) to create them via the generic resource client, with az CLI fallback if SDK fails. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c1f4b98 commit f4be903

9 files changed

Lines changed: 2446 additions & 1090 deletions

File tree

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
"""Benchmark integration for openadapt-ml.
22
3-
This module provides ML-specific agents for benchmark evaluation.
4-
These agents wrap openadapt-ml internals (trained policies, API adapters).
3+
This module provides:
4+
5+
1. ML-specific agents for benchmark evaluation (PolicyAgent, APIBenchmarkAgent, etc.)
6+
2. Azure VM management with clean Python API (AzureVMManager)
7+
3. Pool management for parallel WAA evaluation (PoolManager)
58
69
For benchmark infrastructure (adapters, runners, viewers), use openadapt-evals:
710
```python
@@ -12,20 +15,32 @@
1215
)
1316
```
1417
15-
ML-specific agents (only available in openadapt-ml):
16-
- PolicyAgent: Wraps openadapt_ml.runtime.policy.AgentPolicy
17-
- APIBenchmarkAgent: Uses openadapt_ml.models.api_adapter.ApiVLMAdapter
18-
- UnifiedBaselineAgent: Uses openadapt_ml.baselines adapters
18+
Library usage (programmatic, no CLI):
19+
```python
20+
from openadapt_ml.benchmarks import PoolManager, AzureVMManager
21+
22+
vm = AzureVMManager(resource_group="my-rg")
23+
manager = PoolManager(vm_manager=vm)
24+
pool = manager.create(workers=4)
25+
manager.wait()
26+
result = manager.run(tasks=10)
27+
manager.cleanup(confirm=False)
28+
```
1929
"""
2030

2131
from openadapt_ml.benchmarks.agent import (
2232
APIBenchmarkAgent,
2333
PolicyAgent,
2434
UnifiedBaselineAgent,
2535
)
36+
from openadapt_ml.benchmarks.azure_vm import AzureVMManager
37+
from openadapt_ml.benchmarks.pool import PoolManager, PoolRunResult
2638

2739
__all__ = [
2840
"PolicyAgent",
2941
"APIBenchmarkAgent",
3042
"UnifiedBaselineAgent",
43+
"AzureVMManager",
44+
"PoolManager",
45+
"PoolRunResult",
3146
]

0 commit comments

Comments
 (0)