Commit e04fe9d
test: add unit tests for KubeflowExecutor and KubeflowScheduler
63 tests covering:
- Executor: defaults, kubeconfig fallback, nnodes, nproc_per_node resolution,
assign, manifest generation for PyTorchJob and TrainJob (structure, resources,
volumes, env_vars, env_list, labels, image_pull_secrets, tolerations, affinity,
pod_spec_overrides, spec_kwargs, container_kwargs), launch (success, wait,
timeout, conflict), status (all states + API errors), cancel (plain, 404,
wait=True, wait timeout), fetch_logs (no-follow, follow, TrainJob label selector)
- Scheduler: create, dryrun, schedule, describe (all states + UNKNOWN→PENDING
regression), cancel, log_iter (list + str), persistence (new file, merge,
missing file), state map
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>1 parent 4b3603c commit e04fe9d
2 files changed
Lines changed: 872 additions & 0 deletions
0 commit comments