Commit 41812a1
ci: CI online test (#596)
* ci: add ci_test GitHub workflows
* fix: avoid cross-platform CUDA probing in tests
* ci: target master and setup Python in CI
* ci: use python module pip for CI dependency
* ci: update CI submodule for failure logs
* ci: update CI submodule for ci_ref scheduler
* ci: update CI submodule for source-mounted scheduler
* ci: update CI submodule for Unit generator fix
* ci: update CI submodule for tag model configs
* ci: update CI submodule for Unit failure logs
* ci: pass explicit devices for XPU unit jobs
* ci: standardize CI config extension to yml
* ci: update CI submodule for concise job names
* ci: update CI submodule for skipped job names
* Remove obsolete CI and lint config files
* ci: add manual platform dispatch
* ci: remove smoke and performance pipeline jobs
* Update Moore CI deployment fixes
* ci: rerun PR checks
* ci: default PR tests to nvidia
* ci: rerun nvidia check
* ci: update nvidia unit workflow
* ci: run PR checks on active platforms
* ci: register iluvatar platform
* ci: trigger checks on ci online branch
* ci: enable ascend online runner
* ci: rerun with metax scheduler fix
* ci: rerun metax after cancel
* ci: skip ascend image rebuild
* ci: rerun ascend with encoded args
* ci: rerun ascend after runner cleanup
* ci: rerun iluvatar with timeout guard
* ci: cancel stale online runs
* ci: cap metax unit runtime
* ci: match ascend runner label
* ci: avoid queued platforms blocking ascend
* ci: rerun ascend with runner proxy
* ci: rerun ascend after python compatibility fix
* ci: rerun ascend with scheduler image
* ci: rerun ascend locally
* ci: run metax quick operator subset
* ci: install ascend build dependencies
* ci: rerun iluvatar after scheduler fix
* ci: rerun metax quick subset
* ci: rerun with safe matrix output
* ci: rerun after matrix output fix
* ci: rerun after matrix output fix
* ci: rerun iluvatar after report fix
* ci: rerun ascend accepting docker 137
* ci: limit metax online smoke cases
* ci: rerun metax after busy gpu filter
* ci: rerun full ci online
* ci: address pr feedback
* ci: use prebuilt ascend test image
* test: generate fallback randint data on cpu
* test: format gemm skip reason as markdown
* ci: build ascend test image from dockerfile
* ci: update ci tooling submodule
* ci: opt ascend into buildkit
* ci: keep default repo branch on master
* ci: run ascend tests on free npu
* ci: let ascend pick an available npu
* ci: update dynamic ascend allocation tooling
* ci: update ascend npu allocation parser
* ci: update ascend logical device mapping
* ci: use nvidia base compatible with runner
* ci: align nvidia test command with master
* ci: run nvidia tests on compatible base image
* ci: address review comments
* ci: update moore resource locking
* ci: update scheduler stale lock cleanup
* ci: update nvidia gpu allocation
* ci: add v2 shadow workflow
* ci: handle unavailable v2 shadow agents
* ci: add v2 agent installer
* ci: match v2 runner labels
* ci: enforce v2 shadow checks
* ci: update v2 runner user agent
* ci: default v2 shadow to active platforms
* ci: limit v2 agent queue wait to ten minutes
* ci: use self-healing v2 agent workflow
* ci: use transient state dir fallback
* ci: use platform lock probe workflow
* ci: use checkout-free self-hosted workflow
* ci: use nested junit result detection
* ci: use per-job checked-out agent
* ci: use metax resource allocation fix
* test: keep tests aligned with master
* ci: update iluvatar ci tooling
* ci: use early-exit v2 queue watchdog
* ci: handle queued runners and update platform sets
* ci: add iluvatar runner filesystem repair script
* ci: enable iluvatar in legacy workflow
* ci: skip host gpu probing for iluvatar
* ci: pin iluvatar local runner support
* ci: pin shadow workflow ci ref
* Fix Iluvatar CI container setup
* Include Iluvatar CI build backend dependency
* ci: remove local iluvatar repair script
* ci: preflight runner availability before jobs
* ci: pin runner preflight token fix
* ci: pin best-effort runner preflight
---------
Co-authored-by: zhangyue207 <zhangyue207@users.noreply.github.com>
Co-authored-by: Vincent777 <140055255+Vincent777@users.noreply.github.com>
Co-authored-by: zkjh <zkjh@localhost.localdomain>1 parent fc5aecb commit 41812a1
27 files changed
Lines changed: 132 additions & 5474 deletions
File tree
- .ci
- images
- ascend
- cambricon
- iluvatar
- metax
- moore
- nvidia
- tests
- .github
- workflows
This file was deleted.
0 commit comments