Skip to content

Commit 6980317

Browse files
committed
[NV] llm-d: resolve DI_REPO_DIR from sbatch submit dir, not staged $0
Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
1 parent c501599 commit 6980317

1 file changed

Lines changed: 6 additions & 2 deletions

File tree

benchmarks/multi_node/llm-d/job.slurm

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,13 @@ set -euo pipefail
1212
echo "=== llm-d job start ==="
1313
echo "UTC: $(TZ=UTC date '+%Y-%m-%d %H:%M:%S %Z')"
1414

15-
# Repo root (benchmarks/multi_node/llm-d/job.slurm -> ../../..)
16-
DI_REPO_DIR=$(cd "$(dirname "$0")/../../.." && pwd)
15+
# Repo root. $(pwd) = the sbatch submit dir, which the wrapper sets to
16+
# benchmarks/multi_node/llm-d/ before invoking submit.sh, so 3 up =
17+
# repo root. Using $(dirname "$0") would resolve to a SLURM staging
18+
# copy under /var/spool/... and miss the checkout entirely.
19+
DI_REPO_DIR=$(cd "$(pwd)/../../.." && pwd)
1720
export DI_REPO_DIR
21+
echo "REPO DIR: ${DI_REPO_DIR}"
1822

1923
ALL_NODES=$(scontrol show hostnames "$SLURM_JOB_NODELIST")
2024
TOTAL_NODES=$(echo "$ALL_NODES" | wc -l)

0 commit comments

Comments
 (0)