You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-8Lines changed: 15 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ TLDR: Search for `todo` and update all occurrences to your desired name
13
13
14
14
Docker and singularity is not a must unless you cannot install some dependencies locally on HPC shell environment due to permission issue
15
15
16
-
### Base Repo
16
+
### Base Repository
17
17
18
18
1. Change [LICENSE](LICENSE) if necessary
19
19
@@ -23,6 +23,8 @@ Docker and singularity is not a must unless you cannot install some dependencies
23
23
24
24
### Docker Config
25
25
26
+
Continue on a machine where you have docker permission, HPC clusters usually restrict docker access for security reasons
27
+
26
28
1. Modify `todo-docker-user`, `todo-image-name`, `todo-image-user` in [.env](.env)
27
29
28
30
-[.env](env) will be loaded when you use docker compose for build/run/push/...
@@ -38,11 +40,10 @@ Docker and singularity is not a must unless you cannot install some dependencies
38
40
39
41
1.[build_docker_image.sh](scripts/build_docker_image.sh) to build and test the image locally in your machine's architecture
40
42
41
-
- Do this on a machine where you have docker permission, HPC clusters usually restrict docker access for security reasons
42
43
- The scripts uses buildx to build multi-arch image, you can disable this by removing redundant archs in [docker-compose.yml](docker-compose.yml)
43
44
- Building stage does not have GPU access, if some of your dependencies need GPU, build them inside a running container and commit to the final image
44
45
45
-
1.To run and test a built image, use [run_docker_container.sh](scripts/run_docker_container.sh) or `docker compose up -d`
46
+
1.[run_docker_container.sh](scripts/run_docker_container.sh) or `docker compose up -d` to run and test a built image
46
47
47
48
- The service by default will mount the whole repository onto `CODE_FOLDER` inside the container so any modification inside also takes effect outside, which is useful when you use vscode remote extension to develop inside a running container with remote docker context
48
49
@@ -52,25 +53,31 @@ Docker and singularity is not a must unless you cannot install some dependencies
52
53
53
54
### Singularity Config
54
55
56
+
Continue on the actual HPC cluster environment
57
+
55
58
1.[pull_singularity_image.sh](scripts/pull_singularity_image.sh) to build the singularity image locally
56
59
57
60
- Singularity image can be built upon existing docker image
61
+
- You should see the image `todo-image-name_latest.def` after successfully built
58
62
59
63
1.[run_singularity_instance.sh](scripts/run_singularity_instance.sh) to test the image
60
64
61
65
- Add additional volume binding options to the script such as dataset directories, best practice is to define in [.env](.env) then export in [variables.sh](scripts/variables.sh) with `resolve_host_path` to turn relative path into absolute real path
62
-
- Singularity instances by default has less environment isolation than docker containers unless you specify the additional options like the script
66
+
- Singularity instances by default have less environment isolation than docker containers unless you specify the additional options like the script
63
67
64
68
### Job Config
65
69
66
-
1. Modify job specifications under [jobs/](jobs/)
70
+
1. Modify job specifications under `jobs/`
67
71
68
72
- Each (HPC) Slurm environment has different partition definitions, which are often heterogeneous, you can query this by `sinfo` with some options
69
-
- All the jobs has `-l`(login) options in shebang so that any command working in your current shell environment should also run as a job
73
+
-`--ntasks-per-node` specifies number of parallelization, and it's convenient to tie other resources to task, e.g., `--gpus-per-task`, `--cpus-per-task`, `--mem-per-gpu`, so that you only need to increase ntasks to scale up on a node
74
+
- All the jobs have `-l`(login) options in shebang so that any command working in your current shell environment should also run as a job
75
+
76
+
1.`sbatch jobs/your-cluster/your-job.job` or `jobs/your-cluster/your-job.job` to submit jobs
70
77
71
-
1. Submit job by `sbatch jobs/your-cluster/your-job.job` or `jobs/your-cluster/your-job.job`
78
+
- You should see a file `todo_your_job_name_slurm_job_id.out` in the base folder of this repository, which contains job logs
72
79
73
-
1. Recommend [turm](https://github.com/kabouzeid/turm) for job monitor, use `turm -u your-slurm-user` after installation
80
+
1. Recommend [turm](https://github.com/kabouzeid/turm) for job monitor outside the job, use `turm -u your-slurm-user` after installation
# # follow the error information,replace all the “ipcp-unit-growth” with “ipa-cp-unit-growth” in 3rdparty/carotene/CMakeLists.txt and 3rdparty/carotene/hal/CMakeLists.txt
0 commit comments