Description of the issue:
The UCI supercomputer has 40 cores per node, and I set up an 80 task decomposition to spread POP across 2 nodes when running gx3v7; they have tried running with 120 and 160 nodes, and run into some issues (see bottom of this section). The auto-decomp tool doesn't provide a very good distribution for those task counts at this resolution, so I'd like to add
<decomp nproc="120" res="gx3v7" >
<maxblocks >1</maxblocks>
<bsize_x >10</bsize_x>
<bsize_y >10</bsize_y>
<nx_blocks >10</nx_blocks>
<ny_blocks >12</ny_blocks>
<decomptype>cartesian</decomptype>
</decomp>
<decomp nproc="160" res="gx3v7" >
<maxblocks >1</maxblocks>
<bsize_x >10</bsize_x>
<bsize_y >8</bsize_y>
<nx_blocks >10</nx_blocks>
<ny_blocks >16</ny_blocks>
<decomptype>cartesian</decomptype>
</decomp>
to bld/generate_pop_decomp.xml at some point.
The "issues" I've alluded to only seem to be present on their machine, which is using intel 2018.0.3 - there is a crash in running_means_mod.F90 when ladjust_bury_coeff = .true., but I can't reproduce it on any other machine I have access to. I'm hopeful that it is a compiler bug and updating to a more recent compiler will make it go away, but if it persists I'll open a new issue ticket (I have some ideas on how to investigate if I need to).
Version:
- CESM: 2.2.0, but working on moving to the latest 2.3 beta tag (waiting on ESMF library on their machine)
- POP2: this just needs to go in the latest, no need to add to 2.2 release tags
Machine/Environment Description:
GreenPlanet (UCI super computer)
Any xml/namelist changes or SourceMods:
n/a
Description of the issue:
The UCI supercomputer has 40 cores per node, and I set up an 80 task decomposition to spread POP across 2 nodes when running
gx3v7; they have tried running with 120 and 160 nodes, and run into some issues (see bottom of this section). The auto-decomp tool doesn't provide a very good distribution for those task counts at this resolution, so I'd like to addto
bld/generate_pop_decomp.xmlat some point.The "issues" I've alluded to only seem to be present on their machine, which is using intel 2018.0.3 - there is a crash in
running_means_mod.F90whenladjust_bury_coeff = .true., but I can't reproduce it on any other machine I have access to. I'm hopeful that it is a compiler bug and updating to a more recent compiler will make it go away, but if it persists I'll open a new issue ticket (I have some ideas on how to investigate if I need to).Version:
Machine/Environment Description:
GreenPlanet (UCI super computer)
Any xml/namelist changes or SourceMods:
n/a