Skip to content

add note on lmod cache when loading an EESSI module#731

Open
smoors wants to merge 1 commit into
EESSI:mainfrom
smoors:lmod-cache
Open

add note on lmod cache when loading an EESSI module#731
smoors wants to merge 1 commit into
EESSI:mainfrom
smoors:lmod-cache

Conversation

@smoors

@smoors smoors commented Apr 4, 2026

Copy link
Copy Markdown
Contributor
$ module unuse $MODULEPATH
$ module use /cvmfs/software.eessi.io/init/modules
$ module load EESSI/2025.06                                                                                                                                       
Lmod has detected the following error:  Unable to load module because of error when evaluating modulefile:
     /cvmfs/software.eessi.io/init/modules/EESSI/2025.06.lua: /usr/share/lmod/lmod/libexec/ModuleA.lua:676: Did not find mpath in mA

     Please check the modulefile and especially if there is a line number specified in the above message
If you don't understand the warning or error, contact the helpdesk at hpc@vub.be 
While processing the following module(s):
    Module fullname  Module Filename
    ---------------  ---------------
    EESSI/2025.06    /cvmfs/software.eessi.io/init/modules/EESSI/2025.06.lua
$ LMOD_IGNORE_CACHE=1 module load EESSI/2025.06
Module for EESSI/2025.06 loaded successfully
$ module --version                                                                                                                                                                                 

Modules based on Lua: Version 8.7.65 2025-08-05 10:24 -06:00
    by Robert McLay mclay@tacc.utexas.edu

@smoors smoors changed the title add note on lmod cache when loading a module add note on lmod cache when loading an EESSI module Apr 4, 2026
@boegel

boegel commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

@ocaisa Is that the best we can do here?

@casparvl

casparvl commented Apr 7, 2026

Copy link
Copy Markdown
Collaborator

Ha, I just pinged Alan as well. His words: disabling the cache will be very expensive.

@smoors I'm wondering why you are running into this usse. I have the feeling that rather then just disabling the cache, there is an actual issue that should be fixed instead.

@ocaisa

ocaisa commented Apr 7, 2026

Copy link
Copy Markdown
Member

I think we need to understand what is going wrong here first, ignoring the cache is very expensive for things like module avail.

My suspicion would be that there is a user spider cache there for /cvmfs/software.eessi.io/init/modules that was generated with the Lmod from EESSI, that Lmod version is too new to be compatible with the system Lmod. If you delete the local cache, the old Lmod will create a new one and things should work again.

@smoors

smoors commented Apr 7, 2026

Copy link
Copy Markdown
Contributor Author

My suspicion would be that there is a user spider cache there for /cvmfs/software.eessi.io/init/modules that was generated with the Lmod from EESSI, that Lmod version is too new to be compatible with the system Lmod. If you delete the local cache, the old Lmod will create a new one and things should work again.

there is no local cache in my account, which is only generated if there is no system cache, and we do have a system cache. our Lmod is more recent than the one from EESSI.

@ocaisa

ocaisa commented Apr 7, 2026

Copy link
Copy Markdown
Member

Oh, that's worse, now I would really like to know what it is going on.

@smoors

smoors commented Apr 7, 2026

Copy link
Copy Markdown
Contributor Author

note that i only have to disable the cache when loading an EESSI module, not when subsequently loading any modules in the EESSI stack. this works perfectly:

$ LMOD_IGNORE_CACHE=1 ml EESSI/2025.06                                                                                                                                                                 
Module for EESSI/2025.06 loaded successfully
$ ml GROMACS/2025.2-foss-2025a

@ocaisa

ocaisa commented Apr 7, 2026

Copy link
Copy Markdown
Member

This looks relevant: TACC/Lmod#780 , and specifically TACC/Lmod#780 (comment)

@ocaisa

ocaisa commented Apr 7, 2026

Copy link
Copy Markdown
Member

I think the less impactful suggestion here is export LMOD_CACHED_LOADS=no, does that work for you?

@smoors

smoors commented Apr 7, 2026

Copy link
Copy Markdown
Contributor Author

I think the less impactful suggestion here is export LMOD_CACHED_LOADS=no, does that work for you?

that doesn't work for me.

i really don't think setting LMOD_IGNORE_CACHE=1 only for loading an EESSI module is "impactful", as there are only 2 modules available at that point.

@casparvl

casparvl commented Apr 7, 2026

Copy link
Copy Markdown
Collaborator

That may be true, the risk is that people misinterpret and think "you know what, I'll just export LMOD_IGNORE_CACHE=1 in my .bashrc so I don't need to remember every time". Plus, if there's a fundamental bug, we should fix the bug rather than tell people to work around it :)

Also, understanding the issue better will also help us to give more targetted advice. I.e. if it was the issue that Alan linked to, we could give much more concrete advice "if you see XXX and are running with LMOD version YYY or older, either update to version ZZZ or load the EESSI module with ... WARNING: do NOT export LMOD_IGNORE_CACHE=1 as a global setting, as this would make interaction with the modules provided by EESSI extremely slow."

@smoors

smoors commented Apr 7, 2026

Copy link
Copy Markdown
Contributor Author

That may be true, the risk is that people misinterpret and think "you know what, I'll just export LMOD_IGNORE_CACHE=1 in my .bashrc so I don't need to remember every time". Plus, if there's a fundamental bug, we should fix the bug rather than tell people to work around it :)

Also, understanding the issue better will also help us to give more targetted advice. I.e. if it was the issue that Alan linked to, we could give much more concrete advice "if you see XXX and are running with LMOD version YYY or older, either update to version ZZZ or load the EESSI module with ... WARNING: do NOT export LMOD_IGNORE_CACHE=1 as a global setting, as this would make interaction with the modules provided by EESSI extremely slow."

of course, i fully agree! i wasn't suggesting we shouldn't try to fix/understand it better, i was merely responding to Alan's "less impactful" suggestion.

@ocaisa

ocaisa commented Apr 7, 2026

Copy link
Copy Markdown
Member

I do think it is pretty likely that someone would export LMOD_IGNORE_CACHE=1 and then things will slow down for them. If it had worked, export LMOD_CACHED_LOADS=no would just affect loads and would be harder to notice.

@ocaisa

ocaisa commented Apr 8, 2026

Copy link
Copy Markdown
Member

Just to note, we do have

if ( mode() ~= "spider" ) then
    prepend_path("MODULEPATH", eessi_module_path)
    eessiDebug("Adding " .. eessi_module_path .. " to MODULEPATH")
end

(https://github.com/EESSI/software-layer-scripts/blob/f453fe9f897dc240199c7f886459ea54c58724e4/init/modules/EESSI/2023.06.lua#L158-L162)

in our module file, which matches what is seen in TACC/Lmod#780 (comment). You could put version protection around that, but it is there for a reason as otherwise a cache will be created/updated for the EESSI paths. If we can find a way to create a cache for /cvmfs/software.eessi.io/init/modules then I think this wouldn't be necessary. This is not trivial though as I don't know if spider actually knows what the architecture-specific update to the MODULEPATH looks like.

@smoors

smoors commented Apr 9, 2026

Copy link
Copy Markdown
Contributor Author

i did some testing and digging.

it's an issue with Lmod itself, specifically in the file ModuleA.lua, starting from 8.7.65 and fixed in 9.0.5

removing the original cache didn't help, and adding a cache for the EESSI modules didn't help either.

@ocaisa

ocaisa commented Apr 9, 2026

Copy link
Copy Markdown
Member

This is a tricky thing to fix or work around. One thing we could do is have the module print a load message for that version range, that's better than doing it in the docs because they can act on it straight away.

@ocaisa

ocaisa commented Apr 9, 2026

Copy link
Copy Markdown
Member

Hmm, there may be a way to get the behaviour we would need with haveDynamicMPATH(), let me see if I can reproduce

@ocaisa

ocaisa commented Apr 9, 2026

Copy link
Copy Markdown
Member

I can't reproduce this, I've tried a few different Lmod versions in the range.

ocaisa@~/Lmod((HEAD detached at 8.7.65))$ module purge
ocaisa@~/Lmod((HEAD detached at 8.7.65))$ rm ~/.cache/lmod/ -r
rm: cannot remove '/home/ocaisa/.cache/lmod/': No such file or directory
ocaisa@~/Lmod((HEAD detached at 8.7.65))$ module unuse $MODULEPATH
ocaisa@~/Lmod((HEAD detached at 8.7.65))$ module use /cvmfs/software.eessi.io/init/modules
ocaisa@~/Lmod((HEAD detached at 8.7.65))$ module load EESSI/2025.06
Module for EESSI/2025.06 loaded successfully
ocaisa@~/Lmod((HEAD detached at 8.7.65))$ module --version

Modules based on Lua: Version 8.7.65 2025-08-05 10:24 -06:00
    by Robert McLay mclay@tacc.utexas.edu

@smoors

smoors commented Apr 9, 2026

Copy link
Copy Markdown
Contributor Author

i finally found what's causing this in our site: it's our lmod_config.lua:

require("strict")
local cosmic = require("Cosmic"):singleton()
cosmic:assign("LMOD_ADMIN_FILE", "/etc/lmod/admin.list")
cosmic:assign("LMOD_AUTO_SWAP", "no")
cosmic:assign("LMOD_CACHED_LOADS", "yes")
cosmic:assign("LMOD_CASE_INDEPENDENT_SORTING", "yes")
cosmic:assign("LMOD_DISABLE_SAME_NAME_AUTOSWAP", "yes")
cosmic:assign("LMOD_EXTENDED_DEFAULT", "no")
cosmic:assign("LMOD_PACKAGE_PATH", "/etc/lmod")
cosmic:assign("LMOD_PIN_VERSIONS", "yes")
cosmic:assign("LMOD_REDIRECT", "yes")
cosmic:assign("LMOD_SHORT_TIME", 86400)
cosmic:assign("LMOD_SITE_MSG_FILE", "/etc/lmod/lang.lua")
cosmic:assign("LMOD_SITE_NAME", "VUB_HPC")

if i comment out the line with LMOD_CACHED_LOADS, it works, but strangely enough just export LMOD_CACHED_LOADS=no does not override it.

@smoors

smoors commented Apr 9, 2026

Copy link
Copy Markdown
Contributor Author

@ocaisa can you try to repro with export LMOD_CACHED_LOADS=yes?

@ocaisa

ocaisa commented Apr 9, 2026

Copy link
Copy Markdown
Member

Similar behaviour for me (in a way), if I export that variable it seems to have no effect and I can still successfully load the module

@ocaisa

ocaisa commented Apr 9, 2026

Copy link
Copy Markdown
Member

No, sorry, I am wrong, this does indeed reproduce the problem

ocaisa@~/Lmod((HEAD detached at 8.7.65))$ export LMOD_CACHED_LOADS=yes
ocaisa@~/Lmod((HEAD detached at 8.7.65))$ module purge
ocaisa@~/Lmod((HEAD detached at 8.7.65))$ module load EESSI/2025.06
Lmod has detected the following error:  Unable to load module because of error when evaluating modulefile:
     /cvmfs/software.eessi.io/init/modules/EESSI/2025.06.lua: /home/ocaisa/lmod/lmod/lmod/libexec/ModuleA.lua:676: Did not find mpath in mA

     Please check the modulefile and especially if there is a line number specified in the above message
While processing the following module(s):
    Module fullname  Module Filename
    ---------------  ---------------
    EESSI/2025.06    /cvmfs/software.eessi.io/init/modules/EESSI/2025.06.lua

@boegel

boegel commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

@ocaisa We also have cached loads enabled on the HPC-UGent infrastructure, just in case that's helpful to debug...

@smoors

smoors commented Apr 9, 2026

Copy link
Copy Markdown
Contributor Author

@ocaisa We also have cached loads enabled on the HPC-UGent infrastructure, just in case that's helpful to debug...

but the version in your systems is not affected, only 8.7.65 to 9.0.4

@ocaisa

ocaisa commented Apr 9, 2026

Copy link
Copy Markdown
Member

haveDynamicMPATH() does not seem to help

@casparvl

Copy link
Copy Markdown
Collaborator

We could just document this as a known issue for versions X-Y, when LMOD_CACHED_LOADS=yes is configured. And then advise that for these versions, people use module load --ignore_cache EESSI/<version> (instead of setting LMOD_IGNORE_CACHE, which could easily be exported 'by mistake'). And stress that the --ignore_cache argument should only be passed when loading the EESSI module - not when loading any subsequent module.

Even better would be if we can detect the LMOD version and LMOD_CACHE_LOADS setting in the EESSI module - and print helpful advice from there. But I don't know if that's possible/easy.

@smoors

smoors commented Apr 13, 2026

Copy link
Copy Markdown
Contributor Author

unfortunately, module --ignore_cache load EESSI/<version> doesn't work..

@casparvl

Copy link
Copy Markdown
Collaborator

unfortunately, module --ignore_cache load EESSI/ doesn't work..

That's... baffling. I would have assumed it to be identical to LMOD_IGNORE_CACHE :\

@ocaisa

ocaisa commented Apr 14, 2026

Copy link
Copy Markdown
Member

Is there something else special about your Lmod setup? It is indeed pretty strange that the cached loads is not affected by the envvar that is supposed to affect it, and the command line option equivalent of LMOD_IGNORE_CACHE also does not work.

@smoors

smoors commented Apr 14, 2026

Copy link
Copy Markdown
Contributor Author

for LMOD_CACHED_LOADS, i think the lmod_config.lua file takes precedence over the envvar.

for --ignore_cache, it definitely works when loading "normal" modules, so i suspect this is related to the magic that is happening in the EESSI module?

anyway, it looks like an unhappy coincidence of a lot of factors, so maybe not worth adding this to the docs?

@ocaisa

ocaisa commented May 13, 2026

Copy link
Copy Markdown
Member

Perhaps a FAQ entry...if we had such a thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants