Download pseudos and make artifacts on-the-fly in make_artifacts.jl by azadoks · Pull Request #16 · JuliaMolSim/PseudoLibrary

azadoks · 2026-04-17T09:43:44Z

This is kind of hacky but works!

I've broken the fixed Dojo v0.5.
I'll work on fixing it by hosting only the modified pseudos in the repo and writing a builder function in add_psedodojo.jl.

I see two benefits to doing it this way:

Working with the repo is much nicer (no huge size on clone/checkout, no huge commits for new families)
Won't run into problems with storage requirements on the repo when adding many versions / file formats / variants of new families

And a few drawbacks:

Obviously, the pseudos are no longer stored here, and we're reliant on the pseudo owners/maintainers/distributors to keep their links alive
I'm relying on the add_*.jl convention and the cli provided by the scripts
I'm calling the add_*.jl scripts via the shell; requires precompilation each time

Release sizes are still capped to the Git LFS limit, but that would have been a problem in any case.

mfherbst · 2026-04-19T07:27:13Z

Sorry this commit is so huge (due to all the removed files) that I'm unable to see what you actually did. Would you point me to the relevant changes (with best direct links to lines in the files in your fork. Github understandably has issues if you remove 12M lines of code in one commit).

Given the above, take what I write with a grain of salt:
When setting up this repo I also thought to simply call the add scripts during artifact build. I decided not to do it to keep the mechanics as simple as possible. My point is that managing such a pseudo repo takes a lot of time effort and responsibility (and it's not fun science !) and the Julia community is small, so we should really make sure to not put load on future us.

I see your point about storage, but to me it has a clear benefit to have a "locked-in" version in a repo like this. In some of the parsing we do quite a lot (and take decisions) that should be reproducible. If all this happens in a CI run automagically, it gets very hard to figure out what went wrong if all of a sudden you get a different number when seemingly using the same pseudos. So broken magic here has potentially a huge impact on scientific outcome requiring some care and in my opinion therefore a human in the loop.

My main concern is your 1.. Given the state of the pseudo ecosystem I think it is very likely, close to 100%, that a repo will just disappear in the future. We definitely need resilience towards that.

Is storage such a big issue ? Can this not be solved by using multiple git subrepos that we control ?

azadoks · 2026-04-20T12:48:09Z

I guess storage is not the main issue for me per se but rather the pain of dealing with a repo with so many large files.

I definitely agree that we should guard ourselves against repos disappearing (see, e.g. old versions of the full GBRV table).

In this case maybe the best response is, as you say, subrepos.

azadoks added 3 commits April 17, 2026 11:39

Download pseudos and make artifacts on-the-fly in make_artifacts.jl

f454b3c

Set git user.name

33aa8f1

Set git user.email

3696572

azadoks requested a review from mfherbst April 17, 2026 10:01

azadoks closed this Apr 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download pseudos and make artifacts on-the-fly in make_artifacts.jl#16

Download pseudos and make artifacts on-the-fly in make_artifacts.jl#16
azadoks wants to merge 3 commits intoJuliaMolSim:masterfrom
azadoks:on-the-fly

azadoks commented Apr 17, 2026 •

edited

Loading

Uh oh!

mfherbst commented Apr 19, 2026

Uh oh!

azadoks commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

azadoks commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mfherbst commented Apr 19, 2026

Uh oh!

azadoks commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

azadoks commented Apr 17, 2026 •

edited

Loading