Skip to content
Discussion options

You must be logged in to vote

Thanks for the clarification! The problem is that when FP tasks crash with a non-zero exit code, dpdispatcher treats that as a job failure — and after exhausting retry_count, it raises an exception that stops everything. Neither ratio_unfinished nor ratio_failed will help at this stage, because the error happens at the dpdispatcher level before dpgen's ratio_failed logic ever kicks in.

Here's what you need to do — both steps are needed:

Step 1: Force the FP command to always return exit code 0

In your machine.json, append || true to your FP command so that crashes don't produce a non-zero exit code:

{
  "fp": {
    "command": "vasp_std || true",
    ...
  }
}

This way, even if the DFT co…

Replies: 1 comment 26 replies

Comment options

You must be logged in to vote
26 replies
@whisper-to
Comment options

@dosubot
Comment options

@whisper-to
Comment options

@dosubot
Comment options

Answer selected by whisper-to
@whisper-to
Comment options

@dosubot
Comment options

@whisper-to
Comment options

@dosubot
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant