Skip to content

dpdisp submit implicitly deletes remote working directory via undocumented clean=True default in Submission.run_submission #595

@qchempku2017

Description

@qchempku2017

The current implementation of Submission.run_submission()
(link)

introduces a clean parameter that defaults to True.

When enabled, this triggers machine.context.clean() after job completion, which for SSHContext results in removing the entire remote working directory (i.e., everything under the submission-specific remote root) via recursive deletion (rmtree):

(clean)
(rmtree)

When using the CLI entrypoint:

dpdisp submit submission.json

the clean parameter is not exposed or configurable, neither via JSON schema nor via command-line options:
(submit.py)

As a result, dpdisp submit always executes with clean=True, leading to implicit deletion of all remote job artifacts after completion.

Impact

This behavior has several serious implications:

  • Loss of traceability and auditability
    Users cannot inspect intermediate or full results on the remote HPC environment after job completion.
  • Silent destructive default
    The cleanup is destructive (rm -rf) and occurs without explicit user consent.
  • Agent / automation incompatibility
    For automated workflows (e.g., LLM/agent-based pipelines), this implicit side-effect is particularly problematic, as it is:
    not inferable from the CLI interface,
    not declared in configuration,
    and not documented.

Documentation Gap

The following are currently not documented in Submission.run_submission():

  • The existence of the clean parameter
  • Its default value (True)
  • Its destructive effect on remote directories
  • The fact that CLI submission always enables it

This significantly increases the risk of unintended data loss.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdocumentationImprovements or additions to documentation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions