Skip to content

add support for template values in scontrol command #317

@boegel

Description

@boegel

I would like to look into adding support for using a templated control command, like using this in the bot configuration (app.cfg):

scontrol_command = /usr/bin/scontrol --clusters=%(cluster)s

We have a use case for this at HPC-UGent, where our Slurm setup consists of a set of separate clusters (not partitions).

This is not a problem for the squeue command, where we can use squeue --clusters=ALL to query all jobs across all clusters, but for scontrol we need to specify a specific cluster via the --clusters option (ALL doesn't work there).

To harvest the value to be used for the %(cluster_name)s template, the way we run the squeue command should be changed a bit.

We currently run something like:

/usr/bin/squeue --long --noheader --user $USER

and then harvest the job ID, job state, and job (state) reason from that (by extracting the 0th/4th/8th columns).

We can selectively harvest more fields and also using a better parsing approach by using something like:

/usr/bin/squeue --long --noheader --user $USER --Format JobId,Cluster,Partition,State,Reason

In the case of HPC-UGent, our bot is configured to use "/usr/bin/squeue --clusters=ALL ..." instead of "/usr/bin/squeue ...".

That way we also control the order of the fields it reports, which is better than assuming that the value at index 4 corresponds to job state.

In this case, 2nd column will provide extra info (cluster) which can be used to complete the potentially templated scontrol command. Likewise, 3rd column would be value for %(partition)s template.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions