Skip to content

Commit 7635cd4

Browse files
committed
TA: CS-2229 deleted_by: reschedule (reschedule_unknown, qmod -r) — set JR_deleted_by on the pseudo job report
1 parent 1f47520 commit 7635cd4

6 files changed

Lines changed: 111 additions & 83 deletions

File tree

doc/markdown/man/man1/qacct.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,9 @@ printed. If neither a name nor an ID is given all jobs are enlisted. This option
7070
`qacct`. If activated, CPU times are no longer accumulated but the "raw" accounting information is printed in a
7171
formatted form instead. Seexxqs_name_sxx_accounting(5) for an explanation of the displayed information.
7272

73-
For a job that was deleted, the output additionally contains a `deleted_by` line identifying who deleted the
74-
job (see the *deleted_by* attribute in xxqs_name_sxx_accounting(5)).
73+
For a job that was deleted, killed for exceeding a resource limit, or whose run was
74+
rescheduled by qmaster or by `qmod -r`, the output additionally contains a `deleted_by`
75+
line identifying the actor (see the *deleted_by* attribute in xxqs_name_sxx_accounting(5)).
7576

7677
## -l *attr=val,...*
7778
A resource requirement specification which must be met by the queues in which the jobs being accounted were

doc/markdown/man/man5/sge_accounting.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -48,14 +48,18 @@ The department which was assigned to the job.
4848

4949
## *deleted_by*
5050

51-
Information about who deleted the job, in the form `<user>@<host>`.
51+
Information about who deleted (or terminated) the current run of the job, in the form
52+
`<user>@<host>`.
5253

5354
When the job was deleted with the `qdel` command it contains the user and the host of the
5455
`qdel` request. When the job was killed because it exceeded a resource limit (e.g. `h_rt`) it
5556
contains `execd@<host>` if the limit was enforced by the execution daemon, or `qmaster@<host>`
56-
if it was enforced by the qmaster.
57+
if it was enforced by the qmaster. When the run was terminated because qmaster rescheduled
58+
the job (for example after a `reschedule_unknown` timeout fired on the execution host) it
59+
contains `qmaster@<master-host>`. When a run was rescheduled via `qmod -rj`/`qmod -rq` it
60+
contains the user and host that issued the `qmod` request.
5761

58-
The attribute is absent if the job was not deleted.
62+
The attribute is absent if the job was not deleted or rescheduled.
5963

6064
(JSONL only)
6165

@@ -236,7 +240,7 @@ Values are contained in the following structure and order:
236240
* category
237241
* failed
238242
* exit_status
239-
* deleted_by (optional, present if the job was deleted)
243+
* deleted_by (optional, present if the job was deleted or rescheduled)
240244
* usage - array containing all rusage values
241245
* rusage
242246
* ru_wallclock

doc/markdown/man/man5/sge_reporting.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,10 @@ For the contents and structure of the accounting records see xxqs_name_sxx_accou
175175

176176
When a job was deleted before it finished, the acct record additionally contains a *deleted_by*
177177
attribute holding `<user>@<host>` of the `qdel` request, or `execd@<host>` / `qmaster@<host>` when
178-
the job was killed for exceeding a resource limit. This attribute is emitted only to the JSONL
178+
the job was killed for exceeding a resource limit. The same attribute is also written when
179+
qmaster rescheduled the run automatically (e.g. after a `reschedule_unknown` timeout) — in that
180+
case it holds `qmaster@<master-host>` — or when an operator issued `qmod -rj`/`qmod -rq`, in which
181+
case it holds `<user>@<host>` of the `qmod` caller. This attribute is emitted only to the JSONL
179182
reporting format.
180183

181184
## online_usage

0 commit comments

Comments
 (0)