Skip to content

Commit ae018b7

Browse files
authored
EH: CS-1333 add reporting of online usage to reporting file (#80)
* TA: CS-2203 Add online_usage parsing to reporting_params * TA: CS-2199 Add create_online_usage_record writer plumbing + JSONL implementation * TA: CS-2200 Hook online_usage emission into the qmaster job-report path * TA: CS-2202 Document online_usage in sge_conf.5 and sge_reporting.5 * corrected description of valid variable names for online_usag * do not create online_usage records with an empty usage list * improved validation of online_usage, I18N * Potential fix for pull request finding * fix: clear online_usage defaults when reloading reporting_params
1 parent a75fdb2 commit ae018b7

11 files changed

Lines changed: 411 additions & 4 deletions

doc/markdown/man/man5/sge_conf.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1307,6 +1307,40 @@ Examples:
13071307
usage_patterns=gpu:nvidia.*|amd.*
13081308
usage_patterns=gpu:nvidia.*;power:power-*
13091309

1310+
***online_usage***
1311+
1312+
When set to a non-empty value, xxQS_NAMExx writes a continuous stream of *online_usage* records to the
1313+
JSONL reporting file. One record is generated for every job report received by xxqs_name_sxx_qmaster(8)
1314+
from xxqs_name_sxx_execd(8) for a running job. The value of this parameter selects which usage variables
1315+
shall be included in each record. The format is:
1316+
1317+
<var>[|<var>[|...]]
1318+
1319+
Variables are separated by `|`. There is no closed list of accepted names: any token that is a valid
1320+
xxQS_NAMExx *complex_name* (see xxqs_name_sxx_types(1)) is accepted by the configuration parser.
1321+
A record will carry only those configured variables that the execution daemon actually reports for
1322+
the job — names the execd does not report are silently skipped. The set of variables reported by
1323+
xxqs_name_sxx_execd(8) today includes *cpu*, *mem*, *io*, *iow*, *ioops*, *vmem*, *maxvmem*, *rss*,
1324+
*maxrss*, *pss*, *maxpss*, *smem*, *pmem*, *wallclock*.
1325+
1326+
Examples:
1327+
1328+
online_usage=cpu|mem|maxvmem|wallclock
1329+
online_usage=cpu|mem|io|iow|ioops|vmem|rss|maxrss
1330+
1331+
If the parameter is absent or set to an empty value, no *online_usage* records will be written. This
1332+
is the default — the feature must be opted into explicitly.
1333+
1334+
For tightly-integrated parallel jobs the granularity of *online_usage* records follows the same
1335+
convention as the accounting record: if the parallel environment has *accounting_summary* set to
1336+
*TRUE*, one record per ja_task is written and pe_task usage is aggregated into it; otherwise one
1337+
record is written per pe_task (and one for the master task).
1338+
1339+
*online_usage* records are written only to the new JSONL reporting file. The deprecated colon-separated
1340+
reporting file (see *old_reporting*) does not carry *online_usage* records.
1341+
1342+
See xxqs_name_sxx_reporting(5) for the *online_usage* record schema.
1343+
13101344
## finished_jobs
13111345

13121346
Note: Deprecated, may be removed in future release. xxQS_NAMExx stores a certain number of *just finished* jobs to

doc/markdown/man/man5/sge_reporting.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,44 @@ fine-grained resource usage monitoring over time.
173173

174174
For the contents and structure of the accounting records see xxqs_name_sxx_accounting(5).
175175

176+
## online_usage
177+
178+
Records of type online_usage are written for every job report that xxqs_name_sxx_qmaster(8) receives
179+
from xxqs_name_sxx_execd(8) for a running job, when *online_usage* in *reporting_params* has been
180+
configured with a non-empty list of usage variables (see xxqs_name_sxx_sge_conf(5)). They provide a
181+
continuous time series of scaled usage values per running job and are intended for live monitoring
182+
and for being consumed by a downstream database writer.
183+
184+
For tightly-integrated parallel jobs, the granularity is driven by the parallel environment's
185+
*accounting_summary* attribute (see xxqs_name_sxx_sge_pe(5)):
186+
187+
* `accounting_summary=TRUE` — one record per ja_task. Usage values are aggregated across all pe_tasks
188+
of the ja_task — the same convention the *acct* record follows.
189+
* `accounting_summary=FALSE` — one record per pe_task and one for the master task.
190+
191+
online_usage records have the following fields:
192+
193+
* job_number
194+
The job number.
195+
196+
* task_number
197+
The array task id. 0 for non-array jobs, otherwise the index of the array task.
198+
199+
* pe_taskid
200+
The task id of a parallel sub-task. Present only when the record describes the usage of a single
201+
pe_task. Absent when the record represents the master task or an aggregated ja_task-level record
202+
for a PE with *accounting_summary=TRUE*.
203+
204+
* usage
205+
A JSON object containing the usage variables configured in
206+
`reporting_params=...online_usage=<var>[|<var>...]` that are present in the job report. Each entry
207+
is the scaled usage value of one variable. Variables that are configured but not yet reported by
208+
the execd are silently skipped (no `null` value is emitted). For the list of valid variable names
209+
see the description of *online_usage* in xxqs_name_sxx_sge_conf(5).
210+
211+
The online_usage record type is emitted only to the JSONL reporting file. The deprecated colon-separated
212+
reporting format does not carry online_usage records.
213+
176214
## queue
177215

178216
Records of type queue contain state information for queues (queue instances). A queue record has the following fields:

source/daemons/qmaster/configuration_qmaster.cc

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@
3535
#include <cerrno>
3636
#include <cstring>
3737
#include <cstdlib>
38+
#include <string>
39+
#include <vector>
3840
#include <unistd.h>
3941
#include <pwd.h>
4042

@@ -480,6 +482,27 @@ check_config(lList **alpp, lListElem *conf) {
480482
answer_list_add(alpp, SGE_EVENT, STATUS_EEXIST, ANSWER_QUALITY_ERROR);
481483
DRETURN(STATUS_EEXIST);
482484
}
485+
} else if (!strcmp(name, "reporting_params")) {
486+
// Validate parameters whose grammar we can fail-fast on. Today this
487+
// covers only online_usage; other reporting_params tokens are
488+
// sanity-checked at merge time in sge_conf.cc::merge_configuration().
489+
struct saved_vars_s *context = nullptr;
490+
bool rp_valid = true;
491+
for (const char *s = sge_strtok_r(value, PARAMS_DELIMITER, &context);
492+
rp_valid && s != nullptr;
493+
s = sge_strtok_r(nullptr, PARAMS_DELIMITER, &context)) {
494+
std::string online_usage_str;
495+
if (parse_string_param(s, "online_usage", online_usage_str)) {
496+
std::vector<std::string> discard;
497+
if (!parse_online_usage_value(alpp, online_usage_str.c_str(), discard)) {
498+
rp_valid = false;
499+
}
500+
}
501+
}
502+
sge_free_saved_vars(context);
503+
if (!rp_valid) {
504+
DRETURN(STATUS_EEXIST);
505+
}
483506
} else if (!strcmp(name, "admin_user")) {
484507
struct passwd pw_struct;
485508
char *buffer;

source/daemons/qmaster/job_report_qmaster.cc

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
*
2828
* All Rights Reserved.
2929
*
30-
* Portions of this software are Copyright (c) 2023-2025 HPC-Gridware GmbH
30+
* Portions of this software are Copyright (c) 2023-2026 HPC-Gridware GmbH
3131
*
3232
************************************************************************/
3333
/*___INFO__MARK_END__*/
@@ -43,6 +43,7 @@
4343
#include "sgeobj/ocs_DataStore.h"
4444
#include "sgeobj/sge_answer.h"
4545
#include "sgeobj/sge_ja_task.h"
46+
#include "sgeobj/sge_pe.h"
4647
#include "sgeobj/sge_pe_task.h"
4748
#include "sgeobj/sge_usage.h"
4849
#include "sgeobj/sge_host.h"
@@ -383,6 +384,32 @@ void process_job_report(lListElem *report, lListElem *hep, char *rhost, char *co
383384
}
384385
}
385386

387+
/*
388+
* online_usage record: emit a JSONL snapshot of the scaled
389+
* usage on every job report from execd, gated by the
390+
* reporting_params=online_usage list. Honour the PE's
391+
* accounting_summary flag with the same convention as
392+
* sge_write_rusage / create_acct_records:
393+
* - pe_task JR + accounting_summary TRUE -> skip
394+
* (aggregated record will be written on the master JR)
395+
* - pe_task JR + accounting_summary FALSE -> per-pe_task
396+
* - master JR -> ja_task,
397+
* summing pe_task usage when accounting_summary TRUE
398+
*/
399+
if (ocs::ReportingFileWriter::is_online_usage_required()) {
400+
const bool do_accounting_summary =
401+
pe_do_accounting_summary(lGetObject(jatep, JAT_pe_object));
402+
if (pe_task_id_str != nullptr) {
403+
if (petask != nullptr && !do_accounting_summary) {
404+
ocs::ReportingFileWriter::create_online_usage_records(nullptr, jr, jep, jatep,
405+
petask, false);
406+
}
407+
} else {
408+
ocs::ReportingFileWriter::create_online_usage_records(nullptr, jr, jep, jatep,
409+
nullptr, do_accounting_summary);
410+
}
411+
}
412+
386413
/*
387414
* once a day write an intermediate usage record to the
388415
* reporting file to have correct daily usage reporting with

source/daemons/qmaster/ocs_JsonReportingFileWriter.cc

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
#include "sgeobj/sge_qinstance.h"
3333
#include "sgeobj/sge_qinstance_state.h"
3434
#include "sgeobj/sge_str.h"
35+
#include "sgeobj/sge_usage.h"
3536

3637
#include "uti/sge_lock.h"
3738
#include "uti/sge_log.h"
@@ -76,6 +77,128 @@ namespace ocs {
7677
DRETURN(ret);
7778
}
7879

80+
/**
81+
* Emit a single JSONL online_usage record for the running job.
82+
*
83+
* When pe_task is non-null the record carries that pe_task's scaled
84+
* usage and a pe_taskid field. When pe_task is null the record carries
85+
* the ja_task's scaled usage; if aggregate_pe_tasks is set, scaled
86+
* usage of every pe_task in `ja_task->JAT_task_list` is summed into the
87+
* ja_task value (matching the convention used by sge_write_rusage when
88+
* the PE has `accounting_summary` set). Only the variables listed in
89+
* `reporting_params=online_usage=...` are emitted under `usage`;
90+
* variables missing from the scaled usage list are silently skipped
91+
* (no `null` field).
92+
*
93+
* If none of the configured variables have any data yet (e.g. right
94+
* after job start, before the execd has produced its first usage
95+
* report), no record is emitted at all — empty-`usage` records would
96+
* clutter the reporting file without adding information.
97+
*
98+
* @see ReportingFileWriter::create_online_usage_records()
99+
*/
100+
bool
101+
JsonReportingFileWriter::create_online_usage_record(lList **answer_list, lListElem *job_report,
102+
lListElem *job, lListElem *ja_task, lListElem *pe_task,
103+
bool aggregate_pe_tasks) {
104+
DENTER(TOP_LAYER);
105+
106+
if (job == nullptr || ja_task == nullptr) {
107+
DRETURN(true);
108+
}
109+
110+
// take a snapshot of the configured variable names
111+
std::vector<std::string> vars;
112+
sge_mutex_lock(config_mutex_name.c_str(), __func__, __LINE__, &config_mutex);
113+
vars = online_usage_vars;
114+
sge_mutex_unlock(config_mutex_name.c_str(), __func__, __LINE__, &config_mutex);
115+
116+
if (vars.empty()) {
117+
DRETURN(true);
118+
}
119+
120+
// pe_task != nullptr: report scaled usage of the pe_task,
121+
// otherwise the (possibly aggregated) scaled usage on the ja_task
122+
const lList *primary_usage = (pe_task != nullptr)
123+
? lGetList(pe_task, PET_scaled_usage)
124+
: lGetList(ja_task, JAT_scaled_usage_list);
125+
126+
// aggregation only applies to ja_task-level records
127+
const bool aggregate = (pe_task == nullptr) && aggregate_pe_tasks;
128+
129+
// Suppress empty-usage records: if none of the configured variables
130+
// have data yet, do not emit anything.
131+
bool any_data = false;
132+
for (const auto &name : vars) {
133+
if (lGetElemStr(primary_usage, UA_name, name.c_str()) != nullptr) {
134+
any_data = true;
135+
break;
136+
}
137+
if (aggregate) {
138+
const lListElem *pe_task_iter;
139+
for_each_ep(pe_task_iter, lGetList(ja_task, JAT_task_list)) {
140+
if (lGetSubStr(pe_task_iter, UA_name, name.c_str(), PET_scaled_usage) != nullptr) {
141+
any_data = true;
142+
break;
143+
}
144+
}
145+
if (any_data) {
146+
break;
147+
}
148+
}
149+
}
150+
if (!any_data) {
151+
DRETURN(true);
152+
}
153+
154+
rapidjson::StringBuffer stringBuffer;
155+
rapidjson::Writer<rapidjson::StringBuffer> writer(stringBuffer);
156+
157+
writer.StartObject();
158+
write_json(writer, "time", sge_get_gmt64());
159+
write_json(writer, "type", "online_usage");
160+
write_json(writer, "job_number", lGetUlong(job, JB_job_number));
161+
int task_number = job_is_array(job) ? (int) lGetUlong(ja_task, JAT_task_number) : 0;
162+
write_json(writer, "task_number", task_number);
163+
if (pe_task != nullptr) {
164+
write_json(writer, "pe_taskid", lGetString(pe_task, PET_id));
165+
}
166+
167+
writer.Key("usage");
168+
writer.StartObject();
169+
for (const auto &name : vars) {
170+
double total = 0.0;
171+
bool found_any = false;
172+
173+
const lListElem *u = lGetElemStr(primary_usage, UA_name, name.c_str());
174+
if (u != nullptr) {
175+
total = lGetDouble(u, UA_value);
176+
found_any = true;
177+
}
178+
179+
if (aggregate) {
180+
const lListElem *pe_task_iter;
181+
for_each_ep(pe_task_iter, lGetList(ja_task, JAT_task_list)) {
182+
const lListElem *pu = lGetSubStr(pe_task_iter, UA_name, name.c_str(), PET_scaled_usage);
183+
if (pu != nullptr) {
184+
total += lGetDouble(pu, UA_value);
185+
found_any = true;
186+
}
187+
}
188+
}
189+
190+
if (found_any) {
191+
write_json(writer, name.c_str(), total);
192+
}
193+
}
194+
writer.EndObject();
195+
196+
writer.EndObject();
197+
create_record(stringBuffer);
198+
199+
DRETURN(true);
200+
}
201+
79202
void
80203
JsonReportingFileWriter::create_record(rapidjson::StringBuffer &stringBuffer) {
81204
stringBuffer.Put('\n');

source/daemons/qmaster/ocs_JsonReportingFileWriter.h

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
/*___INFO__MARK_BEGIN_NEW__*/
33
/***************************************************************************
44
*
5-
* Copyright 2024 HPC-Gridware GmbH
5+
* Copyright 2024,2026 HPC-Gridware GmbH
66
*
77
* Licensed under the Apache License, Version 2.0 (the "License");
88
* you may not use this file except in compliance with the License.
@@ -48,6 +48,10 @@ namespace ocs {
4848
create_acct_record(lList **answer_list, lListElem *job_report, lListElem *job,
4949
lListElem *ja_task, bool intermediate) override;
5050

51+
bool
52+
create_online_usage_record(lList **answer_list, lListElem *job_report, lListElem *job,
53+
lListElem *ja_task, lListElem *pe_task, bool aggregate_pe_tasks) override;
54+
5155
bool
5256
create_host_record(lList **answer_list, const lListElem *host, u_long64 report_time) override;
5357

0 commit comments

Comments
 (0)