Skip to content

DM-52886: Update bps report function to aggregate panda task slices into task labels#97

Merged
zhaoyuyoung merged 3 commits into
mainfrom
tickets/DM-52886
Oct 24, 2025
Merged

DM-52886: Update bps report function to aggregate panda task slices into task labels#97
zhaoyuyoung merged 3 commits into
mainfrom
tickets/DM-52886

Conversation

@zhaoyuyoung
Copy link
Copy Markdown
Contributor

…abels

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes

@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 15, 2025

Codecov Report

❌ Patch coverage is 4.34783% with 110 lines in your changes missing coverage. Please review.
✅ Project coverage is 35.26%. Comparing base (110c3fb) to head (f0d77a3).
⚠️ Report is 4 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
python/lsst/ctrl/bps/panda/panda_service.py 0.00% 65 Missing ⚠️
python/lsst/ctrl/bps/panda/utils.py 10.00% 45 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #97      +/-   ##
==========================================
- Coverage   36.26%   35.26%   -1.01%     
==========================================
  Files          13       13              
  Lines        1249     1296      +47     
  Branches      204      212       +8     
==========================================
+ Hits          453      457       +4     
- Misses        776      819      +43     
  Partials       20       20              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zhaoyuyoung zhaoyuyoung force-pushed the tickets/DM-52886 branch 2 times, most recently from 664c5a3 to 7a6a3fb Compare October 16, 2025 18:24
@mxk62 mxk62 changed the title Update bps report function to aggregate panda task slices into task l… DM-52886: Update bps report function to aggregate panda task slices into task labels Oct 19, 2025
Comment thread doc/changes/DM-52866.feature.rst Outdated
@@ -0,0 +1 @@
Update bps report function to aggregate panda task slices into task labels
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nitpick, could you please enclose bps report in double-back quotes (``bps report``).

aggregated_run: `str`
A semicolon-separated string with aggregated job counts
by base label.
"""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rewrite the Returns section as follow:

    Returns
    -------
    aggregated_jobs : `dict` [`str`, `dict` [`str`, `int`]]
        A dictionary mapping each base label to the summed job state counts
        across all matching labels.
    aggregated_exits : `dict` [`str`, `list` [`int`]]
        A dictionary mapping each base label to a combined list of exit codes
        from all matching labels.
    aggregated_run : `str`
        A semicolon-separated string with aggregated job counts by base label.

(See an example in the LSST Dev Guide.)

Returns
-------
`str`:
The extracted task name as per the rules above.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rewrite the Returns section as something like:

taskname: `str`
    The extracted task name as per the rules above.

According to the LSST Dev Guide, the short-hand syntax that avoids naming a returned value will not render properly (see the warning in this section).

Comment thread python/lsst/ctrl/bps/panda/utils.py Outdated

Parameters
----------
s: `str`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single space before and after the colon (:) is required in the parameter declaration (src). Please fix and elsewhere if needed.

def extract_taskname(s: str) -> str:
"""
Extract the task name from a string that follows a pattern
CampaignName_timestamp_TaskNumber_TaskLabel_ChunkNumber.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short summary should immediately follow the opening triple quotes, i.e.:

    """Extract the task name from a string that follows a pattern
    CampaignName_timestamp_TaskNumber_TaskLabel_ChunkNumber.

Please fix here and in aggregate_by_basename().

raise RuntimeError(
f"Error to get workflow status: {new_ret} for id: {wms_workflow_id}"
f"Error getting workflow status: {new_ret} for id: {wms_workflow_id}"
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid code repetition, you may consider encapsulating the logic of querying the iDDS (lines 176-182 and 282-293) in a separate function.

if nfailed > 0 and len(exit_codes_all[tasklabel]) == 0:
_LOG.warning(
f"No exit codes in iDDS task info for workload {transform_workload_id}"
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since DM-51261 was merged, ctrl_bps performs a similar check and prints out a warning if such a situation occurs. For this reason, we may consider using DEBUG level here to avoid printing two separate warnings concerning the same thing .

totaljobs = task.get("output_total_files", 0)
wms_report.total_number_jobs += totaljobs

taskstatus = {}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I missed something, it doesn't look like taskstatus is used in this for loop at all. If that's true, can we move this line closer to the place where taskstatus is actually used. For example, can we put it after the comment in line 313?

# Sort tasks by workload_id or fallback
try:
tasks.sort(key=lambda x: x["transform_workload_id"])
except Exception:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we narrow the exception clause to KeyError? Exception seems too broad in this context.

Comment thread python/lsst/ctrl/bps/panda/panda_service.py
@zhaoyuyoung zhaoyuyoung merged commit 6894bb1 into main Oct 24, 2025
16 of 18 checks passed
@zhaoyuyoung zhaoyuyoung deleted the tickets/DM-52886 branch October 24, 2025 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants