Skip to content

Generate single await-push command per buffer for all local chunks#324

Merged
PeterTh merged 2 commits intomasterfrom
generate-single-await-push
Jul 14, 2025
Merged

Generate single await-push command per buffer for all local chunks#324
PeterTh merged 2 commits intomasterfrom
generate-single-await-push

Conversation

@psalz
Copy link
Copy Markdown
Member

@psalz psalz commented Dec 20, 2024

This brings await-pushes in line with pushes, where we already compute the union of all regions required by remote chunks executed on the same node.

@github-actions
Copy link
Copy Markdown

Check-perf-impact results: (241536968c1afd627708e72c171c89f6)

🚀 Significant speedup (<0.80x) in some microbenchmark results: benchmark intrusive graph dependency handling with N nodes - 100 / creating nodes

Relative execution time per category: (mean of relative medians)

  • command-graph : 1.02x
  • graph-nodes : 0.97x
  • grid : 0.98x
  • instruction-graph : 1.00x
  • scheduler : 0.98x
  • system : 1.05x
  • task-graph : 1.02x

@coveralls
Copy link
Copy Markdown

coveralls commented Dec 20, 2024

Pull Request Test Coverage Report for Build 16268225707

Details

  • 20 of 20 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall first build on generate-single-await-push at 95.058%

Totals Coverage Status
Change from base Build 15412343516: 95.1%
Covered Lines: 7156
Relevant Lines: 7261

💛 - Coveralls

Copy link
Copy Markdown
Contributor

@GagaLP GagaLP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the small comment i made this looks good.

if(!box.empty() && !wcs.is_fresh()) { missing_parts_boxes.push_back(box); }
}

// There is data we don't yet have locally. Generate an await push command for it.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this comment now seems a bit out of place since we don't generate the await push command in this if.

Copy link
Copy Markdown
Contributor

@fknorr fknorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the correct thing to do!


void add(const box_vector<Dims>& boxes) & {
m_boxes.reserve(m_boxes.size() + boxes.size());
for(const auto& b : boxes) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have a comment here that we use a loop rather than insert(end) to skip over empty boxes, something which add(region) does not need to do.

[only if my comment below does not apply]

Comment on lines +409 to +408
auto& required_boxes = per_buffer_required_boxes[bid]; // allow default-insert
required_boxes.add(missing_parts_boxes);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this just be moved into the loop in L401? Then we don't need missing_parts_boxes (or region_builder::add(box_vector) for that matter)

@GagaLP GagaLP force-pushed the generate-single-await-push branch from 9a70a5b to 12fb858 Compare June 3, 2025 12:16
This brings await-pushes in line with pushes, where we already compute
the union of all regions required by remote chunks executed on the same
node.
@PeterTh PeterTh force-pushed the generate-single-await-push branch from 12fb858 to 0df4997 Compare July 14, 2025 12:55
@PeterTh
Copy link
Copy Markdown
Contributor

PeterTh commented Jul 14, 2025

(Fixup'ed Gabriel's comment improvements before expected merge)

@PeterTh PeterTh force-pushed the generate-single-await-push branch from 0df4997 to f9a8656 Compare July 14, 2025 13:29
@github-actions
Copy link
Copy Markdown

Check-perf-impact results: (ae6918621b46271c2f10d6eb978fe95d)

✔️ No significant performance change in the microbenchmark set. You are good to go!

Relative execution time per category: (mean of relative medians)

  • command-graph : 0.95x
  • graph-nodes : 0.98x
  • grid : 1.00x
  • instruction-graph : 1.02x
  • scheduler : 1.01x
  • system : 1.03x
  • task-graph : 0.98x

@PeterTh PeterTh merged commit 6b90ee3 into master Jul 14, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants