Week-1 pgr_maximumWeightedMatching #552
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
💤 Files with no reviewable changes (2)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughAdds maximum weighted matching: a C SRF bridge and public SQL wrapper, a Boost-based C++ matching implementation, a C driver and headers to expose results to Postgres, and wiring to call the C driver from the SRF. ChangesMaximum Weighted Matching
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@sql/matching/_maximumWeightedMatching.sql`:
- Around line 1-2: Add the project's standard license/copyright header at the
top of the new SQL file _maximumWeightedMatching.sql to satisfy repository
license checks; open _maximumWeightedMatching.sql and prepend the exact
multi-line header block used elsewhere in the repo (copy from another SQL file)
including year and copyright holder, then commit so CI recognizes the header.
In `@sql/matching/maximumWeightedMatching.sql`:
- Around line 1-2: Add the project's standard license/header to the top of
maximumWeightedMatching.sql: insert the required copyright line, license
identifier (e.g., SPDX or full license text) and any contributor or year
metadata used by other SQL files in the repo so the file passes the mandatory
header checks; place this header above the existing "--v4.1" line and match the
exact formatting used in other SQL files to satisfy the automated check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0979b4db-49e9-4e0f-a941-7ccbf4a90900
📒 Files selected for processing (2)
sql/matching/_maximumWeightedMatching.sqlsql/matching/maximumWeightedMatching.sql
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/max_flow/maximumWeighted_matching_driver.cpp`:
- Around line 54-63: The driver currently constructs an empty
std::vector<Basic_edge> edges and never reads the SQL input (edges_sql), so
maximumWeightedMatching_process(edges) always runs on an empty graph; fix by
executing/parsing the SQL result identified by edges_sql, iterate over each
returned row, convert each row into a Basic_edge (populate the edge fields used
by Basic_edge such as source/target/weight or the names used in your struct),
push_back into the edges vector, handle SQL errors and type conversions, then
call maximumWeightedMatching_process(edges) as before so the returned matching
reflects the queried data.
- Line 31: The include directive uses the wrong header name; replace the old
include "drivers/max_flow/maximumWeighted_matching_driver.h" with the correct
new header filename "drivers/max_flow/maximum_weighted_matching_driver.h" in the
file that contains the include (referenced symbol: the `#include` line in
maximumWeighted_matching_driver.cpp) so case-sensitive builds can find
maximum_weighted_matching_driver.h.
In `@src/max_flow/maximumWeightedMatching_process.cpp`:
- Around line 56-75: The code is using raw SQL vertex IDs as graph indices which
can create huge sparse graphs and misinterpret mate results; before creating the
Boost graph, collect all unique IDs from edges (edge.source and edge.target),
build a contiguous index mapping (e.g., unordered_map<OriginalId, size_t>
id_to_idx and vector<size_t> idx_to_id), then allocate vertices using
boost::add_vertex(graph) for each mapped index and add edges using
boost::add_edge(id_to_idx[edge.source], id_to_idx[edge.target], edge.cost,
graph); after running the matching, translate mate/index descriptors back to
original IDs via idx_to_id when returning mates.
In `@src/max_flow/maximumWeightedMatching.c`:
- Line 50: Remove the redundant leading blank lines that cpplint flags at the
start of blocks in this file: delete the empty lines that precede block starts
inside the maximumWeightedMatching function and its helper functions (e.g., any
blank line before an if/for/while/return block or before helper function blocks
such as augmenting path helpers), so each block opens immediately without an
extra blank line; run cpplint locally to confirm the whitespace/blank_line
warnings are resolved.
- Line 39: The include at the top of maximumWeightedMatching.c is referencing
the old header name "maximumWeighted_matching_driver.h"; update the `#include` to
use the new header filename "maximum_weighted_matching_driver.h" so the
preprocessor pulls the correct header (look for the existing include statement
in maximumWeightedMatching.c and replace the old header name with
maximum_weighted_matching_driver.h).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f1c61e47-9c74-4522-bb7e-5300ca86bc32
📒 Files selected for processing (5)
include/drivers/max_flow/maximum_weighted_matching_driver.hinclude/max_flow/maximumWeightedMatching.hppsrc/max_flow/maximumWeightedMatching.csrc/max_flow/maximumWeightedMatching_process.cppsrc/max_flow/maximumWeighted_matching_driver.cpp
✅ Files skipped from review due to trivial changes (1)
- include/max_flow/maximumWeightedMatching.hpp
| std::vector<Basic_edge> edges; | ||
|
|
||
| /* | ||
| * TODO: | ||
| * Fetch edges from SQL query | ||
| */ | ||
|
|
||
| std::vector<int64_t> result = | ||
| maximumWeightedMatching_process(edges); | ||
|
|
There was a problem hiding this comment.
Driver currently ignores SQL input and always returns an empty matching.
edges_sql is never parsed/fetched, so edges stays empty and the function returns success with no rows. This breaks the feature contract.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/max_flow/maximumWeighted_matching_driver.cpp` around lines 54 - 63, The
driver currently constructs an empty std::vector<Basic_edge> edges and never
reads the SQL input (edges_sql), so maximumWeightedMatching_process(edges)
always runs on an empty graph; fix by executing/parsing the SQL result
identified by edges_sql, iterate over each returned row, convert each row into a
Basic_edge (populate the edge fields used by Basic_edge such as
source/target/weight or the names used in your struct), push_back into the edges
vector, handle SQL errors and type conversions, then call
maximumWeightedMatching_process(edges) as before so the returned matching
reflects the queried data.
| size_t max_vertex = 0; | ||
|
|
||
| for (const auto &edge : edges) { | ||
| max_vertex = std::max( | ||
| max_vertex, | ||
| static_cast<size_t>( | ||
| std::max(edge.source, edge.target))); | ||
| } | ||
|
|
||
| for (size_t i = 0; i <= max_vertex; ++i) { | ||
| boost::add_vertex(graph); | ||
| } | ||
|
|
||
| for (const auto &edge : edges) { | ||
|
|
||
| boost::add_edge( | ||
| edge.source, | ||
| edge.target, | ||
| edge.cost, | ||
| graph); |
There was a problem hiding this comment.
Map vertex IDs to contiguous indices before building the vecS graph.
This code treats SQL vertex IDs as direct graph indices. With sparse/large IDs, Line 65 can allocate an enormous graph, and returned mates are index descriptors rather than guaranteed original IDs.
Suggested direction
+// Build bijection: original vid <-> contiguous index [0..n)
+std::unordered_map<int64_t, size_t> vid_to_idx;
+std::vector<int64_t> idx_to_vid;
+auto ensure_idx = [&](int64_t vid) -> size_t {
+ auto it = vid_to_idx.find(vid);
+ if (it != vid_to_idx.end()) return it->second;
+ size_t idx = idx_to_vid.size();
+ vid_to_idx.emplace(vid, idx);
+ idx_to_vid.push_back(vid);
+ boost::add_vertex(graph);
+ return idx;
+};
+
+for (const auto &edge : edges) {
+ auto u = ensure_idx(edge.source);
+ auto v = ensure_idx(edge.target);
+ boost::add_edge(u, v, edge.cost, graph);
+}
...
- result.push_back(static_cast<int64_t>(i));
- result.push_back(static_cast<int64_t>(mate[i]));
+ result.push_back(idx_to_vid[i]);
+ result.push_back(idx_to_vid[mate[i]]);Also applies to: 88-99
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/max_flow/maximumWeightedMatching_process.cpp` around lines 56 - 75, The
code is using raw SQL vertex IDs as graph indices which can create huge sparse
graphs and misinterpret mate results; before creating the Boost graph, collect
all unique IDs from edges (edge.source and edge.target), build a contiguous
index mapping (e.g., unordered_map<OriginalId, size_t> id_to_idx and
vector<size_t> idx_to_id), then allocate vertices using boost::add_vertex(graph)
for each mapped index and add edges using
boost::add_edge(id_to_idx[edge.source], id_to_idx[edge.target], edge.cost,
graph); after running the matching, translate mate/index descriptors back to
original IDs via idx_to_id when returning mates.
|
Actionable comments posted: 0 |
Summary by CodeRabbit