Skip to content

Commit e1372d3

Browse files
committed
[CFG] Dump CFG to /tmp on assert_structural_invariants failure
When a structural-invariant check fires, the assert message now points at a fresh file under <tmp>/cfg_invariant_failures/structural_<ns>.txt containing the CFG state at the moment of violation -- node-by-node ranges, prev/next edges, intra-node next/prev links, the offending nodes' statements. Designed for post-mortem on the kind of failure that shows up after a pre-CFG IR transform (e.g. an unstructured-control-flow normaliser) produces a CFG that build_cfg cannot canonicalise: instead of "assert fired, now bisect from scratch" the user has a viewable graph at the exact failure point. The dumper is deliberately tolerant of malformed state. It is invoked only after a violation has been detected so the graph is by construction not internally consistent: null entries print as <ERASED>, per-node dump calls are individually try/catch wrapped so a single bad pointer doesn't take down the whole dump, and the outer try/catch handles filesystem failures. Returns the path on success or empty if the dump itself failed; the assert message picks "<dump failed>" in that case. Also fold the previously-one-per-violation QD_ASSERT_INFO calls into a single comprehensive message at the end -- the endpoint phase now collects all violations before bailing, so the user sees every broken endpoint at once instead of fixing them serially. Edge-consistency phase still short-circuits on first violation (subsequent edges may share corrupted pointers).
1 parent 53d44ad commit e1372d3

2 files changed

Lines changed: 129 additions & 26 deletions

File tree

quadrants/ir/control_flow_graph.cpp

Lines changed: 117 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
#include "quadrants/ir/control_flow_graph.h"
22

3+
#include <chrono>
34
#include <queue>
45
#include <unordered_set>
56
#include <fstream>
@@ -1034,45 +1035,88 @@ void ControlFlowGraph::assert_structural_invariants() const {
10341035
// worklist seeding `nodes[start_node]->reach_gen.insert(...)`, the DP scratch buffer indexed by
10351036
// start_node, the dump-graph loop that skips start_node / final_node by index). If they ever
10361037
// fail, the code following will segfault or silently corrupt -- catch it here instead.
1038+
//
1039+
// Collect every violation we can detect cheaply, then dump + assert in one go. The dump goes
1040+
// to <tmp>/cfg_invariant_failures/structural_<ns>.txt and its path is included in the assert
1041+
// message so a post-mortem can pull up the actual CFG state that violated.
10371042

1038-
// (1) Endpoint invariants -- O(1).
1039-
QD_ASSERT_INFO(!nodes.empty(), "ControlFlowGraph has no nodes");
1040-
QD_ASSERT_INFO(start_node >= 0 && start_node < (int)nodes.size(), "start_node out of range");
1041-
QD_ASSERT_INFO(final_node >= 0 && final_node < (int)nodes.size(), "final_node out of range");
1042-
QD_ASSERT_INFO(nodes[start_node] != nullptr, "start_node entry is null");
1043-
QD_ASSERT_INFO(nodes[final_node] != nullptr, "final_node entry is null");
1043+
std::vector<std::string> errors;
1044+
1045+
// (1) Endpoint invariants -- O(1). Cheap; collect all violations before bailing.
1046+
if (nodes.empty()) {
1047+
errors.emplace_back("nodes is empty");
1048+
}
1049+
if (start_node < 0 || start_node >= (int)nodes.size()) {
1050+
errors.emplace_back(fmt::format("start_node {} out of range [0, {})", start_node, nodes.size()));
1051+
}
1052+
if (final_node < 0 || final_node >= (int)nodes.size()) {
1053+
errors.emplace_back(fmt::format("final_node {} out of range [0, {})", final_node, nodes.size()));
1054+
}
1055+
if (errors.empty()) {
1056+
if (nodes[start_node] == nullptr) {
1057+
errors.emplace_back(fmt::format("start_node entry (index {}) is null", start_node));
1058+
}
1059+
if (nodes[final_node] == nullptr) {
1060+
errors.emplace_back(fmt::format("final_node entry (index {}) is null", final_node));
1061+
}
1062+
}
10441063

10451064
// (2) Edge consistency -- O(V+E). For each forward edge n->next[k] == m, we require the
10461065
// corresponding back edge m->prev to contain n, and vice versa. This catches the dangling-
10471066
// pointer / asymmetric-edge corruption that surfaces when a pre-CFG pass (e.g. an unstructured
1048-
// control-flow normaliser like structure_continues) produces malformed IR -- precisely the
1049-
// failure mode that currently shows up as a segfault deep in worklist propagation rather than
1050-
// at the boundary. Null entries are tolerated: `erase()` clears entries before `simplify_graph`
1051-
// compacts, so a non-compact `nodes` vector with embedded nulls is a legal intermediate state.
1052-
for (std::size_t i = 0; i < nodes.size(); ++i) {
1067+
// control-flow normaliser) produces malformed IR -- precisely the failure mode that surfaces
1068+
// as a segfault deep in worklist propagation rather than at the boundary. Null entries are
1069+
// tolerated: `erase()` clears entries before `simplify_graph` compacts, so a non-compact
1070+
// `nodes` vector with embedded nulls is a legal intermediate state. We short-circuit on the
1071+
// first edge violation: subsequent edges may share corrupted pointers and dereferencing them
1072+
// could itself segfault.
1073+
for (std::size_t i = 0; i < nodes.size() && errors.empty(); ++i) {
10531074
if (!nodes[i]) {
10541075
continue;
10551076
}
10561077
CFGNode *n = nodes[i].get();
10571078
for (CFGNode *m : n->next) {
1058-
QD_ASSERT_INFO(m != nullptr, "CFG node {} has a null entry in `next`", i);
1059-
const bool back_link =
1060-
std::find(m->prev.begin(), m->prev.end(), n) != m->prev.end();
1061-
QD_ASSERT_INFO(back_link,
1062-
"CFG edge asymmetry: node {} -> next contains a successor whose `prev` does "
1063-
"not list back to node {}",
1064-
i, i);
1079+
if (m == nullptr) {
1080+
errors.emplace_back(fmt::format("node {} has a null entry in `next`", i));
1081+
break;
1082+
}
1083+
if (std::find(m->prev.begin(), m->prev.end(), n) == m->prev.end()) {
1084+
errors.emplace_back(
1085+
fmt::format("edge asymmetry: node {} has a successor whose `prev` does not "
1086+
"list back to node {}",
1087+
i, i));
1088+
break;
1089+
}
1090+
}
1091+
if (!errors.empty()) {
1092+
break;
10651093
}
10661094
for (CFGNode *m : n->prev) {
1067-
QD_ASSERT_INFO(m != nullptr, "CFG node {} has a null entry in `prev`", i);
1068-
const bool fwd_link =
1069-
std::find(m->next.begin(), m->next.end(), n) != m->next.end();
1070-
QD_ASSERT_INFO(fwd_link,
1071-
"CFG edge asymmetry: node {} <- prev contains a predecessor whose `next` "
1072-
"does not list forward to node {}",
1073-
i, i);
1095+
if (m == nullptr) {
1096+
errors.emplace_back(fmt::format("node {} has a null entry in `prev`", i));
1097+
break;
1098+
}
1099+
if (std::find(m->next.begin(), m->next.end(), n) == m->next.end()) {
1100+
errors.emplace_back(
1101+
fmt::format("edge asymmetry: node {} has a predecessor whose `next` does not "
1102+
"list forward to node {}",
1103+
i, i));
1104+
break;
1105+
}
10741106
}
10751107
}
1108+
1109+
if (errors.empty()) {
1110+
return;
1111+
}
1112+
const std::filesystem::path dump_path = dump_invariant_failure_to_temp_path("structural");
1113+
std::string joined;
1114+
for (const auto &e : errors) {
1115+
joined += "\n - ";
1116+
joined += e;
1117+
}
1118+
QD_ASSERT_INFO(false, "CFG structural invariant failure(s):{}\nCFG state dumped to: {}", joined,
1119+
dump_path.empty() ? std::string("<dump failed>") : dump_path.string());
10761120
}
10771121

10781122
void ControlFlowGraph::erase(int node_id) {
@@ -1182,6 +1226,54 @@ void write_cfg_node_statements(std::ostream &out, const CFGNode *node) {
11821226

11831227
} // namespace
11841228

1229+
std::filesystem::path ControlFlowGraph::dump_invariant_failure_to_temp_path(
1230+
const std::string &reason) const {
1231+
// Deliberately tolerant of malformed state (null entries, dangling edges, broken back-pointers):
1232+
// we are *only* called from `assert_structural_invariants` after a violation has been detected,
1233+
// so the graph state we are dumping is, by construction, not internally consistent. Any single
1234+
// stmt-write that throws is caught and replaced with a placeholder line so the rest of the dump
1235+
// still lands. The outer try/catch handles filesystem failures and similar.
1236+
try {
1237+
namespace fs = std::filesystem;
1238+
const fs::path dir = fs::temp_directory_path() / "cfg_invariant_failures";
1239+
fs::create_directories(dir);
1240+
const auto ns = std::chrono::duration_cast<std::chrono::nanoseconds>(
1241+
std::chrono::system_clock::now().time_since_epoch())
1242+
.count();
1243+
const fs::path filename = dir / fmt::format("{}_{}.txt", reason, ns);
1244+
std::ofstream out(filename);
1245+
if (!out) {
1246+
return {};
1247+
}
1248+
out << "# CFG invariant failure dump\n";
1249+
out << "# reason: " << reason << "\n";
1250+
out << fmt::format("# start_node: {}\n", start_node);
1251+
out << fmt::format("# final_node: {}\n", final_node);
1252+
out << fmt::format("# nodes.size(): {}\n\n", nodes.size());
1253+
1254+
std::unordered_map<CFGNode *, int> to_index;
1255+
to_index.reserve(nodes.size());
1256+
for (std::size_t i = 0; i < nodes.size(); ++i) {
1257+
to_index[nodes[i].get()] = static_cast<int>(i);
1258+
}
1259+
for (std::size_t i = 0; i < nodes.size(); ++i) {
1260+
if (!nodes[i]) {
1261+
out << "NODE " << i << ": <ERASED>\n\n";
1262+
continue;
1263+
}
1264+
try {
1265+
write_cfg_node_header(out, static_cast<int>(i), nodes[i].get(), to_index);
1266+
write_cfg_node_statements(out, nodes[i].get());
1267+
} catch (...) {
1268+
out << "NODE " << i << ": <DUMP FAILED -- pointer state likely corrupt>\n\n";
1269+
}
1270+
}
1271+
return filename;
1272+
} catch (...) {
1273+
return {};
1274+
}
1275+
}
1276+
11851277
void ControlFlowGraph::dump_graph_to_file(const CompileConfig &config,
11861278
const std::string &kernel_name,
11871279
const std::string &suffix) const {

quadrants/ir/control_flow_graph.h

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
#pragma once
22

3+
#include <filesystem>
34
#include <optional>
45
#include <unordered_set>
56

@@ -195,9 +196,19 @@ class ControlFlowGraph {
195196
// cannot canonicalise (O(V+E)).
196197
// Called from each public driver. Always-on (release builds too); the cost is dominated by the
197198
// analyses that follow and the alternative on violation is silent corruption / a segfault deep
198-
// in worklist processing.
199+
// in worklist processing. On any violation, dumps the current CFG state to a temp file (see
200+
// `dump_invariant_failure_to_temp_path`) and includes the path in the assertion message.
199201
void assert_structural_invariants() const;
200202

203+
// Best-effort dump of the current CFG state to a unique file under
204+
// <tmp>/cfg_invariant_failures/<reason>_<ns>.txt. Called from `assert_structural_invariants`
205+
// when a violation is about to fire, so the dump runs against possibly-corrupt state -- the
206+
// dump is deliberately tolerant of nulls, dangling edges, and broken back-pointers (it
207+
// prints placeholders and continues, rather than segfaulting). Catches and swallows all
208+
// exceptions; the caller is about to abort. Returns the dump path on success, or an empty
209+
// path if the dump itself failed.
210+
std::filesystem::path dump_invariant_failure_to_temp_path(const std::string &reason) const;
211+
201212
public:
202213
struct LiveVarAnalysisConfig {
203214
// This is mostly useful for SFG task-level dead store elimination. SFG may

0 commit comments

Comments
 (0)