Skip to content

Commit 6cdc22f

Browse files
bjjwwangclaude
andcommitted
Actually use AbstractStateManager (drop local helper re-implementations)
The previous sync (deb5cb1) wired stateMgr into AbstractExecution but only as a placeholder for AbsExtAPI. Reads/writes still went through a separate postAbsTrace map and the GEP/load/size helpers were ported by hand. That defeated the point: the C++ side held a stateMgr it never actually called, and the Python side never even built one. This commit makes both languages use the stateMgr as the authoritative post-trace store and routes the GEP helpers through its API. C++ Assignment-3: - Drop the standalone Map<const ICFGNode*, AbstractState> postAbsTrace. Replace with `Map<...>& postAbsTrace() { return svfStateMgr->getTrace(); }` so existing call sites still read like `postAbsTrace()[node]`. - getAbsStateFromTrace returns `(*svfStateMgr)[node]`. - All call sites of `as.getByteOffset(gep)` (now removed upstream) and the intermediate `bufOverflowHelper.getByteOffset(as, gep)` shim go directly to `svfStateMgr->getGepByteOffset(gep)`. - Drop the in-out trace sync that used to wrap `utils->handleExtAPI(...)`. AbsExtAPI now reads and writes through the same stateMgr that backs the post trace, so there is nothing to sync. - Drop the 60-line `AbstractExecutionHelper::getByteOffset` re-implementation; it has no remaining caller. Python Assignment-3: - Construct `self.ander = pysvf.AndersenWaveDiff(svfir)` and `self.svf_state_mgr = pysvf.AbstractStateManager(svfir, self.ander)` in `AbstractExecution.__init__`. - Alias `self.post_abs_trace = self.svf_state_mgr` so every existing `self.post_abs_trace[node]`, `node in self.post_abs_trace`, and `self.post_abs_trace[node] = state` call site continues to work -- pysvf.AbstractStateManager's new __getitem__/__setitem__/__contains__ bindings carry the dict protocol. - Pass the stateMgr into AbstractExecutionHelper.__init__. - AbstractExecutionHelper.getByteOffset/getGepObjAddrs/getAllocaInstByteSize shrink to one-liners that delegate to the stateMgr's upstream impl, matching the C++ shape `svfStateMgr->getGepByteOffset(gep)`. getPointeeElement keeps a local impl because the upstream signature takes (ObjVar, ICFGNode) while the existing call sites only have a NodeID for what is typically a ValVar pointer -- not 1:1 convertible. Net asymmetry left: 1 local helper on the Python side (getPointeeElement); everything else now flows through stateMgr on both languages. Locally: bin/ass3 builds at 100%; Python imports clean; helper methods visibly delegate to self.svf_state_mgr. Depends on bjjwwang/SVF-Python sync-llvm-21 head bb03d79 for the new __setitem__/__contains__ on AbstractStateManager. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent deb5cb1 commit 6cdc22f

5 files changed

Lines changed: 54 additions & 150 deletions

File tree

Assignment-3/CPP/Assignment_3.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ void AbstractExecution::bufOverflowDetection(const SVF::SVFStmt* stmt) {
7676
AbstractState& as = getAbsStateFromTrace(gep->getICFGNode());
7777
NodeID lhs = gep->getLHSVarID();
7878
NodeID rhs = gep->getRHSVarID();
79-
updateGepObjOffsetFromBase(as, as[lhs].getAddrs(), as[rhs].getAddrs(), bufOverflowHelper.getByteOffset(as, gep));
79+
updateGepObjOffsetFromBase(as, as[lhs].getAddrs(), as[rhs].getAddrs(), svfStateMgr->getGepByteOffset(gep));
8080

8181
/// TODO: your code starts from here
8282

@@ -264,9 +264,9 @@ void AbstractExecution::updateStateOnCall(const CallPE* callPE) {
264264
{
265265
NodeID curId = callPE->getOpVarID(i);
266266
const ICFGNode* opICFGNode = callPE->getOpCallICFGNode(i);
267-
if (postAbsTrace.count(opICFGNode))
267+
if (postAbsTrace().count(opICFGNode))
268268
{
269-
AbstractState& opAs = postAbsTrace[opICFGNode];
269+
AbstractState& opAs = postAbsTrace()[opICFGNode];
270270
rhs.join_with(opAs[curId]);
271271
}
272272
}

Assignment-3/CPP/Assignment_3.h

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ namespace SVF {
109109

110110
/// Return its abstract state given an ICFGNode
111111
AbstractState& getAbsStateFromTrace(const ICFGNode* node) {
112-
return postAbsTrace[node];
112+
return (*svfStateMgr)[node];
113113
}
114114

115115
/// Update the offset of a GEP (GetElementPtr) object from its base address
@@ -129,9 +129,10 @@ namespace SVF {
129129
/// SVFIR and ICFG
130130
SVFIR* svfir;
131131
ICFG* icfg;
132-
/// Adapter that lets us reuse AbsExtAPI (which now requires an
133-
/// AbstractStateManager) without giving up our own pre/postAbsTrace.
134-
/// Trace is synced in/out around AbsExtAPI calls.
132+
/// Owns the abstract trace immediately after an ICFGNode (post trace).
133+
/// AbsExtAPI and the GEP/load/store helpers (getGepByteOffset etc.)
134+
/// read and write through this manager; we don't keep a separate
135+
/// postAbsTrace map any more.
135136
AbstractStateManager* svfStateMgr = nullptr;
136137

137138
/// Map a function to its corresponding WTO
@@ -140,8 +141,10 @@ namespace SVF {
140141
Set<const FunObjVar*> recursiveFuns;
141142
/// Abstract trace immediately before an ICFGNode.
142143
Map<const ICFGNode*, AbstractState> preAbsTrace;
143-
/// Abstract trace immediately after an ICFGNode.
144-
Map<const ICFGNode*, AbstractState> postAbsTrace;
144+
/// Convenience alias: the "post" trace lives inside svfStateMgr.
145+
Map<const ICFGNode*, AbstractState>& postAbsTrace() {
146+
return svfStateMgr->getTrace();
147+
}
145148

146149
private:
147150
AbstractExecutionHelper bufOverflowHelper;

Assignment-3/CPP/Assignment_3_Helper.cpp

Lines changed: 13 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -81,13 +81,13 @@ IntervalValue AbstractExecution::getAccessOffset(NodeID objId, const GepStmt* ge
8181
// Field-insensitive base object
8282
if (SVFUtil::isa<BaseObjVar>(obj)) {
8383
// get base size
84-
IntervalValue accessOffset = bufOverflowHelper.getByteOffset(as, gep);
84+
IntervalValue accessOffset = svfStateMgr->getGepByteOffset(gep);
8585
return accessOffset;
8686
}
8787
// A sub object of an aggregate object
8888
else if (SVFUtil::isa<GepObjVar>(obj)) {
8989
IntervalValue accessOffset =
90-
bufOverflowHelper.getGepObjOffsetFromBase(SVFUtil::cast<GepObjVar>(obj)) + bufOverflowHelper.getByteOffset(as, gep);
90+
bufOverflowHelper.getGepObjOffsetFromBase(SVFUtil::cast<GepObjVar>(obj)) + svfStateMgr->getGepByteOffset(gep);
9191
return accessOffset;
9292
}
9393
else{
@@ -221,12 +221,12 @@ bool AbstractExecution::mergeStatesFromPredecessors(const ICFGNode* block, Abstr
221221
// Iterate over all incoming edges of the given block
222222
for (auto& edge : block->getInEdges()) {
223223
// Check if the source node of the edge has a post-execution state recorded
224-
if (postAbsTrace.find(edge->getSrcNode()) != postAbsTrace.end()) {
224+
if (postAbsTrace().find(edge->getSrcNode()) != postAbsTrace().end()) {
225225
const IntraCFGEdge* intraCfgEdge = SVFUtil::dyn_cast<IntraCFGEdge>(edge);
226226

227227
// If the edge is an intra-block edge and has a condition
228228
if (intraCfgEdge && intraCfgEdge->getCondition()) {
229-
AbstractState tmpEs = postAbsTrace[edge->getSrcNode()];
229+
AbstractState tmpEs = postAbsTrace()[edge->getSrcNode()];
230230
// Check if the branch condition is feasible
231231
if (isBranchFeasible(intraCfgEdge, tmpEs)) {
232232
as.joinWith(tmpEs); // Merge the state with the current state
@@ -236,7 +236,7 @@ bool AbstractExecution::mergeStatesFromPredecessors(const ICFGNode* block, Abstr
236236
}
237237
else {
238238
// For non-conditional edges, directly merge the state
239-
as.joinWith(postAbsTrace[edge->getSrcNode()]);
239+
as.joinWith(postAbsTrace()[edge->getSrcNode()]);
240240
inEdgeNum++;
241241
}
242242
}
@@ -492,8 +492,8 @@ bool AbstractExecution::isBranchFeasible(const IntraCFGEdge* intraEdge, Abstract
492492
void AbstractExecution::handleGlobalNode() {
493493
AbstractState as;
494494
const ICFGNode* node = icfg->getGlobalICFGNode();
495-
postAbsTrace[node] = preAbsTrace[node];
496-
postAbsTrace[node][0] = AddressValue();
495+
postAbsTrace()[node] = preAbsTrace[node];
496+
postAbsTrace()[node][0] = AddressValue();
497497
// Global Node, we just need to handle addr, load, store, copy and gep
498498
for (const SVFStmt* stmt : node->getSVFStmts()) {
499499
updateAbsState(stmt);
@@ -585,8 +585,8 @@ bool AbstractExecution::handleICFGNode(const ICFGNode* node) {
585585
}
586586
preAbsTrace[node] = tmpEs;
587587
// Store the last abstract state, used to check if the abstract state has reached a fixpoint
588-
AbstractState last_as = postAbsTrace[node];
589-
postAbsTrace[node] = preAbsTrace[node];
588+
AbstractState last_as = postAbsTrace()[node];
589+
postAbsTrace()[node] = preAbsTrace[node];
590590
for (const SVFStmt* stmt : node->getSVFStmts()) {
591591
updateAbsState(stmt);
592592
bufOverflowDetection(stmt);
@@ -596,7 +596,7 @@ bool AbstractExecution::handleICFGNode(const ICFGNode* node) {
596596
handleCallSite(callNode);
597597
}
598598
// If the abstract state is the same as the last abstract state, return false because we have reached fixpoint
599-
if (postAbsTrace[node] == last_as) {
599+
if (postAbsTrace()[node] == last_as) {
600600
return false;
601601
}
602602
return true;
@@ -682,22 +682,16 @@ void AbstractExecution::handleCallSite(const CallICFGNode* callNode) {
682682
}
683683
else if (fun_name == "nd" || fun_name == "rand") {
684684
NodeID lhsId = callNode->getRetICFGNode()->getActualRet()->getId();
685-
postAbsTrace[callNode][lhsId] = AbstractValue(IntervalValue::top());
685+
postAbsTrace()[callNode][lhsId] = AbstractValue(IntervalValue::top());
686686
}
687687
else if (isExternalCallForAssignment(callee)) {
688688
// implement external calls for the assignment
689689
updateStateOnExtCall(callNode);
690690
}
691691
else if (SVFUtil::isExtCall(callee)) {
692-
// handle external API calls — sync our trace into the stateMgr so
693-
// AbsExtAPI sees the right state, then copy any updates back out.
694-
for (const auto& kv : postAbsTrace) {
695-
svfStateMgr->updateAbstractState(kv.first, kv.second);
696-
}
692+
// handle external API calls — AbsExtAPI reads/writes through the
693+
// same svfStateMgr that backs postAbsTrace(), so no sync needed.
697694
utils->handleExtAPI(callNode);
698-
for (const auto& kv : svfStateMgr->getTrace()) {
699-
postAbsTrace[kv.first] = kv.second;
700-
}
701695
}
702696
else if (recursiveFuns.find(callee) != recursiveFuns.end()) {
703697
// skip recursive functions

Assignment-3/CPP/Assignment_3_Helper.h

Lines changed: 0 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -34,62 +34,6 @@
3434
namespace SVF {
3535
class AbstractExecutionHelper {
3636
public:
37-
/// Compute the byte offset of a GepStmt against the given abstract state.
38-
/// Replaces the upstream-removed `AbstractState::getByteOffset(GepStmt*)`.
39-
/// Mirrors `AbstractStateManager::getGepByteOffset` but reads non-constant
40-
/// indices directly from `as` (dense trace), since Assignment-3 manages
41-
/// its own per-node trace separately from any AbstractStateManager.
42-
IntervalValue getByteOffset(const AbstractState& as, const GepStmt* gep) {
43-
if (gep->isConstantOffset())
44-
return IntervalValue((s64_t)gep->accumulateConstantByteOffset());
45-
46-
IntervalValue res(0);
47-
for (int i = gep->getOffsetVarAndGepTypePairVec().size() - 1; i >= 0; i--) {
48-
const ValVar* idxOperandVar = gep->getOffsetVarAndGepTypePairVec()[i].first;
49-
const SVFType* idxOperandType = gep->getOffsetVarAndGepTypePairVec()[i].second;
50-
51-
if (SVFUtil::isa<SVFArrayType>(idxOperandType) || SVFUtil::isa<SVFPointerType>(idxOperandType)) {
52-
u32_t elemByteSize = 1;
53-
if (const SVFArrayType* arrTy = SVFUtil::dyn_cast<SVFArrayType>(idxOperandType))
54-
elemByteSize = arrTy->getTypeOfElement()->getByteSize();
55-
else if (SVFUtil::isa<SVFPointerType>(idxOperandType))
56-
elemByteSize = gep->getAccessPath().gepSrcPointeeType()->getByteSize();
57-
else
58-
assert(false && "idxOperandType must be ArrType or PtrType");
59-
60-
if (const ConstIntValVar* op = SVFUtil::dyn_cast<ConstIntValVar>(idxOperandVar)) {
61-
s64_t lb = (double)Options::MaxFieldLimit() / elemByteSize >= op->getSExtValue()
62-
? op->getSExtValue() * elemByteSize
63-
: Options::MaxFieldLimit();
64-
res = res + IntervalValue(lb, lb);
65-
}
66-
else {
67-
AbstractState& mut_as = const_cast<AbstractState&>(as);
68-
IntervalValue idxVal = mut_as[idxOperandVar->getId()].getInterval();
69-
if (idxVal.isBottom())
70-
res = res + IntervalValue(0, 0);
71-
else {
72-
s64_t ub = (idxVal.ub().getIntNumeral() < 0) ? 0
73-
: (double)Options::MaxFieldLimit() / elemByteSize >= idxVal.ub().getIntNumeral()
74-
? elemByteSize * idxVal.ub().getIntNumeral()
75-
: Options::MaxFieldLimit();
76-
s64_t lb = (idxVal.lb().getIntNumeral() < 0) ? 0
77-
: (double)Options::MaxFieldLimit() / elemByteSize >= idxVal.lb().getIntNumeral()
78-
? elemByteSize * idxVal.lb().getIntNumeral()
79-
: Options::MaxFieldLimit();
80-
res = res + IntervalValue(lb, ub);
81-
}
82-
}
83-
}
84-
else if (const SVFStructType* structTy = SVFUtil::dyn_cast<SVFStructType>(idxOperandType)) {
85-
res = res + IntervalValue(gep->getAccessPath().getStructFieldOffset(idxOperandVar, structTy));
86-
}
87-
else {
88-
assert(false && "gep type pair only support arr/ptr/struct");
89-
}
90-
}
91-
return res;
92-
}
9337

9438
/// Add a detected bug to the bug reporter and print the report
9539
///@{

Assignment-3/Python/Assignment_3_Helper.py

Lines changed: 29 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,7 @@ class AbstractExecutionHelper:
204204
managing GEP object offsets, and other utilities.
205205
"""
206206

207-
def __init__(self, svfir: pysvf.SVFIR):
207+
def __init__(self, svfir: pysvf.SVFIR, svf_state_mgr: pysvf.AbstractStateManager = None):
208208
"""
209209
Initialize member variables.
210210
"""
@@ -216,6 +216,9 @@ def __init__(self, svfir: pysvf.SVFIR):
216216
# Map to store exception information for each ICFGNode
217217
self.node_to_bug_info = {}
218218
self.svfir = svfir
219+
# Optional: if a stateMgr is provided, getByteOffset delegates to its
220+
# getGepByteOffset (the C++ side does the same via svfStateMgr->...).
221+
self.svf_state_mgr = svf_state_mgr
219222

220223
# ------------------------------------------------------------------
221224
# Helpers that used to live as instance methods on `pysvf.AbstractState`.
@@ -225,54 +228,19 @@ def __init__(self, svfir: pysvf.SVFIR):
225228
# Python side mirrors the C++ side (`AbstractExecutionHelper::getByteOffset`).
226229
# ------------------------------------------------------------------
227230
def getByteOffset(self, abstract_state: pysvf.AbstractState, gep: pysvf.GepStmt) -> pysvf.IntervalValue:
228-
if gep.isConstantOffset():
229-
return pysvf.IntervalValue(gep.getConstantByteOffset())
230-
max_field_limit = pysvf.Options.max_field_limit()
231-
res = pysvf.IntervalValue(0)
232-
pairs = gep.getOffsetVarAndGepTypePairVec()
233-
for i in reversed(range(len(pairs))):
234-
idx_var, idx_type = pairs[i]
235-
if idx_type.isArrayType() or idx_type.isPointerType():
236-
if idx_type.isArrayType():
237-
elem_byte_size = idx_type.asArrayType().getTypeOfElement().getByteSize()
238-
else:
239-
elem_byte_size = gep.getSrcPointeeType().getByteSize()
240-
if isinstance(idx_var, pysvf.ConstIntValVar):
241-
val = idx_var.getSExtValue()
242-
lb = val * elem_byte_size if (max_field_limit / elem_byte_size) >= val else max_field_limit
243-
res = res + pysvf.IntervalValue(lb, lb)
244-
else:
245-
idx_val = abstract_state[idx_var.getId()].getInterval()
246-
if idx_val.isBottom():
247-
res = res + pysvf.IntervalValue(0, 0)
248-
else:
249-
ub_int = idx_val.ub().getNumeral()
250-
lb_int = idx_val.lb().getNumeral()
251-
ub = 0 if ub_int < 0 else (
252-
elem_byte_size * ub_int if (max_field_limit / elem_byte_size) >= ub_int
253-
else max_field_limit)
254-
lb = 0 if lb_int < 0 else (
255-
elem_byte_size * lb_int if (max_field_limit / elem_byte_size) >= lb_int
256-
else max_field_limit)
257-
res = res + pysvf.IntervalValue(lb, ub)
258-
elif idx_type.isStructType():
259-
res = res + pysvf.IntervalValue(gep.getStructFieldOffset(idx_var, idx_type.asStructType()))
260-
else:
261-
raise AssertionError("gep type pair only supports arr/ptr/struct")
262-
return res
231+
# Delegates to the stateMgr's upstream impl, mirroring the C++ side
232+
# `svfStateMgr->getGepByteOffset(gep)`. The `abstract_state` argument
233+
# is kept in the signature for symmetry with the call-site shape but
234+
# is not consulted here -- the mgr reads non-constant indices from
235+
# its own trace, which is the same trace this helper writes to.
236+
return self.svf_state_mgr.getGepByteOffset(gep)
263237

264238
def getGepObjAddrs(self, abstract_state: pysvf.AbstractState, var_id: int, offset: pysvf.IntervalValue) -> pysvf.AddressValue:
265-
gep_addrs = pysvf.AddressValue()
266-
max_field_limit = pysvf.Options.max_field_limit()
267-
lb = min(offset.lb().getNumeral(), max_field_limit)
268-
ub = min(offset.ub().getNumeral(), max_field_limit)
269-
addrs = abstract_state[var_id].getAddrs()
270-
for i in range(lb, ub + 1):
271-
for addr in addrs:
272-
base_obj = abstract_state.getIDFromAddr(addr)
273-
gep_obj = self.svfir.getGepObjVar(base_obj, i)
274-
gep_addrs.insert(pysvf.AbstractState.getVirtualMemAddress(gep_obj))
275-
return gep_addrs
239+
# Delegates to the stateMgr's upstream impl. mgr.getGepObjAddrs takes
240+
# a ValVar* (and infers the ICFGNode from it), so we look the var up
241+
# by id. Matches the C++ side `svfStateMgr->getGepObjAddrs(...)`.
242+
pointer = self.svfir.getGNode(var_id)
243+
return self.svf_state_mgr.getGepObjAddrs(pointer, offset)
276244

277245
def getPointeeElement(self, abstract_state: pysvf.AbstractState, var_id: int):
278246
ptr_val = abstract_state[var_id]
@@ -286,23 +254,10 @@ def getPointeeElement(self, abstract_state: pysvf.AbstractState, var_id: int):
286254
return None
287255

288256
def getAllocaInstByteSize(self, abstract_state: pysvf.AbstractState, addr: pysvf.AddrStmt) -> int:
289-
rhs = addr.getRHSVar()
290-
if not isinstance(rhs, pysvf.ObjVar):
291-
raise AssertionError("Addr rhs value is not ObjVar")
292-
base = self.svfir.getBaseObject(rhs.getId())
293-
if base.isConstantByteSize():
294-
return base.getByteSizeOfObj()
295-
max_field_limit = pysvf.Options.max_field_limit()
296-
sizes = addr.getArrSize()
297-
res = 1
298-
for value in sizes:
299-
sz_val = abstract_state[value.getId()].getInterval()
300-
if sz_val.isBottom():
301-
ub = max_field_limit
302-
else:
303-
ub = sz_val.ub().getNumeral()
304-
res = res * ub if res * ub <= max_field_limit else max_field_limit
305-
return int(res)
257+
# Delegates to the stateMgr's upstream impl. mgr.getAllocaInstByteSize
258+
# takes the AddrStmt directly (it derives node + sizes itself). Matches
259+
# the C++ side `svfStateMgr->getAllocaInstByteSize(addr)`.
260+
return self.svf_state_mgr.getAllocaInstByteSize(addr)
306261

307262
def reportBufOverflow(self, node, msg):
308263
"""
@@ -487,8 +442,16 @@ def __init__(self, pag: pysvf.SVFIR):
487442
self.func_to_wto = {}
488443
self.recursive_funs = set()
489444
self.pre_abs_trace = {}
490-
self.post_abs_trace = {}
491-
self.buf_overflow_helper = AbstractExecutionHelper(self.svfir)
445+
# Owns the post-trace and is the backing store for AbsExtAPI as well
446+
# as the GEP/load/store helpers (getGepByteOffset etc.). Replaces
447+
# the old `self.post_abs_trace` dict so reads/writes on
448+
# `self.post_abs_trace[node]` go through the mgr's trace.
449+
self.ander = pysvf.AndersenWaveDiff(self.svfir)
450+
self.svf_state_mgr = pysvf.AbstractStateManager(self.svfir, self.ander)
451+
# Alias preserved so existing call-sites `self.post_abs_trace[node]`
452+
# keep working. The mgr supports __getitem__/__setitem__/__contains__.
453+
self.post_abs_trace = self.svf_state_mgr
454+
self.buf_overflow_helper = AbstractExecutionHelper(self.svfir, self.svf_state_mgr)
492455
self.assert_points = set()
493456
self.widen_delay = 3
494457
self.addressMask = 0x7f000000

0 commit comments

Comments
 (0)