Skip to content

Commit deb5cb1

Browse files
bjjwwangclaude
andcommitted
Sync Assignment-3 + CI with upstream LLVM 21.1.0 + Semi-Sparse refactor
Upstream commits 899d00a (Port SVF to LLVM 21) and 0aa951d (Semi-Sparse infrastructure) together broke Assignment-3: 1. AbstractState::getByteOffset(GepStmt*) was deleted (moved to AbstractStateManager::getGepByteOffset). 2. AbsExtAPI's constructor changed from AbsExtAPI(Map<const ICFGNode*, AbstractState>&) to AbsExtAPI(AbstractStateManager*). 3. The same set of methods was deleted on the Python pysvf binding. CI / packaging: - Dockerfile + build.yml: bump llvm_version 18.1.0 -> 21.1.0 to match the npm svf-lib package after SVF-npm sync-llvm-21 republishes. Assignment-3 C++: - Assignment_3_Helper.h: add AbstractExecutionHelper::getByteOffset (header-only). Body is a faithful port of the upstream AbstractStateManager::getGepByteOffset, reading non-constant indices from `as[idxVar.getId()]` instead of going through a stateMgr -- works because Assignment-3 keeps a dense per-node trace. - Assignment_3.h / _Helper.cpp: own a lazily-constructed AbstractStateManager* svfStateMgr so AbsExtAPI(svfStateMgr) compiles. Around the single utils->handleExtAPI(callNode) site, sync postAbsTrace into the mgr and copy any updates back, since AbsExtAPI now reads abstract values exclusively through the mgr. - Migrate the 3 stale call sites: as.getByteOffset(gep) -> bufOverflowHelper.getByteOffset(as, gep) Assignment-3 Python (mirrors C++ shape): - Assignment_3_Helper.py: add 4 helpers on AbstractExecutionHelper that port the upstream behavior: getByteOffset, getGepObjAddrs, getPointeeElement, getAllocaInstByteSize. Three of them need svfir (already a member of the helper); getByteOffset uses pysvf.Options.max_field_limit() and gep.getStructFieldOffset(...) which SVF-Python sync-llvm-21 newly exposes. - Migrate 9 stale call sites: abstract_state.<method>(...) -> self.buf_overflow_helper.<method>(abstract_state, ...) (or self.<method>(...) when the caller is already inside the helper). Symmetric with the C++ bufOverflowHelper.<method>(as, ...) pattern. Locally: SVF builds clean against brew llvm@21 (21.1.4) on darwin/arm64; SSA builds 100% (bin/ass3 produced); pysvf imports + Assignment-3 helpers import + Options.max_field_limit() and AbstractStateManager are visible. test-ae.{cpp,py} need a real .bc fixture to run end-to-end and have not been exercised yet. Depends on (publish in this order): 1. SVF-npm sync-llvm-21 (republish svf-lib) 2. SVF-Python sync-llvm-21 (republish pysvf to TestPyPI) 3. this branch Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 63c0e59 commit deb5cb1

8 files changed

Lines changed: 177 additions & 16 deletions

File tree

.github/workflows/build.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ jobs:
4343
- name: build
4444
run: |
4545
export SVF_DIR=$(npm root)/SVF
46-
export LLVM_DIR=$(npm root)/llvm-18.1.0.obj
46+
export LLVM_DIR=$(npm root)/llvm-21.1.0.obj
4747
export Z3_DIR=$(npm root)/z3.obj
4848
echo "SVF_DIR="$SVF_DIR
4949
echo "LLVM_DIR="$LLVM_DIR

Assignment-3/CPP/Assignment_3.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ void AbstractExecution::bufOverflowDetection(const SVF::SVFStmt* stmt) {
7676
AbstractState& as = getAbsStateFromTrace(gep->getICFGNode());
7777
NodeID lhs = gep->getLHSVarID();
7878
NodeID rhs = gep->getRHSVarID();
79-
updateGepObjOffsetFromBase(as, as[lhs].getAddrs(), as[rhs].getAddrs(), as.getByteOffset(gep));
79+
updateGepObjOffsetFromBase(as, as[lhs].getAddrs(), as[rhs].getAddrs(), bufOverflowHelper.getByteOffset(as, gep));
8080

8181
/// TODO: your code starts from here
8282

Assignment-3/CPP/Assignment_3.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
*/
2727
#include "Assignment_3_Helper.h"
2828
#include "AE/Svfexe/AbsExtAPI.h"
29+
#include "AE/Svfexe/AbstractStateManager.h"
2930
#include "SVFIR/SVFIR.h"
3031

3132
namespace SVF {
@@ -121,12 +122,17 @@ namespace SVF {
121122

122123
/// Destructor
123124
virtual ~AbstractExecution() {
125+
delete svfStateMgr;
124126
}
125127

126128
protected:
127129
/// SVFIR and ICFG
128130
SVFIR* svfir;
129131
ICFG* icfg;
132+
/// Adapter that lets us reuse AbsExtAPI (which now requires an
133+
/// AbstractStateManager) without giving up our own pre/postAbsTrace.
134+
/// Trace is synced in/out around AbsExtAPI calls.
135+
AbstractStateManager* svfStateMgr = nullptr;
130136

131137
/// Map a function to its corresponding WTO
132138
Map<const FunObjVar*, ICFGWTO*> funcToWTO;

Assignment-3/CPP/Assignment_3_Helper.cpp

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -81,13 +81,13 @@ IntervalValue AbstractExecution::getAccessOffset(NodeID objId, const GepStmt* ge
8181
// Field-insensitive base object
8282
if (SVFUtil::isa<BaseObjVar>(obj)) {
8383
// get base size
84-
IntervalValue accessOffset = as.getByteOffset(gep);
84+
IntervalValue accessOffset = bufOverflowHelper.getByteOffset(as, gep);
8585
return accessOffset;
8686
}
8787
// A sub object of an aggregate object
8888
else if (SVFUtil::isa<GepObjVar>(obj)) {
8989
IntervalValue accessOffset =
90-
bufOverflowHelper.getGepObjOffsetFromBase(SVFUtil::cast<GepObjVar>(obj)) + as.getByteOffset(gep);
90+
bufOverflowHelper.getGepObjOffsetFromBase(SVFUtil::cast<GepObjVar>(obj)) + bufOverflowHelper.getByteOffset(as, gep);
9191
return accessOffset;
9292
}
9393
else{
@@ -543,7 +543,9 @@ void AbstractExecution::ensureAllAssertsValidated() {
543543
void AbstractExecution::analyse() {
544544
// Init WTOs for all functions, and handle Global ICFGNode of SVFModule
545545
initWTO();
546-
utils = new AbsExtAPI(postAbsTrace);
546+
AndersenWaveDiff* ander = AndersenWaveDiff::createAndersenWaveDiff(svfir);
547+
svfStateMgr = new AbstractStateManager(svfir, ander);
548+
utils = new AbsExtAPI(svfStateMgr);
547549

548550
// Handle the global node
549551
handleGlobalNode();
@@ -687,8 +689,15 @@ void AbstractExecution::handleCallSite(const CallICFGNode* callNode) {
687689
updateStateOnExtCall(callNode);
688690
}
689691
else if (SVFUtil::isExtCall(callee)) {
690-
// handle external API calls
692+
// handle external API calls — sync our trace into the stateMgr so
693+
// AbsExtAPI sees the right state, then copy any updates back out.
694+
for (const auto& kv : postAbsTrace) {
695+
svfStateMgr->updateAbstractState(kv.first, kv.second);
696+
}
691697
utils->handleExtAPI(callNode);
698+
for (const auto& kv : svfStateMgr->getTrace()) {
699+
postAbsTrace[kv.first] = kv.second;
700+
}
692701
}
693702
else if (recursiveFuns.find(callee) != recursiveFuns.end()) {
694703
// skip recursive functions

Assignment-3/CPP/Assignment_3_Helper.h

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,69 @@
2828
#include "AE/Core/AbstractState.h"
2929
#include "AE/Svfexe/AEDetector.h"
3030
#include "AE/Core/ICFGWTO.h"
31+
#include "SVFIR/SVFStatements.h"
32+
#include "Util/Options.h"
3133
#include "Util/SVFBugReport.h"
3234
namespace SVF {
3335
class AbstractExecutionHelper {
3436
public:
37+
/// Compute the byte offset of a GepStmt against the given abstract state.
38+
/// Replaces the upstream-removed `AbstractState::getByteOffset(GepStmt*)`.
39+
/// Mirrors `AbstractStateManager::getGepByteOffset` but reads non-constant
40+
/// indices directly from `as` (dense trace), since Assignment-3 manages
41+
/// its own per-node trace separately from any AbstractStateManager.
42+
IntervalValue getByteOffset(const AbstractState& as, const GepStmt* gep) {
43+
if (gep->isConstantOffset())
44+
return IntervalValue((s64_t)gep->accumulateConstantByteOffset());
45+
46+
IntervalValue res(0);
47+
for (int i = gep->getOffsetVarAndGepTypePairVec().size() - 1; i >= 0; i--) {
48+
const ValVar* idxOperandVar = gep->getOffsetVarAndGepTypePairVec()[i].first;
49+
const SVFType* idxOperandType = gep->getOffsetVarAndGepTypePairVec()[i].second;
50+
51+
if (SVFUtil::isa<SVFArrayType>(idxOperandType) || SVFUtil::isa<SVFPointerType>(idxOperandType)) {
52+
u32_t elemByteSize = 1;
53+
if (const SVFArrayType* arrTy = SVFUtil::dyn_cast<SVFArrayType>(idxOperandType))
54+
elemByteSize = arrTy->getTypeOfElement()->getByteSize();
55+
else if (SVFUtil::isa<SVFPointerType>(idxOperandType))
56+
elemByteSize = gep->getAccessPath().gepSrcPointeeType()->getByteSize();
57+
else
58+
assert(false && "idxOperandType must be ArrType or PtrType");
59+
60+
if (const ConstIntValVar* op = SVFUtil::dyn_cast<ConstIntValVar>(idxOperandVar)) {
61+
s64_t lb = (double)Options::MaxFieldLimit() / elemByteSize >= op->getSExtValue()
62+
? op->getSExtValue() * elemByteSize
63+
: Options::MaxFieldLimit();
64+
res = res + IntervalValue(lb, lb);
65+
}
66+
else {
67+
AbstractState& mut_as = const_cast<AbstractState&>(as);
68+
IntervalValue idxVal = mut_as[idxOperandVar->getId()].getInterval();
69+
if (idxVal.isBottom())
70+
res = res + IntervalValue(0, 0);
71+
else {
72+
s64_t ub = (idxVal.ub().getIntNumeral() < 0) ? 0
73+
: (double)Options::MaxFieldLimit() / elemByteSize >= idxVal.ub().getIntNumeral()
74+
? elemByteSize * idxVal.ub().getIntNumeral()
75+
: Options::MaxFieldLimit();
76+
s64_t lb = (idxVal.lb().getIntNumeral() < 0) ? 0
77+
: (double)Options::MaxFieldLimit() / elemByteSize >= idxVal.lb().getIntNumeral()
78+
? elemByteSize * idxVal.lb().getIntNumeral()
79+
: Options::MaxFieldLimit();
80+
res = res + IntervalValue(lb, ub);
81+
}
82+
}
83+
}
84+
else if (const SVFStructType* structTy = SVFUtil::dyn_cast<SVFStructType>(idxOperandType)) {
85+
res = res + IntervalValue(gep->getAccessPath().getStructFieldOffset(idxOperandVar, structTy));
86+
}
87+
else {
88+
assert(false && "gep type pair only support arr/ptr/struct");
89+
}
90+
}
91+
return res;
92+
}
93+
3594
/// Add a detected bug to the bug reporter and print the report
3695
///@{
3796
void addBugToReporter(const AEException& e, const ICFGNode* node) {

Assignment-3/Python/Assignment_3.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ def bufOverflowDetection(self, stmt: pysvf.SVFStmt):
7777
# Update GEP object offset from base
7878
self.buf_overflow_helper.updateGepObjOffsetFromBase(abstract_state,
7979
abstract_state[lhs].getAddrs(), abstract_state[rhs].getAddrs(),
80-
abstract_state.getByteOffset(stmt)
80+
self.buf_overflow_helper.getByteOffset(abstract_state, stmt)
8181
)
8282

8383
# TODO: your code starts from here

Assignment-3/Python/Assignment_3_Helper.py

Lines changed: 95 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,93 @@ def __init__(self, svfir: pysvf.SVFIR):
217217
self.node_to_bug_info = {}
218218
self.svfir = svfir
219219

220+
# ------------------------------------------------------------------
221+
# Helpers that used to live as instance methods on `pysvf.AbstractState`.
222+
# Upstream (Semi-Sparse refactor) moved them to `AbstractStateManager`,
223+
# which requires a sparsity-aware trace we don't keep here. We re-implement
224+
# the dense-mode behavior using only public AbstractState surface so the
225+
# Python side mirrors the C++ side (`AbstractExecutionHelper::getByteOffset`).
226+
# ------------------------------------------------------------------
227+
def getByteOffset(self, abstract_state: pysvf.AbstractState, gep: pysvf.GepStmt) -> pysvf.IntervalValue:
228+
if gep.isConstantOffset():
229+
return pysvf.IntervalValue(gep.getConstantByteOffset())
230+
max_field_limit = pysvf.Options.max_field_limit()
231+
res = pysvf.IntervalValue(0)
232+
pairs = gep.getOffsetVarAndGepTypePairVec()
233+
for i in reversed(range(len(pairs))):
234+
idx_var, idx_type = pairs[i]
235+
if idx_type.isArrayType() or idx_type.isPointerType():
236+
if idx_type.isArrayType():
237+
elem_byte_size = idx_type.asArrayType().getTypeOfElement().getByteSize()
238+
else:
239+
elem_byte_size = gep.getSrcPointeeType().getByteSize()
240+
if isinstance(idx_var, pysvf.ConstIntValVar):
241+
val = idx_var.getSExtValue()
242+
lb = val * elem_byte_size if (max_field_limit / elem_byte_size) >= val else max_field_limit
243+
res = res + pysvf.IntervalValue(lb, lb)
244+
else:
245+
idx_val = abstract_state[idx_var.getId()].getInterval()
246+
if idx_val.isBottom():
247+
res = res + pysvf.IntervalValue(0, 0)
248+
else:
249+
ub_int = idx_val.ub().getNumeral()
250+
lb_int = idx_val.lb().getNumeral()
251+
ub = 0 if ub_int < 0 else (
252+
elem_byte_size * ub_int if (max_field_limit / elem_byte_size) >= ub_int
253+
else max_field_limit)
254+
lb = 0 if lb_int < 0 else (
255+
elem_byte_size * lb_int if (max_field_limit / elem_byte_size) >= lb_int
256+
else max_field_limit)
257+
res = res + pysvf.IntervalValue(lb, ub)
258+
elif idx_type.isStructType():
259+
res = res + pysvf.IntervalValue(gep.getStructFieldOffset(idx_var, idx_type.asStructType()))
260+
else:
261+
raise AssertionError("gep type pair only supports arr/ptr/struct")
262+
return res
263+
264+
def getGepObjAddrs(self, abstract_state: pysvf.AbstractState, var_id: int, offset: pysvf.IntervalValue) -> pysvf.AddressValue:
265+
gep_addrs = pysvf.AddressValue()
266+
max_field_limit = pysvf.Options.max_field_limit()
267+
lb = min(offset.lb().getNumeral(), max_field_limit)
268+
ub = min(offset.ub().getNumeral(), max_field_limit)
269+
addrs = abstract_state[var_id].getAddrs()
270+
for i in range(lb, ub + 1):
271+
for addr in addrs:
272+
base_obj = abstract_state.getIDFromAddr(addr)
273+
gep_obj = self.svfir.getGepObjVar(base_obj, i)
274+
gep_addrs.insert(pysvf.AbstractState.getVirtualMemAddress(gep_obj))
275+
return gep_addrs
276+
277+
def getPointeeElement(self, abstract_state: pysvf.AbstractState, var_id: int):
278+
ptr_val = abstract_state[var_id]
279+
if not ptr_val.isAddr():
280+
return None
281+
for addr in ptr_val.getAddrs():
282+
obj_id = abstract_state.getIDFromAddr(addr)
283+
if obj_id == 0:
284+
continue
285+
return self.svfir.getBaseObject(obj_id).getType()
286+
return None
287+
288+
def getAllocaInstByteSize(self, abstract_state: pysvf.AbstractState, addr: pysvf.AddrStmt) -> int:
289+
rhs = addr.getRHSVar()
290+
if not isinstance(rhs, pysvf.ObjVar):
291+
raise AssertionError("Addr rhs value is not ObjVar")
292+
base = self.svfir.getBaseObject(rhs.getId())
293+
if base.isConstantByteSize():
294+
return base.getByteSizeOfObj()
295+
max_field_limit = pysvf.Options.max_field_limit()
296+
sizes = addr.getArrSize()
297+
res = 1
298+
for value in sizes:
299+
sz_val = abstract_state[value.getId()].getInterval()
300+
if sz_val.isBottom():
301+
ub = max_field_limit
302+
else:
303+
ub = sz_val.ub().getNumeral()
304+
res = res * ub if res * ub <= max_field_limit else max_field_limit
305+
return int(res)
306+
220307
def reportBufOverflow(self, node, msg):
221308
"""
222309
Record an overflow node and its associated exception.
@@ -277,7 +364,7 @@ def handleMemcpy(self, abstractState: pysvf.AbstractState, dst: pysvf.SVFVar, sr
277364
if dst.getType().isArrayTy():
278365
elemSize = dst.getType().getTypeOfElement().getByteSize()
279366
elif dst.getType().isPointerTy():
280-
elemType = abstractState.getPointeeElement(dstId)
367+
elemType = self.getPointeeElement(abstractState, dstId)
281368
if elemType.isArrayTy():
282369
elemSize = elemType.getTypeOfElement().getByteSize()
283370
else:
@@ -288,8 +375,8 @@ def handleMemcpy(self, abstractState: pysvf.AbstractState, dst: pysvf.SVFVar, sr
288375
range_val = size/elemSize
289376
if abstractState.inVarToAddrsTable(dstId) and abstractState.inVarToAddrsTable(srcId):
290377
for index in range(0, int(range_val)):
291-
expr_src = abstractState.getGepObjAddrs(srcId, pysvf.IntervalValue(index))
292-
expr_dst = abstractState.getGepObjAddrs(dstId, pysvf.IntervalValue(index + start_idx))
378+
expr_src = self.getGepObjAddrs(abstractState, srcId, pysvf.IntervalValue(index))
379+
expr_dst = self.getGepObjAddrs(abstractState, dstId, pysvf.IntervalValue(index + start_idx))
293380
for addr_src in expr_src:
294381
for addr_dst in expr_dst:
295382
objId = abstractState.getIDFromAddr(addr_src)
@@ -320,15 +407,15 @@ def getStrlen(self, abstractState, strValue):
320407
icfg_node = base_object.getICFGNode()
321408
for stmt in icfg_node.getSVFStmts():
322409
if isinstance(stmt, pysvf.AddrStmt):
323-
dst_size = abstractState.getAllocaInstByteSize(stmt)
410+
dst_size = self.getAllocaInstByteSize(abstractState, stmt)
324411

325412
length = 0
326413
elem_size = 1
327414

328415
# Calculate the string length
329416
if abstractState.getVar(value_id).isAddr():
330417
for index in range(dst_size):
331-
expr0 = abstractState.getGepObjAddrs(value_id, pysvf.IntervalValue(index))
418+
expr0 = self.getGepObjAddrs(abstractState, value_id, pysvf.IntervalValue(index))
332419
val = pysvf.AbstractValue()
333420

334421
for addr in expr0:
@@ -343,7 +430,7 @@ def getStrlen(self, abstractState, strValue):
343430
if strValue.getType().isArrayTy():
344431
elem_size = strValue.getType().getTypeOfElement().getByteSize()
345432
elif strValue.getType().isPointerTy():
346-
elem_type = abstractState.getPointeeElement(value_id)
433+
elem_type = self.getPointeeElement(abstractState, value_id)
347434
if elem_type:
348435
if elem_type.isArrayTy():
349436
elem_size = elem_type.getTypeOfElement().getByteSize()
@@ -1071,14 +1158,14 @@ def getAccessOffset(self, objId: int, gep: pysvf.GepStmt) -> pysvf.IntervalValue
10711158
# Field-insensitive base object
10721159
if isinstance(obj, pysvf.BaseObjVar):
10731160
# Get base size
1074-
access_offset = abstract_state.getByteOffset(gep)
1161+
access_offset = self.buf_overflow_helper.getByteOffset(abstract_state, gep)
10751162
return access_offset
10761163

10771164
# A sub-object of an aggregate object
10781165
elif isinstance(obj, pysvf.GepObjVar):
10791166
access_offset = (
10801167
self.buf_overflow_helper.getGepObjOffsetFromBase(obj)
1081-
+ abstract_state.getByteOffset(gep)
1168+
+ self.buf_overflow_helper.getByteOffset(abstract_state, gep)
10821169
)
10831170
return access_offset
10841171

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ARG TARGETPLATFORM
88
RUN set -e
99

1010
# Define LLVM version.
11-
ENV llvm_version=18.1.0
11+
ENV llvm_version=21.1.0
1212

1313
# Define home directory
1414
ENV HOME=/home/SVF-tools

0 commit comments

Comments
 (0)