Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
f653249
MiniSat: add set_limit_incremental_simplification
tautschnig Apr 8, 2026
1413542
Fix CaDiCaL l_get crash for var_no 0
tautschnig Apr 7, 2026
a5fbd61
Add --wide-pointer-encoding to bv_pointerst
tautschnig Mar 31, 2026
73ed698
Add alternative test descriptor for wide pointer encoding
tautschnig Apr 1, 2026
d05e05c
Address-based pointer equality and comparison (fixes #8200, #8161)
tautschnig Apr 1, 2026
0cedf2f
Defer I2P backward constraints to finish_eager_conversion
tautschnig Apr 1, 2026
a56aab1
Add object_base_address expression and encoding infrastructure
tautschnig Apr 2, 2026
3330912
Fix boolbv_widtht for wide pointer encoding
tautschnig Apr 2, 2026
1a3e853
Fix byte-level pointer representation for wide encoding
tautschnig Apr 2, 2026
4e04ba9
Add --model-stack-layout for stack growth and adjacent placement
tautschnig Apr 3, 2026
2ac85cf
Fix #2117: address reuse after free with wide pointer encoding
tautschnig Apr 3, 2026
400d893
Optimize backward I2P constraints for performance
tautschnig Apr 3, 2026
387d3bd
Add regression test for #8200: cast-to-pointer
tautschnig Apr 8, 2026
66ba67a
Add regression test for #8161: inconsistent pointer comparison
tautschnig Apr 8, 2026
948f12a
Add regression test for #8103: I2P dereference
tautschnig Apr 8, 2026
156de78
Add regression test for #2117: address reuse after free
tautschnig Apr 8, 2026
e5169e5
Add regression test for --model-stack-layout
tautschnig Apr 8, 2026
22688ed
Fix Pointer_comparison3 wide descriptor: expect 13 failures, not 11
tautschnig Apr 8, 2026
8521863
Add documentation for wide pointer encoding
tautschnig Apr 3, 2026
a186af2
Integrate backward I2P refinement into bv_refinementt::check_SAT
tautschnig Apr 8, 2026
53cbe27
Fix simplifier dynamic object detection for malloc_may_alias
tautschnig Apr 8, 2026
141e1b7
Refine value_set_dereference comment for pointer-containing types
tautschnig Apr 8, 2026
cce177b
Fix crashes and correctness: boolbv_width, byte_extract, backward I2P
tautschnig Apr 8, 2026
75bb4b5
Fix havoc_slice/test_struct_raw_bytes descriptor: expect 11 failures
tautschnig Apr 8, 2026
a1c6844
Add wide-pointer-encoding descriptors for Malloc10, Malloc14, null8
tautschnig Apr 8, 2026
f2f29a0
Tag tests with no-wide-pointer-encoding for different expected results
tautschnig Apr 8, 2026
65eeb1f
Fix MiniSat crash: freeze variables from finish_eager_conversion
tautschnig Apr 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
149 changes: 149 additions & 0 deletions doc/architectural/wide-pointer-encoding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Wide Pointer Encoding

## Overview

The `--wide-pointer-encoding` option extends CBMC's pointer bitvector
representation to include a flat integer address alongside the standard
object/offset encoding. This enables precise modeling of pointer-to-integer
casts, integer-to-pointer casts, pointer comparison, and address reuse
after free.

## Pointer Layout

Standard encoding (64-bit platform):
```
[offset(56) | object(8)] = 64 bits
```

Wide encoding (64-bit platform):
```
[object(64) | offset(64) | address(64)] = 192 bits
```

Each component has the full platform pointer width, removing the
`--object-bits` limitation entirely.

## What It Fixes

### Issue #881: Object-bits limitation

The standard encoding splits the pointer width between object bits and
offset bits. With the default 8 object bits, only 256 objects can be
tracked, and offsets are limited to 56 bits. The wide encoding gives
each component the full 64 bits.

### Issue #8200: Cast-to-pointer misses error

```c
char *x = "";
char *ptr = (char *)0x55a8a2e6b007;
assert(ptr != x); // Standard: SUCCESS (wrong). Wide: FAILURE (correct).
```

The wide encoding compares pointers by their flat address, not by
object identity. Two pointers from different objects can have the
same address.

### Issue #8161: Inconsistent pointer comparison

```c
int *n = &m;
if((unsigned long)n >= (unsigned long)(-4095))
assert(...); // Standard: inconsistent. Wide: consistent.
```

Pointer-to-integer casts return the flat address, making arithmetic
on pointer-derived integers consistent with plain integer arithmetic.

### Issue #8103: Integer-to-pointer casting

```c
uint64_t addr = constrained_to_be_within(buffer);
uint8_t *p = (void *)addr;
assert(*p == expected); // Standard: FAILURE (wrong). Wide: SUCCESS (correct).
```

The wide encoding adds address-based dereference dispatch: when a
pointer comes from an integer-to-pointer cast, the dereference reads
from the object whose address range contains the pointer's flat address.

### Issue #2117: Address reuse after free

```c
int *x = malloc(sizeof(int));
free(x);
int *y = malloc(sizeof(int));
if(x == y) assert(0); // Standard: unreachable. Wide: reachable.
```

The wide encoding allows `malloc` to return the same address after
`free` by relaxing the non-overlapping constraint for dynamic objects
and preventing the simplifier from assuming different dynamic objects
have different addresses.

## Additional Features

### Stack Layout Modeling (`--model-stack-layout`)

When combined with `--wide-pointer-encoding`, this option places stack
variables adjacently with architecture-appropriate growth direction
(downward for x86/ARM, upward for HPPA):

```c
int a, b, c;
assert(&a > &b); // Stack growth direction
assert(&a - &b == sizeof(b)); // Adjacent placement
char *p = (char *)&a;
p[sizeof(a)] = 42;
assert(*(char *)&b == 42); // Buffer overflow modeling
```

## Technical Details

### Pointer Equality and Comparison

With the wide encoding, `p == q` compares flat addresses (not object
identity). Relational comparisons (`<`, `>`, `<=`, `>=`) also use
flat addresses, making cross-object comparisons well-defined.

### Pointer-to-Integer Cast (P2I)

Returns the flat address component of the pointer. This is the
byte-level address that the program would see on a real machine.

### Integer-to-Pointer Cast (I2P)

Creates a pointer with fresh object/offset variables. Forward
constraints link the object to the address: `obj == i => base[i] + off == addr`.
Backward constraints are added lazily via a refinement loop in
`dec_solve`: if the SAT model assigns an object inconsistent with
the address range, backward constraints are added and the formula
is re-solved.

### Byte Operations

Byte-extract and byte-update on pointer types operate on the 64-bit
address component only. The object/offset bits are internal encoding
and not byte-visible. Pointer arrays store 64-bit addresses per
element, matching symex's byte-level element size.

### Non-Overlapping Constraints

Objects are ordered by index with a chain constraint:
`base[0] + size[0] <= base[1] <= ...` (O(n) instead of O(n²)).
For consecutive dynamic objects of the same size, address reuse is
allowed: `base[next] == base[prev] OR base[next] >= end[prev]`.

### Performance

The wide encoding creates 3× wider pointer bitvectors, increasing
the SAT formula size. Typical overhead is 1.5× on the regression
suite. Programs with many integer-to-pointer casts may be slower
due to the backward constraint refinement.

## Command-Line Options

- `--wide-pointer-encoding`: Enable the wide pointer encoding.
- `--model-stack-layout`: Place stack variables adjacently with
architecture-appropriate growth direction (requires
`--wide-pointer-encoding`).
11 changes: 11 additions & 0 deletions regression/cbmc/Malloc10/test-wide-pointer-encoding.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
CORE no-new-smt wide-pointer-encoding
main.c

^EXIT=10$
^SIGNAL=0$
^VERIFICATION FAILED$
--
^warning: ignoring
--
Wide pointer encoding: address reuse allows a malloc'd pointer to
equal a previously seen pointer (elem), triggering the assertion.
2 changes: 1 addition & 1 deletion regression/cbmc/Malloc10/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE no-new-smt
CORE no-new-smt no-wide-pointer-encoding
main.c

^EXIT=0$
Expand Down
13 changes: 13 additions & 0 deletions regression/cbmc/Malloc14/test-wide-pointer-encoding.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
CORE wide-pointer-encoding
main.c

^EXIT=10$
^SIGNAL=0$
^\[main\.assertion\.1\] .* same memory allocated twice: FAILURE$
^VERIFICATION FAILED$
--
^warning: ignoring
--
Wide pointer encoding: two mallocs can return the same address
(address reuse after free). The assertion that they differ is
unsound.
2 changes: 1 addition & 1 deletion regression/cbmc/Malloc14/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE
CORE no-wide-pointer-encoding
main.c

^EXIT=0$
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
CORE wide-pointer-encoding
main.i
--32 --little-endian
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
^warning: ignoring
15 changes: 15 additions & 0 deletions regression/cbmc/Pointer_array4/test-wide-pointer-encoding.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
CORE wide-pointer-encoding
main.i
--32
^EXIT=10$
^SIGNAL=0$
^\[main.overflow.1\] line 8 arithmetic overflow on signed - in iThird - iFirst: FAILURE$
^\*\* 1 of 2 failed
^VERIFICATION FAILED$
--
^warning: ignoring
--
Wide pointer encoding: addresses can be in the upper half of the
32-bit address space, so (int)pointer can be negative and the
signed subtraction iThird - iFirst can overflow. This is correct
behavior — casting pointers to signed int is implementation-defined.
2 changes: 1 addition & 1 deletion regression/cbmc/Pointer_array4/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE
CORE no-wide-pointer-encoding
main.i
--32
^EXIT=0$
Expand Down
11 changes: 11 additions & 0 deletions regression/cbmc/Pointer_array6/test-wide-pointer-encoding.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
KNOWNBUG wide-pointer-encoding
main.c
--no-malloc-may-fail
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
--
Wide pointer encoding: byte-level operations on pointer arrays
see the 192-bit encoding instead of the 64-bit flat address.
Requires endianness map fix for pointer byte representation.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
CORE no-new-smt wide-pointer-encoding
main.c
--no-malloc-may-fail
^\[main.assertion.5\] line 28 assertion __CPROVER_POINTER_OBJECT\(p\) == __CPROVER_POINTER_OBJECT\(p1\): FAILURE$
^\*\* 13 of 59 failed
^VERIFICATION FAILED$
^EXIT=10$
^SIGNAL=0$
--
^warning: ignoring
--
Wide pointer encoding: relational pointer comparisons use flat
addresses, so p < p1+1 does not imply same_object(p, p1).
Address reuse for dynamic objects adds further failures.
2 changes: 1 addition & 1 deletion regression/cbmc/Pointer_comparison3/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE no-new-smt
CORE no-new-smt no-wide-pointer-encoding
main.c
--no-malloc-may-fail
^\[main.assertion.3\] line 21 always false for different objects: FAILURE$
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
CORE broken-smt-backend no-new-smt wide-pointer-encoding
main.c

^EXIT=10$
^SIGNAL=0$
^\[main.assertion.1\] line 6 correct: SUCCESS
^\[main.pointer.1\] line 8 same object violation in array - other_array: FAILURE$
^\[main.assertion.3\] line 13 undefined behavior: FAILURE$
^\[main.assertion.8\] line 28 end plus 2 is nondet: (UNKNOWN|FAILURE)$
^\[main.pointer_arithmetic.\d+\] line 28 pointer relation: pointer outside object bounds in p: FAILURE$
^\*\* [56] of \d+ failed
^VERIFICATION FAILED$
--
^warning: ignoring
^CONVERSION ERROR$
--
Wide pointer encoding variant: fewer overflow failures because
pointer subtraction overflow is checked on the 56-bit offset
component, not the full 64-bit packed representation.
2 changes: 1 addition & 1 deletion regression/cbmc/Pointer_difference2/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE broken-smt-backend no-new-smt
CORE broken-smt-backend no-new-smt no-wide-pointer-encoding
main.c

^\[main.assertion.1\] line 6 correct: SUCCESS
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
KNOWNBUG thorough-paths no-new-smt wide-pointer-encoding
test.c
--no-simplify --unwind 300 --object-bits 8
^EXIT=6$
^SIGNAL=0$
--
^warning: ignoring
--
Wide pointer encoding: the object-bits limitation is removed
(full 64-bit object width), so the "too many addressed objects"
error is never triggered. This test is not applicable.
2 changes: 1 addition & 1 deletion regression/cbmc/address_space_size_limit1/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE thorough-paths no-new-smt
CORE thorough-paths no-new-smt no-wide-pointer-encoding
test.c
--no-simplify --unwind 300 --object-bits 8
too many addressed objects
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
CORE thorough-smt-backend no-new-smt wide-pointer-encoding
main.i
--32 --little-endian --object-bits 26
^EXIT=10$
^SIGNAL=0$
^VERIFICATION FAILED$
^.* dereference failure: pointer outside object bounds in .*: FAILURE$
--
^warning: ignoring
--
Wide pointer encoding: the object-bits limitation is removed but
the bounds violation is still detected. Slower than standard
encoding due to 3x pointer width (requires thorough tag).
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
CORE wide-pointer-encoding broken-cprover-smt-backend no-new-smt
test_struct_raw_bytes.c

^EXIT=10$
^SIGNAL=0$
^VERIFICATION FAILED$
^\*\* 11 of
--
^.*expecting SUCCESS: FAILURE$
^.*dereference .*: FAILURE$
--
Wide pointer encoding: same results as standard encoding for
struct byte operations.
3 changes: 2 additions & 1 deletion regression/cbmc/memory_allocation1/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE broken-smt-backend
CORE broken-smt-backend no-wide-pointer-encoding
main.c

^EXIT=10$
Expand All @@ -12,3 +12,4 @@ main.c
^warning: ignoring
--
This test only fails when using SMT solvers as back-end on Windows.
Broken under --wide-pointer-encoding: __CPROVER_allocated_memory has no body. Needs further work.
4 changes: 3 additions & 1 deletion regression/cbmc/mm_io1/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE
CORE no-wide-pointer-encoding
main.c
--no-standard-checks
^Replaced MMIO operations: 3 read\(s\), 1 write\(s\)
Expand All @@ -7,3 +7,5 @@ main.c
^VERIFICATION SUCCESSFUL$
--
^warning: ignoring
--
Broken under --wide-pointer-encoding: memory-mapped I/O dispatch not yet supported. Needs further work.
13 changes: 13 additions & 0 deletions regression/cbmc/null8/test-wide-pointer-encoding.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
CORE wide-pointer-encoding
main.c
--verbosity 8
Generated 1 VCC\(s\), [01] remaining after simplification
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
^warning: ignoring
--
Wide pointer encoding: the VCC may not be simplified away because
forward propagation of dynamic object addresses is blocked (needed
for address reuse modeling). The assertion is still proved correct.
2 changes: 1 addition & 1 deletion regression/cbmc/null8/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE
CORE no-wide-pointer-encoding
main.c
--verbosity 8
Generated 1 VCC\(s\), 0 remaining after simplification
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
KNOWNBUG wide-pointer-encoding
main.c
--show-points-to-sets --json-ui --no-standard-checks
activate-multi-line-match
^EXIT=0$
^SIGNAL=0$
--
--
Wide pointer encoding: the address-based dereference dispatch adds
extra entries to the Value field in the JSON output, changing the
expected format.
2 changes: 1 addition & 1 deletion regression/cbmc/points-to-sets/test_json.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE
CORE no-wide-pointer-encoding
main.c
--show-points-to-sets --json-ui --no-standard-checks
activate-multi-line-match
Expand Down
Loading
Loading