@@ -199,6 +199,92 @@ Query counters again from the host with:
199199python3 ./zettai_host_dcd_gfam_test.py --query
200200```
201201
202+ ### Zettai Type2 tmatmul and CXL.mem ioctl test
203+
204+ The Zettai switch CCI device (` 7a74:a123 ` ) creates a guest char device such as
205+ ` /dev/zettai_cxl0d003 ` . The current Linux driver ABI for this device is
206+ ` ioctl() ` , not ` io_uring_cmd ` ; ` /tmp/zettai-qmp.sock ` remains a host-side QMP
207+ socket used for bind/add/query orchestration.
208+
209+ Build the guest helper:
210+
211+ ``` bash
212+ gcc -O2 -Wall -Wextra -o zettai_tmatmul_ctl zettai_tmatmul_ctl.c
213+ ```
214+
215+ Check whether QEMU exposed the tmatmul CSR block:
216+
217+ ``` bash
218+ ./zettai_tmatmul_ctl --dev /dev/zettai_cxl0d003 --info
219+ ```
220+
221+ If dmesg reports ` tmatmul=0 ` or the tool prints ` tmatmul_present=no ` , QEMU only
222+ exposed the switch CCI BAR and tmatmul smoke runs will return ` ENODEV ` . CXL.mem
223+ read/write can still be tested by passing a real nonzero HPA base from a CXL
224+ region or decoder resource:
225+
226+ ``` bash
227+ cxl list -R -u
228+ ./zettai_tmatmul_ctl --dev /dev/zettai_cxl0d003 \
229+ --mem-write --hpa-base 0xYOUR_REGION_RESOURCE --hpa-size 0x10000000 \
230+ --offset 0 --size 4096 --pattern 0x5a
231+ ./zettai_tmatmul_ctl --dev /dev/zettai_cxl0d003 \
232+ --mem-read --hpa-base 0xYOUR_REGION_RESOURCE --hpa-size 0x10000000 \
233+ --offset 0 --size 64
234+ ```
235+
236+ Once the QEMU Zettai device exposes a BAR large enough for the tmatmul CSR window
237+ at ` BAR0 + 0x1c0000 ` , run:
238+
239+ ``` bash
240+ ./zettai_tmatmul_ctl --dev /dev/zettai_cxl0d003 \
241+ --smoke --hpa-base 0xYOUR_REGION_RESOURCE --hpa-size 0x10000000
242+ ```
243+
244+ ### Zettai benchmark harness
245+
246+ For a repeatable host-side smoke benchmark, use:
247+
248+ ``` bash
249+ QEMU_NET_MODE=none \
250+ KERNEL_IMAGE=/path/to/bzImage \
251+ DISK_IMAGE=/path/to/rootfs.img \
252+ ./zettai_benchmark.sh --launch --keep-qemu
253+ ```
254+
255+ The harness launches QEMU with a QMP socket, binds ` cxl-dcd0 ` , adds a 256 MiB
256+ DCD extent, queries CXLMemSim DCD/GFAM counters, and writes logs under
257+ ` build/zettai-bench/ ` . If QEMU is already running, omit ` --launch ` and keep the
258+ same ` ZETTAI_QMP_SOCKET ` value used by ` QEMU_EXTRA_ARGS ` .
259+
260+ To include the in-guest DCD region setup and Type2 fabric-memory BAR benchmark,
261+ provide SSH access to the guest:
262+
263+ ``` bash
264+ ZETTAI_GUEST_SSH=" ssh root@192.168.122.10" \
265+ ZETTAI_GUEST_DIR=/root/CXLMemSim/qemu_integration \
266+ ./zettai_benchmark.sh --guest --run-type2-bench
267+ ```
268+
269+ The Type2 benchmark is ` guest_libcuda/cxl_bar_benchmark.c ` . It discovers the
270+ ` cxl-type2 ` endpoint (` 8086:0d92 ` ), reports BAR register and data-region
271+ latency/bandwidth, then exercises the Zettai fabric-memory controls exposed by
272+ QEMU: ` DCD_GET_INFO ` , optional DCD add/release when free capacity exists,
273+ ` GFAM_GET_INFO ` , and ` MHSLD_GET_INFO/SET_HEAD ` .
274+
275+ For a bounded CXL.cache command-path check, build the optional static binary
276+ and run only the prefetch section:
277+
278+ ``` bash
279+ make -C guest_libcuda static
280+ sudo ./guest_libcuda/cxl_bar_benchmark.static \
281+ --prefetch-only --prefetch-iters 5
282+ ```
283+
284+ This mode is useful when the guest is reached through a serial shell because it
285+ avoids the full BAR bandwidth suite while still exercising read- and
286+ write-intent ` CACHE_PREFETCH ` .
287+
202288## Features
203289
204290- ** Cacheline-granular access** : All memory operations are performed at 64-byte cacheline granularity
0 commit comments