Skip to content

Commit a2fb2e4

Browse files
committed
Mangle buffer names in MchanDma chunked-1D fallback
When a contiguous DMA transfer exceeds the 17-bit mchan command-size field, the fallback splits it into `_MAX_1D_TRANSFER_BYTES` chunks and emits one mchan_transfer_1d(cmd, loc, ext) per chunk. The opRepr values were built with `f"((char*){buf.name} + {offset})"`, but ExecutionBlock._mangleOpRepr only rewrites values that ctxt.is_buffer() recognises — a formatted string slips through unchanged, so the emitted C refers to `TILING_CODEGEN_L1_foo_ref` while the declaration is `DeeployNetwork_TILING_CODEGEN_L1_foo_ref`. Seen on ResNet8 tiled L1>=200KB L3 for layer3.conv2: the 147456-byte weight load splits into 131072+16384, and clang fails with "use of undeclared identifier 'TILING_CODEGEN_L1_node_23_..._weight_ref'". Fix: call ctxt._mangle() on the buffer names once before the chunking loop so the emitted strings match the declared variables. The same bug is present in the GAP9 MchanDma port — patch both. This was blocking end-to-end gvsoc runs for any configuration whose weight tiles exceed 128KB.
1 parent d3fa7a1 commit a2fb2e4

2 files changed

Lines changed: 20 additions & 4 deletions

File tree

Deeploy/Targets/GAP9/DMA/MchanDma.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -105,14 +105,20 @@ def transfer(self, ctxt: NetworkContext, externalBuffer: VariableBuffer, localBu
105105
mchanFlags += (1 << 1) # increment addresses
106106
mchanFlags += (1 << 3) # event enable
107107
template = self._transferTemplates[1]
108+
# Explicitly mangle the buffer names: see Siracusa MchanDma for the
109+
# same fix — _mangleOpRepr only rewrites plain buffer names, so a
110+
# string like "((char*)foo + 0)" would ship without the
111+
# DeeployNetwork_ prefix and fail to compile.
112+
locName = ctxt._mangle(localBuffer.name)
113+
extName = ctxt._mangle(externalBuffer.name)
108114
chunks: List[CodeSnippet] = []
109115
offset = 0
110116
while offset < totalSize:
111117
chunkSize = min(self._MAX_1D_TRANSFER_BYTES, totalSize - offset)
112118
cmd = (mchanFlags << 17) + chunkSize
113119
opRepr: OperatorRepresentation = {
114-
"loc": f"((char*){localBuffer.name} + {offset})",
115-
"ext": f"((char*){externalBuffer.name} + {offset})",
120+
"loc": f"((char*){locName} + {offset})",
121+
"ext": f"((char*){extName} + {offset})",
116122
"future": future.name,
117123
"cmd": cmd,
118124
"size": chunkSize,

Deeploy/Targets/PULPOpen/DMA/MchanDma.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -91,14 +91,24 @@ def transfer(self, ctxt: NetworkContext, externalBuffer: VariableBuffer, localBu
9191
mchanFlags += (1 << 1) # increment addresses
9292
mchanFlags += (1 << 3) # event enable
9393
template = self._transferTemplates[1]
94+
# Use the network-prefixed buffer name so the emitted C references
95+
# the variable that was actually declared (e.g.
96+
# DeeployNetwork_TILING_CODEGEN_L1_foo_ref) rather than the raw
97+
# buffer.name. The mangling normally happens inside
98+
# ExecutionBlock._mangleOpRepr, but that pass only rewrites values
99+
# that match is_buffer() — a formatted string like
100+
# "((char*)foo + 0)" escapes the rewrite and produces undeclared
101+
# identifier build errors for any weight/tile >131072 bytes.
102+
locName = ctxt._mangle(localBuffer.name)
103+
extName = ctxt._mangle(externalBuffer.name)
94104
chunks: List[CodeSnippet] = []
95105
offset = 0
96106
while offset < totalSize:
97107
chunkSize = min(self._MAX_1D_TRANSFER_BYTES, totalSize - offset)
98108
cmd = (mchanFlags << 17) + chunkSize
99109
opRepr: OperatorRepresentation = {
100-
"loc": f"((char*){localBuffer.name} + {offset})",
101-
"ext": f"((char*){externalBuffer.name} + {offset})",
110+
"loc": f"((char*){locName} + {offset})",
111+
"ext": f"((char*){extName} + {offset})",
102112
"future": future.name,
103113
"cmd": cmd,
104114
}

0 commit comments

Comments
 (0)