Skip to content

Commit 78a05d4

Browse files
runwangdlclaude
andcommitted
fix(redmule): unmap Conv from RedMulE engine; drop weight-layout pass
The Siracusa+RedMulE training CI on 1782a88 got past Python codegen but failed at link time: ld.lld: error: undefined symbol: Conv2d_Im2Col_fp32_fp32_fp32_HWC_8_Redmule >>> referenced by TrainingNetwork.c:5386 in _node_1_tokenizer_..._Conv_cluster_fork The original RedMulE PR (pulp-platform/Deeploy#67) shipped only the matmul kernel TargetLibraries/PULPOpen/src/Matmul_fp32_Redmule.c. The ConvTemplate references a `Conv2d_Im2Col_..._8_Redmule` kernel that has no corresponding source in the tree, and 67b754b already deleted the testFloat2DConvolution / testFloat2dConvLarge fixtures that would have exercised the Redmule Conv path. So the Conv binding has always been load-bearing only for non-test models like CCT_train, and on those it breaks the link. Two coupled changes route Conv through the existing PULPClusterEngine (which has a working PULP_Conv2d_Im2Col_fp32_fp32_fp32_HWC): - Drop 'Conv' from RedmuleMapping. Without it Conv falls through to the second engine in RedmulePlatform's engine list (PULPCluster). - Drop RedMuleAdjustWeightMemoryLayoutPass from the lowering passes. That pass transposed Conv weights from [F,H,W,Cin] to [H,W,Cin,F] for the RedMulE accelerator's expected layout; once Conv is on the PULPCluster engine, PULP expects [F,H,W,Cin] and the pre-applied transpose makes Tiling produce out-of-bounds tile rectangles (locally repro'd: AssertionError "Rectangle offset should be zero when the dimensions are the same. Received rectangle HyperRectangle(offset=(3, 0, 0, 0), dims=(3, 3, 3, 32))" in TilingCodegen.minimizeRectangle). Both are clearly marked in-source as "restore when the RedMulE Conv kernel lands." Locally validated end-to-end: - testMVPTraining.py -> exit 0 (TrainingNetwork.c emits PULP_Conv2d_Im2Col_fp32_fp32_fp32_HWC for the tokenizer Conv). - testMVPOptimizer.py -> exit 0. Matmul / Gemm continue to bind to RedMulE as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 1782a88 commit 78a05d4

2 files changed

Lines changed: 19 additions & 4 deletions

File tree

Deeploy/Targets/Redmule/Deployer.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,7 @@
3030
from Deeploy.AbstractDataTypes import Pointer
3131
from Deeploy.DeeployTypes import DeploymentPlatform, TopologyOptimizer
3232
from Deeploy.Targets.PULPOpen.Deployer import PULPDeployer
33-
from Deeploy.Targets.Redmule.TopologyOptimizationPasses.Passes import RedMuleAdjustWeightMemoryLayoutPass, \
34-
RedMuleGEMMTransposePass
33+
from Deeploy.Targets.Redmule.TopologyOptimizationPasses.Passes import RedMuleGEMMTransposePass
3534

3635

3736
class RedmuleDeployer(PULPDeployer):
@@ -50,6 +49,13 @@ def __init__(self,
5049
default_channels_first, deeployStateDir, inputOffsets)
5150

5251
self.loweringOptimizer.passes += [
53-
RedMuleAdjustWeightMemoryLayoutPass("Redmule"),
52+
# RedMuleAdjustWeightMemoryLayoutPass is intentionally not
53+
# registered: it transposes Conv weights from [F,H,W,Cin] to
54+
# [H,W,Cin,F] for the RedMulE accelerator, but Conv has been
55+
# routed to the PULPCluster engine (see Engine.RedmuleMapping)
56+
# because no `Conv2d_Im2Col_*_Redmule` kernel is defined yet.
57+
# PULP's Conv expects [F,H,W,Cin] layout, so applying the
58+
# transpose breaks tiling for those nodes. Restore alongside
59+
# the Conv mapping when a real RedMulE Conv kernel lands.
5460
RedMuleGEMMTransposePass("Redmule")
5561
]

Deeploy/Targets/Redmule/Engine.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,16 @@
3939

4040
RedmuleMapping = {
4141
'MatMul': MatMulLayer([MatMulRedmuleMapper]),
42-
'Conv': ConvLayer([Conv2DRedmuleMapper]),
42+
# 'Conv' is intentionally not mapped here: the Redmule ConvTemplate
43+
# references the kernel symbol Conv2d_Im2Col_fp32_fp32_fp32_HWC_8_Redmule,
44+
# which is *declared* by the template but never *defined* in any source
45+
# file under TargetLibraries/. Letting Conv fall through to the next
46+
# engine (PULPClusterEngine, which has a working
47+
# PULP_Conv2d_Im2Col_fp32_fp32_fp32_HWC implementation) keeps the
48+
# Siracusa+RedMulE link step from failing on undefined symbols. When
49+
# a real RedMulE-accelerated Conv kernel lands, restore the mapping:
50+
#
51+
# 'Conv': ConvLayer([Conv2DRedmuleMapper]),
4352
'Gemm': GEMMLayer([GEMMMRedmuleMapper]),
4453
}
4554

0 commit comments

Comments
 (0)