Skip to content

Commit a571229

Browse files
authored
Improve Docstring and Debugging (#160)
This PR adds many missing docstring comments and improves debugging, especially when using a GUI debugger, by providing more helpful `__repr__()` for the `_ReferenceBuffer` class. Additionally, it moves the `MemoryAwareClosureGeneration` and `MemoryAwarePrint*` passes from the `CommonExtensions` to the `MemoryLevelExtension`. ## Added - Add many missing docstrings - Add `__repr__()` function for `_ReferenceBuffer` calss ## Changed - Move `MemoryAwareClosureGeneration` pass to `MemoryLevelExtension` - Move `MemoryAwarePrint*` passes to `MemoryLevelExtension` - Make `sizeInBytes` a class property instead of a function - Move `AnnotateNeurekaWeightMemoryLevel` to `Neureka` specific folder
1 parent 62e5f1a commit a571229

24 files changed

Lines changed: 3342 additions & 244 deletions

File tree

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ This file contains the changelog for the Deeploy project. The changelog is divid
55

66

77
### List of Pull Requests
8+
- Improve Docstring and Debugging [#160](https://github.com/pulp-platform/Deeploy/pull/160)
89
- Add GAP9 Container Support [#163](https://github.com/pulp-platform/Deeploy/pull/163)
910
- Extend Codeowners [#164](https://github.com/pulp-platform/Deeploy/pull/164)
1011
- Support for MaxPool1D and RQSConv1D for PULPOpen [#146](https://github.com/pulp-platform/Deeploy/pull/146)
@@ -13,13 +14,19 @@ This file contains the changelog for the Deeploy project. The changelog is divid
1314
- Update CLI interface Across Project, Fix Tutorial, and Remove Legacy Test [#157](https://github.com/pulp-platform/Deeploy/pull/157)
1415

1516
### Added
17+
- Add many missing docstrings
18+
- Add `__repr__()` function for `_ReferenceBuffer` class
1619
- GAP9 Container Support with ARM64 architecture support
1720
- `zsh` and `oh-my-zsh` plugin installation in containers
1821
- Shell Format pre-commit hook
1922
- Add integer MaxPool1D for Generic platform and RQSConv1D support for PULPOpen, with corresponding kernel tests.
2023
- Added GAP9 Platform Support: Deployer, Bindings, Templates, Tiler, DMA (L3Dma/MchanDma), target library, CI workflows
2124

2225
### Changed
26+
- Move `MemoryAwareClosureGeneration` pass to `MemoryLevelExtension`
27+
- Move `MemoryAwarePrint*` passes to `MemoryLevelExtension`
28+
- Make `sizeInBytes` a class property instead of a function
29+
- Move `AnnotateNeurekaWeightMemoryLevel` to `Neureka` specific folder
2330
- Cleaned up Docker flow to use a temporary build folder
2431
- Switch CI to use pre-commit for linting
2532
- Update `pulp-nnx` and `pulp-nn-mixed` submodules to their latest versions

Deeploy/CommonExtensions/CodeTransformationPasses/Closure.py

Lines changed: 188 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -53,19 +53,76 @@
5353

5454

5555
class ClosureExecutionBlock(ExecutionBlock):
56+
"""
57+
Execution block wrapper for closure-based code generation.
58+
59+
This class extends ExecutionBlock to support closure-based code generation
60+
patterns, where functions are wrapped in closures with argument structures.
61+
It maintains a reference to the base execution block that contains the
62+
actual code to be wrapped.
63+
64+
Notes
65+
-----
66+
This class is used in the closure generation process to maintain the
67+
relationship between the closure wrapper and the original execution block.
68+
"""
5669

5770
def __init__(self, nodeTemplate = None, closureBlock: Optional[ExecutionBlock] = None):
71+
"""
72+
Initialize a ClosureExecutionBlock.
73+
74+
Parameters
75+
----------
76+
nodeTemplate : NodeTemplate, optional
77+
The node template for this execution block. Default is None.
78+
closureBlock : ExecutionBlock, optional
79+
The execution block to be wrapped in a closure. Default is None.
80+
"""
5881
super().__init__(nodeTemplate)
5982
self.closureBlock = closureBlock
6083

6184
@property
6285
def baseBlock(self):
86+
"""
87+
Get the base execution block, unwrapping nested closures.
88+
89+
Recursively unwraps ClosureExecutionBlock instances to find the
90+
underlying base execution block that contains the actual code.
91+
92+
Returns
93+
-------
94+
ExecutionBlock
95+
The base execution block without closure wrappers.
96+
97+
Notes
98+
-----
99+
This property handles nested closures by recursively calling
100+
baseBlock until a non-ClosureExecutionBlock is found.
101+
"""
63102
if isinstance(self.closureBlock, ClosureExecutionBlock):
64103
return self.closureBlock.baseBlock
65104
return self.closureBlock
66105

67106

68107
class ClosureGeneration(CodeTransformationPass, IntrospectiveCodeTransformationMixIn):
108+
"""
109+
Code transformation pass for generating function closures.
110+
111+
This class transforms execution blocks into closure-based code patterns
112+
where functions are wrapped with argument structures. It generates the
113+
necessary struct definitions, closure functions, and call sites to
114+
enable closure-based execution patterns in generated code.
115+
116+
117+
Notes
118+
-----
119+
The closure generation process involves:
120+
1. Analyzing the execution block to identify dynamic references
121+
2. Creating a struct type to hold closure arguments
122+
3. Generating the closure function definition
123+
4. Replacing the original call with a closure call
124+
5. Optionally generating argument writeback code
125+
"""
69126

70127
closureStructArgType: Dict[str, Type[Union[Pointer, Immediate, Struct]]]
71128
closureStructArgs: Dict[str, Union[Pointer, Immediate, Struct]]
@@ -75,6 +132,22 @@ def __init__(self,
75132
closureSuffix = "_closure",
76133
writeback: bool = True,
77134
generateStruct: bool = True):
135+
"""
136+
Initialize the ClosureGeneration transformation pass.
137+
138+
Parameters
139+
----------
140+
closureCallTemplate : NodeTemplate, optional
141+
Template for generating closure function calls. Default is the
142+
global _closureCallTemplate.
143+
closureSuffix : str, optional
144+
Suffix to append to closure function names. Default is "_closure".
145+
writeback : bool, optional
146+
Whether to generate writeback code for closure arguments.
147+
Default is True.
148+
generateStruct : bool, optional
149+
Whether to generate argument structure definitions. Default is True.
150+
"""
78151
super().__init__()
79152
self.closureSuffix = closureSuffix
80153
self.closureTemplate = _closureTemplate
@@ -86,6 +159,31 @@ def __init__(self,
86159

87160
# Don't override this
88161
def _generateClosureStruct(self, ctxt: NetworkContext, executionBlock: ExecutionBlock):
162+
"""
163+
Generate the closure argument structure.
164+
165+
Analyzes the execution block to identify dynamic references and creates
166+
a struct type to hold all closure arguments. This struct will be used
167+
to pass arguments to the closure function.
168+
169+
Parameters
170+
----------
171+
ctxt : NetworkContext
172+
The network context containing buffer information.
173+
executionBlock : ExecutionBlock
174+
The execution block to analyze for dynamic references.
175+
176+
Notes
177+
-----
178+
This method populates the following instance attributes:
179+
- closureStructArgType: The struct class type for closure arguments
180+
- closureStructArgs: The struct instance with argument mappings
181+
182+
The method handles different buffer types:
183+
- TransientBuffer: Mapped to void pointers
184+
- StructBuffer: Excluded from closure arguments
185+
- Other buffers: Use their native types
186+
"""
89187

90188
# Add closure struct info to operatorRepresentation
91189
closureStructArgsType: Dict[str, Type[Union[Pointer, Immediate, Struct]]] = {}
@@ -108,6 +206,31 @@ def _generateClosureStruct(self, ctxt: NetworkContext, executionBlock: Execution
108206

109207
# Don't override this
110208
def _generateClosureCtxt(self, ctxt: NetworkContext, nodeName: str) -> NetworkContext:
209+
"""
210+
Generate closure context and global definitions.
211+
212+
Creates the closure function definition and struct type definition,
213+
then hoists them to the global scope. This includes generating
214+
the actual closure function code and the argument struct typedef.
215+
216+
Parameters
217+
----------
218+
ctxt : NetworkContext
219+
The network context to modify with global definitions.
220+
nodeName : str
221+
The name of the node for tracking dependencies.
222+
223+
Returns
224+
-------
225+
NetworkContext
226+
The modified network context with closure definitions added.
227+
228+
Notes
229+
-----
230+
This method generates and hoists the following global definitions:
231+
- Closure argument struct typedef
232+
- Closure function definition with argument casting and optional writeback
233+
"""
111234

112235
ret = ctxt.hoistStruct(self.closureStructArgs, self.closureName + "_args", self.closureStructArgType)
113236
ctxt.lookup(ret)._users.append(nodeName)
@@ -133,6 +256,36 @@ def _generateClosureCtxt(self, ctxt: NetworkContext, nodeName: str) -> NetworkCo
133256
# Don't override this
134257
def _generateClosureCall(self, ctxt: NetworkContext, executionBlock: ExecutionBlock,
135258
nodeName: str) -> Tuple[NetworkContext, ExecutionBlock]:
259+
"""
260+
Generate the closure call and replace the original execution block.
261+
262+
Creates a new ClosureExecutionBlock that wraps the original execution
263+
with closure call code. This includes the closure function call and
264+
optional argument writeback code.
265+
266+
Parameters
267+
----------
268+
ctxt : NetworkContext
269+
The network context for code generation.
270+
executionBlock : ExecutionBlock
271+
The original execution block to wrap with closure calls.
272+
nodeName : str
273+
The name of the node for struct generation.
274+
275+
Returns
276+
-------
277+
Tuple[NetworkContext, ExecutionBlock]
278+
A tuple containing:
279+
- The modified network context
280+
- The new ClosureExecutionBlock with closure calls
281+
282+
Notes
283+
-----
284+
This method replaces the original function call with:
285+
1. A closure function call (added to the left)
286+
2. Optional argument writeback code (added to the right if enabled)
287+
3. Optional argument struct generation
288+
"""
136289

137290
allArgs = {
138291
"closureName": self.closureName,
@@ -158,57 +311,45 @@ def apply(self,
158311
executionBlock: ExecutionBlock,
159312
name: str,
160313
verbose: CodeGenVerbosity = _NoVerbosity) -> Tuple[NetworkContext, ExecutionBlock]:
314+
"""
315+
Apply the closure generation transformation.
316+
317+
Transforms the given execution block into a closure-based pattern
318+
by generating the necessary struct, closure function, and call site.
319+
This is the main entry point for the closure transformation.
320+
321+
Parameters
322+
----------
323+
ctxt : NetworkContext
324+
The network context containing buffer and type information.
325+
executionBlock : ExecutionBlock
326+
The execution block to transform into a closure pattern.
327+
name : str
328+
The base name for generating closure-related identifiers.
329+
verbose : CodeGenVerbosity, optional
330+
The verbosity level for code generation. Default is _NoVerbosity.
331+
332+
Returns
333+
-------
334+
Tuple[NetworkContext, ExecutionBlock]
335+
A tuple containing:
336+
- The modified network context with closure definitions
337+
- The new ClosureExecutionBlock with closure call patterns
338+
339+
Notes
340+
-----
341+
The transformation process includes:
342+
1. Generating a unique closure name with the specified suffix
343+
2. Capturing the original function call code
344+
3. Creating the closure argument struct
345+
4. Generating the closure function definition in global scope
346+
5. Replacing the original call with a closure call pattern
347+
"""
348+
161349
# Prepend underscore to avoid name issues when beginning with problematic characters (like numbers)
162350
self.closureName = "_" + name + self.closureSuffix
163351
self.functionCall = executionBlock.generate(ctxt)
164352
self._generateClosureStruct(ctxt, executionBlock)
165353
ctxt = self._generateClosureCtxt(ctxt, name)
166354
ctxt, executionBlock = self._generateClosureCall(ctxt, executionBlock, name)
167355
return ctxt, executionBlock
168-
169-
170-
class MemoryAwareClosureGeneration(ClosureGeneration):
171-
172-
def __init__(self,
173-
closureCallTemplate: NodeTemplate = _closureCallTemplate,
174-
closureSuffix = "_closure",
175-
writeback: bool = True,
176-
generateStruct: bool = True,
177-
startRegion: str = "L2",
178-
endRegion: str = "L1"):
179-
super().__init__(closureCallTemplate, closureSuffix, writeback, generateStruct)
180-
self.startRegion = startRegion
181-
self.endRegion = endRegion
182-
183-
# Don't override this
184-
def _generateClosureStruct(self, ctxt: NetworkContext, executionBlock: ExecutionBlock):
185-
186-
# Add closure struct info to operatorRepresentation
187-
closureStructArgsType = {}
188-
closureStruct = {}
189-
makoDynamicReferences = self.extractDynamicReferences(ctxt, executionBlock, unrollStructs = True)
190-
191-
filteredMakoDynamicReferences = []
192-
193-
for ref in makoDynamicReferences:
194-
buf = ctxt.lookup(ref)
195-
if not hasattr(buf, "_memoryLevel") or buf._memoryLevel is None:
196-
filteredMakoDynamicReferences.append(ref)
197-
continue
198-
199-
if buf._memoryLevel == self.startRegion or buf._memoryLevel != self.endRegion:
200-
filteredMakoDynamicReferences.append(ref)
201-
202-
for arg in list(dict.fromkeys(filteredMakoDynamicReferences)):
203-
ref = ctxt.lookup(arg)
204-
if isinstance(ref, TransientBuffer):
205-
closureStructArgsType[ctxt._mangle(arg)] = PointerClass(VoidType)
206-
elif not isinstance(ref, StructBuffer):
207-
closureStructArgsType[ctxt._mangle(arg)] = ref._type
208-
209-
if not isinstance(ref, StructBuffer):
210-
closureStruct[ctxt._mangle(arg)] = arg
211-
212-
structClass = StructClass(self.closureName + "_args_t", closureStructArgsType)
213-
self.closureStructArgType = structClass
214-
self.closureStructArgs = self.closureStructArgType(closureStruct, ctxt)

Deeploy/CommonExtensions/CodeTransformationPasses/CycleMeasurement.py

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,67 @@
99

1010

1111
class ProfilingCodeGeneration(CodeTransformationPass):
12+
"""
13+
Code transformation pass for inserting cycle measurement profiling code.
14+
15+
This class extends CodeTransformationPass to automatically insert profiling
16+
code around execution blocks. It adds cycle counting instrumentation before
17+
and after the target code, enabling performance measurement and analysis
18+
of individual operations during runtime.
19+
20+
The generated profiling code uses a `getCycles()` function to measure
21+
execution time and prints the results to stdout. This is useful for
22+
performance analysis, optimization, and debugging of neural network
23+
operations.
24+
25+
Notes
26+
-----
27+
This transformation requires that the target platform provides a
28+
`getCycles()` function that returns the current cycle count as a uint32_t.
29+
The transformation also assumes printf functionality is available for
30+
output formatting.
31+
32+
The profiling code is non-intrusive and can be easily enabled or disabled
33+
by including or excluding this transformation pass from the compilation
34+
pipeline.
35+
"""
1236

1337
def apply(self,
1438
ctxt: NetworkContext,
1539
executionBlock: ExecutionBlock,
1640
name: str,
1741
verbose: CodeGenVerbosity = _NoVerbosity) -> Tuple[NetworkContext, ExecutionBlock]:
42+
"""
43+
Apply cycle measurement profiling to an execution block.
44+
45+
Wraps the given execution block with cycle counting code that measures
46+
and reports the execution time. The profiling code is added before
47+
(left) and after (right) the original execution block.
48+
49+
Parameters
50+
----------
51+
ctxt : NetworkContext
52+
The network context for code generation. This parameter is passed
53+
through unchanged as cycle measurement doesn't modify the context.
54+
executionBlock : ExecutionBlock
55+
The execution block to instrument with cycle measurement code.
56+
The original block remains unchanged, with profiling code added
57+
around it.
58+
name : str
59+
The name of the operation being profiled. This name is used to
60+
generate unique variable names and is included in the output
61+
message for identification.
62+
verbose : CodeGenVerbosity, optional
63+
The verbosity level for code generation. Default is _NoVerbosity.
64+
This parameter is not used by the cycle measurement transformation.
65+
66+
Returns
67+
-------
68+
Tuple[NetworkContext, ExecutionBlock]
69+
A tuple containing:
70+
- The unchanged network context
71+
- The modified execution block with profiling code added
72+
"""
1873
executionBlock.addLeft(NodeTemplate("""
1974
uint32_t ${op}_cycles = getCycles();
2075
"""), {"op": name})

0 commit comments

Comments
 (0)