Feature/add oversampled polyphase channelizer#1151
Conversation
Add support for channelize_poly with decimation factors (D) less than the number of output channels (M). The case with D < M corresponds to oversampling where a set of M outputs is produced for each D inputs. Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Using a direct pointer for dense filter tensors offers a performance benefit, but adds complexity. It would be better to pursue adding traits to MatX to optimize ALU usage for memory access. Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Added flags to control the phase rotation assumption for the first output in the oversampled case. Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
|
/build |
Greptile SummaryThis PR adds oversampling support to the polyphase channelizer, enabling decimation factors Confidence Score: 5/5Safe to merge; all previously flagged P0/P1 issues are resolved and remaining findings are P2 style/comment improvements All prior P0/P1 concerns (division-by-zero, smem alignment, loop-invariant hoisting, misleading inequality comment) are addressed. The new kernel logic, phase-rotation math, and dispatch hierarchy are sound. Remaining findings are a silently-ignored size parameter in the Python test generator and a misleading test comment — neither affects correctness of the production code. test/test_vectors/generators/00_transforms.py — harris2003_oversampled_operators::channelize() ignores self.size Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["channelize_poly_impl(in, filter, M, D)"] --> B{"real input\n& real filter?"}
B -- yes --> C{"D==M and M<=6?"}
B -- no --> D{"D==M and M<=6?"}
C -- yes --> E["FusedChan kernel"]
C -- no --> F{"D==M and Smem fits?"}
F -- yes --> G["Smem kernel"]
F -- no --> H{"SmemTiled eligible?"}
H -- yes --> I["SmemTiled kernel"]
H -- no --> J["Generic ChannelizePoly1D"]
D -- yes --> E
D -- no --> K{"D==M and Smem fits?"}
K -- yes --> G
K -- no --> L{"SmemTiled eligible?"}
L -- yes --> I
L -- no --> J
I --> M{"MaximallyDecimated\nD==M?"}
M -- yes --> N["Fixed phase per channel\nincremental buf_row advance"]
M -- no --> O["phase = (c + t*D) % M\nK-rotation filter cache"]
Reviews (5): Last reviewed commit: "Make ::detail constants visible for host..." | Re-trigger Greptile |
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
|
/build |
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
|
/build |
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
|
/build |
Add support for oversampling to the polyphase channelizer.
This update adds support for decimation factors (D) that are lower than the number of channels (M) with the polyphase channelizer. For all cases, the channelizer generates M outputs for each D inputs with any remaining partial set of inputs being zero-padded to D elements. Cases of D == M correspond to the maximally decimated, or critically sampled, case, which was previously supported. With maximal decimation, the channel frequency bands partition the frequency space, so the channel frequency support is adjacent but not overlapping. The oversampled cases D < M maintain the same channel center frequencies, but the channels have some overlap.
The per-channel length of the output tensor is (input_len + D - 1) / D. Thus, for example, with M=20 D=10, there is a 2x oversampling factor and the output tensor will have twice as many samples per channel as with M=20 D=20.