Skip to content

Feature/add oversampled polyphase channelizer#1151

Merged
tbensonatl merged 14 commits intomainfrom
feature/add-oversampled-polyphase-channelizer
Apr 13, 2026
Merged

Feature/add oversampled polyphase channelizer#1151
tbensonatl merged 14 commits intomainfrom
feature/add-oversampled-polyphase-channelizer

Conversation

@tbensonatl
Copy link
Copy Markdown
Collaborator

Add support for oversampling to the polyphase channelizer.

This update adds support for decimation factors (D) that are lower than the number of channels (M) with the polyphase channelizer. For all cases, the channelizer generates M outputs for each D inputs with any remaining partial set of inputs being zero-padded to D elements. Cases of D == M correspond to the maximally decimated, or critically sampled, case, which was previously supported. With maximal decimation, the channel frequency bands partition the frequency space, so the channel frequency support is adjacent but not overlapping. The oversampled cases D < M maintain the same channel center frequencies, but the channels have some overlap.

The per-channel length of the output tensor is (input_len + D - 1) / D. Thus, for example, with M=20 D=10, there is a 2x oversampling factor and the output tensor will have twice as many samples per channel as with M=20 D=20.

Add support for channelize_poly with decimation factors (D) less than
the number of output channels (M). The case with D < M corresponds to
oversampling where a set of M outputs is produced for each D inputs.

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Using a direct pointer for dense filter tensors offers a performance benefit, but adds complexity. It would be better to pursue adding
traits to MatX to optimize ALU usage for memory access.

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Added flags to control the phase rotation assumption for the first output
in the oversampled case.

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@tbensonatl tbensonatl self-assigned this Apr 11, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 11, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@tbensonatl
Copy link
Copy Markdown
Collaborator Author

/build

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 11, 2026

Greptile Summary

This PR adds oversampling support to the polyphase channelizer, enabling decimation factors D < M. It introduces a new tiled shared-memory kernel (ChannelizePoly1D_SmemTiled) supporting both the maximally-decimated and oversampled paths, a fused complex MAC helper (channelize_cmac), comprehensive tests covering integer/rational oversampling, large-M tiling, and batch modes, and a Python reference generator for cross-validation. Previously flagged P0/P1 issues (division-by-zero, alignment, loop-invariant s, misleading comment) are all addressed in the current HEAD.

Confidence Score: 5/5

Safe to merge; all previously flagged P0/P1 issues are resolved and remaining findings are P2 style/comment improvements

All prior P0/P1 concerns (division-by-zero, smem alignment, loop-invariant hoisting, misleading inequality comment) are addressed. The new kernel logic, phase-rotation math, and dispatch hierarchy are sound. Remaining findings are a silently-ignored size parameter in the Python test generator and a misleading test comment — neither affects correctness of the production code.

test/test_vectors/generators/00_transforms.py — harris2003_oversampled_operators::channelize() ignores self.size

Important Files Changed

Filename Overview
include/matx/kernels/channelize_poly.cuh Adds MaximallyDecimated template parameter to ChannelizePoly1D, new ChannelizePoly1D_SmemTiled kernel (D==M and D<M paths), and channelize_cmac FMA helper; logic appears correct with proper circular-buffer and phase-rotation handling
include/matx/transforms/channelize_poly.h Dispatch logic updated to route oversampled inputs through SmemTiled or generic kernels; fused-DFT and Smem kernels correctly gated to D==M only
include/matx/operators/channelize_poly.h Output dimension formula correctly updated from num_channels to decimation_factor; documentation substantially expanded with Harris convention explanation and reference
test/00_transform/ChannelizePoly.cu Thorough new test suite (identity filter, integer/rational oversampling, large-M tiling, batched, complex filter, fallback path); one test comment incorrectly describes what triggers the fallback
test/test_vectors/generators/00_transforms.py Adds channelize_oversampled reference and harris2003_oversampled_operators; harris class hardcodes M/D/input_len constants and silently ignores the size parameter passed from C++
examples/channelize_poly_bench.cu Refactored to accept explicit -M/-D flags with proper validation guards; atol() used for parsing without detecting invalid string inputs

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["channelize_poly_impl(in, filter, M, D)"] --> B{"real input\n& real filter?"}
    B -- yes --> C{"D==M and M<=6?"}
    B -- no --> D{"D==M and M<=6?"}
    C -- yes --> E["FusedChan kernel"]
    C -- no --> F{"D==M and Smem fits?"}
    F -- yes --> G["Smem kernel"]
    F -- no --> H{"SmemTiled eligible?"}
    H -- yes --> I["SmemTiled kernel"]
    H -- no --> J["Generic ChannelizePoly1D"]
    D -- yes --> E
    D -- no --> K{"D==M and Smem fits?"}
    K -- yes --> G
    K -- no --> L{"SmemTiled eligible?"}
    L -- yes --> I
    L -- no --> J
    I --> M{"MaximallyDecimated\nD==M?"}
    M -- yes --> N["Fixed phase per channel\nincremental buf_row advance"]
    M -- no --> O["phase = (c + t*D) % M\nK-rotation filter cache"]
Loading

Reviews (5): Last reviewed commit: "Make ::detail constants visible for host..." | Re-trigger Greptile

Comment thread examples/channelize_poly_bench.cu Outdated
Comment thread include/matx/kernels/channelize_poly.cuh Outdated
Comment thread include/matx/kernels/channelize_poly.cuh
Comment thread include/matx/kernels/channelize_poly.cuh
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Comment thread examples/channelize_poly_bench.cu Outdated
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@tbensonatl
Copy link
Copy Markdown
Collaborator Author

/build

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@tbensonatl
Copy link
Copy Markdown
Collaborator Author

/build

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@tbensonatl
Copy link
Copy Markdown
Collaborator Author

/build

@tbensonatl tbensonatl requested a review from cliffburdick April 12, 2026 22:14
@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage is 91.829%feature/add-oversampled-polyphase-channelizer into main. No base build found for main.

Comment thread examples/channelize_poly_bench.cu
Comment thread include/matx/kernels/channelize_poly.cuh
Copy link
Copy Markdown
Collaborator

@cliffburdick cliffburdick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!

@tbensonatl tbensonatl merged commit 013c856 into main Apr 13, 2026
1 check passed
@cliffburdick cliffburdick deleted the feature/add-oversampled-polyphase-channelizer branch April 13, 2026 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants