Better handle types and overflows in softmax (especially lagacy and stable)#1476
Better handle types and overflows in softmax (especially lagacy and stable)#1476jmitrevs wants to merge 6 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR focuses on making the Vivado softmax implementations more robust against overflow by improving internal type handling (notably for legacy and latency variants), while also improving default type choices for stable softmax and adding auto-precision inference support for Softmax accumulators.
Changes:
- Updated Vivado softmax templates to use accumulator types for summation / inversion-table addressing and to split exp/inv table types (legacy).
- Updated Vivado stream softmax templates accordingly (latency/stable/legacy).
- Added Softmax-specific
accum_tinference inInferPrecisionTypesand adjusted FPGA backend Softmax default type attributes (unsigned table defaults, newinp_norm_tattribute).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| hls4ml/templates/vivado/nnet_utils/nnet_activation.h | Aligns softmax (latency/stable/legacy) internal types to reduce overflow risk and clarifies exp/inv table type usage. |
| hls4ml/templates/vivado/nnet_utils/nnet_activation_stream.h | Mirrors softmax type changes for streaming implementations (but currently has inconsistencies that need fixing). |
| hls4ml/model/optimizer/passes/infer_precision.py | Adds Softmax-specific accumulator type inference when precision is set to auto. |
| hls4ml/backends/fpga/fpga_backend.py | Updates Softmax type attribute defaults (unsigned tables, adds inp_norm_t, adjusts accumulator handling). |
Comments suppressed due to low confidence (1)
hls4ml/templates/vivado/nnet_utils/nnet_activation_stream.h:295
- This legacy stream implementation still uses
CONFIG_T::table_sizeto compute/clampindex, butexp_tableis now allocated withCONFIG_T::exp_table_size. If these differ (and they can, since softmax exposes separate exp/inv table sizes),indexcan go out of bounds or the scaling will be wrong. Useexp_table_sizeconsistently for exp table addressing here.
auto data_round = (data_cache[j] - data_cache[i]) * CONFIG_T::table_size / 16;
int index = data_round + 8 * CONFIG_T::table_size / 16;
if (index < 0)
index = 0;
if (index > CONFIG_T::table_size - 1)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I should also update the oneAPI backend. Truthfully that is a bit more involved. I will look into it this month, but have to work some hardware tests the next two weeks, so this for the moment will be lower priority. |
|
On the oneAPI (and Quartus) side there are already some softmax changes in PR #1432 so we need to coordinate. |
Description
The legacy softmax was prone to overflow, so this PR made it use an accumulator type. Also, for stable, the default types are better chosen. There are some changes in latency, as well, to avoid overflow, but my feeling is that latency can often behave badly, so this doesn't fix that.
Type of change
Tests
The standard softmax tests should still pass. One could consider adding specific overflow tests, like we have in the current version of the brevitas tutorial.
Checklist
pre-commiton the files I edited or added.