Support for floating point types by vloncar · Pull Request #1307 · fastmachinelearning/hls4ml

vloncar · 2025-06-08T21:24:05Z

Description

Current hls4ml supports only arbitrary precision integer and fixed-point types. This PR adds support for floating point types. Floating point types are defined in ap_float.h and ac_float.h of the respective libraries and cover two distinct cases: IEEE floating-point standard (basically C/C++ types) and general floating-point implementation (any combination of mantissa and exponent). AC types library is more broad and offers more flexibility than AP types one. The PR covers both by introducing FloatPrecisionType for the general case (covered by the ac_float) and StandardFloatPrecisionType (for ap_float<W,E> and ac_std_float<W,E>). In principle one could cram everything in a single type but that makes it complicated to track what is the actual intended use, especially because of the 1-bit difference between AC and AP types.

To use, user can specify float, double, half and bfloat16 as type and this will result in StandardFloatTypePrecision objects to be used and those C++ types used in the generated code. Note that half and bfloat16 aren't supported out of the box and require the user to tweak the code to make it compile, as it is dependent on the compiler how these are exposed. If the user specifies std_float<W,E> the ap_float<W,E> and ac_std_float<W,E> will be generated using the same object. Finally, for full control of AC type, user can use ac_float<W,I,E,Q> which will use the FloatPrecisionType class and emit the corresponding type in code.

The PR is somewhat incomplete as there are numerous nuances of full support of the general case and the half/bfloat16 but is a good starting point and is self-contained. The problem remains with AP types that don't have a public version of ap_float.h (we'll ask AMD about open-sourcing it), so if user uses a general floating-point type the local compilation with compile() won't work. In the future we can tackle the include issue (also for half and bfloat16 on host compilers), as well as look into optimizations of algorithms (for example, using the accumulator type for the CMVM). The intention right now is not to make this a first-class supported feature, rather an exotic option for users who know what they want and are aware of the caveats and rough edges, but more crucially it allows us to avoid silly test failures due to bitwidth issues. This could be advertised as an experimental feature, but I see we're completely lacking any documentation on type setting, so that may come as a separate PR.

Type of change

New feature (non-breaking change which adds functionality)

Tests

Tests for parsing the new types has been added to test_types.h.

Checklist

Yeah, yeah, I did all the things in the checklist.

vloncar · 2025-06-27T15:08:23Z

I'll add ap_float.h and mapping to C++23 types. Converting to draft until I make the change

vloncar · 2025-07-28T18:57:12Z

 #include "nnet_utils/nnet_types.h"
 #include <cstddef>
 #include <cstdio>
+#if __cplusplus > 202002L


This check should be #if __cplusplus >= 202302L but since GCC doesn't fully support C++23 the constant is not set yet to that value, hence we check for "newer than C++20". Alternatively, we could check __STDCPP_BFLOAT16_T__ and __STDCPP_FLOAT16_T__ but these may also not be available even if the implementation is there...

I am running GCC version 11.5 which apparently support C++23, but does not support <stdfloat>. So this guard does not protect from compilation failures in my tests. Checking __STDCPP_BFLOAT16_T__ works, but then I run into further problems with ap_float not naming a type.

Seems that we need an #include "nnet_utils/nnet_types.h" in this file. With that, compilation works.

I don't understand, nnet_types.h is already included? As for the other issues, I am thinking to include <stdfloat> only if the type is used, and not otherwise. Same for ap_float.h, because that one is not available in Vivado 2020.1. Such a mess with these types...

Ah sorry, copied the wrong line. I had to add #include "ap_types/ap_float.h" in the defines.h for it to work.

JanFSchulte · 2025-07-30T12:49:40Z

I've been playing with this a bit and see that when running synthesis in Vitis I get this error

../../../../firmware/nnet_utils/nnet_helpers.h:287:13: error: use of overloaded operator '<<' is ambiguous (with operand types 'std::ostream' (aka 'basic_ostream<char>') and 'ap_float<32, 8, 0>')

Looks like we need an implementation of the << operator in the ap_types.h

JanFSchulte · 2025-07-30T18:54:18Z

I can now synthesize a model in Vitis 2024.1 when I use float as the data type. For an ap_float, there are still some issues, this is the full error stack I'm getting.

/home/tools/Xilinx/Vitis_HLS/2024.1/include/ap_float.h:24:3: warning: use of this statement in a constexpr function is a C++14 extension [-Wc++14-extensions]
  if (val <= 0)
  ^
/home/tools/Xilinx/Vitis_HLS/2024.1/include/ap_float.h:29:3: warning: multiple return statements in constexpr function is a C++14 extension [-Wc++14-extensions]
  return log2_ceil((val + 1) / 2) + 1;
  ^
/home/tools/Xilinx/Vitis_HLS/2024.1/include/ap_float.h:25:5: note: previous return statement is here
    return std::numeric_limits<int>::lowest();
    ^
/home/tools/Xilinx/Vitis_HLS/2024.1/include/ap_float.h:27:5: note: previous return statement is here
    return 0;
    ^
In file included from ../../../../myproject_test.cpp:1:
In file included from /home/tools/Xilinx/Vitis_HLS/2024.1/tps/lnx64/gcc-8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../include/c++/8.3.0/algorithm:61:
/home/tools/Xilinx/Vitis_HLS/2024.1/tps/lnx64/gcc-8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../include/c++/8.3.0/bits/stl_algobase.h:324:18: error: no viable overloaded '='
              *__result = *__first;
              ~~~~~~~~~ ^ ~~~~~~~~
/home/tools/Xilinx/Vitis_HLS/2024.1/tps/lnx64/gcc-8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../include/c++/8.3.0/bits/stl_algobase.h:386:22: note: in instantiation of function template specialization 'std::__copy_move<false, false, std::random_access_iterator_tag>::__copy_m<const float *, ap_float<32, 8> *>' requested here
                              _Category>::__copy_m(__first, __last, __result);
                                          ^
/home/tools/Xilinx/Vitis_HLS/2024.1/tps/lnx64/gcc-8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../include/c++/8.3.0/bits/stl_algobase.h:422:23: note: in instantiation of function template specialization 'std::__copy_move_a<false, const float *, ap_float<32, 8> *>' requested here
      return _OI(std::__copy_move_a<_IsMove>(std::__niter_base(__first),
                      ^
/home/tools/Xilinx/Vitis_HLS/2024.1/tps/lnx64/gcc-8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../include/c++/8.3.0/bits/stl_algobase.h:454:20: note: in instantiation of function template specialization 'std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<const float *, std::vector<float>>, ap_float<32, 8> *>' requested here
      return (std::__copy_move_a2<__is_move_iterator<_II>::__value>
                   ^
../../../../firmware/nnet_utils/nnet_helpers.h:255:10: note: in instantiation of function template specialization 'std::copy<__gnu_cxx::__normal_iterator<const float *, std::vector<float>>, ap_float<32, 8> *>' requested here
    std::copy(in_begin, in_end, dst);
         ^
../../../../myproject_test.cpp:63:13: note: in instantiation of function template specialization 'nnet::copy_data<float, ap_float<32, 8>, 0UL, 300UL>' requested here
      nnet::copy_data<float, input_t, 0, N_INPUT_1_1*N_INPUT_2_1*N_INPUT_3_1>(in, x);
            ^
/home/tools/Xilinx/Vitis_HLS/2024.1/include/ap_float.h:162:13: note: candidate function not viable: no known conversion from 'const float' to 'ap_float<32, 8>' for 1st argument
  ap_float& operator=(ap_float &&) = default;
            ^
/home/tools/Xilinx/Vitis_HLS/2024.1/include/ap_float.h:163:13: note: candidate function not viable: no known conversion from 'const float' to 'const ap_float<32, 8>' for 1st argument
  ap_float& operator=(const ap_float &) = default;
            ^
In file included from ../../../../myproject_test.cpp:1:
In file included from /home/tools/Xilinx/Vitis_HLS/2024.1/tps/lnx64/gcc-8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../include/c++/8.3.0/algorithm:61:
/home/tools/Xilinx/Vitis_HLS/2024.1/tps/lnx64/gcc-8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../include/c++/8.3.0/bits/stl_algobase.h:754:11: error: no viable overloaded '='
        *__first = __tmp;
        ~~~~~~~~ ^ ~~~~~
/home/tools/Xilinx/Vitis_HLS/2024.1/tps/lnx64/gcc-8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../include/c++/8.3.0/bits/stl_algobase.h:789:23: note: in instantiation of function template specialization 'std::__fill_n_a<ap_float<32, 8> *, unsigned long, double>' requested here
      return _OI(std::__fill_n_a(std::__niter_base(__first), __n, __value));
                      ^
../../../../firmware/nnet_utils/nnet_helpers.h:304:79: note: in instantiation of function template specialization 'std::fill_n<ap_float<32, 8> *, unsigned long, double>' requested here
template <class data_T, size_t SIZE> void fill_zero(data_T data[SIZE]) { std::fill_n(data, SIZE, 0.); }
                                                                              ^
../../../../myproject_test.cpp:95:19: note: in instantiation of function template specialization 'nnet::fill_zero<ap_float<32, 8>, 300UL>' requested here
            nnet::fill_zero<input_t, N_INPUT_1_1*N_INPUT_2_1*N_INPUT_3_1>(x);
                  ^
/home/tools/Xilinx/Vitis_HLS/2024.1/include/ap_float.h:162:13: note: candidate function not viable: no known conversion from 'const double' to 'ap_float<32, 8>' for 1st argument
  ap_float& operator=(ap_float &&) = default;
            ^
/home/tools/Xilinx/Vitis_HLS/2024.1/include/ap_float.h:163:13: note: candidate function not viable: no known conversion from 'const double' to 'const ap_float<32, 8>' for 1st argument
  ap_float& operator=(const ap_float &) = default;
            ^
2 warnings and 2 errors generated.
make: *** [csim.mk:87: obj/myproject_test.o] Error 1
ERROR: [SIM 211-100] 'csim_design' failed: compilation error(s).

vloncar · 2025-08-01T18:18:33Z

Well, it turns out ap_float has many, many limitations on how it can be used, so assignment of literals doesn't work (like in initializations of arrays), comparison operators with literals (like > 0 in relu) also etc. no idea how to solve this. i'm thinking we either can it for now, or leave but not advertise it. after all, most of this was developed so that i can write tests with float as the type and not worry about the f***ing mismatches.

JanFSchulte · 2025-08-01T18:51:09Z

Dang, that's frustrating. I think we should merge it as is and not advertise it. It all works fine from the hls4ml side and would be nice to just have it in there in case ap_float ever gets developed into a fully working data type, we will have support for it.

* Support for floating point types * Use C++23 types for bfloat16 and half * Implement << op * Use floating-point headers only if the types require them * Print a warning if new C++ types (half and bloat16) are used * Fix typo

Support for floating point types

be9c4fd

vloncar requested review from calad0i and jmitrevs June 8, 2025 21:24

vloncar added the please test Trigger testing by creating local PR branch label Jun 9, 2025

vloncar marked this pull request as draft June 27, 2025 15:08

vloncar added 2 commits July 28, 2025 17:13

Merge remote-tracking branch 'upstream/main' into float_type

4b668eb

Use C++23 types for bfloat16 and half

1c7dbf9

vloncar marked this pull request as ready for review July 28, 2025 18:50

vloncar commented Jul 28, 2025

View reviewed changes

vloncar added 3 commits July 30, 2025 19:44

Implement << op

cbc770f

Use floating-point headers only if the types require them

70f945e

Print a warning if new C++ types (half and bloat16) are used

98124b2

JanFSchulte approved these changes Aug 1, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/main' into float_type

684315e

jmitrevs reviewed Aug 4, 2025

View reviewed changes

Comment thread test/pytest/test_types.py Outdated

Fix typo

b1c847f

JanFSchulte merged commit 544a16d into fastmachinelearning:main Aug 6, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for floating point types#1307

Support for floating point types#1307
JanFSchulte merged 8 commits into
fastmachinelearning:mainfrom
vloncar:float_type

vloncar commented Jun 8, 2025

Uh oh!

vloncar commented Jun 27, 2025

Uh oh!

vloncar Jul 28, 2025

Uh oh!

JanFSchulte Jul 29, 2025

Uh oh!

JanFSchulte Jul 30, 2025

Uh oh!

vloncar Jul 30, 2025

Uh oh!

JanFSchulte Jul 30, 2025

Uh oh!

JanFSchulte commented Jul 30, 2025

Uh oh!

JanFSchulte commented Jul 30, 2025

Uh oh!

vloncar commented Aug 1, 2025

Uh oh!

JanFSchulte commented Aug 1, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

vloncar commented Jun 8, 2025

Description

Type of change

Tests

Checklist

Uh oh!

vloncar commented Jun 27, 2025

Uh oh!

vloncar Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

JanFSchulte Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

JanFSchulte Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

vloncar Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

JanFSchulte Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

JanFSchulte commented Jul 30, 2025

Uh oh!

JanFSchulte commented Jul 30, 2025

Uh oh!

vloncar commented Aug 1, 2025

Uh oh!

JanFSchulte commented Aug 1, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants