From e0c4c9aa8276fc3c3f3cacf0bffd45017014106b Mon Sep 17 00:00:00 2001 From: serge-sans-paille Date: Fri, 28 Nov 2025 23:58:19 +0100 Subject: [PATCH] Improve documentation - Add some note on integration, as a followup to #1207 - Harmonize capitalization of titles And generic documentation improvements to make things, hopefully, easier to understand. --- docs/Doxyfile | 4 +- docs/source/api/aligned_allocator.rst | 6 +-- docs/source/api/arch.rst | 4 +- docs/source/api/arithmetic_index.rst | 6 ++- docs/source/api/batch_index.rst | 4 +- docs/source/api/batch_manip.rst | 2 +- docs/source/api/bitwise_operators_index.rst | 2 +- docs/source/api/cast_index.rst | 2 +- docs/source/api/comparison_index.rst | 2 +- docs/source/api/data_transfer.rst | 6 ++- docs/source/api/instr_macros.rst | 4 +- docs/source/api/math_index.rst | 2 +- docs/source/api/reducer_index.rst | 2 +- docs/source/api/xsimd_batch.rst | 2 +- docs/source/api/xsimd_batch_bool.rst | 4 +- docs/source/api/xsimd_batch_complex.rst | 10 +++-- docs/source/api/xsimd_batch_constant.rst | 2 +- docs/source/basic_usage.rst | 29 ++++++++++----- docs/source/index.rst | 41 ++++++++++++++------- docs/source/installation.rst | 6 +-- docs/source/integration.rst | 37 +++++++++++++++++++ docs/source/vectorized_code.rst | 14 ++++--- 22 files changed, 132 insertions(+), 59 deletions(-) create mode 100644 docs/source/integration.rst diff --git a/docs/Doxyfile b/docs/Doxyfile index cfbd5156a..72cd9c32e 100644 --- a/docs/Doxyfile +++ b/docs/Doxyfile @@ -45,4 +45,6 @@ WARN_AS_ERROR = NO ENABLE_PREPROCESSING = YES MACRO_EXPANSION = YES EXPAND_ONLY_PREDEF = YES -PREDEFINED = XSIMD_NO_DISCARD= XSIMD_INLINE=inline DOXYGEN_SHOULD_SKIP_THIS= +PREDEFINED = XSIMD_NO_DISCARD= \ + XSIMD_INLINE=inline \ + DOXYGEN_SHOULD_SKIP_THIS= diff --git a/docs/source/api/aligned_allocator.rst b/docs/source/api/aligned_allocator.rst index fb0189528..c91a38b60 100644 --- a/docs/source/api/aligned_allocator.rst +++ b/docs/source/api/aligned_allocator.rst @@ -4,17 +4,17 @@ The full license is in the file LICENSE, distributed with this software. -Alignment manipulation +Alignment Manipulation ====================== -Aligned memory allocator +Aligned Memory Allocator ------------------------ .. doxygenclass:: xsimd::aligned_allocator :project: xsimd :members: -Alignement checker +Alignement Checker ------------------ .. doxygenfunction:: xsimd::is_aligned diff --git a/docs/source/api/arch.rst b/docs/source/api/arch.rst index f434feed4..d6c9bbca7 100644 --- a/docs/source/api/arch.rst +++ b/docs/source/api/arch.rst @@ -4,7 +4,7 @@ The full license is in the file LICENSE, distributed with this software. -Architecture manipulation +Architecture Manipulation ========================= xsimd provides an high level description of the instruction sets it manipulates. @@ -19,7 +19,7 @@ The best available architecture is available at compile time through :members: -Emulated mode +Emulated Mode ------------- When compiled with the macro ``XSIMD_WITH_EMULATED`` set to ``1``, xsimd also diff --git a/docs/source/api/arithmetic_index.rst b/docs/source/api/arithmetic_index.rst index a6f95d623..429600cb3 100644 --- a/docs/source/api/arithmetic_index.rst +++ b/docs/source/api/arithmetic_index.rst @@ -26,8 +26,10 @@ } -Arithmetic operators -==================== +.. _Arithmetic Operations: + +Arithmetic Operations +===================== Binary operations: diff --git a/docs/source/api/batch_index.rst b/docs/source/api/batch_index.rst index ac4e56029..4b934e60b 100644 --- a/docs/source/api/batch_index.rst +++ b/docs/source/api/batch_index.rst @@ -4,7 +4,9 @@ The full license is in the file LICENSE, distributed with this software. -Batch types +.. _Batch Types: + +Batch Types =========== .. toctree:: diff --git a/docs/source/api/batch_manip.rst b/docs/source/api/batch_manip.rst index f83f7a6a0..a4b98c588 100644 --- a/docs/source/api/batch_manip.rst +++ b/docs/source/api/batch_manip.rst @@ -4,7 +4,7 @@ The full license is in the file LICENSE, distributed with this software. -Conditional expression +Conditional Expression ====================== +------------------------------+-------------------------------------------+ diff --git a/docs/source/api/bitwise_operators_index.rst b/docs/source/api/bitwise_operators_index.rst index fe769e61f..e4631dcb3 100644 --- a/docs/source/api/bitwise_operators_index.rst +++ b/docs/source/api/bitwise_operators_index.rst @@ -26,7 +26,7 @@ } -Bitwise operators +Bitwise Operators ================= +---------------------------------------+----------------------------------------------------+ diff --git a/docs/source/api/cast_index.rst b/docs/source/api/cast_index.rst index dae96bb72..6b692deb7 100755 --- a/docs/source/api/cast_index.rst +++ b/docs/source/api/cast_index.rst @@ -27,7 +27,7 @@ -Type conversion +Type Conversion =============== Cast: diff --git a/docs/source/api/comparison_index.rst b/docs/source/api/comparison_index.rst index 8425faea6..23d75288c 100644 --- a/docs/source/api/comparison_index.rst +++ b/docs/source/api/comparison_index.rst @@ -26,7 +26,7 @@ } -Comparison operators +Comparison Operators ==================== Ordering: diff --git a/docs/source/api/data_transfer.rst b/docs/source/api/data_transfer.rst index 2ce62e010..815f56293 100644 --- a/docs/source/api/data_transfer.rst +++ b/docs/source/api/data_transfer.rst @@ -4,8 +4,10 @@ The full license is in the file LICENSE, distributed with this software. -Data transfer -============= +.. _Data Transfer: + +Data Transfers +============== From memory: diff --git a/docs/source/api/instr_macros.rst b/docs/source/api/instr_macros.rst index e90bb977c..0d0723e71 100644 --- a/docs/source/api/instr_macros.rst +++ b/docs/source/api/instr_macros.rst @@ -26,7 +26,7 @@ } -Instruction set macros +Instruction Set Macros ====================== Each of these macros corresponds to an instruction set supported by XSIMD. They @@ -36,7 +36,7 @@ can be used to filter arch-specific code. :project: xsimd :content-only: -Changing Default architecture +Changing Default Architecture ***************************** You can change the default instruction set used by xsimd (when none is provided diff --git a/docs/source/api/math_index.rst b/docs/source/api/math_index.rst index 40fe40193..97b04576b 100644 --- a/docs/source/api/math_index.rst +++ b/docs/source/api/math_index.rst @@ -27,7 +27,7 @@ -Mathematical functions +Mathematical Functions ====================== Basic functions: diff --git a/docs/source/api/reducer_index.rst b/docs/source/api/reducer_index.rst index f99c7f1a7..d6c504441 100644 --- a/docs/source/api/reducer_index.rst +++ b/docs/source/api/reducer_index.rst @@ -26,7 +26,7 @@ } -Reduction operators +Reduction Operators =================== +---------------------------------------+----------------------------------------------------+ diff --git a/docs/source/api/xsimd_batch.rst b/docs/source/api/xsimd_batch.rst index 7324b20f5..c8d5a5756 100644 --- a/docs/source/api/xsimd_batch.rst +++ b/docs/source/api/xsimd_batch.rst @@ -4,7 +4,7 @@ The full license is in the file LICENSE, distributed with this software. -Batch of scalars +Batch of Scalars ================ .. _xsimd-batch-ref: diff --git a/docs/source/api/xsimd_batch_bool.rst b/docs/source/api/xsimd_batch_bool.rst index d588740ea..3a3488f3d 100644 --- a/docs/source/api/xsimd_batch_bool.rst +++ b/docs/source/api/xsimd_batch_bool.rst @@ -4,7 +4,7 @@ The full license is in the file LICENSE, distributed with this software. -Batch of conditions +Batch of Conditions =================== .. _xsimd-batch-bool-ref: @@ -13,7 +13,7 @@ Batch of conditions :project: xsimd :members: -Logical operators +Logical Operators ----------------- .. doxygengroup:: batch_bool_logical diff --git a/docs/source/api/xsimd_batch_complex.rst b/docs/source/api/xsimd_batch_complex.rst index a398f9e1e..b0419e0d0 100644 --- a/docs/source/api/xsimd_batch_complex.rst +++ b/docs/source/api/xsimd_batch_complex.rst @@ -4,24 +4,26 @@ The full license is in the file LICENSE, distributed with this software. -Batch of complex numbers +Batch of Complex Numbers ======================== .. doxygenclass:: xsimd::batch< std::complex< T >, A > :project: xsimd :members: -Operations specific to batches of complex numbers +Operations Specific to Batches of Complex Numbers ------------------------------------------------- .. doxygengroup:: batch_complex :project: xsimd :content-only: -XTL complex support +XTL Complex Support ------------------- If the preprocessor token ``XSIMD_ENABLE_XTL_COMPLEX`` is defined, ``xsimd`` provides constructors of ``xsimd::batch< std::complex< T >, A >`` from -``xtl::xcomplex``, similar to those for ``std::complex``. This requires ``xtl`` +``xtl::xcomplex``, similar to those for ``std::complex``. This requires `XTL`_ to be installed. + +.. _XTL: https://github.com/xtensor-stack/xtl diff --git a/docs/source/api/xsimd_batch_constant.rst b/docs/source/api/xsimd_batch_constant.rst index c86656929..e9e75ea88 100644 --- a/docs/source/api/xsimd_batch_constant.rst +++ b/docs/source/api/xsimd_batch_constant.rst @@ -4,7 +4,7 @@ The full license is in the file LICENSE, distributed with this software. -Batch of constants +Batch of Constants ================== .. _xsimd-batch-constant-ref: diff --git a/docs/source/basic_usage.rst b/docs/source/basic_usage.rst index 479ad39d4..3a14408f8 100644 --- a/docs/source/basic_usage.rst +++ b/docs/source/basic_usage.rst @@ -4,10 +4,10 @@ The full license is in the file LICENSE, distributed with this software. -Basic usage +Basic Usage =========== -Manipulating abstract batches +Manipulating Abstract Batches ----------------------------- Here is an example that computes the mean of two batches, using the best @@ -15,22 +15,29 @@ architecture available, based on compile time informations: .. literalinclude:: ../../test/doc/manipulating_abstract_batches.cpp -The batch can be a batch of 4 single precision floating point numbers (e.g. on -Neon) or a batch of 8 (e.g. on AVX2). +There is no explicit architectural information available in the code, those are +deduced from the compiler target and its vector instruction support. If several +vector instructions sets are supported, the one with widest register width and +most operations is picked (e.g. AVX2 over AVX over SSE4.1). -Manipulating parametric batches +There is no explicit register size information available in the code, those +solely depend on the architecture picked, as stated above. The batch can be a +batch of 4 single precision floating point numbers (e.g. on Neon) or a batch of +8 (e.g. on AVX2). + +Manipulating Parametric Batches ------------------------------- -The previous example can be made fully parametric, both in the batch type and -the underlying architecture. This is achieved as described in the following -example: +The implicit architectural information from previous example can be made +explicit, and the type used can be parametric. This is achieved as described in +the following example: .. literalinclude:: ../../test/doc/manipulating_parametric_batches.cpp At its core, a :cpp:class:`xsimd::batch` is bound to the scalar type it contains, and to the instruction set it can use to operate on its values. -Explicit use of an instruction set extension +Explicit Use of an Instruction Set Extension -------------------------------------------- Here is an example that loads two batches of 4 double floating point values, and @@ -38,7 +45,9 @@ computes their mean, explicitly using the AVX extension: .. literalinclude:: ../../test/doc/explicit_use_of_an_instruction_set.cpp -Note that in that case, the instruction set is explicilty specified in the batch type. +Note that in that case, the instruction set is explicilty specified in the batch +type. The flag passed down to the compiler need to make it possible for this +architecture to be used. This example outputs: diff --git a/docs/source/index.rst b/docs/source/index.rst index 63bdbbe64..e6f0fcf40 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -16,10 +16,16 @@ Introduction on a batch of values at once, and thus provide a way to significantly accelerate code execution. However, these instructions differ between microprocessor vendors and compilers. -`xsimd` provides a unified means for using these features for library authors. Namely, it enables manipulation of batches of scalar and complex numbers with the same arithmetic -operators and common mathematical functions as for single values. +`xsimd` provides a unified means for using these features for library authors. -There are several ways to use `xsimd`: +The core of the library consist in a parametrized vector type, :ref:`Batch Types`, +and a set of operations to perform :ref:`Arithmetic Operations`, +:ref:`Data Transfer`, and many common mathemtical functions, as for single +values. + + +There are several ways to use `xsimd` using those :ref:`Batch Types` and +operations: - one can write a generic, vectorized, algorithm and compile it as part of their application build, with the right architecture flag; @@ -34,20 +40,13 @@ There are several ways to use `xsimd`: Of course, nothing prevents the combination of several of those approach, but more about this in section :ref:`Writing vectorized code`. -You can find out more about this implementation of C++ wrappers for SIMD intrinsics at the `The C++ Scientist`_. The mathematical functions are a +You can find out more about this implementation of C++ wrappers for SIMD +intrinsics at the `The C++ Scientist`_. The mathematical functions are a lightweight implementation of the algorithms also used in `boost.SIMD`_. -`xsimd` requires a C++11 compliant compiler. The following C++ compilers are supported: -+-------------------------+-------------------------------+ -| Compiler | Version | -+=========================+===============================+ -| Microsoft Visual Studio | MSVC 2015 update 2 and above | -+-------------------------+-------------------------------+ -| g++ | 4.9 and above | -+-------------------------+-------------------------------+ -| clang | 3.7 and above | -+-------------------------+-------------------------------+ +Compiler and Architecture Support +--------------------------------- The following SIMD instruction set extensions are supported: @@ -69,6 +68,19 @@ The following SIMD instruction set extensions are supported: | PowerPC | VSX | +--------------+---------------------------------------------------------+ + +`xsimd` requires a C++11 compliant compiler. The following C++ compilers are supported: + ++-------------------------+-------------------------------+ +| Compiler | Version | ++=========================+===============================+ +| Microsoft Visual Studio | MSVC 2015 update 2 and above | ++-------------------------+-------------------------------+ +| g++ | 4.9 and above | ++-------------------------+-------------------------------+ +| clang | 3.7 and above | ++-------------------------+-------------------------------+ + Licensing --------- @@ -90,6 +102,7 @@ This software is licensed under the BSD-3-Clause license. See the LICENSE file f basic_usage vectorized_code + integration .. toctree:: :caption: API REFERENCE diff --git a/docs/source/installation.rst b/docs/source/installation.rst index b1200dca0..c6475594f 100644 --- a/docs/source/installation.rst +++ b/docs/source/installation.rst @@ -28,7 +28,7 @@ Besides the `xsimd` headers, all these methods place the ``cmake`` project confi .. image:: conda.svg -Using the conda-forge package +Using the conda-forge Package ----------------------------- A package for `xsimd` is available for the `mamba `_ (or `conda `_) package manager. @@ -39,7 +39,7 @@ A package for `xsimd` is available for the `mamba .. image:: spack.svg -Using the Spack package +Using the Spack Package ----------------------- A package for `xsimd` is available on the `Spack `_ package manager. @@ -51,7 +51,7 @@ A package for `xsimd` is available on the `Spack `_ package ma .. image:: cmake.svg -From source with cmake +From Source with cmake ---------------------- You can install `xsimd` from source with `cmake `_. On Unix platforms, from the source directory: diff --git a/docs/source/integration.rst b/docs/source/integration.rst new file mode 100644 index 000000000..73abf4d38 --- /dev/null +++ b/docs/source/integration.rst @@ -0,0 +1,37 @@ +.. Copyright (c) 2025, Serge Guelton + + Distributed under the terms of the BSD 3-Clause License. + + The full license is in the file LICENSE, distributed with this software. + +Integration +=========== + +When Targeting a Single Architecture +------------------------------------ + +If you compile your whole project for a single architecture, you can rely on the +implicit architecture parameter for :cpp:class:`xsimd::batch`. Just add your +source using `xsimd` to your project build system, pass down the +appropriate flags and the magic should happen. + +It's very common though to have a base application with minimal architectural +constraints, while still wanting to benefit from the acceleration of better +instruction sets if those are available. + +When Targeting Multiple Architectures +------------------------------------- + +It's very common, especially when targeting Intel hardware, to set a minimal +baseline, say SSE2, for the base application, while still shipping computation +kernels specialized for SSE4.2, AVX2 or AVX512BF. + +In that case one can write specific kernels for each targeted instruction set +(or a generic one that's instantiated for each targeted instruction set). Those +kernels must then be compiled with the appropriate flags independently, and +linked into the application. + +`xsimd` provides a generic dispatch mechanism that can be used from the *base +application* to pick the best kernel *at runtime* based on runtime detection of the +supported architectures, as described more in detailed in :ref:`Arch +Dispatching`. diff --git a/docs/source/vectorized_code.rst b/docs/source/vectorized_code.rst index d536fa816..e5afe7154 100644 --- a/docs/source/vectorized_code.rst +++ b/docs/source/vectorized_code.rst @@ -4,7 +4,9 @@ The full license is in the file LICENSE, distributed with this software. -Writing vectorized code +.. _Writing Vectorized Code: + +Writing Vectorized Code ======================= Assume that we have a simple function that computes the mean of two vectors, something like: @@ -13,7 +15,7 @@ Assume that we have a simple function that computes the mean of two vectors, som How can we use `xsimd` to take advantage of vectorization? -Explicit use of an instruction set +Explicit Use of an Instruction Set ---------------------------------- `xsimd` provides the template class :cpp:class:`xsimd::batch` parametrized by ``T`` and ``A`` types where ``T`` is the type of the values involved in SIMD @@ -22,12 +24,14 @@ of ``batch``. For instance, assuming the AVX instruction set is available, the p .. literalinclude:: ../../test/doc/explicit_use_of_an_instruction_set_mean.cpp +Note that the code is written in a form that's independent from the actual +vector register width. However, if you want to write code that is portable, you cannot rely on the use of ``batch``. Indeed this won't compile on a CPU where only SSE2 instruction set is available for instance. Fortunately, if you don't set the second template parameter, `xsimd` picks the best architecture among the one available, based on the compiler flag you use. -Aligned vs unaligned memory +Aligned vs Unaligned Memory --------------------------- In the previous example, you may have noticed the :cpp:func:`xsimd::batch::load_unaligned` and :cpp:func:`xsimd::batch::store_unaligned` functions. These @@ -42,7 +46,7 @@ with STL containers. Let's change the previous code so it can take advantage of .. literalinclude:: ../../test/doc/explicit_use_of_an_instruction_set_mean_aligned.cpp -Memory alignment and tag dispatching +Memory Alignment and Tag Dispatching ------------------------------------ You may need to write code that can operate on any type of vectors or arrays, not only the STL ones. In that @@ -60,7 +64,7 @@ of a ``get_alignment_tag`` meta-function in the code, the previous code can be i mean(a, b, res, get_alignment_tag()); -Writing arch-independent code +Writing Arch-Independent Code ----------------------------- If your code may target either SSE2, AVX2 or AVX512 instruction set, `xsimd`