You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -51,11 +40,11 @@ The use of multidimensional arrays is widespread across various fields, particul
51
40
52
41
Many programming languages commonly used in scientific computing support multidimensional arrays in different ways. Fortran, a longstanding choice in the field, and Julia, a more recent language, both natively support these data structures. In contrast, the Python ecosystem relies on the popular NumPy library’s `numpy.Array`[@harris2020array]. Meanwhile, C++23 introduced `std::mdspan` to the standard library. This container was inspired by `Kokkos::View` from the Kokkos library which also serves as the foundation of DDC.
53
42
54
-
Despite their importance, multidimensional arrays introduce several practical challenges. In a sense, they encourage the usage of implicit information in the source code. A frequent source of errors is the inadvertent swapping of indices when accessing elements. Such errors can be difficult to detect, especially given the common convention of using single-letter variable names like `i` and `j` for indexing. Another challenge in medium to large codebases is the lack of semantic clarity in function signatures when using raw multidimensional arrays. When array dimensions carry specific meanings, this information is not explicitly represented in the source code, leaving it up to the user to ensure that dimensions are ordered correctly according to implicit expectations. For example it is quite usual to use the same index for multiple interpretations: looping over mesh cells identified by `i` and interpreting `i+1` as the face to the right. Another example is slicing that removes dimensions, this can shift the positions of remaining dimensions, altering the correspondence between axis indices and their semantic meanings.
43
+
Despite their importance, multidimensional arrays introduce several practical challenges. In a sense, they encourage the usage of implicit information in the source code. A frequent source of errors is the inadvertent swapping of indices when accessing elements. Such errors can be difficult to detect, especially given the common convention of using single-letter variable names like `i` and `j` for indexing. Another challenge in medium-to-large codebases is the lack of semantic clarity in function signatures when using raw multidimensional arrays. When array dimensions carry specific meanings, this information is not explicitly represented in the source code, leaving it up to the user to ensure that dimensions are ordered correctly according to implicit expectations. For example it is quite usual to use the same index for multiple interpretations: looping over mesh cells identified by `i` and interpreting `i+1` as the face to the right. Another example is slicing that removes dimensions, this can shift the positions of remaining dimensions, altering the correspondence between axis indices and their semantic meanings.
55
44
56
-
Solutions have been proposed to address these issues. For example in Python, the Xarray [@hoyer2017xarray] library allows users to label dimensions that can then be used to perform computations. Following a similar approach, the "Discrete Domain Computation" (DDC) library aims to bring equivalent functionality to the C++ ecosystem. It uses a zero overhead abstraction approach, i.e. with labels fixed at compile-time, on top of different performant portable libraries, such as: Kokkos [@9485033, @9502936], Kokkos Kernels [@rajamanickam2021kokkos], kokkos-fft [@kokkos-fft] and Ginkgo [@GinkgoJoss2020]. Labelling at compile time is achieved by strongly typing dimensions, an approach similar to that used in units libraries such as mp-units [@Pusz_mp-units_2024], which strongly type quantities rather than dimensions.
45
+
Solutions have been proposed to address these issues. For example, in Python, the Xarray [@hoyer2017xarray] library allows users to label dimensions that can then be used to perform computations. Following a similar approach, the "Discrete Domain Computation" (DDC) library aims to bring equivalent functionality to the C++ ecosystem. It uses a zero overhead abstraction approach, i.e., with labels fixed at compile-time, on top of different performant portable libraries, such as Kokkos [@9485033, @9502936], Kokkos Kernels [@rajamanickam2021kokkos], kokkos-fft [@kokkos-fft], and Ginkgo [@GinkgoJoss2020]. Labelling at compile time is achieved by strongly typing dimensions, an approach similar to that used in units libraries such as mp-units [@Pusz_mp-units_2024], which strongly type quantities rather than dimensions.
57
46
58
-
The library is actively used to modernize the Fortran-based Gysela plasma simulation code [@Bourne_Gyselalib]. This simulation code relies heavily on high-dimensional arrays. While the data stored in the arrays has 7 dimensions, each dimension can have multiple representations, including Fourier, spline, Cartesian, and various curvilinear meshes. The legacy Fortran implementation, used to manipulate multi-dimensional arrays that stored slices of all the possible dimensions with very limited information about which dimensions were actually represented to enforce correctness at the API level. DDC enables a more explicit, strongly-typed representation of these arrays, ensuring at compile-time that function calls respect the expected dimensions. This reduces indexing errors and improves code maintainability, particularly in large-scale scientific software.
47
+
The library is actively used to modernize the Fortran-based Gysela plasma simulation code [@Bourne_Gyselalib]. This simulation code relies heavily on high-dimensional arrays. While the data stored in the arrays has 7 dimensions, each dimension can have multiple representations, including Fourier, spline, Cartesian, and various curvilinear meshes. The legacy Fortran implementation was used to manipulate multi-dimensional arrays that stored slices of all the possible dimensions with very limited information about which dimensions were actually represented to enforce correctness at the API level. DDC enables a more explicit, strongly-typed representation of these arrays, ensuring at compile-time that function calls respect the expected dimensions. This reduces indexing errors and improves code maintainability, particularly in large-scale scientific software.
59
48
60
49
## DDC Core key features
61
50
@@ -78,7 +67,7 @@ In a DDC container, `DiscreteElement` indices represent absolute positions, whil
78
67
79
68

80
69
81
-
For example consider \autoref{fig:domains} that illustrates a two-dimensional data chunk with axes `X` and `Y`. Here `chunk_r` is a container defined over the red area and `chunk_b` is a slice of `chunk_r` over the blue area. Let us define
70
+
For example, consider \autoref{fig:domains} that illustrates a two-dimensional data chunk with axes `X` and `Y`. Here `chunk_r` is a container defined over the red area and `chunk_b` is a slice of `chunk_r` over the blue area. Let us define
82
71
83
72
-`DiscreteElement<X, Y> e(x_c, y_b)`,
84
73
-`DiscreteVector<X, Y> v(2, 1)`,
@@ -95,7 +84,7 @@ This highlights the fact that `DiscreteElement` provides a globally consistent i
95
84
96
85
### Sets of `DiscreteElement`
97
86
98
-
The semantics of DDC containers associates data to a set of `DiscreteElement` indices. Let us note that the set of all possible `DiscreteElement` has a total order that is typically established once and for all at program initialization. Thus to be able to construct a DDC container one must provide a multidimensional set of `DiscreteElement` indices, only these indices can be later used to access the container’s data.
87
+
The semantics of DDC containers associates data to a set of `DiscreteElement` indices. Let us note that the set of all possible `DiscreteElement` has a total order that is typically established once and for all at program initialization. Thus, to be able to construct a DDC container, one must provide a multidimensional set of `DiscreteElement` indices, where only these indices can be later used to access the container’s data.
99
88
100
89
The library provides several ways to group `DiscreteElement` into sets, each represented as a Cartesian product of per-dimension sets. These sets offer a lookup function to retrieve the position of a multi-index relative to the front of the set. The performance of container data access depends significantly on the compile-time properties of the set used.
0 commit comments