Skip to content

Commit ab93816

Browse files
committed
Adds a post processing for reuse index tapes to improve adjoint cache efficiency.
The MR adds several features to make the post processing possible: 1. Tapes and index handlers can now add arbitrary data to the active type in CoDiPack. This additional data is kept away from the user. This is a breaking interface change but usually not seen, since the standard tapes do not define additional data. 2. The tagging tapes have been adapted to use the new possibility to define additional data. 3. General iterator functions for input and output identifiers have been added to low level functions and external functions. 4. General tape iteration functions have been added to all tapes. This allows custom tape evaluations programmed by the user. 5. A tape cache optimizer uses the iterator functionality to access the tape and modify the identifiers for better cache usage. Merge pull request #77 from 'feature/tape_cache_optimization' into develop Reviewed-by: Jan Rottmayer <jan.rottmayer@scicomp.uni-kl.de>
2 parents 076fc38 + 616586d commit ab93816

113 files changed

Lines changed: 4929 additions & 509 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,9 @@ doc:
146146
@mkdir -p $(BUILD_DIR)/documentation
147147
CODI_VERSION=$(CODI_VERSION) doxygen
148148

149+
single_header:
150+
quom --include_directory include include/codi.hpp $(BUILD_DIR)/codi_single.hpp
151+
149152
.PHONY: format
150153
format:
151154
find include tests/general/include tests/general/src tests/events/include tests/events/src -type f -exec $(CLANG_FORMAT) -i {} \;

documentation/Changelog.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,22 @@
11
Changelog {#Changelog}
22
===========================
33

4+
### v3.?.? - ????-??-??
5+
- Features:
6+
* It is now possible to define custom tape evaluators for all CoDiPack tapes. The evaluators have access to the full
7+
statement data and functionality for low level functions.
8+
* Low level function and external functions can now iterate over their input and output identifiers.
9+
* New tool for optimizing the cache access of reuse index tapes. See \ref Example_29_Tape_cache_optimization.
10+
11+
- Internal:
12+
* Restructure of per value tape data handling. Each tape and each index manager in CoDiPack can now define data that
13+
is stored in each value. This is a breaking interface change but it will not affect the default CoDiPack tapes.
14+
This change is mostly used for debugging.
15+
16+
- Bugfix:
17+
* Explicitly set the language of CoDiPack to C++ for CMake.
18+
* Add missing setter functions to complex numbers.
19+
420
### v3.0.0 - 2025-07-08
521
- General:
622
* Raised default cpp version of CoDiPack to 17. If you require a lower cpp version please use CoDiPack 2.*.

documentation/CoDiLayout.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
<tab type="user" title="Example 26 - Jacobian tape readers" url="@ref Example_26_Jacobian_Tape_Readers"/>
4747
<tab type="user" title="Example 27 - Primal tape readers" url="@ref Example_27_Primal_Tape_Readers"/>
4848
<tab type="user" title="Example 28 - Complex numbers" url="@ref Example_28_Complex_numbers"/>
49+
<tab type="user" title="Example 29 - Tape cache optimization" url="@ref Example_29_Tape_cache_optimization"/>
4950
</tab>
5051
<tab type="user" title="Papers" url="@ref Papers"/>
5152
<tab type="user" title="Taping strategies" url="@ref TapingStrategy"/>

documentation/developer/simpleTape.cpp

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ struct SimpleTape : public codi::ReverseTapeInterface<double, double, int> {
4646
using Real = double;
4747
using Gradient = double;
4848
using Identifier = int;
49+
using ActiveTypeTapeData = int;
4950

5051
//! [Data stream - Type definition]
5152
using OperatorData = codi::ChunkedData<codi::Chunk1<OperatorCode>>;
@@ -176,17 +177,25 @@ struct SimpleTape : public codi::ReverseTapeInterface<double, double, int> {
176177

177178
static bool constexpr AllowJacobianOptimization = false; // If certain operations can be hidden from the tape.
178179

179-
//! [Identifiers - Initialization]
180+
//! [Identifiers - Initialization and handling]
180181
template<typename Real>
181-
void initIdentifier(Real& value, Identifier& identifier) {
182-
identifier = 0; // Initialize with zero we perform an online activity analysis.
182+
void initTapeData(Real& value, ActiveTypeTapeData& data) {
183+
data = 0; // Initialize with zero we perform an online activity analysis.
183184
}
184185

185186
template<typename Real>
186-
void destroyIdentifier(Real& value, Identifier& identifier) {
187+
void destroyTapeData(Real& value, ActiveTypeTapeData& data) {
187188
// Do nothing: Identifiers are not reused.
188189
}
189-
//! [Identifiers - Initialization]
190+
191+
Identifier const& getIdentifier(ActiveTypeTapeData const& data) {
192+
return data;
193+
}
194+
195+
Identifier& getIdentifier(ActiveTypeTapeData& data) {
196+
return data;
197+
}
198+
//! [Identifiers - Initialization and handling]
190199

191200
//! [Storing - Entry]
192201
template<typename Lhs, typename Rhs>

documentation/developer/simpleTape.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -97,11 +97,13 @@ left hand side identifier from the data stream in a linear index management sche
9797
\snippet developer/simpleTape.cpp Identifiers - Registration
9898

9999
The identifiers are stored in the AD type provided by CoDiPack. The initialization of the identifier in
100-
the AD value is done by the function `initIdentifier` required by the codi::InternalStatementRecordingTapeInterface. We
100+
the AD value is done by the function `initTapeData` required by the codi::InternalStatementRecordingTapeInterface. We
101101
implement an online activity analysis in this tape. Therefore, all identifiers in the AD values can be initialized with
102102
zero. The zero identifier is used in our implementation to track _passive_ values. These are values that do not depend
103-
on the input values. How this is done is explained in the next section.
104-
\snippet developer/simpleTape.cpp Identifiers - Initialization
103+
on the input values. How this is done is explained in the next section. In addition the
104+
codi::IdentifierInformationTapeInterface requires the function 'getIdentifier' for a const and non-const argument. Since
105+
we do not have any tape specific data beside the identifier, these function are the identity.
106+
\snippet developer/simpleTape.cpp Identifiers - Initialization and handling
105107

106108
#### Storing of expressions/operators
107109

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
//! [Example 29 - Tape cache optimization]
2+
#include <codi.hpp>
3+
#include <iostream>
4+
5+
//! [Function]
6+
template<typename Real>
7+
void func(const Real* x, size_t l, Real* y) {
8+
y[0] = 0.0;
9+
y[1] = 1.0;
10+
for(size_t i = 0; i < l; ++i) {
11+
y[0] += x[i];
12+
y[1] *= x[i];
13+
}
14+
}
15+
//! [Function]
16+
17+
int main(int nargs, char** args) {
18+
19+
using Real = codi::RealReverseIndex;
20+
using Identifier = typename Real::Identifier;
21+
using Tape = typename Real::Tape;
22+
23+
Real x[5];
24+
Real y[2];
25+
x[0] = 1.0;
26+
x[1] = 2.0;
27+
x[2] = 3.0;
28+
x[3] = 4.0;
29+
x[4] = 5.0;
30+
31+
// Step 1: Record the tape.
32+
Tape& tape = Real::getTape();
33+
tape.setActive();
34+
35+
for(size_t i = 0; i < 5; ++i) {
36+
tape.registerInput(x[i]);
37+
}
38+
39+
func(x, 5, y);
40+
41+
tape.registerOutput(y[0]);
42+
tape.registerOutput(y[1]);
43+
44+
tape.setPassive();
45+
46+
// Step 2: Gather the input and output identifiers.
47+
Identifier xIds[5];
48+
Identifier yIds[2];
49+
for(int i = 0; i < 5; i += 1) {
50+
xIds[i] = x[i].getIdentifier();
51+
}
52+
for(int i = 0; i < 2; i += 1) {
53+
yIds[i] = y[i].getIdentifier();
54+
}
55+
56+
// Step 3: Define the input and output iterators.
57+
auto iterX = [&xIds](auto&& func) {
58+
for(size_t i = 0; i < 5; ++i) {
59+
func(xIds[i]);
60+
}
61+
};
62+
auto iterY = [&yIds](auto&& func) {
63+
for(size_t i = 0; i < 2; ++i) {
64+
func(yIds[i]);
65+
}
66+
};
67+
68+
// Step 4: Apply the optimization.
69+
codi::IdentifierCacheOptimizerHotCold<Tape> co{tape};
70+
co.eval(iterX, iterY);
71+
72+
// Step 5: Do a tape evaluation with the translated ids.
73+
codi::Jacobian<double> jacobian(2,5);
74+
for(size_t curY = 0; curY < 2; curY += 1) {
75+
tape.gradient(yIds[curY]) = 1.0;
76+
tape.evaluate();
77+
78+
for(size_t curX = 0; curX < 5; curX += 1) {
79+
jacobian(curY,curX) = tape.gradient(xIds[curX]);
80+
tape.gradient(xIds[curX]) = 0.0;
81+
}
82+
}
83+
84+
std::cout << "Reverse Jacobian:" << std::endl;
85+
std::cout << "f(1 .. 5) = (" << y[0] << ", " << y[1] << ")" << std::endl;
86+
std::cout << "df/dx (1 .. 5) = \n" << jacobian << std::endl;
87+
88+
tape.reset();
89+
90+
return 0;
91+
}
92+
//! [Example 29 - Tape cache optimization]
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
Example 29 - Tape cache optimization {#Example_29_Tape_cache_optimization}
2+
=======
3+
4+
**Goal:** Demonstrate the use of std::complex with CoDiPack.
5+
6+
**Prerequisite:** \ref Tutorial_02_Reverse_mode_AD, \ref Example_02_Custom_adjoint_vector_evaluation
7+
8+
**Function:**
9+
\snippet examples/Example_28_Complex_numbers.cpp Function implementations
10+
11+
**Full code:**
12+
\snippet examples/Example_29_Tape_Cache_Optimization.cpp Example 29 - Tape cache optimization
13+
14+
**Additional information:**
15+
The cache optimizer performs a lifetime analysis of the identifiers on the tape. The identifiers are redistributed
16+
for more cache performance during the reverse evaluation.
17+
18+
Since identifiers are redistributed, identifiers from values like `x` should not be used. Instead, the identifier of
19+
`x` should be stored and then the stored value should be given to the optimizer.
20+
21+
The cache optimization is only meaningfully if the tape is evaluated at least 10 times, e.g., as in a reverse
22+
accumulation process.

documentation/user/Tutorials.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ or pointers to other features.
4848
| \subpage Example_26_Jacobian_Tape_Readers "" | Reading Jacobian tapes from disk. |
4949
| \subpage Example_27_Primal_Tape_Readers "" | Rading primal value tapes from disk. |
5050
| \subpage Example_28_Complex_numbers "" | How to use complex numbers in CoDiPack. |
51+
| \subpage Example_29_Tape_cache_optimization "" | Applying a cache optimimization for faster reverse evaluations to the tape.|
5152

5253
The graph shows how the tutorials and examples are connected. Usually it is better to understand first the prerequisites
5354
of a tutorial/example before reading the actual example.

documentation/user/TutorialsGraph.dot

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ digraph Tutorials {
7676

7777
E28 [label="E28 - Complex numbers"];
7878

79+
E29 [label="E29 - Tape cache optimization"];
80+
7981
// Edges (sorted)
8082
E02:e -> E08:w;
8183
E02:e -> E09:w;
@@ -102,6 +104,7 @@ digraph Tutorials {
102104
T02:e -> E23:w;
103105
T02:e -> E25:w;
104106
T02:e -> E28:w;
107+
T02:e -> E29:w;
105108
T02:e -> T03:w;
106109
T02:e -> T04:w;
107110
T02:e -> T05:w;

include/codi.hpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@
4848
#include "codi/tapes/data/blockData.hpp"
4949
#include "codi/tapes/data/chunkedData.hpp"
5050
#include "codi/tapes/forwardEvaluation.hpp"
51+
#include "codi/tapes/indices/debugMultiUseIndexManager.hpp"
5152
#include "codi/tapes/indices/linearIndexManager.hpp"
5253
#include "codi/tapes/indices/multiUseIndexManager.hpp"
5354
#include "codi/tapes/jacobianLinearTape.hpp"
@@ -72,6 +73,8 @@
7273
#include "codi/tools/helpers/preaccumulationHelper.hpp"
7374
#include "codi/tools/helpers/statementPushHelper.hpp"
7475
#include "codi/tools/helpers/tapeHelper.hpp"
76+
#include "codi/tools/identifierCacheOptimizer.hpp"
77+
#include "codi/tools/io/writeConnectivityData.hpp"
7578
#include "codi/tools/lowlevelFunctions/lowLevelFunctionCreationUtilities.hpp"
7679
#include "codi/traits/computationTraits.hpp"
7780
#include "codi/traits/numericLimits.hpp"

0 commit comments

Comments
 (0)