Skip to content

Commit b8ce8a0

Browse files
franzpoeschelPöschelpre-commit-ci[bot]
authored
Unify Random-Access API and Streaming API into Series::snapshots() (#1592)
* Introduce SharedAttributableData * Add AbstractSeriesIterator * Derive SeriesIterator from AbstractSeriesIterator * Little fix * Introduce Snapshots.hpp * Make AbstractSeriesIterator non-virtual * Working commit for Series::snapshots() * No virtual operator[] * Remove random-accessing from iterator * Introduce AbstractSnapshotsContainer * basic random-access iteration * RandomAccessSnapshots.hpp -> snapshots/RandomAccessIterator.hpp * ReadIterations.hpp -> snapshots/StatefulIterator.hpp * SeriesIterator.hpp -> snapshots/IteratorTraits.hpp * Snapshots.hpp -> snapshots/Snapshots.hpp * Move AbstractSnapshotsContainer to ContainerTraits.hpp * Move Container implementations to ContainerImpls.(h|c)pp * Fix: parsePreference is not set in file-based iteratione encoding * Temporarily fix test * Const iteration * Extract stuff to .cpp * Reverse iteration * Commit missing Snapshots.cpp file * empty() * Revert wrong renaming ReadIterations/StatefulIterator * Rename SeriesIterator -> StatefulIterator * Add ::at, operator[] * beginStep(): always return relevant iteration indices * Basically working example for snapshots() in write access * Extract some methods to .cpp * Fully replace WriteIterations class with the new one * Fix nullpointer issue * Little fixes * Add some further API calls * Some postfix form transformations * Use snapshots() in read example 2 * Simplify ReadIterations implementation * Further cleanup * Change representation of iterations in current step * Initiate reading of group/variable-based encoding with nextStep() * Prepare internal representation to be aware of steps * Windows fixes * Adapt tests * Unify close status * Add basic test for opening after closing * Add new end() iterator representations * Reopening logic in Iterator, not yet in Series itself * Reopening fundamentally working in READ_LINEAR * Extend test still sth wrong in append_mode test, but see about this next week * For now, adapt the append_mode test * fixes * BUGFIX: modifiable attributes, maybe extract this to dev * Ensure that iterations are never parsed twice * Move currently_available_iterations to During_t * Revert "For now, adapt the append_mode test" This reverts commit 19b68ee. * Remember where we saw what iteration * Bit of cleanup * [wip] Groupbased writing: close and reopen * Further test and implement reopening of Iterations * Unused variable * some fixes to groupbased reopen test * Filebased reopen in ADIOS2 (no READ_WRITE support yet) * Now supports READ_WRITE too in filebased mode * Some exceptions for unimplemented stuff * Works in JSON and HDF5 now too * CI fixes * Virtual destructors * CI fixes continued * Some fixes for noexcept specifications * Further CI Fixes * CI FIXES * Fixes for ADIOS2 v2.7 * placate the intel compiler * noexcept details for MSVC * Fix ulimit test * Fix after rebase: dirtyRecursive * Fixes after rebase * remove conflict markers... * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Better defaults? * Parameterize Series::snapshots() * Use enum class for last commit * Add some missing minor function implementations * Don't use globbing * Add missing include * Better include structure, put Legacy stuff to Legacy headers * Bugfix * Documentation, cleanup * Add check_recursive_include script * Fixes after rebase * Fix bug that hindered files from being properly closed * Will this fix the Windows CI errors I dont think so * Use macro instead of function Proper return() is supported beginning with CMake 3.25 only * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Better document reopening options * Update close_iteration_test * Documentation --------- Co-authored-by: Pöschel <poesch58@ad.fz-rossendorf.de> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 77a55a3 commit b8ce8a0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+4438
-1358
lines changed

CMakeLists.txt

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -397,13 +397,12 @@ set(CORE_SOURCE
397397
src/Mesh.cpp
398398
src/ParticlePatches.cpp
399399
src/ParticleSpecies.cpp
400-
src/ReadIterations.cpp
401400
src/Record.cpp
401+
src/ReadIterations.cpp
402402
src/RecordComponent.cpp
403403
src/Series.cpp
404404
src/UnitDimension.cpp
405405
src/version.cpp
406-
src/WriteIterations.cpp
407406
src/auxiliary/Date.cpp
408407
src/auxiliary/Filesystem.cpp
409408
src/auxiliary/JSON.cpp
@@ -415,11 +414,19 @@ set(CORE_SOURCE
415414
src/backend/PatchRecordComponent.cpp
416415
src/backend/Writable.cpp
417416
src/benchmark/mpi/OneDimensionalBlockSlicer.cpp
418-
src/helper/list_series.cpp)
417+
src/helper/list_series.cpp
418+
src/snapshots/ContainerImpls.cpp
419+
src/snapshots/ContainerTraits.cpp
420+
src/snapshots/IteratorHelpers.cpp
421+
src/snapshots/IteratorTraits.cpp
422+
src/snapshots/RandomAccessIterator.cpp
423+
src/snapshots/Snapshots.cpp
424+
src/snapshots/StatefulIterator.cpp)
419425
set(IO_SOURCE
420426
src/IO/AbstractIOHandler.cpp
421427
src/IO/AbstractIOHandlerImpl.cpp
422428
src/IO/AbstractIOHandlerHelper.cpp
429+
src/IO/Access.cpp
423430
src/IO/DummyIOHandler.cpp
424431
src/IO/IOTask.cpp
425432
src/IO/FlushParams.cpp
@@ -773,8 +780,20 @@ if(openPMD_BUILD_TESTING)
773780
target_compile_definitions(CatchRunner PUBLIC openPMD_HAVE_MPI=1)
774781
endif()
775782

783+
macro(additional_testing_sources test_name out_list)
784+
if(${test_name} STREQUAL "SerialIO")
785+
list(APPEND ${out_list}
786+
test/Files_SerialIO/close_and_reopen_test.cpp
787+
test/Files_SerialIO/filebased_write_test.cpp
788+
)
789+
endif()
790+
endmacro()
791+
776792
foreach(testname ${openPMD_TEST_NAMES})
777-
add_executable(${testname}Tests test/${testname}Test.cpp)
793+
set(ADDITIONAL_SOURCE_FILES "")
794+
additional_testing_sources(${testname} ADDITIONAL_SOURCE_FILES)
795+
add_executable(${testname}Tests test/${testname}Test.cpp ${ADDITIONAL_SOURCE_FILES})
796+
target_include_directories(${testname}Tests PRIVATE test/Files_${testname}/)
778797
openpmd_cxx_required(${testname}Tests)
779798
set_target_properties(${testname}Tests PROPERTIES
780799
COMPILE_PDB_NAME ${testname}Tests

examples/10_streaming_read.cpp

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,15 @@ int main()
1818
return 0;
1919
}
2020

21+
// Access the Series linearly. This means that upon opening the Series, no
22+
// data is accessed yet. Instead, the single Iterations are processed
23+
// collectively, one after the other, and data access only happens upon
24+
// explicitly accessing an Iteration from `Series::snapshots()`. Note that
25+
// the Container API of `Series::snapshots()` will work in a restricted mode
26+
// compared to the `READ_RANDOM_ACCESS` access type, refer also to the
27+
// documentation of the `Snapshots` class in `snapshots/Snapshots.hpp`. This
28+
// restricted workflow enables performance optimizations in the backends,
29+
// and more importantly is compatible with streaming I/O.
2130
Series series = Series("electrons.sst", Access::READ_LINEAR, R"(
2231
{
2332
"adios2": {
@@ -29,15 +38,9 @@ int main()
2938
}
3039
})");
3140

32-
// `Series::writeIterations()` and `Series::readIterations()` are
33-
// intentionally restricted APIs that ensure a workflow which also works
34-
// in streaming setups, e.g. an iteration cannot be opened again once
35-
// it has been closed.
36-
// `Series::iterations` can be directly accessed in random-access workflows.
37-
for (IndexedIteration iteration : series.readIterations())
41+
for (auto &[index, iteration] : series.snapshots())
3842
{
39-
std::cout << "Current iteration: " << iteration.iterationIndex
40-
<< std::endl;
43+
std::cout << "Current iteration: " << index << std::endl;
4144
Record electronPositions = iteration.particles["e"]["position"];
4245
std::array<RecordComponent::shared_ptr_dataset_types, 3> loadedChunks;
4346
std::array<Extent, 3> extents;

examples/10_streaming_write.cpp

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
#include "openPMD/Series.hpp"
2+
#include "openPMD/snapshots/Snapshots.hpp"
13
#include <openPMD/openPMD.hpp>
24

35
#include <algorithm>
@@ -38,12 +40,12 @@ int main()
3840
std::shared_ptr<position_t> local_data(
3941
new position_t[length], [](position_t const *ptr) { delete[] ptr; });
4042

41-
// `Series::writeIterations()` and `Series::readIterations()` are
42-
// intentionally restricted APIs that ensure a workflow which also works
43-
// in streaming setups, e.g. an iteration cannot be opened again once
44-
// it has been closed.
45-
// `Series::iterations` can be directly accessed in random-access workflows.
46-
WriteIterations iterations = series.writeIterations();
43+
// Create the Series with synchronous snapshots, i.e. one Iteration after
44+
// the other. The alternative would be random-access where multiple
45+
// Iterations can be accessed independently from one another. This more
46+
// restricted mode enables performance optimizations in the backends, and
47+
// more importantly is compatible with streaming I/O.
48+
auto iterations = series.snapshots(SnapshotWorkflow::Synchronous);
4749
for (size_t i = 0; i < 100; ++i)
4850
{
4951
Iteration iteration = iterations[i];

examples/2_read_serial.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,13 +34,13 @@ int main()
3434
cout << "Read a Series with openPMD standard version " << series.openPMD()
3535
<< '\n';
3636

37-
cout << "The Series contains " << series.iterations.size()
37+
cout << "The Series contains " << series.snapshots().size()
3838
<< " iterations:";
39-
for (auto const &i : series.iterations)
39+
for (auto const &i : series.snapshots())
4040
cout << "\n\t" << i.first;
4141
cout << '\n';
4242

43-
Iteration i = series.iterations[100];
43+
Iteration i = series.snapshots()[100];
4444
cout << "Iteration 100 contains " << i.meshes.size() << " meshes:";
4545
for (auto const &m : i.meshes)
4646
cout << "\n\t" << m.first;

include/openPMD/IO/ADIOS/ADIOS2Auxiliary.hpp

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,21 @@ namespace adios_defs
6666
Yes,
6767
No
6868
};
69+
70+
/*
71+
* Necessary to implement the `reopen` flag of
72+
* `Parameter<Operation::OPEN_FILE>`. The distinction between Open and
73+
* Reopen is necessary for Write workflows in file-based encoding. In order
74+
* to write new data to an Iteration that was created and closed previously,
75+
* the only applicable access mode is Append mode, ideally in conjunction
76+
* with `SetParameter("FlattenSteps", "ON")`.
77+
*/
78+
enum class OpenFileAs
79+
{
80+
Create,
81+
Open,
82+
ReopenFileThatWeCreated
83+
};
6984
} // namespace adios_defs
7085

7186
/*

include/openPMD/IO/ADIOS/ADIOS2File.hpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -229,7 +229,10 @@ class ADIOS2File
229229

230230
using AttributeMap_t = std::map<std::string, adios2::Params>;
231231

232-
ADIOS2File(ADIOS2IOHandlerImpl &impl, InvalidatableFile file);
232+
ADIOS2File(
233+
ADIOS2IOHandlerImpl &impl,
234+
InvalidatableFile file,
235+
adios_defs::OpenFileAs);
233236

234237
~ADIOS2File();
235238

include/openPMD/IO/ADIOS/ADIOS2IOHandler.hpp

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,8 @@ class ADIOS2IOHandlerImpl
212212
* @brief The ADIOS2 access type to chose for Engines opened
213213
* within this instance.
214214
*/
215-
adios2::Mode adios2AccessMode(std::string const &fullPath);
215+
adios2::Mode
216+
adios2AccessMode(std::string const &fullPath, adios_defs::OpenFileAs);
216217

217218
FlushTarget m_flushTarget = FlushTarget::Disk;
218219

@@ -403,10 +404,13 @@ class ADIOS2IOHandlerImpl
403404
*/
404405
GroupOrDataset groupOrDataset(Writable *);
405406

406-
enum class IfFileNotOpen : bool
407+
enum class IfFileNotOpen : char
407408
{
408409
OpenImplicitly,
409-
ThrowError
410+
CreateImplicitly,
411+
ThrowError,
412+
ReopenFileThatWeCreated,
413+
ReopenFileFoundOnDisk = OpenImplicitly,
410414
};
411415

412416
detail::ADIOS2File &

include/openPMD/IO/Access.hpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,8 @@ enum class Access
8282
APPEND //!< write new iterations to an existing series without reading
8383
}; // Access
8484

85+
std::ostream &operator<<(std::ostream &o, Access const &a);
86+
8587
namespace access
8688
{
8789
inline bool readOnly(Access access)

include/openPMD/IO/IOTask.hpp

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,21 @@ struct OPENPMDAPI_EXPORT Parameter<Operation::OPEN_FILE>
187187
new Parameter<Operation::OPEN_FILE>(std::move(*this)));
188188
}
189189

190+
// Needed for reopening files in file-based Iteration encoding when using
191+
// R/W-mode in ADIOS2. Files can only be opened for reading XOR writing,
192+
// so R/W mode in file-based encoding can only operate at the granularity
193+
// of files in ADIOS2. The frontend needs to tell us if we should reopen
194+
// a file for continued reading (WasFoundOnDisk) or for continued writing
195+
// (WasCreatedByUs).
196+
enum class Reopen
197+
{
198+
WasCreatedByUs,
199+
WasFoundOnDisk,
200+
NoReopen
201+
};
202+
190203
std::string name = "";
204+
Reopen reopen = Reopen::NoReopen;
191205
using ParsePreference = internal::ParsePreference;
192206
std::shared_ptr<ParsePreference> out_parsePreference =
193207
std::make_shared<ParsePreference>(ParsePreference::UpFront);

include/openPMD/Iteration.hpp

Lines changed: 31 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -48,10 +48,8 @@ namespace internal
4848
Open, //!< Iteration has not been closed
4949
ClosedInFrontend, /*!< Iteration has been closed, but task has not yet
5050
been propagated to the backend */
51-
ClosedInBackend, /*!< Iteration has been closed and task has been
51+
Closed, /*!< Iteration has been closed and task has been
5252
propagated to the backend */
53-
ClosedTemporarily /*!< Iteration has been closed internally and may
54-
be reopened later */
5553
};
5654

5755
struct DeferredParseAccess
@@ -71,11 +69,6 @@ namespace internal
7169
* (Group- and variable-based parsing shares the same code logic.)
7270
*/
7371
bool fileBased = false;
74-
/**
75-
* If fileBased == true, the file name (without file path) of the file
76-
* containing this iteration.
77-
*/
78-
std::string filename;
7972
bool beginStep = false;
8073
};
8174

@@ -92,6 +85,14 @@ namespace internal
9285
* overwritten.
9386
*/
9487
CloseStatus m_closed = CloseStatus::Open;
88+
/*
89+
* While parsing a file-based Series, each file is opened, read, then
90+
* closed again. Explicitly `Iteration::open()`ing a file should only be
91+
* necessary after having explicitly closed it (or in
92+
* defer_iteration_parsing mode). So, the parsing procedures will set
93+
* this flag as true when closing an Iteration.
94+
*/
95+
bool allow_reopening_implicitly = false;
9596

9697
/**
9798
* Whether a step is currently active for this iteration.
@@ -107,14 +108,6 @@ namespace internal
107108
* Otherwise empty.
108109
*/
109110
std::optional<DeferredParseAccess> m_deferredParseAccess{};
110-
111-
/**
112-
* Upon reading a file, set this field to the used file name.
113-
* In inconsistent iteration paddings, we must remember the name of the
114-
* file since it cannot be reconstructed from the filename pattern
115-
* alone.
116-
*/
117-
std::optional<std::string> m_overrideFilebasedFilename{};
118111
};
119112
} // namespace internal
120113
/** @brief Logical compilation of data from one snapshot (e.g. a single
@@ -128,16 +121,18 @@ class Iteration : public Attributable
128121
template <typename T, typename T_key, typename T_container>
129122
friend class Container;
130123
friend class Series;
131-
friend class WriteIterations;
132-
friend class SeriesIterator;
133124
friend class internal::AttributableData;
134125
template <typename T>
135126
friend T &internal::makeOwning(T &self, Series);
136127
friend class Writable;
128+
friend class StatefulIterator;
129+
friend class StatefulSnapshotsContainer;
137130

138131
public:
139132
Iteration(Iteration const &) = default;
133+
Iteration(Iteration &&) = default;
140134
Iteration &operator=(Iteration const &) = default;
135+
Iteration &operator=(Iteration &&) = default;
141136

142137
using IterationIndex_t = uint64_t;
143138

@@ -220,12 +215,19 @@ class Iteration : public Attributable
220215

221216
/**
222217
* @brief Has the iteration been closed?
223-
* A closed iteration may not (yet) be reopened.
224218
*
225219
* @return Whether the iteration has been closed.
226220
*/
227221
bool closed() const;
228222

223+
/**
224+
* @brief Has the iteration been parsed yet?
225+
If not, it will contain no structure yet.
226+
*
227+
* @return Whether the iteration has been parsed.
228+
*/
229+
bool parsed() const;
230+
229231
/**
230232
* @brief Has the iteration been closed by the writer?
231233
* Background: Upon calling Iteration::close(), the openPMD API
@@ -299,6 +301,7 @@ class Iteration : public Attributable
299301
*/
300302
void reread(std::string const &path);
301303
void readFileBased(
304+
IterationIndex_t,
302305
std::string const &filePath,
303306
std::string const &groupPath,
304307
bool beginStep);
@@ -314,7 +317,7 @@ class Iteration : public Attributable
314317
*/
315318
struct BeginStepStatus
316319
{
317-
using AvailableIterations_t = std::optional<std::deque<uint64_t> >;
320+
using AvailableIterations_t = std::vector<uint64_t>;
318321

319322
AdvanceStatus stepStatus{};
320323
/*
@@ -356,11 +359,8 @@ class Iteration : public Attributable
356359
* Useful in group-based iteration encoding where the Iteration will only
357360
* be known after opening the step.
358361
*/
359-
static BeginStepStatus beginStep(
360-
std::optional<Iteration> thisObject,
361-
Series &series,
362-
bool reread,
363-
std::set<IterationIndex_t> const &ignoreIterations = {});
362+
static BeginStepStatus
363+
beginStep(std::optional<Iteration> thisObject, Series &series, bool reread);
364364

365365
/**
366366
* @brief End an IO step on the IO file (or file-like object)
@@ -434,13 +434,17 @@ inline T Iteration::dt() const
434434
*/
435435
class IndexedIteration : public Iteration
436436
{
437-
friend class SeriesIterator;
438-
friend class WriteIterations;
437+
friend class StatefulIterator;
438+
friend class LegacyIteratorAdaptor;
439439

440440
public:
441441
using index_t = Iteration::IterationIndex_t;
442442
index_t const iterationIndex;
443443

444+
inline IndexedIteration(std::pair<index_t const, Iteration> pair)
445+
: Iteration(std::move(pair.second)), iterationIndex(pair.first)
446+
{}
447+
444448
private:
445449
template <typename Iteration_t>
446450
IndexedIteration(Iteration_t &&it, index_t index)

0 commit comments

Comments
 (0)