New IO capabilities via FieldAPI IO #121
Conversation
|
The HDF5 functionality has now been added to field_api main. Could this also be merged into develop_ecmwf, as that's the branch we use in cloudsc? (@awnawab) |
Yes please! I need it for further developments. |
Sure, I'll rebase ecmwf-develop over main. |
|
F-API develop-ecmwf now rebased over main. |
|
@piotrows Can you update the field_api branch in the bundle.yml and resolve the conflicts? Then this should be ready for a look I believe? |
d027734 to
224903b
Compare
Done, let's have a look together. I've made this development a bit more optional. |
mlange05
left a comment
There was a problem hiding this comment.
First of all, apologies for the delayed review comments, and second of all many thanks for taking this on. I think this is already very useful and I've run a basic example locally and things do indeed work as I'd expect them to. So, many thanks for this and much appreciated.
Next up, I've left a few comments mostly about structure and where to call what. In short, I think it would be better if we would add the file-write capabilities as utilities of the field-based state and flux objects, and call these outside of the driver layer, triggered by env variables to allow run-time switching instead of compile-time switching.
I've also noted two things:
- Nothing in the tests actually tests the correct execution of the output file, so silent failure would not be caught. Maybe a simple check for the existence of the dumped file(s) could be added to one of the ctests?
- Executing the same run twice only works with a data set of the same size, however changing the working set size on a re-run fails. I'm assuming we want to replace existing output files on the re-run instead of writing into the existing file?
| TYPE(CLOUDSC_STATE_TYPE) ,INTENT(IN) :: TENDENCY_TMP | ||
| TYPE(CLOUDSC_STATE_TYPE) ,INTENT(IN) :: TENDENCY_LOC | ||
|
|
||
| REAL(KIND=JPRB), POINTER :: PAUX_PT(:,:,:) |
There was a problem hiding this comment.
I don't think the pointer declarations are needed here, or am I missing something?
There was a problem hiding this comment.
They are needed for the FA IO API so the shape of the variable becomes explicit.
There was a problem hiding this comment.
Are they truly needed there though? From my reading of https://github.com/ecmwf-ifs/field_api/blob/main/src/io/field_RANKSUFF_hdf5_module.fypp#L50 it uses SELF%PTR throughout, so there's no need to pass an externally declared pointer in. I also think this makes it look much more cumbersome than it needs to be, as the field object should have all the information locally to just write its content.
There was a problem hiding this comment.
I'm guessing this is required to have shape information embedded into the subroutine interface that allows Fortran to pick the correct variant for the field?
If the pointer was removed, something like
CALL WRITE_HDF5_PERRANK_DATA(PAUX%F_PT, PAUX_PVERVEL, "cloudsc_data", "PT", HDFEXISTS=.TRUE.)
would then have to have either a dimension keyword
CALL WRITE_HDF5_PERRANK_DATA(PAUX%F_PT, 3, "cloudsc_data", "PT", HDFEXISTS=.TRUE.)
or specify the actual implementation in the routine name, hypothetically paraphrasing:
CALL WRITE_HDF5_PERRANK_DATA_3RB(PAUX%F_PT, "cloudsc_data", "PT", HDFEXISTS=.TRUE.)
I might be misinterpreting this, but if the above is the case, then pointers with a certain dimension are an unintuitive way of specifying this information. @awnawab might have a suggestion here?
| write(0,1003) NUMPROC,NUMOMP,NGPTOTG,NPROMA,NGPBLKS | ||
| end if | ||
|
|
||
| #ifdef FIELD_API_IO |
There was a problem hiding this comment.
To me, the option to write the input state to file would be a run-time option, and not a compile-time option. How about we rename TEST_FIELD_HDF5 to WRITE_STATE_HDF5 and make it a property of the CLOUDSC_FIELD_STATE instead of invoking it from the kernel? This way we could toggle this via an environment flag in src/cloudsc_fortran/dwarf_cloudsc.F90 right after loading the input state from the classic input files?
Similar we could then write the output by implementing the same boilerplate on CLOUDSC_FLUX_TYPE instead of adding the I/O to the kernel driver.
There was a problem hiding this comment.
How about we rename
TEST_FIELD_HDF5toWRITE_STATE_HDF5
Could be done but what we are doing at the moment is testing (write+read, not a write alone).
There was a problem hiding this comment.
make it a property of the
CLOUDSC_FIELD_STATEinstead of invoking it from the kernel?
Well, this is an option, but then it looses the virtue of a top-level, clear example that teaches the FieldAPI IO. So it really depends if we want expand CLOUDSC capabilities or we would rather prefer a FA IO tutorial, or perhaps both.
There was a problem hiding this comment.
Well, this is an option, but then it looses the virtue of a top-level, clear example that teaches the FieldAPI IO. So it really depends if we want expand CLOUDSC capabilities or we would rather prefer a FA IO tutorial, or perhaps both.
I disagree on this one - having a pure procedural test routine does not make a good "tutorial". Instead, the derived types CLOUDSC_FIELD_STATE and CLOUDSC_FLUX_STATE are intended as examples of group types used in the EC-physics context. Adding independent read/write routines to those types is the main goal here, so why not illustrate this here in exactly this form?
There was a problem hiding this comment.
How about splitting the two into a write and read routine and calling them in this order from here?
That would work as a demonstration of how to use either mode individually.
For a full-on switch to HDF5 as inputs, we would need to replicate the inputs in a third format commited to the repository. I think that is a good idea but could also be tackled in a follow-on PR maybe?
The input reading is still very much set-up in a Serialbox first, otherwise use the legacy hdf5 read method. We could first remove that and full-on switch to HDF5, then add the field_api hdf5 reader.
| help : Enable Field API IO (requires MPI) [ON|OFF] | ||
| cmake : > | ||
| FIELD_API_ENABLE_IO={{value}} | ||
| FIELD_API_ENABLE_FIAT={{value}} |
There was a problem hiding this comment.
Should we not always enable FIAT now that it is part of the bundle? Also, could we sensibly always enabled Field-API-IO, or do we need that toggle for installs without hdf5?
There was a problem hiding this comment.
@reuterbal was strongly opposing permanent dependence on FIAT.
Since the availability of HDF5 for non-standard compilers is not obvious, perhaps we could stick to an optional HDF5...
There was a problem hiding this comment.
Ok, that makes sense. In that case please leave this as is.
| ${COMMON_LIB_LIBS} | ||
| $<${HAVE_FIELD_API}:fiat> | ||
| $<${HAVE_FIELD_API}:field_api_${prec}> | ||
| # $<$<NOT:$<BOOL:${HAVE_FIELD_API}>>:parkind_${prec}> |
Actually, there is a check. If you look closely, there is first a store to the output file and then an immediate read. If the store is corrupt, the read shall be also corrupt and the usual norms will fail. |
This is a design question ... the current solution prevents the user from overwriting his previously stored data and was chosen deliberately. We can extend the API to support both ways, or we can change the default behavior. |
|
@mlange05 @reuterbal
|
|
| TYPE(CLOUDSC_STATE_TYPE) ,INTENT(IN) :: TENDENCY_TMP | ||
| TYPE(CLOUDSC_STATE_TYPE) ,INTENT(IN) :: TENDENCY_LOC | ||
|
|
||
| REAL(KIND=JPRB), POINTER :: PAUX_PT(:,:,:) |
There was a problem hiding this comment.
I'm guessing this is required to have shape information embedded into the subroutine interface that allows Fortran to pick the correct variant for the field?
If the pointer was removed, something like
CALL WRITE_HDF5_PERRANK_DATA(PAUX%F_PT, PAUX_PVERVEL, "cloudsc_data", "PT", HDFEXISTS=.TRUE.)
would then have to have either a dimension keyword
CALL WRITE_HDF5_PERRANK_DATA(PAUX%F_PT, 3, "cloudsc_data", "PT", HDFEXISTS=.TRUE.)
or specify the actual implementation in the routine name, hypothetically paraphrasing:
CALL WRITE_HDF5_PERRANK_DATA_3RB(PAUX%F_PT, "cloudsc_data", "PT", HDFEXISTS=.TRUE.)
I might be misinterpreting this, but if the above is the case, then pointers with a certain dimension are an unintuitive way of specifying this information. @awnawab might have a suggestion here?
| set(CLOUDSC_FIAT_SOURCES | ||
| module/parkind1.F90 | ||
| module/ec_pmon_mod.F90 | ||
| module/oml_mod.F90 | ||
| module/abor1.F90 | ||
| ) |
There was a problem hiding this comment.
My suggestion would be to roll back this change and restore these files in the list above.
| set(CLOUDSC_FIAT_SOURCES | |
| module/parkind1.F90 | |
| module/ec_pmon_mod.F90 | |
| module/oml_mod.F90 | |
| module/abor1.F90 | |
| ) |
| SOURCES | ||
| ${CLOUDSC_COMMON_SOURCES} | ||
| ${COMMON_LIB_SOURCES} | ||
| $<$<NOT:$<BOOL:${HAVE_FIAT}>>:${CLOUDSC_FIAT_SOURCES}> |
There was a problem hiding this comment.
This can then also be removed
| $<$<NOT:$<BOOL:${HAVE_FIAT}>>:${CLOUDSC_FIAT_SOURCES}> |
| write(0,1003) NUMPROC,NUMOMP,NGPTOTG,NPROMA,NGPBLKS | ||
| end if | ||
|
|
||
| #ifdef FIELD_API_IO |
There was a problem hiding this comment.
How about splitting the two into a write and read routine and calling them in this order from here?
That would work as a demonstration of how to use either mode individually.
For a full-on switch to HDF5 as inputs, we would need to replicate the inputs in a third format commited to the repository. I think that is a good idea but could also be tackled in a follow-on PR maybe?
The input reading is still very much set-up in a Serialbox first, otherwise use the legacy hdf5 read method. We could first remove that and full-on switch to HDF5, then add the field_api hdf5 reader.
|
F-API develop-ecmwf rebased over main and now also contains ecmwf-ifs/field_api#108. |
| FIELD_API_ENABLE_IO={{value}} | ||
| FIELD_API_ENABLE_FIAT={{value}} | ||
| ENABLE_FIELD_API_ENABLE_IO={{value}} | ||
| ENABLE_MPI=ON |
There was a problem hiding this comment.
Is there really a need to forcefully enable MPI?
There was a problem hiding this comment.
Is there really a need to forcefully enable MPI?
The way it is coded in FA at the moment, yes. I am working on it on a FA side.
There was a problem hiding this comment.
Is there really a need to forcefully enable MPI?
@reuterbal @awnawab I've made MPI/MPL fully optional, what do you think about the current state?
There was a problem hiding this comment.
That's great, many thanks! Looks very nice on the CLOUDSC side now, but I haven't look at Field-API. I'll let @awnawab judge the F-API angle ;-)
There was a problem hiding this comment.
I think we can simplify it a little in F-API. My comments that follow are based on what I saw in napz-IO-mpi-independent-option, please disregard if you had plans to change that.
Right now the FIELD_API IO feature requires fiat and for fiat to be built with MPI. For the moment, let's keep these two assumptions, and if in the future they become limitations we can address them then.
The only MPI related information we need in the IO functionality is the MPI communicator. The current rank and total number of ranks can be queried from the communicator. So I would propose we simply make the MPI communicator an optional argument to any IO routine that needs it. If this argument is not present, then we can assume serial IO. We can print a message either way to NULOUT so the user isn't surprised.
This allows us to remove all the IO related options: IO_SERIAL, IO_MPI, IO_MPL. FIAT links to MPI publicly, so linking to FIAT means we can import the MPI runtime library too in F-API. Since that is currently a requirement for the IO feature, we don't even need to ifdef guard it.
Apologies if I've missed something, and we can also chat about it offline if you'd like 😄
There was a problem hiding this comment.
I think you misunderstood me. Having fiat as a dependency when we're building against field api is perfectly valid. I just don't like having fiat always as a dependency on non-fapi targets.
There was a problem hiding this comment.
It just means we increment the rank by 1 by default,
If it is not set in stone in MPL, then I would see it as unsafe to encode such an implicit assumption into non-MPL code.
There was a problem hiding this comment.
Having 4 IO related options seems like overkill
I think there are 3: SERIAL, MPI, MPL. There need to be at least two specified at the compile time (serial and parallel), the third one merely allows to be independent of FIAT.
There was a problem hiding this comment.
An
IOoption coupled with anMPIoption should be enough., and allows us to be independent of FIAT.
Again, from a domain scientist point of view, there should be an option of storing the data in an organized manner that translates in an unique way to a computational mesh, so the h5 file can be e.g. viewed in an external viewer or auto-processed in Python for postprocessing. At least until we move on to Atlas.
There was a problem hiding this comment.
I think you misunderstood me. Having fiat as a dependency when we're building against field api is perfectly valid. I just don't like having fiat always as a dependency on non-fapi targets.
My main point was to provide an MPI-independent setup, actually. You had a good point that this should be relaxed and indeed I see that for the sake of testing new compilers it is very handy to don't rely on the accompanying MPI.
| target_compile_options( ${TARGET}-${prec}-lib | ||
| PRIVATE $<$<COMPILE_LANGUAGE:CUDA>:SHELL:${CLOUDSC_CUDA_OPT_FLAGS} ${CLOUDSC_CUDA_FLAGS}> | ||
| ) | ||
| target_link_libraries(${TARGET}-${prec}-lib PRIVATE $<${HAVE_MPI}:MPI::MPI_C>) |
There was a problem hiding this comment.
This should be redundant to the addition above (if that is even needed)
| DESCRIPTION "Support for task-level parallelism using MPI" | ||
| DEFAULT OFF | ||
| REQUIRED_PACKAGES "MPI COMPONENTS Fortran" ) | ||
| REQUIRED_PACKAGES "MPI COMPONENTS Fortran C CXX" ) |
There was a problem hiding this comment.
Still curious what makes this necessary?
There was a problem hiding this comment.
As stated earlier, dependence of hdf5.h on mpi.h. I am not sure if CXX will be needed here, but it is hard to exclude at the moment.
| if( HAVE_FIELD_API_ENABLE_IO_MPL ) | ||
| list(APPEND CLOUDSC_DEFINITIONS FIELD_API_IO_MPL) | ||
| elseif( HAVE_FIELD_API_ENABLE_IO_MPI ) | ||
| find_package(MPI REQUIRED ) |
There was a problem hiding this comment.
I think it makes more sense to check here that the MPI feature is enabled. The bundle implicitly enforces this but a CMake-build without bundle would allow a halfway-house otherwise
| find_package(MPI REQUIRED ) | |
| if( NOT HAVE_MPI ) | |
| ecbuild_error("Cannot enable FIELD_API_ENABLE_IO_MPI when feature MPI is disabled.") | |
| endif() |
There was a problem hiding this comment.
Actually, at the moment it is activated in FieldAPI itself:
elseif(HAVE_IO_MPI)
ecbuild_enable_mpi( COMPONENTS FORTRAN REQUIRED)
so this section of cloudsc can be deleted indeed.
However, if we link to parallel version of HDF5, it is probably necessary anyway to activate MPI, even without --with-mpi in the bundle.
| Field API is used in newer versions of the IFS. Optionally, this version also tests | ||
| Field API IO. To test IO write/read feature, add --with-field-api-io at the build stage. | ||
| Note that this enables MPI by default. The IO feature is embedded in the | ||
| cloudsc-fortran Field API test. |
There was a problem hiding this comment.
This probably needs a small update later.
Co-authored-by: Balthasar Reuter <6384870+reuterbal@users.noreply.github.com>
…he latest ecbuild/eckit/atlas
…PI IO combinations
This PR adds HDF5 IO feature to a FieldAPI variant of CLOUDSC. The changes proposed are: