Skip to content

Commit 152fa87

Browse files
vepadulanodpiparo
authored andcommitted
[df] Add more release notes for 6.40
(cherry picked from commit fe7fa7a)
1 parent 623ecbe commit 152fa87

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

README/ReleaseNotes/v640/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -341,6 +341,10 @@ Given the risk of silently incorrect physics results, and the absence of known w
341341
- The change of default compression settings used by Snapshot for the TTree output data format introduced in 6.38 (was 101 before 6.38, became 505 in 6.38) is reverted. That choice was based on evidence available up to that point that indicated that ZSTD was outperforming ZLIB in all cases for the available datasets. New evidence demonstrated that this is not always the case, and in particular for the notable case of TTree branches made of collections where many (up to all) of them are empty. The investigation is described at https://github.com/vepadulano/ttree-lossless-compression-studies. The new default compression settings for Snapshot are respectively `kUndefined` for the compression algorithm and `0` for the compression level. When Snapshot detects `kUndefined` used in the options, it changes the compression settings to the new defaults of 101 (for TTree) and 505 (for RNTuple).
342342
- Signatures of the HistoND and HistoNSparseD operations have been changed. Previously, the list of input column names was allowed to contain an extra column for events weights. This was done to align the logic with the THnBase::Fill method. But this signature was inconsistent with all other Histo* operations, which have a separate function argument that represents the column to get the weights from. Thus, HistoND and HistoNSparseD both now have a separate function argument for the weights. The previous signature is still supported, but deprecated: a warning will be raised if the user passes the column name of the weights as an extra element of the list of input column names. In a future version of ROOT this functionality will be removed. From now on, creating a (sparse) N-dim histogram with weights should be done by calling `HistoN[Sparse]D(histoModel, inputColumns, weightColumn)`.
343343
- The string expressions passed to `Vary` calls can now be shortened. If the string begins with '{' and ends with '}' (excluding whitespace, tab and newline characters), RDataFrame will automatically inject the return type in the generated lambda expression before declaring it to the interpreter. This for example allows writing an expression such as `{{px * 0.9, px * 1.1}, {py * 0.9, py * 1.1}}` instead of `ROOT::RVec<ROOT::RVec<ROOT::RVec<float>>>{{px * 0.9, px * 1.1}, {py * 0.9, py * 1.1}}`
344+
- Support for the `Report` action was added to distributed RDataFrame.
345+
- The memory ownership model as well as the information sharing mechanism has been reworked for computation graphs with nodes storing code to be just-in-time compiled. In practice, this reduces the amount of code that will be interpreted before the start of a computation graph. This results in faster setup times as well as a reduction in memory used and a safer memory management overall. For a simple yet non-trivial example with 2 computation graphs each having 1000 nodes with code to be just-in-time-compiled sharing the same function signature, previous versions of RDataFrame would produce 4020 lines of code to be interpreted whereas currently it would produce 35 lines.
346+
- A new method called `GetDatasetTopLevelFieldNames` was added to retrieve the list of names of available columns that correspond to top-level fields on-disk. This makes sense only when the data source supports hierarchical dataset schemas, such as in the case of TTree or RNTuple. For example, if the schema contains a user class with a data member, only the name of the top-level field containing the user class object would be reported, but not the name of the data member sub-field.
347+
- Previously, RDataFrame would prevent the user from calling Snapshot if one or more of the selected columns would correspond to a type without an available I/O dictionary. If that type is known to the interpreter, it can be written to disk both by TTree and RNTuple (with different levels of support). If interpreter information is available, RDataFrame now removes that safeguard and delegates responsibility for writing an object of an interpreted class type to disk to either TTree or RNTuple.
344348

345349
## Histograms
346350

0 commit comments

Comments
 (0)