Skip to content

Commit fe9ad24

Browse files
ariostasianna
andauthored
docs: update to reflect that RNTuple is the default writing format (#1624)
Updated docs to reflect RNTuple being the default writing format Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch>
1 parent 30f70b0 commit fe9ad24

2 files changed

Lines changed: 41 additions & 23 deletions

File tree

docs-sphinx/basic.rst

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -924,6 +924,8 @@ Reading RNTuples
924924
925925
TTree has been the default format to store large datasets in ROOT files for decades. However, it has slowly become outdated and is not optimized for modern systems. This is where the RNTuple format comes in. It is a modern serialization format that is designed with modern systems in mind and is planned to replace TTree in the coming years. `Version 1.0.0.0 <https://cds.cern.ch/record/2923186>`__ is out and will be supported "forever".
926926
927+
Starting in Uproot v5.7.0, RNTuple is the default format for writing. When you use the dict-like syntax to write data to a file, Uproot will create an RNTuple instead of a TTree.
928+
927929
RNTuples are deliberately simpler than TTrees by design. For the first time, there’s an official specification, making it much easier for third-party I/O tools like Uproot to support it. Uproot already supports reading the full RNTuple specification, meaning that you can read any RNTuple you find in the wild. It also already supports writing a large part of the specification, and intends to support as much as it makes sense for data analysis.
928930
929931
To ease the transition into RNTuples, we are designing the interface to closely match the existing TTree interface. Many of the functionality explained in the previous subsections works in the same way. However, there the terminology is slightly different (e.g. "branch" becomes "field") and arguments may vary slightly, accordingly.
@@ -1145,8 +1147,10 @@ Writing TTrees to a file
11451147
TTrees are a special type of object, just as TDirectories are special: data can be cumulatively added to them.
11461148
11471149
:doc:`uproot.writing.writable.WritableTree` objects can be created using the :ref:`uproot.writing.writable.WritableDirectory.mktree` method that Uproot provides for TDirectories.
1148-
Previously, they could be created by assigning TTree-like data to a name in a directory (e.g., ``file["tree"] = {"branch": np.arange(1000)}``). However, this syntax was deprecated,
1149-
as Uproot will switch to writing RNTuples with syntax, since the HEP community is moving towards RNTuples.
1150+
1151+
.. note::
1152+
1153+
Starting in v5.7.0, Uproot uses RNTuples as the default format for writing data when using the dict-like assignment syntax (e.g., ``file["my_data"] = {"my_array": np.arange(1000)}``). If you specifically want to write a TTree, you should use the :ref:`uproot.writing.writable.WritableDirectory.mktree` method.
11501154
11511155
.. code-block:: python
11521156
@@ -1327,14 +1331,21 @@ Writing RNTuples
13271331
13281332
Just like with reading, writing RNTuples is similar to writing TTree objects. Since RNTuples are much simpler, we aim to be able to write almost any RNTuple that you might want.
13291333
1330-
Here is an example of writing an RNTuple. Since TTree is still the default format for the near future, writing an RNTuple is a bit more verbose.
1334+
RNTuples are the default format for writing data starting in Uproot v5.7.0. You can write an RNTuple by using a dict-like syntax:
13311335
13321336
.. code-block:: python
13331337
13341338
>>> file = uproot.recreate("example.root")
13351339
>>> data = {"my_int": [1,2], "my_vector": [[1,2], [3,4,5]]}
1336-
>>> rntuple = file.mkrntuple("my_rntuple", data)
1337-
>>> rntuple.extend(data) # Can be extended, just like TTrees
1340+
>>> file["my_rntuple"] = data
1341+
>>> file["my_rntuple"].extend(data) # Can be extended, just like TTrees
1342+
1343+
You can also use the :ref:`uproot.writing.writable.WritableDirectory.mkrntuple` method for more explicit RNTuple creation, and to have the ability to initialize an empty RNTuple from a type specification dictionary or an Awkward form.
1344+
1345+
.. code-block:: python
1346+
1347+
>>> file.mkrntuple("ntuple", {"x": "f4", "y": "var * int64"})
1348+
<WritableNTuple '/ntuple' at 0x000131ff30e0>
13381349
13391350
Using your own interpretation
13401351
--------------------------------

src/uproot/writing/writable.py

Lines changed: 25 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -521,9 +521,11 @@ class WritableDirectory(MutableMapping):
521521
Subdirectories created this way will never be empty; to make an empty directory,
522522
use :ref:`uproot.writing.writable.WritableDirectory.mkdir`.
523523
524-
Similarly, non-empty TTrees can be created by assignment (see :doc:`uproot.writing.writable.WritableTree`
525-
for recognized TTree-like data), but empty TTrees require the
526-
:ref:`uproot.writing.writable.WritableDirectory.mktree` method.
524+
Similarly, non-empty RNTuples can be created by assignment starting in Uproot
525+
v5.7.0 (see :doc:`uproot.writing.writable.WritableNTuple` for recognized
526+
RNTuple-like data), but empty RNTuples require the
527+
:ref:`uproot.writing.writable.WritableDirectory.mkrntuple` method.
528+
Writing a TTree requires the :ref:`uproot.writing.writable.WritableDirectory.mktree` method.
527529
"""
528530

529531
def __init__(self, path, file, cascading):
@@ -1285,15 +1287,9 @@ def mktree(
12851287
12861288
Creates an empty TTree in this directory.
12871289
1288-
Note that TTrees can be created by assigning TTree-like data to a directory
1289-
(see :doc:`uproot.writing.writable.WritableTree` for recognized TTree-like types):
1290-
1291-
.. code-block:: python
1292-
1293-
my_directory["tree"] = {"branch1": np.array(...), "branch2": ak.Array(...)}
1294-
1295-
but TTrees created this way will never be empty. Use this method
1296-
to make an empty TTree or to control its parameters.
1290+
Note that starting in v5.7.0, Uproot uses RNTuples as the default format for writing
1291+
data when using the dict-like assignment syntax. Writing a TTree requires using this
1292+
method.
12971293
"""
12981294
if self._file.sink.closed:
12991295
raise ValueError("cannot create a TTree in a closed file")
@@ -1388,6 +1384,16 @@ def mkrntuple(
13881384
description (str): Description for the new RNTuple.
13891385
13901386
Creates an empty RNTuple in this directory.
1387+
1388+
Note that starting in v5.7.0, non-empty RNTuples can be created by
1389+
assigning RNTuple-like data to a directory:
1390+
1391+
.. code-block:: python
1392+
1393+
my_directory["ntuple"] = {"field1": np.array(...), "field2": ak.Array(...)}
1394+
1395+
but RNTuples created this way will never be empty. Use this method
1396+
to make an empty RNTuple or to control its parameters.
13911397
"""
13921398
if self._file.sink.closed:
13931399
raise ValueError("cannot create a RNTuple in a closed file")
@@ -2012,10 +2018,11 @@ class WritableNTuple:
20122018
20132019
Represents a writable ``RNTuple`` from a ROOT file.
20142020
2015-
Assigning TTree-like data to a directory creates the TTree object with all of
2016-
its metadata and fills it with the contents of the arrays in one step. To separate
2017-
the process of creating the TTree metadata from filling the first TBasket, use the
2018-
:doc:`uproot.writing.writable.WritableDirectory.mktree` method:
2021+
Assigning data to a directory creates an RNTuple object by default starting in Uproot v5.7.0.
2022+
This creates the RNTuple object with all of its metadata and fills it with
2023+
the contents of the arrays in one step. To separate the process of creating the
2024+
RNTuple metadata from filling the first cluster, use the
2025+
:doc:`uproot.writing.writable.WritableDirectory.mkrntuple` method:
20192026
20202027
.. code-block:: python
20212028
@@ -2039,8 +2046,8 @@ class WritableNTuple:
20392046
and slow to read (especially for Uproot, but also for ROOT).
20402047
20412048
For instance, if you want to write a million events and have enough memory
2042-
available to do that 100 thousand events at a time (total of 10 TBaskets),
2043-
then do so. Filling the RNTuple a hundred events at a time (total of 10000 TBaskets)
2049+
available to do that 100 thousand events at a time (total of 10 clusters),
2050+
then do so. Filling the RNTuple a hundred events at a time (total of 10000 clusters)
20442051
would be considerably slower for writing and reading, and the file would be much
20452052
larger than it could otherwise be, even with compression.
20462053
"""

0 commit comments

Comments
 (0)