You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: bindings/pyroot/pythonizations/python/ROOT/_pythonization/dataloader.md
+10-36Lines changed: 10 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,13 +38,9 @@ import ROOT
38
38
# Open a ROOT file and create an RDataFrame
39
39
rdf = ROOT.RDataFrame("events", "file.root")
40
40
41
-
# Define a Python callback to compute a new variable
42
-
def invariant_mass(E: float, p: float) -> float:
43
-
return math.sqrt(E**2 - p**2)
44
-
45
41
# Apply selections and compute derived features
46
42
rdf = rdf.Filter("nMuons >= 2") \
47
-
.Define("inv_mass", invariant_mass, ["E", "p"])
43
+
.Define("inv_mass", "sqrt(E*E - p*p)")
48
44
~~~
49
45
50
46
Then pass your `RDataFrame` to `RDataLoader`:
@@ -138,7 +134,7 @@ dl = RDataLoader(
138
134
# events with fewer than 10 jets are zero-padded
139
135
~~~
140
136
141
-
\warning Every RVec column in `columns` must appear in `max_vec_sizes`.
137
+
\warning Every vector column in `columns` must appear in `max_vec_sizes`.
142
138
143
139
## Iterating Batches
144
140
@@ -212,6 +208,14 @@ train, val = train_val.train_test_split(test_size=0.176)
212
208
213
209
## Advanced Features
214
210
211
+
### Eager loading
212
+
213
+
By default the loader reads data lazily, one chunk of data at a time. For small datasets that fit in memory and will be iterated many times, eager loading pays a one-time cost at construction and then serves batches every epoch from memory:
Correct class imbalance by oversampling the minority or undersampling the majority. You can do this by passing two RDataFrames:
@@ -244,33 +248,3 @@ dl = RDataLoader(rdf,
244
248
for X, y, w in dl.as_torch():
245
249
loss = (loss_fn(model(X), y) * w).mean()
246
250
~~~
247
-
248
-
### Eager loading
249
-
250
-
By default the loader reads data lazily, one chunk of data at a time. For small datasets that fit in memory and will be iterated many times, eager loading pays a one-time cost at construction and then serves every epoch from memory:
0 commit comments