File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 6262# We wrap the resulting polars dataframe in a `skrub` DataOp to benefit
6363# from the built-in `skrub.TableReport` display in the notebook. Using the
6464# `skrub` DataOps will also be useful for other reasons: all
65- # operations in this notebook chain operations chained together in a directed
65+ # operations in this notebook are chained together in a directed
6666# acyclic graph that is automatically tracked by `skrub`. This allows us to
6767# extract the resulting pipeline and apply it to new data later on, exactly
6868# like a trained scikit-learn pipeline. The main difference is that we do so
Original file line number Diff line number Diff line change 159159#
160160# In the example below, we define that the training data should be at most 2 years
161161# worth of data and the test data should be 24 weeks long. We also define a gap of
162- # 1 week between the training.
162+ # 1 week between the training and the testing sets .
163163#
164164# Let's check those statistics by iterating over the different folds provided by the
165165# splitter.
286286# A true model is navigating between the diagonal and the oracle model. The area between
287287# the diagonal and the Lorenz curve of a model is called the Gini index.
288288#
289- # For our model , we observe that each oracle model is not far from the diagonal. It
289+ # For our use case , we observe that each oracle model is not far from the diagonal. It
290290# means that the observed values do not contain a couple of large values with high
291291# variability. Therefore, it informs us that the complexity of our problem at hand is
292292# not too high. Looking at the Lorenz curve of each model, we observe that it is quite
You can’t perform that action at this time.
0 commit comments