Skip to content

Commit 18a19cf

Browse files
authored
Sync with Inria repo
2 parents 7b7c428 + 9c0f241 commit 18a19cf

16 files changed

Lines changed: 32 additions & 37 deletions

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ repos:
1010
exclude: notebooks
1111
exclude_types: [svg]
1212
- repo: https://github.com/psf/black
13-
rev: 23.1.0
13+
rev: 25.11.0
1414
hooks:
1515
- id: black
1616
- repo: https://github.com/astral-sh/ruff-pre-commit

environment-dev.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,4 +14,4 @@ dependencies:
1414
- packaging
1515
- pip
1616
- pip:
17-
- jupyter-book >= 0.11
17+
- jupyter-book < 2.0

jupyter-book/_config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ html:
4747
<div>
4848
<div class="mooc_add">
4949
<a href="https://www.fun-mooc.fr/en/courses/machine-learning-python-scikit-learn">Join the full MOOC experience</a>
50-
<a href="https://certification.probabl.ai/">Get officially certified!</a>
50+
<a href="https://probabl.ai/certification?utm_source=inria&utm_medium=mooc&utm_campaign=2026_inria_mooc_referrals">Get officially certified!</a>
5151
</div>
5252
Brought to you under a <a href="https://github.com/INRIA/scikit-learn-mooc/blob/main/LICENSE">CC-BY License</a> by
5353
<a href="https://learninglab.inria.fr">Inria Learning Lab</a>,

notebooks/cross_validation_validation_curve.ipynb

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -259,18 +259,18 @@
259259
"errors made during the data collection process (besides not measuring the\n",
260260
"unobserved input feature).\n",
261261
"\n",
262-
"One extreme case could happen if there where samples in the dataset with\n",
263-
"exactly the same input feature values but different values for the target\n",
264-
"variable. That is very unlikely in real life settings, but could the case if\n",
265-
"all features are categorical or if the numerical features were discretized\n",
266-
"or rounded up naively. In our example, we can imagine two houses having\n",
267-
"the exact same features in our dataset, but having different prices because\n",
268-
"of the (unmeasured) seller's rush.\n",
269-
"\n",
270-
"Apart from these extreme case, it's hard to know for sure what should qualify\n",
271-
"or not as noise and which kind of \"noise\" as introduced above is dominating.\n",
272-
"But in practice, the best ways to make our predictive models robust to noise\n",
273-
"are to avoid overfitting models by:\n",
262+
"One extreme case could happen if there where samples in the dataset with exactly\n",
263+
"the same input feature values but different values for the target variable. That\n",
264+
"is very unlikely in real life settings, but could be the case if all features\n",
265+
"are categorical or if the numerical features were discretized or rounded up\n",
266+
"naively. In our example, we can imagine two houses having the exact same\n",
267+
"features in our dataset, but having different prices because of the (unmeasured)\n",
268+
"seller's rush.\n",
269+
"\n",
270+
"Apart from this extreme case, it's hard to know for sure what should qualify or\n",
271+
"not as noise and which kind of \"noise\" as introduced above is dominating. But in\n",
272+
"practice, the best way to make our predictive models robust to noise is to\n",
273+
"avoid overfitting models by:\n",
274274
"\n",
275275
"- selecting models that are simple enough or with tuned hyper-parameters as\n",
276276
" explained in this module;\n",

notebooks/linear_models_ex_01.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@
4343
"penguins = pd.read_csv(\"../datasets/penguins_regression.csv\")\n",
4444
"feature_name = \"Flipper Length (mm)\"\n",
4545
"target_name = \"Body Mass (g)\"\n",
46-
"data, target = penguins[[feature_name]], penguins[target_name]"
46+
"data, target = penguins[[feature_name]], penguins[[target_name]]"
4747
]
4848
},
4949
{

notebooks/linear_models_sol_01.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@
4343
"penguins = pd.read_csv(\"../datasets/penguins_regression.csv\")\n",
4444
"feature_name = \"Flipper Length (mm)\"\n",
4545
"target_name = \"Body Mass (g)\"\n",
46-
"data, target = penguins[[feature_name]], penguins[target_name]"
46+
"data, target = penguins[[feature_name]], penguins[[target_name]]"
4747
]
4848
},
4949
{
@@ -153,7 +153,7 @@
153153
"def goodness_fit_measure(true_values, predictions):\n",
154154
" # we compute the error between the true values and the predictions of our\n",
155155
" # model\n",
156-
" errors = np.ravel(true_values) - np.ravel(predictions)\n",
156+
" errors = true_values - predictions\n",
157157
" # We have several possible strategies to reduce all errors to a single value.\n",
158158
" # Computing the mean error (sum divided by the number of element) might seem\n",
159159
" # like a good solution. However, we have negative errors that will misleadingly\n",

python_scripts/cross_validation_validation_curve.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -202,16 +202,16 @@
202202
#
203203
# One extreme case could happen if there where samples in the dataset with
204204
# exactly the same input feature values but different values for the target
205-
# variable. That is very unlikely in real life settings, but could the case if
206-
# all features are categorical or if the numerical features were discretized
207-
# or rounded up naively. In our example, we can imagine two houses having
208-
# the exact same features in our dataset, but having different prices because
209-
# of the (unmeasured) seller's rush.
205+
# variable. That is very unlikely in real life settings, but could be the case
206+
# if all features are categorical or if the numerical features were discretized
207+
# or rounded up naively. In our example, we can imagine two houses having the
208+
# exact same features in our dataset, but having different prices because of the
209+
# (unmeasured) seller's rush.
210210
#
211-
# Apart from these extreme case, it's hard to know for sure what should qualify
211+
# Apart from this extreme case, it's hard to know for sure what should qualify
212212
# or not as noise and which kind of "noise" as introduced above is dominating.
213-
# But in practice, the best ways to make our predictive models robust to noise
214-
# are to avoid overfitting models by:
213+
# But in practice, the best way to make our predictive models robust to noise
214+
# is to avoid overfitting models by:
215215
#
216216
# - selecting models that are simple enough or with tuned hyper-parameters as
217217
# explained in this module;

python_scripts/datasets_ames_housing.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,6 @@
169169
from sklearn.impute import SimpleImputer
170170
from sklearn.pipeline import make_pipeline
171171

172-
173172
numerical_features = [
174173
"LotFrontage",
175174
"LotArea",

python_scripts/ensemble_bagging.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -356,7 +356,6 @@ def bootstrap_sample(data, target, seed=0):
356356
from sklearn.preprocessing import MinMaxScaler
357357
from sklearn.pipeline import make_pipeline
358358

359-
360359
polynomial_regressor = make_pipeline(
361360
MinMaxScaler(),
362361
PolynomialFeatures(degree=4, include_bias=False),

python_scripts/feature_selection_introduction.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,6 @@
5757
from sklearn.feature_selection import f_classif
5858
from sklearn.pipeline import make_pipeline
5959

60-
6160
model_with_selection = make_pipeline(
6261
SelectKBest(score_func=f_classif, k=2),
6362
RandomForestClassifier(n_jobs=2),

0 commit comments

Comments
 (0)