Skip to content

Commit a3ea532

Browse files
committed
fixing jupyterlite and updating readme
1 parent d582dd1 commit a3ea532

14 files changed

Lines changed: 15468 additions & 116 deletions

README.md

Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4,34 +4,33 @@ This is the website for the
44
on the [skrub package](https://skrub-data.org/stable/): it contains all the material
55
used for the course, including the datasets and exercises used during the session.
66

7-
## Beta warning
8-
If you are reading this, then you will be attending the **Beta version** of this
9-
course. As a **Beta version**, this is not the final version of the course and
10-
it will be tweaked according to the feedback provided after the session.
11-
12-
Both the presentation and the content of the book are liable to be changed based
13-
on feedback.
14-
157
## Structure of the course
168
The course covers the main features of skrub, from data exploration to pipeline
17-
construction, with the notable exclusion of the Data Ops.
9+
construction. While skrub DataOps are a major feature of the package, they are
10+
also expansive enough to deserve their own course, and as such only a short introduction
11+
will be given here.
1812

1913
Each chapter includes a section that describes how a specific feature may assist
20-
in building a machine learning pipeline, along with practical code examples.
14+
in building a machine learning pipeline, along with practical code examples, and
15+
a quiz at the end.
2116

22-
Some chapters include exercises for participants to work with the explained features.
23-
These exercises are made available in `content/exercises`, as well as at the end
24-
of the respective lesson in `content/notebooks`.
25-
26-
The content of the book is split in sections, and each section includes a "final
27-
quiz" that covers the subjects covered up to that point.
17+
The course is split in sections, which group relevant material together. Each
18+
section is wrapped up by an exercise that covers what has been explained in the
19+
section.
20+
These exercises are made available in `content/exercises` as `py` files, and
21+
in `content/notebooks` as Jupyter notebooks.
2822

2923
# Prepration and setup
30-
First of all, clone the [GitHub repo](https://github.com/skrub-data/skrub-tutorials/tree/main)
31-
of this book to have access to the exercises. In a future version, Jupyterlite
32-
will be made available.
24+
25+
## Using Jupyterlite
26+
The easiest way to work on the exercises is simply by using Jupyterlite: this
27+
will create a notebook interface directly from the browser that allows to run the
28+
exercises without needing to create a local environment.
3329

3430
## Setting up a local environment
31+
If you still want to work locally (for example, if you want to use your own IDE),
32+
you can still do so by cloning the [GitHub repo](https://github.com/skrub-data/skrub-tutorials/tree/main)
33+
of this book to have access to the exercises.
3534

3635
### Finding the material
3736
Following any of the following commands should let you open a Jupyter lab or

content/exercises/01_ex_explore_clean.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
# Now use the skrub `TableReport` and answer the following questions:
1515

1616
# %%
17+
%pip install skrub
1718
from skrub import TableReport
1819

1920
TableReport(data)

content/exercises/02_ex_selectors.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
# # Exercise: using selectors together with `ApplyToCols`
33
# Consider this example dataframe:
44

5+
# %%
6+
%pip install skrub
57
# %%
68
import pandas as pd
79

@@ -26,6 +28,7 @@
2628
# `"str_id"`.
2729

2830
# %%
31+
%pip install skrub
2932
import skrub.selectors as s
3033
from sklearn.preprocessing import StandardScaler, OneHotEncoder
3134
from skrub import ApplyToCols

content/exercises/03_ex_feat_eng.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@
1010
# **Hint**: use the format `"%d %B %Y"` for the datetime.
1111
#
1212

13+
# %%
14+
%pip install skrub
15+
1316
# %%
1417
import pandas as pd
1518

content/exercises/04_ex_table_vec.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
# Replicate the behavior of a `TableVectorizer` using `ApplyToCols`, the skrub
44
# selectors, and the given transformers.
55

6+
# %%
7+
%pip install skrub
68
# %%
79
from skrub import Cleaner, ApplyToCols, StringEncoder, DatetimeEncoder
810
from sklearn.preprocessing import OneHotEncoder

0 commit comments

Comments
 (0)