Skip to content

Commit f47f268

Browse files
committed
small updates
1 parent f2c0ec0 commit f47f268

6 files changed

Lines changed: 56 additions & 58 deletions

File tree

content/exercises/01_ex_explore_clean.ipynb

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"cells": [
33
{
44
"cell_type": "markdown",
5-
"id": "c8b7476a",
5+
"id": "e049e39d",
66
"metadata": {},
77
"source": [
88
"# Exercise: exploring a new table\n",
@@ -15,7 +15,7 @@
1515
{
1616
"cell_type": "code",
1717
"execution_count": null,
18-
"id": "10dc3dd9",
18+
"id": "5a34e0a7",
1919
"metadata": {},
2020
"outputs": [],
2121
"source": [
@@ -26,7 +26,7 @@
2626
},
2727
{
2828
"cell_type": "markdown",
29-
"id": "b95a5029",
29+
"id": "f01d2a92",
3030
"metadata": {},
3131
"source": [
3232
"Now use the skrub `TableReport` and answer the following questions:"
@@ -35,7 +35,7 @@
3535
{
3636
"cell_type": "code",
3737
"execution_count": null,
38-
"id": "a205456e",
38+
"id": "279facf2",
3939
"metadata": {},
4040
"outputs": [],
4141
"source": [
@@ -47,7 +47,7 @@
4747
},
4848
{
4949
"cell_type": "markdown",
50-
"id": "23c471c3",
50+
"id": "b41135ea",
5151
"metadata": {},
5252
"source": [
5353
"## Questions\n",
@@ -60,7 +60,6 @@
6060
"- Which columns have an imbalanced distribution?\n",
6161
"- Which columns are strongly correlated with each other?\n",
6262
"\n",
63-
"```{.python}\n",
6463
"# PLACEHOLDER\n",
6564
"#\n",
6665
"#\n",
@@ -71,8 +70,8 @@
7170
"#\n",
7271
"#\n",
7372
"#\n",
74-
"```\n",
7573
"\n",
74+
"# %% [markdown]\n",
7675
"## Answers\n",
7776
"- What's the size of the dataframe? (columns and rows)\n",
7877
" - 9228 rows × 8 columns\n",
@@ -98,7 +97,7 @@
9897
},
9998
{
10099
"cell_type": "markdown",
101-
"id": "f20bde70",
100+
"id": "cd729648",
102101
"metadata": {},
103102
"source": [
104103
"# Exercise: clean a dataframe using the `Cleaner`\n",
@@ -108,7 +107,7 @@
108107
{
109108
"cell_type": "code",
110109
"execution_count": null,
111-
"id": "1a512d31",
110+
"id": "5bf185b2",
112111
"metadata": {},
113112
"outputs": [],
114113
"source": [
@@ -119,7 +118,7 @@
119118
},
120119
{
121120
"cell_type": "markdown",
122-
"id": "2d8454f4",
121+
"id": "29c88f0c",
123122
"metadata": {},
124123
"source": [
125124
"Use the `TableReport` to answer the following questions:\n",
@@ -132,7 +131,7 @@
132131
{
133132
"cell_type": "code",
134133
"execution_count": null,
135-
"id": "50244f15",
134+
"id": "c3a0807e",
136135
"metadata": {},
137136
"outputs": [],
138137
"source": [
@@ -143,7 +142,7 @@
143142
},
144143
{
145144
"cell_type": "markdown",
146-
"id": "03dcbdcb",
145+
"id": "15005ad9",
147146
"metadata": {},
148147
"source": [
149148
"Then, use the `Cleaner` to sanitize the data so that:\n",
@@ -156,7 +155,7 @@
156155
{
157156
"cell_type": "code",
158157
"execution_count": null,
159-
"id": "e78ad1a3",
158+
"id": "2add235b",
160159
"metadata": {},
161160
"outputs": [],
162161
"source": [
@@ -176,7 +175,7 @@
176175
{
177176
"cell_type": "code",
178177
"execution_count": null,
179-
"id": "f7370994",
178+
"id": "f1f97ec9",
180179
"metadata": {},
181180
"outputs": [],
182181
"source": [
@@ -199,7 +198,7 @@
199198
},
200199
{
201200
"cell_type": "markdown",
202-
"id": "627265cd",
201+
"id": "72a1d694",
203202
"metadata": {},
204203
"source": [
205204
"We can inspect which columns were dropped and what transformations were applied:"
@@ -208,7 +207,7 @@
208207
{
209208
"cell_type": "code",
210209
"execution_count": null,
211-
"id": "eb157043",
210+
"id": "a94cbf09",
212211
"metadata": {},
213212
"outputs": [],
214213
"source": [

content/exercises/01_ex_explore_clean.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@
3030
# - Which columns have an imbalanced distribution?
3131
# - Which columns are strongly correlated with each other?
3232
#
33-
# ```{.python}
3433
# # PLACEHOLDER
3534
# #
3635
# #
@@ -41,8 +40,8 @@
4140
# #
4241
# #
4342
# #
44-
# ```
45-
#
43+
#
44+
# # %% [markdown]
4645
# ## Answers
4746
# - What's the size of the dataframe? (columns and rows)
4847
# - 9228 rows × 8 columns

content/exercises/02_ex_selectors.ipynb

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"cells": [
33
{
44
"cell_type": "markdown",
5-
"id": "d986e59e",
5+
"id": "64cbdacf",
66
"metadata": {},
77
"source": [
88
"# Exercise: using selectors together with `ApplyToCols`\n",
@@ -12,7 +12,7 @@
1212
{
1313
"cell_type": "code",
1414
"execution_count": null,
15-
"id": "5053fabc",
15+
"id": "97d637b0",
1616
"metadata": {
1717
"lines_to_next_cell": 0
1818
},
@@ -24,7 +24,7 @@
2424
{
2525
"cell_type": "code",
2626
"execution_count": null,
27-
"id": "ba2624c3",
27+
"id": "f2ea076c",
2828
"metadata": {},
2929
"outputs": [],
3030
"source": [
@@ -46,7 +46,7 @@
4646
},
4747
{
4848
"cell_type": "markdown",
49-
"id": "31811f22",
49+
"id": "66804095",
5050
"metadata": {},
5151
"source": [
5252
"Using the skrub selectors and `ApplyToCols`:\n",
@@ -59,7 +59,7 @@
5959
{
6060
"cell_type": "code",
6161
"execution_count": null,
62-
"id": "fadcb2e8",
62+
"id": "1fb49ecd",
6363
"metadata": {},
6464
"outputs": [],
6565
"source": [
@@ -84,7 +84,7 @@
8484
{
8585
"cell_type": "code",
8686
"execution_count": null,
87-
"id": "6be168d7",
87+
"id": "dc056e23",
8888
"metadata": {},
8989
"outputs": [],
9090
"source": [
@@ -103,7 +103,7 @@
103103
},
104104
{
105105
"cell_type": "markdown",
106-
"id": "df81b2a5",
106+
"id": "6ee9b01e",
107107
"metadata": {},
108108
"source": [
109109
"Given the same dataframe and using selectors, drop only string columns that contain\n",
@@ -113,7 +113,7 @@
113113
{
114114
"cell_type": "code",
115115
"execution_count": null,
116-
"id": "c72984a5",
116+
"id": "671dbeba",
117117
"metadata": {},
118118
"outputs": [],
119119
"source": [
@@ -132,7 +132,7 @@
132132
{
133133
"cell_type": "code",
134134
"execution_count": null,
135-
"id": "22bf8031",
135+
"id": "d1075488",
136136
"metadata": {},
137137
"outputs": [],
138138
"source": [
@@ -143,7 +143,7 @@
143143
},
144144
{
145145
"cell_type": "markdown",
146-
"id": "3a7f1af8",
146+
"id": "86430e36",
147147
"metadata": {},
148148
"source": [
149149
"Now write a custom function that selects columns where all values are lower than\n",
@@ -153,7 +153,7 @@
153153
{
154154
"cell_type": "code",
155155
"execution_count": null,
156-
"id": "ed9e22db",
156+
"id": "5e3efdf7",
157157
"metadata": {},
158158
"outputs": [],
159159
"source": [
@@ -172,7 +172,7 @@
172172
{
173173
"cell_type": "code",
174174
"execution_count": null,
175-
"id": "84bdfd4b",
175+
"id": "0f96a4e3",
176176
"metadata": {},
177177
"outputs": [],
178178
"source": [
@@ -189,7 +189,7 @@
189189
{
190190
"cell_type": "code",
191191
"execution_count": null,
192-
"id": "e9219bd9",
192+
"id": "7674715a",
193193
"metadata": {},
194194
"outputs": [],
195195
"source": []

content/exercises/03_ex_feat_eng.ipynb

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"cells": [
33
{
44
"cell_type": "markdown",
5-
"id": "f76fb6d0",
5+
"id": "f55abcb2",
66
"metadata": {},
77
"source": [
88
"# Exercise\n",
@@ -19,7 +19,7 @@
1919
{
2020
"cell_type": "code",
2121
"execution_count": null,
22-
"id": "7fb3b7ef",
22+
"id": "a326c03a",
2323
"metadata": {},
2424
"outputs": [],
2525
"source": [
@@ -29,7 +29,7 @@
2929
{
3030
"cell_type": "code",
3131
"execution_count": null,
32-
"id": "021e2515",
32+
"id": "08c44fa2",
3333
"metadata": {},
3434
"outputs": [],
3535
"source": [
@@ -53,7 +53,7 @@
5353
{
5454
"cell_type": "code",
5555
"execution_count": null,
56-
"id": "29f082fe",
56+
"id": "839700b0",
5757
"metadata": {},
5858
"outputs": [],
5959
"source": [
@@ -77,7 +77,7 @@
7777
{
7878
"cell_type": "code",
7979
"execution_count": null,
80-
"id": "b9ede08c",
80+
"id": "65918722",
8181
"metadata": {},
8282
"outputs": [],
8383
"source": [
@@ -100,7 +100,7 @@
100100
{
101101
"cell_type": "code",
102102
"execution_count": null,
103-
"id": "fb275e7c",
103+
"id": "4c8b4794",
104104
"metadata": {},
105105
"outputs": [],
106106
"source": [
@@ -120,7 +120,7 @@
120120
},
121121
{
122122
"cell_type": "markdown",
123-
"id": "290081f0",
123+
"id": "bf0c61a4",
124124
"metadata": {},
125125
"source": [
126126
"Modify the script so that the `DatetimeEncoder` adds periodic encoding with sine\n",
@@ -130,7 +130,7 @@
130130
{
131131
"cell_type": "code",
132132
"execution_count": null,
133-
"id": "0c4b22e4",
133+
"id": "dbb9934b",
134134
"metadata": {},
135135
"outputs": [],
136136
"source": [
@@ -153,7 +153,7 @@
153153
},
154154
{
155155
"cell_type": "markdown",
156-
"id": "28e1bb62",
156+
"id": "94e6e6be",
157157
"metadata": {},
158158
"source": [
159159
"Now modify the script above to add spline features (`periodic_encoding=\"spline\"`).\n"
@@ -162,7 +162,7 @@
162162
{
163163
"cell_type": "code",
164164
"execution_count": null,
165-
"id": "df44d5e7",
165+
"id": "bbed03e7",
166166
"metadata": {},
167167
"outputs": [],
168168
"source": [
@@ -188,7 +188,7 @@
188188
{
189189
"cell_type": "code",
190190
"execution_count": null,
191-
"id": "3d6d4cae",
191+
"id": "4925a0b9",
192192
"metadata": {},
193193
"outputs": [],
194194
"source": []

0 commit comments

Comments
 (0)