File tree Expand file tree Collapse file tree 5 files changed +62
-55
lines changed
Expand file tree Collapse file tree 5 files changed +62
-55
lines changed Original file line number Diff line number Diff line change 22 "cells" : [
33 {
44 "cell_type" : " markdown" ,
5- "id" : " e049e39d " ,
5+ "id" : " b5dcbf2f " ,
66 "metadata" : {},
77 "source" : [
88 " # Exercise: exploring a new table\n " ,
1515 {
1616 "cell_type" : " code" ,
1717 "execution_count" : null ,
18- "id" : " 5a34e0a7 " ,
18+ "id" : " a39de6a3 " ,
1919 "metadata" : {},
2020 "outputs" : [],
2121 "source" : [
2626 },
2727 {
2828 "cell_type" : " markdown" ,
29- "id" : " f01d2a92 " ,
29+ "id" : " 6bbb0310 " ,
3030 "metadata" : {},
3131 "source" : [
3232 " Now use the skrub `TableReport` and answer the following questions:"
3535 {
3636 "cell_type" : " code" ,
3737 "execution_count" : null ,
38- "id" : " 279facf2 " ,
38+ "id" : " 04295734 " ,
3939 "metadata" : {},
4040 "outputs" : [],
4141 "source" : [
4747 },
4848 {
4949 "cell_type" : " markdown" ,
50- "id" : " b41135ea" ,
51- "metadata" : {},
50+ "id" : " 61f7edb6" ,
51+ "metadata" : {
52+ "lines_to_next_cell" : 0
53+ },
5254 "source" : [
5355 " ## Questions\n " ,
5456 " - What's the size of the dataframe? (columns and rows)\n " ,
6971 " #\n " ,
7072 " #\n " ,
7173 " #\n " ,
72- " #\n " ,
73- " \n " ,
74- " # %% [markdown]\n " ,
74+ " #\n "
75+ ]
76+ },
77+ {
78+ "cell_type" : " markdown" ,
79+ "id" : " 7ba6e8d2" ,
80+ "metadata" : {},
81+ "source" : [
7582 " ## Answers\n " ,
7683 " - What's the size of the dataframe? (columns and rows)\n " ,
7784 " - 9228 rows × 8 columns\n " ,
97104 },
98105 {
99106 "cell_type" : " markdown" ,
100- "id" : " cd729648 " ,
107+ "id" : " b5d2a481 " ,
101108 "metadata" : {},
102109 "source" : [
103110 " # Exercise: clean a dataframe using the `Cleaner`\n " ,
107114 {
108115 "cell_type" : " code" ,
109116 "execution_count" : null ,
110- "id" : " 5bf185b2 " ,
117+ "id" : " 6e6b0d18 " ,
111118 "metadata" : {},
112119 "outputs" : [],
113120 "source" : [
118125 },
119126 {
120127 "cell_type" : " markdown" ,
121- "id" : " 29c88f0c " ,
128+ "id" : " cf0b8964 " ,
122129 "metadata" : {},
123130 "source" : [
124131 " Use the `TableReport` to answer the following questions:\n " ,
131138 {
132139 "cell_type" : " code" ,
133140 "execution_count" : null ,
134- "id" : " c3a0807e " ,
141+ "id" : " e8d0ec23 " ,
135142 "metadata" : {},
136143 "outputs" : [],
137144 "source" : [
142149 },
143150 {
144151 "cell_type" : " markdown" ,
145- "id" : " 15005ad9 " ,
152+ "id" : " c8f51c8b " ,
146153 "metadata" : {},
147154 "source" : [
148155 " Then, use the `Cleaner` to sanitize the data so that:\n " ,
155162 {
156163 "cell_type" : " code" ,
157164 "execution_count" : null ,
158- "id" : " 2add235b " ,
165+ "id" : " 4fb2a1ba " ,
159166 "metadata" : {},
160167 "outputs" : [],
161168 "source" : [
175182 {
176183 "cell_type" : " code" ,
177184 "execution_count" : null ,
178- "id" : " f1f97ec9 " ,
185+ "id" : " 0c3c78dc " ,
179186 "metadata" : {},
180187 "outputs" : [],
181188 "source" : [
198205 },
199206 {
200207 "cell_type" : " markdown" ,
201- "id" : " 72a1d694 " ,
208+ "id" : " 8a1c1797 " ,
202209 "metadata" : {},
203210 "source" : [
204211 " We can inspect which columns were dropped and what transformations were applied:"
207214 {
208215 "cell_type" : " code" ,
209216 "execution_count" : null ,
210- "id" : " a94cbf09 " ,
217+ "id" : " d8d13e9a " ,
211218 "metadata" : {},
212219 "outputs" : [],
213220 "source" : [
Original file line number Diff line number Diff line change 4141# #
4242# #
4343#
44- # # %% [markdown]
44+ # %% [markdown]
4545# ## Answers
4646# - What's the size of the dataframe? (columns and rows)
4747# - 9228 rows × 8 columns
Original file line number Diff line number Diff line change 22 "cells" : [
33 {
44 "cell_type" : " markdown" ,
5- "id" : " 64cbdacf " ,
5+ "id" : " 0a1656ab " ,
66 "metadata" : {},
77 "source" : [
88 " # Exercise: using selectors together with `ApplyToCols`\n " ,
1212 {
1313 "cell_type" : " code" ,
1414 "execution_count" : null ,
15- "id" : " 97d637b0 " ,
15+ "id" : " 94bc1caa " ,
1616 "metadata" : {
1717 "lines_to_next_cell" : 0
1818 },
2424 {
2525 "cell_type" : " code" ,
2626 "execution_count" : null ,
27- "id" : " f2ea076c " ,
27+ "id" : " 743b063a " ,
2828 "metadata" : {},
2929 "outputs" : [],
3030 "source" : [
4646 },
4747 {
4848 "cell_type" : " markdown" ,
49- "id" : " 66804095 " ,
49+ "id" : " 4da47e99 " ,
5050 "metadata" : {},
5151 "source" : [
5252 " Using the skrub selectors and `ApplyToCols`:\n " ,
5959 {
6060 "cell_type" : " code" ,
6161 "execution_count" : null ,
62- "id" : " 1fb49ecd " ,
62+ "id" : " 190aa32e " ,
6363 "metadata" : {},
6464 "outputs" : [],
6565 "source" : [
8484 {
8585 "cell_type" : " code" ,
8686 "execution_count" : null ,
87- "id" : " dc056e23 " ,
87+ "id" : " b9c6b4ee " ,
8888 "metadata" : {},
8989 "outputs" : [],
9090 "source" : [
103103 },
104104 {
105105 "cell_type" : " markdown" ,
106- "id" : " 6ee9b01e " ,
106+ "id" : " 6e321108 " ,
107107 "metadata" : {},
108108 "source" : [
109109 " Given the same dataframe and using selectors, drop only string columns that contain\n " ,
113113 {
114114 "cell_type" : " code" ,
115115 "execution_count" : null ,
116- "id" : " 671dbeba " ,
116+ "id" : " 81750237 " ,
117117 "metadata" : {},
118118 "outputs" : [],
119119 "source" : [
132132 {
133133 "cell_type" : " code" ,
134134 "execution_count" : null ,
135- "id" : " d1075488 " ,
135+ "id" : " f0721941 " ,
136136 "metadata" : {},
137137 "outputs" : [],
138138 "source" : [
143143 },
144144 {
145145 "cell_type" : " markdown" ,
146- "id" : " 86430e36 " ,
146+ "id" : " d3f434c8 " ,
147147 "metadata" : {},
148148 "source" : [
149149 " Now write a custom function that selects columns where all values are lower than\n " ,
153153 {
154154 "cell_type" : " code" ,
155155 "execution_count" : null ,
156- "id" : " 5e3efdf7 " ,
156+ "id" : " de74b276 " ,
157157 "metadata" : {},
158158 "outputs" : [],
159159 "source" : [
172172 {
173173 "cell_type" : " code" ,
174174 "execution_count" : null ,
175- "id" : " 0f96a4e3 " ,
175+ "id" : " 333aaa5f " ,
176176 "metadata" : {},
177177 "outputs" : [],
178178 "source" : [
189189 {
190190 "cell_type" : " code" ,
191191 "execution_count" : null ,
192- "id" : " 7674715a " ,
192+ "id" : " d5751f7a " ,
193193 "metadata" : {},
194194 "outputs" : [],
195195 "source" : []
Original file line number Diff line number Diff line change 22 "cells" : [
33 {
44 "cell_type" : " markdown" ,
5- "id" : " f55abcb2 " ,
5+ "id" : " 564b4538 " ,
66 "metadata" : {},
77 "source" : [
88 " # Exercise\n " ,
1919 {
2020 "cell_type" : " code" ,
2121 "execution_count" : null ,
22- "id" : " a326c03a " ,
22+ "id" : " 0bf5f727 " ,
2323 "metadata" : {},
2424 "outputs" : [],
2525 "source" : [
2929 {
3030 "cell_type" : " code" ,
3131 "execution_count" : null ,
32- "id" : " 08c44fa2 " ,
32+ "id" : " 0fa39f1b " ,
3333 "metadata" : {},
3434 "outputs" : [],
3535 "source" : [
5353 {
5454 "cell_type" : " code" ,
5555 "execution_count" : null ,
56- "id" : " 839700b0 " ,
56+ "id" : " ee0af0a3 " ,
5757 "metadata" : {},
5858 "outputs" : [],
5959 "source" : [
7777 {
7878 "cell_type" : " code" ,
7979 "execution_count" : null ,
80- "id" : " 65918722 " ,
80+ "id" : " 17c31bf7 " ,
8181 "metadata" : {},
8282 "outputs" : [],
8383 "source" : [
100100 {
101101 "cell_type" : " code" ,
102102 "execution_count" : null ,
103- "id" : " 4c8b4794 " ,
103+ "id" : " e27be910 " ,
104104 "metadata" : {},
105105 "outputs" : [],
106106 "source" : [
120120 },
121121 {
122122 "cell_type" : " markdown" ,
123- "id" : " bf0c61a4 " ,
123+ "id" : " 145332dd " ,
124124 "metadata" : {},
125125 "source" : [
126126 " Modify the script so that the `DatetimeEncoder` adds periodic encoding with sine\n " ,
130130 {
131131 "cell_type" : " code" ,
132132 "execution_count" : null ,
133- "id" : " dbb9934b " ,
133+ "id" : " 9340e84c " ,
134134 "metadata" : {},
135135 "outputs" : [],
136136 "source" : [
153153 },
154154 {
155155 "cell_type" : " markdown" ,
156- "id" : " 94e6e6be " ,
156+ "id" : " 3ee9e75a " ,
157157 "metadata" : {},
158158 "source" : [
159159 " Now modify the script above to add spline features (`periodic_encoding=\" spline\" `).\n "
162162 {
163163 "cell_type" : " code" ,
164164 "execution_count" : null ,
165- "id" : " bbed03e7 " ,
165+ "id" : " 584444a9 " ,
166166 "metadata" : {},
167167 "outputs" : [],
168168 "source" : [
188188 {
189189 "cell_type" : " code" ,
190190 "execution_count" : null ,
191- "id" : " 4925a0b9 " ,
191+ "id" : " 36cfe30a " ,
192192 "metadata" : {},
193193 "outputs" : [],
194194 "source" : []
You can’t perform that action at this time.
0 commit comments