You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 1-introduction.md
+8-4Lines changed: 8 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -133,7 +133,7 @@ In most neural networks, neurons are aggregated into layers. Signals travel from
133
133
The image below shows an example of a neural network with three layers, each circle is a neuron, each line is an edge and the arrows indicate the direction data moves in.
134
134
135
135
![
136
-
Image credit: Glosser.ca, CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons,
136
+
Image credit: Glosser.ca, CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons,
@@ -302,19 +302,21 @@ This is a good time for switching instructor and/or a break.
302
302
### Keras for neural networks
303
303
304
304
Keras is a machine learning framework with ease of use as one of its main features.
305
-
It is part of the tensorflow python package and can be imported using `from tensorflow import keras`.
305
+
It is a standalone python package that supports multiple deep learning frameworks as backends, and it can be imported using `import keras`.
306
+
Here, we will use Keras with the PyTorch backend.
306
307
307
308
Keras includes functions, classes and definitions to define deep learning models, cost functions and optimizers (optimizers are used to train a model).
308
309
309
310
Before we move on to the next section of the workflow we need to make sure we have Keras imported.
310
311
We do this as follows:
311
312
```python
312
-
from tensorflow import keras
313
+
import keras
313
314
```
314
315
315
316
For this episode it is useful if everyone gets the same results from their training.
316
317
Keras uses a random number generator at certain points during its execution.
317
-
Therefore we will need to set two random seeds, one for numpy and one for tensorflow:
318
+
Therefore, we will need to set two random seeds: one for NumPy and one for PyTorch.
319
+
We can use a built-in Keras function to achieve this in one line of code:
318
320
```python
319
321
keras.utils.set_random_seed(2)
320
322
```
@@ -348,7 +350,7 @@ and outputs a layer needs and therefore how many edges need to be created.
348
350
This means we need to inform Keras how big our input is going to be. We do this by instantiating a `keras.Input` class and tell it how big our input is, thus the number of columns it contains.
349
351
350
352
```python
351
-
inputs = keras.Input(shape=(X_train.shape[1],))
353
+
inputs = keras.Input(shape=(x_train.shape[1],))
352
354
```
353
355
354
356
We store a reference to this input class in a variable so we can pass it to the creation of
@@ -369,7 +371,7 @@ for inputs that are 0 and below and the identity function (returning the same va
369
371
for inputs above 0.
370
372
This is a commonly used activation function in deep neural networks that is proven to work well.
371
373
372
-
Next we see an extra set of parenthenses with inputs in them. This means that after creating an
374
+
Next we see an extra set of parenthenses with `inputs` in them. This means that after creating an
373
375
instance of the Dense layer we call it as if it was a function.
374
376
This tells the Dense layer to connect the layer passed as a parameter, in this case the inputs.
Because we chose the one-hot encoding, we use three neurons for the output layer.
385
387
386
-
The `softmax` activation ensures that the three output neurons produce values in the range
388
+
The [`softmax`](https://keras.io/api/layers/activations/#softmax-function) activation ensures that the three output neurons produce values in the range
387
389
(0, 1) and they sum to 1.
388
390
We can interpret this as a kind of 'probability' that the sample belongs to a certain
389
391
species.
@@ -403,10 +405,10 @@ Keras distinguishes between two types of weights, namely:
403
405
404
406
- trainable parameters: these are weights of the neurons that are modified when we train the model in order to minimize our loss function (we will learn about loss functions shortly!).
405
407
406
-
- non-trainable parameters: these are weights of the neurons that are not changed when we train the model. These could be for many reasons - using a pre-trained model, choice of a particular filter for a convolutional neural network, and statistical weights for batch normalization are some examples.
408
+
- non-trainable parameters: these are weights of the neurons that are not changed when we train the model. These could be for many reasons - using a pre-trained model, choice of a particular filter for a convolutional neural network, and statistical weights for batch normalization are some examples.
407
409
408
410
If these reasons are not clear right away, don't worry! In later episodes of this course, we will touch upon a couple of these concepts.
409
-
:::
411
+
:::
410
412
411
413
412
414
::: instructor
@@ -483,9 +485,9 @@ Model: "functional"
483
485
Non-trainable params: 0 (0.00 B)
484
486
485
487
```
486
-
The model has 83 trainable parameters. Each of the 10 neurons in the in the `dense` hidden layer is connected to each of
487
-
the 4 inputs in the input layer resulting in 40 weights that can be trained. The 10 neurons in the hidden layer are also
488
-
connected to each of the 3 outputs in the `dense_1` output layer, resulting in a further 30 weights that can be trained.
488
+
The model has 83 trainable parameters. Each of the 10 neurons in the in the `dense` hidden layer is connected to each of
489
+
the 4 inputs in the input layer resulting in 40 weights that can be trained. The 10 neurons in the hidden layer are also
490
+
connected to each of the 3 outputs in the `dense_1` output layer, resulting in a further 30 weights that can be trained.
489
491
By default `Dense` layers in Keras also contain 1 bias term for each neuron, resulting in a further 10 bias values for the
490
492
hidden layer and 3 bias terms for the output layer. `40+30+10+3=83` trainable parameters.
491
493
@@ -524,7 +526,7 @@ So in total 8 extra parameters.
524
526
```python
525
527
model = keras.Sequential(
526
528
[
527
-
keras.Input(shape=(X_train.shape[1],)),
529
+
keras.Input(shape=(x_train.shape[1],)),
528
530
keras.layers.Dense(10, activation="relu"),
529
531
keras.layers.Dense(3, activation="softmax"),
530
532
]
@@ -571,13 +573,13 @@ This is a measure for how close the distribution of the three neural network out
571
573
It is lower if the distributions are more similar.
572
574
573
575
For more information on the available loss functions in Keras you can check the
Note for MacOS users: there is a package `tensorflow-metal` which accelerates the training of machine learning models with TensorFlow on a recent Mac with a Silicon chip (M1/M2/M3).
84
-
However, the installation is currently broken in the most recent version (as of January 2025), see the [developer forum](https://developer.apple.com/forums/thread/772147).
Note: Tensorflow makes Keras available as a module too.
99
-
100
-
An [optional challenge in episode 2](episodes/2-keras.md) requires installation of Graphviz
101
-
and instructions for doing that can be found
102
-
[by following this link](https://graphviz.org/download/).
94
+
An [optional challenge in episode 2](episodes/2-keras.md) requires installation of Graphviz. Instructions for doing that can be found [by following this link](https://graphviz.org/download/).
103
95
104
96
## Starting Jupyter Lab
105
97
@@ -108,7 +100,7 @@ Jupyter Lab is compatible with Firefox, Chrome, Safari and Chromium-based browse
108
100
Note that Internet Explorer and Edge are *not* supported.
109
101
See the [Jupyter Lab documentation](https://jupyterlab.readthedocs.io/en/latest/getting_started/accessibility.html#compatibility-with-browsers-and-assistive-technology) for an up-to-date list of supported browsers.
110
102
111
-
To start Jupyter Lab, open a terminal (Mac/Linux) or Command Prompt (Windows),
103
+
To start Jupyter Lab, open a terminal (Mac/Linux) or Command Prompt (Windows),
112
104
make sure that you activated the virtual environment you created for this course,
113
105
and type the command:
114
106
@@ -121,31 +113,38 @@ To check whether all packages installed correctly, start a jupyter notebook in j
121
113
explained above. Run the following lines of code:
122
114
```python
123
115
import sklearn
124
-
print('sklearn version: ', sklearn.__version__)
116
+
print(f'Sklearn version: {sklearn.__version__}')
125
117
126
118
import seaborn
127
-
print('seaborn version: ', seaborn.__version__)
119
+
print(f'Seaborn version: {seaborn.__version__}')
128
120
129
121
import pandas
130
-
print('pandas version: ', pandas.__version__)
122
+
print(f'Pandas version: {pandas.__version__}')
123
+
124
+
import torch
125
+
print(f'PyTorch version: {torch.__version__}')
126
+
127
+
# Note: Before importing Keras, we have to instruct it to use PyTorch as the backend.
This should output the versions of all required packages without giving errors.
137
136
Most versions will work fine with this lesson, but:
138
-
- For Keras and Tensorflow, the minimum version is 2.12.0
137
+
- For Keras, the minimum version is 2.12.0
139
138
- For sklearn, the minimum version is 1.2.2
140
139
141
140
## Fallback option: cloud environment
142
141
If a local installation does not work for you, it is also possible to run this lesson in [Binder Hub](https://mybinder.org/v2/gh/carpentries-lab/deep-learning-intro/scaffolds). This should give you an environment with all the required software and data to run this lesson, nothing which is saved will be stored, please copy any files you want to keep. Note that if you are the first person to launch this in the last few days it can take several minutes to startup. The second person who loads it should find it loads in under a minute. Instructors who intend to use this option should start it themselves shortly before the workshop begins.
143
142
144
-
Alternatively you can use [Google colab](https://colab.research.google.com/). If you open a jupyter notebook here, the required packages are already pre-installed. Note that google colab uses jupyter notebook instead of Jupyter Lab.
143
+
Alternatively you can use [Google Colab](https://colab.research.google.com/). If you open a jupyter notebook here, the required packages are already pre-installed. Note that Google Colab uses jupyter notebook instead of Jupyter Lab.
145
144
146
145
## Downloading the required datasets
147
146
148
-
Download the [weather dataset prediction csv][weatherdata] and [Dollar street dataset (4 files in total)][dollar-street]
147
+
Download the [Weather dataset prediction csv][weatherdata] and [Dollar street dataset (4 files in total)][dollar-street]
0 commit comments