Add TOC

edoardob90 · edoardob90 · commit 1ee4971fb994 · 2025-05-12T09:31:51.000+02:00
diff --git a/32_language_modeling_1.ipynb b/32_language_modeling_1.ipynb
@@ -15,7 +15,31 @@
    "id": "1",
    "metadata": {},
    "source": [
-    "[TOC]"
+    "## Table of Contents\n",
+    "- [References](#References)\n",
+    "- [Inspecting the data](#Inspecting-the-data)\n",
+    "- [Bigram language model](#Bigram-language-model)\n",
+    "  - [Evaluating the quality of the model](#Evaluating-the-quality-of-the-model)\n",
+    "- [A neural network approach](#A-neural-network-approach)\n",
+    "  - [The training set](#The-training-set)\n",
+    "  - [Feeding the network](#Feeding-the-network)\n",
+    "  - [Regaining a normal distribution](#Regaining-a-normal-distribution)\n",
+    "  - [Recap: How the Neural Network Processes Input Characters](#Recap:-How-the-Neural-Network-Processes-Input-Characters)\n",
+    "  - [Optimization](#Optimization)\n",
+    "  - [Putting it all together](#Putting-it-all-together)\n",
+    "    - [Preparing data](#Preparing-data)\n",
+    "    - [Initializing the neural network](#Initializing-the-neural-network)\n",
+    "    - [Training the neural network](#Training-the-neural-network)\n",
+    "    - [Comparison with a Bigram frequency model](#Comparison-with-a-Bigram-frequency-model)\n",
+    "    - [Smoothing applied to a neural network](#Smoothing-applied-to-a-neural-network)\n",
+    "    - [Sampling from our trained model](#Sampling-from-our-trained-model)\n",
+    "  - [Conclusion](#Conclusion)\n",
+    "- [Exercises](#Exercises)\n",
+    "  - [1. Build a Trigram model](#1.-Build-a-Trigram-model)\n",
+    "  - [2. Split the dataset](#2.-Split-the-dataset)\n",
+    "    - [Bigram model baseline](#Bigram-model-baseline)\n",
+    "    - [Compare the Bigram and Trigram model](#Compare-the-Bigram-and-Trigram-model)\n",
+    "  - [3. Change the loss function](#3.-Change-the-loss-function)"
    ]
   },
   {
@@ -49,11 +73,11 @@
     "- Transformer: [Vaswani et al. 2017](https://arxiv.org/abs/1706.03762)\n",
     "\n",
     "A few more related resource (hands-on, tutorial, articles, videos, etc.):\n",
+    "- Book \"[Build a Large Language Model (From Scratch)](http://mng.bz/orYv)\" by Sebastian Raschka (the companion [GitHub repository](https://github.com/rasbt/LLMs-from-scratch))\n",
+    "- [Andrej Karpathy's \"Neural Net: From Zero to Hero\"](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ) (*This was the **main** inspiration for this and the subsequent notebooks*)\n",
     "- A [tutorial](https://docs.fast.ai/tutorial.text.html) on *transfer learning* by fastai\n",
     "- [Hugging Face's FineWeb dataset](https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1)\n",
-    "- [Transformer LLM 3D visualizer](https://bbycroft.net/llm)\n",
-    "- [Andrej Karpathy's \"Neural Net: From Zero to Hero\"](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ) (*This was the **main** inspiration for this and the subsequent notebooks*)\n",
-    "- Book \"[Build a Large Language Model (From Scratch)](http://mng.bz/orYv)\" by Sebastian Raschka (the companion [GitHub repository](https://github.com/rasbt/LLMs-from-scratch))"
+    "- [Transformer LLM 3D visualizer](https://bbycroft.net/llm)"
    ]
   },
   {