Skip to content

Commit 1ee4971

Browse files
committed
Add TOC
1 parent a41f399 commit 1ee4971

File tree

1 file changed

+28
-4
lines changed

1 file changed

+28
-4
lines changed

32_language_modeling_1.ipynb

Lines changed: 28 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,31 @@
1515
"id": "1",
1616
"metadata": {},
1717
"source": [
18-
"[TOC]"
18+
"## Table of Contents\n",
19+
"- [References](#References)\n",
20+
"- [Inspecting the data](#Inspecting-the-data)\n",
21+
"- [Bigram language model](#Bigram-language-model)\n",
22+
" - [Evaluating the quality of the model](#Evaluating-the-quality-of-the-model)\n",
23+
"- [A neural network approach](#A-neural-network-approach)\n",
24+
" - [The training set](#The-training-set)\n",
25+
" - [Feeding the network](#Feeding-the-network)\n",
26+
" - [Regaining a normal distribution](#Regaining-a-normal-distribution)\n",
27+
" - [Recap: How the Neural Network Processes Input Characters](#Recap:-How-the-Neural-Network-Processes-Input-Characters)\n",
28+
" - [Optimization](#Optimization)\n",
29+
" - [Putting it all together](#Putting-it-all-together)\n",
30+
" - [Preparing data](#Preparing-data)\n",
31+
" - [Initializing the neural network](#Initializing-the-neural-network)\n",
32+
" - [Training the neural network](#Training-the-neural-network)\n",
33+
" - [Comparison with a Bigram frequency model](#Comparison-with-a-Bigram-frequency-model)\n",
34+
" - [Smoothing applied to a neural network](#Smoothing-applied-to-a-neural-network)\n",
35+
" - [Sampling from our trained model](#Sampling-from-our-trained-model)\n",
36+
" - [Conclusion](#Conclusion)\n",
37+
"- [Exercises](#Exercises)\n",
38+
" - [1. Build a Trigram model](#1.-Build-a-Trigram-model)\n",
39+
" - [2. Split the dataset](#2.-Split-the-dataset)\n",
40+
" - [Bigram model baseline](#Bigram-model-baseline)\n",
41+
" - [Compare the Bigram and Trigram model](#Compare-the-Bigram-and-Trigram-model)\n",
42+
" - [3. Change the loss function](#3.-Change-the-loss-function)"
1943
]
2044
},
2145
{
@@ -49,11 +73,11 @@
4973
"- Transformer: [Vaswani et al. 2017](https://arxiv.org/abs/1706.03762)\n",
5074
"\n",
5175
"A few more related resource (hands-on, tutorial, articles, videos, etc.):\n",
76+
"- Book \"[Build a Large Language Model (From Scratch)](http://mng.bz/orYv)\" by Sebastian Raschka (the companion [GitHub repository](https://github.com/rasbt/LLMs-from-scratch))\n",
77+
"- [Andrej Karpathy's \"Neural Net: From Zero to Hero\"](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ) (*This was the **main** inspiration for this and the subsequent notebooks*)\n",
5278
"- A [tutorial](https://docs.fast.ai/tutorial.text.html) on *transfer learning* by fastai\n",
5379
"- [Hugging Face's FineWeb dataset](https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1)\n",
54-
"- [Transformer LLM 3D visualizer](https://bbycroft.net/llm)\n",
55-
"- [Andrej Karpathy's \"Neural Net: From Zero to Hero\"](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ) (*This was the **main** inspiration for this and the subsequent notebooks*)\n",
56-
"- Book \"[Build a Large Language Model (From Scratch)](http://mng.bz/orYv)\" by Sebastian Raschka (the companion [GitHub repository](https://github.com/rasbt/LLMs-from-scratch))"
80+
"- [Transformer LLM 3D visualizer](https://bbycroft.net/llm)"
5781
]
5882
},
5983
{

0 commit comments

Comments
 (0)