ApartsinProjects
diff --git a/‎agents/08-code-pedagogy.html‎
Lines changed: 25 additions & 2 deletions b/‎agents/08-code-pedagogy.html‎
Lines changed: 25 additions & 2 deletions
diff --git a/‎agents/31-illustrator.html‎
Lines changed: 23 additions & 1 deletion b/‎agents/31-illustrator.html‎
Lines changed: 23 additions & 1 deletion
diff --git a/‎part-1-foundations/module-00-ml-pytorch-foundations/section-0.1.html‎
Lines changed: 2 additions & 0 deletions b/‎part-1-foundations/module-00-ml-pytorch-foundations/section-0.1.html‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎part-1-foundations/module-00-ml-pytorch-foundations/section-0.3.html‎
Lines changed: 1 addition & 0 deletions b/‎part-1-foundations/module-00-ml-pytorch-foundations/section-0.3.html‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎part-1-foundations/module-00-ml-pytorch-foundations/section-0.4.html‎
Lines changed: 5 additions & 0 deletions b/‎part-1-foundations/module-00-ml-pytorch-foundations/section-0.4.html‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎part-1-foundations/module-01-foundations-nlp-text-representation/section-1.2.html‎
Lines changed: 2 additions & 0 deletions b/‎part-1-foundations/module-01-foundations-nlp-text-representation/section-1.2.html‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎part-1-foundations/module-01-foundations-nlp-text-representation/section-1.3.html‎
Lines changed: 3 additions & 0 deletions b/‎part-1-foundations/module-01-foundations-nlp-text-representation/section-1.3.html‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎part-1-foundations/module-01-foundations-nlp-text-representation/section-1.4.html‎
Lines changed: 5 additions & 0 deletions b/‎part-1-foundations/module-01-foundations-nlp-text-representation/section-1.4.html‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎part-1-foundations/module-02-tokenization-subword-models/section-2.1.html‎
Lines changed: 4 additions & 0 deletions b/‎part-1-foundations/module-02-tokenization-subword-models/section-2.1.html‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎part-1-foundations/module-02-tokenization-subword-models/section-2.2.html‎
Lines changed: 6 additions & 0 deletions b/‎part-1-foundations/module-02-tokenization-subword-models/section-2.2.html‎
Lines changed: 6 additions & 0 deletions
@@ -43,13 +43,36 @@ <h3 id="3-pedagogical-effectiveness">3. Pedagogical Effectiveness</h3>
 <li>Variable names should be descriptive and match the prose terminology</li>
 <li>Code should be minimal: remove everything that does not serve the teaching goal</li>
 </ul>
-<h3 id="4-progressive-complexity">4. Progressive Complexity</h3>
+<h3 id="4-value-gate">4. Value Gate: Every Code Fragment Must Earn Its Place</h3>
+<p><strong>A code fragment that does not solve a specific task, demonstrate a library function, or produce a meaningful output should not exist.</strong> Defining a dataclass, configuration object, or data structure is not a code example; it is a schema. If the code only declares something without <em>doing</em> anything with it, either:</p>
+<ul>
+<li><strong>Merge it</strong> into the next code fragment that actually uses it, so the reader sees definition and usage together.</li>
+<li><strong>Replace it</strong> with a code fragment that demonstrates the concept in action (input, processing, output).</li>
+<li><strong>Remove it</strong> entirely if the prose or a callout box conveys the same information more efficiently.</li>
+</ul>
+<p>Ask: "If I ran this code, would I learn something from the output?" If the answer is "it just prints a summary of what I typed in," the fragment fails the value gate.</p>
+<p><strong>Signs of a low-value code fragment:</strong></p>
+<ul>
+<li>The entire block is a class/dataclass/enum definition with no usage</li>
+<li>The only "output" is printing what was hardcoded in the input</li>
+<li>The code restates in Python what the prose already explains in English</li>
+<li>No library, algorithm, or technique is being demonstrated</li>
+<li>Removing the code fragment would not reduce the reader's understanding</li>
+</ul>
+<p><strong>This book is about LLMs and AI.</strong> Every code fragment must involve AI/ML libraries, LLM API calls, prompt engineering, evaluation, embeddings, tokenization, fine-tuning, agent logic, or similar AI-relevant functionality. Code that is purely organizational (form-filling dataclasses, business logic scorecards, project management checklists, configuration dicts with no library calls) does not belong as a standalone code fragment. Either:</p>
+<ul>
+<li><strong>Merge it</strong> into a fragment that actually calls an AI library or API, so the definition and AI usage appear together.</li>
+<li><strong>Replace it with prose</strong>: a table, callout box, or bullet list often conveys configuration and schema better than a Python dict literal.</li>
+<li><strong>Delete it</strong> if the surrounding text already covers the same information.</li>
+</ul>
+<p><strong>Concrete test:</strong> "Does this code fragment import or call anything from an AI/ML library, or demonstrate a technique specific to AI systems?" If no, it fails the gate.</p>
+<h3 id="5-progressive-complexity">5. Progressive Complexity</h3>
 <ul>
 <li>First code example in a section: simple, 5 to 10 lines</li>
 <li>Later examples: build on earlier ones, add one new element at a time</li>
 <li>Final example: brings it together, realistic but not overwhelming</li>
 </ul>
-<h3 id="5-reproducibility">5. Reproducibility</h3>
+<h3 id="6-reproducibility">6. Reproducibility</h3>
 <ul>
 <li>Pin library versions in requirements or comments</li>
 <li>Use deterministic seeds for random operations</li>
 
@@ -84,13 +84,35 @@ <h3 id="step-2-prompt">Step 2: Craft the Gemini Prompt</h3>
 <h3 id="step-3-generate">Step 3: Generate the Image</h3>
 <p><strong>FIRST</strong>, create the images directory (this MUST run before any generation):</p>
 <pre><code class="language-bash">mkdir -p "{module-folder}/images"</code></pre>
-<p><strong>THEN</strong> run the generation script:</p>
+
+<h4>Single Image (one-off generation)</h4>
 <pre><code class="language-bash">python "C:/Users/apart/.claude/skills/gemini-imagegen/scripts/generate_image.py" \
   --prompt "[your crafted prompt]" \
   --output "{module-folder}/images/{descriptive-filename}.png" \
   --aspect-ratio 4:3 \
   --image-size 1K</code></pre>
 
+<h4>Batch Mode (PREFERRED for 2+ images, 50% cost discount)</h4>
+<p>When generating multiple illustrations (e.g., for a full chapter pass or book-wide sweep), <strong>always use Gemini batch API mode</strong>. This provides a 50% cost reduction and processes all images in parallel.</p>
+<ol>
+<li>Create a JSON file with all illustration prompts:
+<pre><code class="language-json">[
+  {
+    "name": "descriptive-id",
+    "prompt": "Simple, cartoon-like educational illustration...",
+    "output": "relative/path/to/images/filename.png"
+  }
+]</code></pre>
+</li>
+<li>Run the batch generation script:
+<pre><code class="language-bash">python "C:/Users/apart/.claude/skills/book-skills/scripts/generate_icons_gemini.py" \
+  --engine gemini \
+  --batch \
+  --input prompts.json</code></pre>
+</li>
+</ol>
+<p><strong>IMPORTANT:</strong> Output paths in the JSON are resolved relative to the current working directory. Run the script from the book root (<code>E:/Projects/LLMCourse</code>) so paths like <code>part-X/module-Y/images/file.png</code> resolve correctly.</p>
+
 <h3 id="step-4-embed">Step 4: Embed in HTML</h3>
 <p>Insert a <code>&lt;figure&gt;</code> block at the identified location:</p>
 <pre><code class="language-html">&lt;figure class="illustration"&gt;
 
@@ -308,6 +308,7 @@ <h3>Variants of Gradient Descent</h3>
 Step 3: w = 1.5816, loss = 2.0113
 Step 4: w = 2.0218, loss = 0.9572
 Step 5: w = 2.3116, loss = 0.4739</div>
+<div class="code-caption"><strong>Code Fragment 0.1.9:</strong> Simulating mini-batch SGD on a simple quadratic loss</div>
 <div class="code-caption"><strong>Code Fragment 0.1.2:</strong> Simulating mini-batch SGD on a simple quadratic loss.</div>
 
     <p>Even with noisy gradients, the parameter <code>w</code> steadily moves toward the true minimum at 3.0. Each step is imprecise, but the overall trajectory converges. This is the core principle that scales all the way up to training models with billions of parameters.</p>
@@ -558,6 +559,7 @@ <h3>K-Fold Cross-Validation</h3>
 Fold 5: MSE = 0.2336
 
 Mean MSE: 0.2553 (+/- 0.0383)</div>
+<div class="code-caption"><strong>Code Fragment 0.1.8:</strong> K-Fold cross-validation from scratch</div>
 
 <div class="callout tip">
     <div class="callout-title">Production Alternative</div>
 
@@ -132,6 +132,7 @@ <h3>1.2 Indexing, Slicing, and Reshaping</h3>
 Flat:     tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
 Shape before unsqueeze: torch.Size([3])
 Shape after unsqueeze(0): torch.Size([1, 3])</div>
+<div class="code-caption"><strong>Code Fragment 0.3.42:</strong> PyTorch implementation</div>
 <div class="code-caption"><strong>Code Fragment 0.3.2:</strong> Reshaping, slicing, and fancy indexing on tensors. These operations return views when possible, avoiding unnecessary copies.</div>
 
 <h3>1.3 Broadcasting</h3>
 
@@ -382,6 +382,7 @@ <h2>4. Policy Gradients: Learning by Trial and Feedback <span class="level-badge
 Episode  600 | Last reward: +1.0 | Action 2 rate: 96%
 Episode  800 | Last reward: +1.0 | Action 2 rate: 99%
 Episode 1000 | Last reward: +1.0 | Action 2 rate: 100%</div>
+<div class="code-caption"><strong>Code Fragment 0.4.8:</strong> Define SimpleEnv; implement reset, step</div>
 <div class="code-caption"><strong>Code Fragment 0.4.2:</strong> REINFORCE policy gradient sketch in PyTorch. The PolicyNetwork maps states to action probabilities, mirroring how an LLM transformer maps token sequences to next-token distributions. The reinforce_episode function collects a trajectory, computes discounted returns, and nudges the policy toward actions that led to higher rewards.</div>
 
 <div class="callout note">
@@ -757,6 +758,7 @@ <h3>Step 1: Build an AI milestone timeline</h3>
 plt.savefig("ml_timeline.png", dpi=150, bbox_inches="tight")
 plt.show()
 print("Timeline saved to ml_timeline.png")</code></pre>
+<div class="code-caption"><strong>Code Fragment 0.4.7:</strong> TODO: add description</div>
         </div>
 
         <div class="lab-step">
@@ -787,6 +789,7 @@ <h3>Step 2: Implement a GridWorld environment</h3>
 env = GridWorld()
 print(f"Start: {env.reset()}, Goal: {env.goal}")</code></pre>
 <div class="code-output">Start: (0, 0), Goal: (3, 3)</div>
+<div class="code-caption"><strong>Code Fragment 0.4.6:</strong> TODO: add description</div>
         </div>
 
         <div class="lab-step">
@@ -828,6 +831,7 @@ <h3>Step 3: Train a Q-learning agent from scratch</h3>
   Row 1: ['right', 'right', 'down', 'down']
   Row 2: ['right', 'right', 'right', 'down']
   Row 3: ['right', 'right', 'right', 'right']</div>
+<div class="code-caption"><strong>Code Fragment 0.4.5:</strong> states (4x4) x actions (4)</div>
         </div>
 
         <div class="lab-step">
@@ -847,6 +851,7 @@ <h3>Step 4: Visualize the Q-values as a heatmap</h3>
 plt.tight_layout()
 plt.savefig("gridworld_values.png", dpi=150)
 plt.show()</code></pre>
+<div class="code-caption"><strong>Code Fragment 0.4.4:</strong> Value function: max Q at each state</div>
         </div>
     </div>
 
 
@@ -429,6 +429,7 @@ <h2>Bag-of-Words (BoW) <span class="level-badge intermediate" title="Intermediat
 [[1 0 0 0 1 1 1 2]
  [0 0 1 1 0 1 1 2]
  [1 1 1 0 0 0 0 2]]</div>
+<div class="code-caption"><strong>Code Fragment 1.2.13:</strong> Implementation example</div>
 <div class="code-caption"><strong>Code Fragment 1.2.3:</strong> This snippet demonstrates this approach. Study the implementation details to understand how each component contributes to the overall computation. Tracing through each step builds the intuition needed when debugging or extending similar systems.</div>
 
 <div class="callout practical-example">
@@ -697,6 +698,7 @@ <h2>One-Hot Encoding and Its Limitations <span class="level-badge intermediate"
 
 Distance cat to dog:       1.41
 Distance cat to democracy: 1.41</div>
+<div class="code-caption"><strong>Code Fragment 1.2.12:</strong> One-hot encoding: see the problem for yourself</div>
     <div class="code-caption"><strong>Code Fragment 1.2.6:</strong> One-hot encoding: see the problem for yourself.</div>
 
     <p>
 
@@ -352,6 +352,7 @@ <h2>Training Word2Vec from Scratch <span class="level-badge advanced" title="Adv
 Most similar to 'cat': [('dog', 0.91), ('sat', 0.78), ('mat', 0.72)]
 
 Embedding matrix shape: (24, 50)</div>
+<div class="code-caption"><strong>Code Fragment 1.3.16:</strong> NumPy computation</div>
     <div class="code-caption"><strong>Code Fragment 1.3.1:</strong> Sample corpus (in practice, use millions of sentences).</div>
 
     <h2>Measuring Similarity: Cosine Similarity <span class="level-badge intermediate" title="Intermediate">INTERMEDIATE</span></h2>
@@ -421,6 +422,7 @@ <h2>Measuring Similarity: Cosine Similarity <span class="level-badge intermediat
 0.1342
 0.6510
 0.7703</div>
+<div class="code-caption"><strong>Code Fragment 1.3.15:</strong> Measuring cosine similarity between word vectors</div>
 
 <div class="callout library-shortcut">
 <div class="callout-title">Library Shortcut</div>
@@ -738,6 +740,7 @@ <h2>Visualizing Embeddings <span class="level-badge intermediate" title="Interme
     plt.annotate(word, (vectors_2d[i, 0]+0.5, vectors_2d[i, 1]+0.5), fontsize=12)
 plt.title("Word Embeddings Projected to 2D with t-SNE")
 plt.show()</code></pre>
+<div class="code-caption"><strong>Code Fragment 1.3.14:</strong> Implementation example</div>
 
 <pre><code class="language-python">
 # NumPy computation
 
@@ -325,6 +325,7 @@ <h2>Contextual Embeddings in Code <span class="level-badge intermediate" title="
 <div class="code-output">Cosine distance between 'bank' in different contexts: 0.349
 Distance bank(river) to shore:      0.218
 Distance bank(river) to bank(money): 0.349</div>
+<div class="code-caption"><strong>Code Fragment 1.4.7:</strong> Demonstrating contextual embeddings: same word, different vectors</div>
 
 <div class="callout library-shortcut">
 <div class="callout-title">Library Shortcut</div>
@@ -653,6 +654,7 @@ <h3>Step 1: Train Word2Vec on a sample corpus</h3>
 for word in ["king", "queen", "man"]:
     neighbors = model.wv.most_similar(word, topn=3)
     print(f"{word}: {[(w, f'{s:.2f}') for w, s in neighbors]}")</code></pre>
+<div class="code-caption"><strong>Code Fragment 1.4.6:</strong> Sample corpus (in practice, use a larger dataset)</div>
         </div>
 
         <div class="lab-step">
@@ -678,6 +680,7 @@ <h3>Step 2: Visualize embeddings with t-SNE</h3>
 plt.tight_layout()
 plt.savefig("word2vec_tsne.png", dpi=150)
 plt.show()</code></pre>
+<div class="code-caption"><strong>Code Fragment 1.4.5:</strong> TODO: add description</div>
         </div>
 
         <div class="lab-step">
@@ -732,6 +735,7 @@ <h3>Step 3: Compare static vs. contextual embeddings</h3>
     'We sat on the river bank' vs 'Fish swim near the bank of the stream'
   [2] vs [3]: 0.762
     'The bank approved the loan' vs 'Fish swim near the bank of the stream'</div>
+<div class="code-caption"><strong>Code Fragment 1.4.4:</strong> TODO: add description</div>
             <details>
                 <summary>Expected pattern</summary>
                 <p>Financial sentences (0 and 2) should have higher cosine similarity with each other than with the river sentences (1 and 3). This confirms that BERT produces different representations for "bank" depending on context, unlike Word2Vec.</p>
@@ -765,6 +769,7 @@ <h3>Step 4: Visualize contextual differences</h3>
 plt.tight_layout()
 plt.savefig("contextual_bank.png", dpi=150)
 plt.show()</code></pre>
+<div class="code-caption"><strong>Code Fragment 1.4.3:</strong> TODO: add description</div>
         </div>
     </div>
 
 
@@ -262,6 +262,7 @@ <h3>Seeing the Tradeoff in Numbers</h3>
 Subword tokens (GPT-4): 11 tokens
   Decoded: ['Token', 'ization', ' determines', ' the', ' model', "'s", ' vocabulary', ' and', ' sequence', ' length', '.']
     </div>
+<div class="code-caption"><strong>Code Fragment 2.1.14:</strong> Comparing tokenization granularities</div>
 <div class="code-caption"><strong>Code Fragment 2.1.1:</strong> Comparing tokenization granularities.</div>
 
     <p>
@@ -318,6 +319,7 @@ <h3>The Token Tax on Different Languages</h3>
 Japanese  :  14 tokens,  1 words, ratio = 14.0 tokens/word
 Hindi     :  28 tokens,  7 words, ratio = 4.0 tokens/word
     </div>
+<div class="code-caption"><strong>Code Fragment 2.1.13:</strong> Demonstrating the "token tax" across languages</div>
 <div class="code-caption"><strong>Code Fragment 2.1.2:</strong> Demonstrating the "token tax" across languages.</div>
 
     <div class="callout warning">
@@ -487,6 +489,7 @@ <h3>Artifact 1: Inconsistent Splitting</h3>
   'tokenization'            => ['token', 'ization']
   ' tokenization'           => [' token', 'ization']
     </div>
+<div class="code-caption"><strong>Code Fragment 2.1.12:</strong> Demonstrating context-sensitive tokenization</div>
 <div class="code-caption"><strong>Code Fragment 2.1.3:</strong> Demonstrating context-sensitive tokenization.</div>
 
     <p>
@@ -527,6 +530,7 @@ <h3>Artifact 2: Arithmetic Failures</h3>
   381        => ['38', '1']
   380        => ['380']
     </div>
+<div class="code-caption"><strong>Code Fragment 2.1.11:</strong> See how numbers tokenize differently</div>
 <div class="code-caption"><strong>Code Fragment 2.1.4:</strong> See how numbers tokenize differently.</div>
 
     <p>
 
@@ -146,6 +146,7 @@ <h3>The BPE Merge Algorithm</h3>
 
 <span class="algo-line-keyword">return</span> vocab, merges
     </code></pre>
+<div class="code-caption"><strong>Code Fragment 2.2.12:</strong> TODO: add description</div>
 </div>
 
     <!-- DIAGRAM 1: BPE merge tree -->
@@ -272,6 +273,7 @@ <h3>Lab: Implementing BPE from Scratch</h3>
   newest&lt;/w&gt;                       (freq=1)
   wi d est&lt;/w&gt;                     (freq=1)
     </div>
+<div class="code-caption"><strong>Code Fragment 2.2.11:</strong> Minimal BPE implementation from scratch</div>
 
 <div class="callout tip">
     <div class="callout-title">Production Alternative</div>
@@ -350,6 +352,7 @@ <h3>Encoding New Text with Learned Merges</h3>
   'newer' -> ['new', 'e', 'r', '&lt;/w&gt;']
   'slowly' -> ['s', 'lo', 'w', 'l', 'y', '&lt;/w&gt;']
     </div>
+<div class="code-caption"><strong>Code Fragment 2.2.10:</strong> Encode a new word using the learned merge table</div>
 <div class="code-caption"><strong>Code Fragment 2.2.2:</strong> Encode a new word using the learned merge table.</div>
 
     <p>
@@ -433,6 +436,7 @@ <h3>WordPiece at Inference: MaxMatch</h3>
   'players' -> ['play', '##er', '##s']
   'helpless' -> ['help', '##less']
     </div>
+<div class="code-caption"><strong>Code Fragment 2.2.9:</strong> Simulated WordPiece MaxMatch tokenization</div>
 
 <div class="callout library-shortcut">
 <div class="callout-title">Library Shortcut</div>
@@ -743,6 +747,7 @@ <h3>Comparison of the Three Methods</h3>
 print("Decoded:", sp.decode(ids))
 os.unlink(tmp.name)
 </code></pre>
+<div class="code-caption"><strong>Code Fragment 2.2.8:</strong> pip install sentencepiece</div>
 </div>
 
 <div class="callout practical-example">
@@ -767,6 +772,7 @@ <h3>Comparison of the Three Methods</h3>
 print("Tokens:", output.tokens)
 print("IDs:", output.ids)
 </code></pre>
+<div class="code-caption"><strong>Code Fragment 2.2.7:</strong> pip install tokenizers</div>
 </div>
 
     <!-- ============================== BYTE-LEVEL BPE ============================== -->