Skip to content

Commit 9fffb96

Browse files
committed
deploy: 0ca7093
1 parent 3ff4df4 commit 9fffb96

3 files changed

Lines changed: 20 additions & 20 deletions

File tree

_sources/_sources/projects/Project_02_Wake_Behind_Cylinder.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
"metadata": {},
1313
"source": [
1414
"<div>\n",
15-
"<img src=\"https://raw.githubusercontent.com/illinois-mlp/MachineLearningForPhysics/main/img/Project_Cylinder_Wake.jpg\" width=800>\n",
15+
"<img src=\"https://raw.githubusercontent.com/illinois-mlp/MachineLearningForPhysics/main/img/Project_Cylinder_Wake.jpg\" width=1000>\n",
1616
"</div>"
1717
]
1818
},

_sources/lectures/Attention.html

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1119,8 +1119,8 @@ <h3><span style="color:LightGreen">Sequence Padding and Attention Masking</span>
11191119
<section id="span-style-color-orange-computing-the-reweighted-padded-attention-mask-span">
11201120
<h2><span style="color:Orange">Computing the Reweighted Padded Attention Mask</span><a class="headerlink" href="#span-style-color-orange-computing-the-reweighted-padded-attention-mask-span" title="Permalink to this heading">#</a></h2>
11211121
<p>Lets create some numbers so we can get a better idea of how this works. Let the tokens be <span class="math notranslate nohighlight">\(X = [10, 2, \text{&lt;pad&gt;}]\)</span>, so the third token is a padding token. Lets then also pretend, we pass this to our model, and when we go to compute our attention <span class="math notranslate nohighlight">\(QK^T\)</span>. The raw output before the Softmax is below:</p>
1122-
<div class="amsmath math notranslate nohighlight" id="equation-3f9a884c-a2d1-4790-9ccc-22eb777faba2">
1123-
<span class="eqno">(1)<a class="headerlink" href="#equation-3f9a884c-a2d1-4790-9ccc-22eb777faba2" title="Permalink to this equation">#</a></span>\[\begin{equation}
1122+
<div class="amsmath math notranslate nohighlight" id="equation-1458f32b-d442-494e-b855-8a83597bfa2b">
1123+
<span class="eqno">(1)<a class="headerlink" href="#equation-1458f32b-d442-494e-b855-8a83597bfa2b" title="Permalink to this equation">#</a></span>\[\begin{equation}
11241124
\begin{bmatrix}
11251125
7 &amp; -8 &amp; 6 \\
11261126
-3 &amp; 2 &amp; 4 \\
@@ -1133,8 +1133,8 @@ <h2><span style="color:Orange">Computing the Reweighted Padded Attention Mask</s
11331133
\text{Softmax}(\vec{x}) = \frac{e^{x_i}}{\sum_{j=1}^N{e^{x_j}}}
11341134
\]</div>
11351135
<p>If we ignore padding and everything right now, we can compute softmax for row of the matrix above:</p>
1136-
<div class="amsmath math notranslate nohighlight" id="equation-63d1a546-6cca-4be4-98c1-750360b7e9cd">
1137-
<span class="eqno">(2)<a class="headerlink" href="#equation-63d1a546-6cca-4be4-98c1-750360b7e9cd" title="Permalink to this equation">#</a></span>\[\begin{equation}
1136+
<div class="amsmath math notranslate nohighlight" id="equation-04dc3062-b225-4c13-a734-7707da519306">
1137+
<span class="eqno">(2)<a class="headerlink" href="#equation-04dc3062-b225-4c13-a734-7707da519306" title="Permalink to this equation">#</a></span>\[\begin{equation}
11381138
\text{Softmax}
11391139
\begin{bmatrix}
11401140
7 &amp; -8 &amp; 6 \\
@@ -1153,17 +1153,17 @@ <h2><span style="color:Orange">Computing the Reweighted Padded Attention Mask</s
11531153
\end{bmatrix}
11541154
\end{equation}\]</div>
11551155
<p>But what we need is to mask out all the tokens in this matrix related to padding. Just like we did in <a class="reference external" href="https://github.com/priyammaz/HAL-DL-From-Scratch/tree/main/PyTorch%20for%20NLP/GPT">GPT</a>, we will fill in the indexes of the that we want to mask with <span class="math notranslate nohighlight">\(-\infty\)</span>. If only the last token was a padding token in our sequence, then the attention before the softmax should be written as:</p>
1156-
<div class="amsmath math notranslate nohighlight" id="equation-62962437-0153-4c8a-b58c-c6195acc9c0d">
1157-
<span class="eqno">(3)<a class="headerlink" href="#equation-62962437-0153-4c8a-b58c-c6195acc9c0d" title="Permalink to this equation">#</a></span>\[\begin{equation}
1156+
<div class="amsmath math notranslate nohighlight" id="equation-416a944c-c56b-44d7-b702-1271bd6c6dbf">
1157+
<span class="eqno">(3)<a class="headerlink" href="#equation-416a944c-c56b-44d7-b702-1271bd6c6dbf" title="Permalink to this equation">#</a></span>\[\begin{equation}
11581158
\begin{bmatrix}
11591159
7 &amp; -8 &amp; -\infty \\
11601160
-3 &amp; 2 &amp; -\infty \\
11611161
1 &amp; 6 &amp; -\infty \\
11621162
\end{bmatrix}
11631163
\end{equation}\]</div>
11641164
<p>Taking the softmax of the rows of this matrix then gives:</p>
1165-
<div class="amsmath math notranslate nohighlight" id="equation-bfcfffc9-068d-40aa-8324-1b4354a71176">
1166-
<span class="eqno">(4)<a class="headerlink" href="#equation-bfcfffc9-068d-40aa-8324-1b4354a71176" title="Permalink to this equation">#</a></span>\[\begin{equation}
1165+
<div class="amsmath math notranslate nohighlight" id="equation-e61b8123-fc93-4e7c-8a95-e5ae2912a354">
1166+
<span class="eqno">(4)<a class="headerlink" href="#equation-e61b8123-fc93-4e7c-8a95-e5ae2912a354" title="Permalink to this equation">#</a></span>\[\begin{equation}
11671167
\text{Softmax}
11681168
\begin{bmatrix}
11691169
7 &amp; -8 &amp; -\infty \\
@@ -1205,8 +1205,8 @@ <h3><span style="color:LightGreen">Repeating to Match Attention Matrix Shape</sp
12051205
<p><code class="docutils literal notranslate"><span class="pre">attn.shape</span></code> - (Batch x seq_len x seq_len)</p>
12061206
<p><code class="docutils literal notranslate"><span class="pre">mask.shape</span></code> - (Batch x seq_len)</p>
12071207
<p>It is clear that our mask is missing a dimension, and we need to repeat it. Lets take sequence_1 for instance that has a mask of [True, True, True, False]. Because the sequence length here is 4, lets repeat this row 4 times:</p>
1208-
<div class="amsmath math notranslate nohighlight" id="equation-fb5058d7-5f44-4108-8ecf-bd4891fce586">
1209-
<span class="eqno">(5)<a class="headerlink" href="#equation-fb5058d7-5f44-4108-8ecf-bd4891fce586" title="Permalink to this equation">#</a></span>\[\begin{bmatrix}
1208+
<div class="amsmath math notranslate nohighlight" id="equation-82bd5929-c99b-4a71-aff0-6b735a56c041">
1209+
<span class="eqno">(5)<a class="headerlink" href="#equation-82bd5929-c99b-4a71-aff0-6b735a56c041" title="Permalink to this equation">#</a></span>\[\begin{bmatrix}
12101210
\textrm{True} &amp; \textrm{True} &amp; \textrm{True} &amp; \textrm{False} \\
12111211
\textrm{True} &amp; \textrm{True} &amp; \textrm{True} &amp; \textrm{False} \\
12121212
\textrm{True} &amp; \textrm{True} &amp; \textrm{True} &amp; \textrm{False} \\
@@ -1466,8 +1466,8 @@ <h3><span style="color:LightGreen">Enforcing Causality</span><a class="headerlin
14661466
<section id="span-style-color-lightgreen-computing-the-reweighted-causal-attention-mask-span">
14671467
<h3><span style="color:LightGreen">Computing the Reweighted Causal Attention Mask</span><a class="headerlink" href="#span-style-color-lightgreen-computing-the-reweighted-causal-attention-mask-span" title="Permalink to this heading">#</a></h3>
14681468
<p>Lets pretend the raw outputs of <span class="math notranslate nohighlight">\(QK^T\)</span>, before the softmax, is below:</p>
1469-
<div class="amsmath math notranslate nohighlight" id="equation-0b415d81-7bbb-4d27-a1f3-f24f92d4c167">
1470-
<span class="eqno">(6)<a class="headerlink" href="#equation-0b415d81-7bbb-4d27-a1f3-f24f92d4c167" title="Permalink to this equation">#</a></span>\[\begin{equation}
1469+
<div class="amsmath math notranslate nohighlight" id="equation-91439069-0acb-44c5-a9e0-fd7727dfd0a2">
1470+
<span class="eqno">(6)<a class="headerlink" href="#equation-91439069-0acb-44c5-a9e0-fd7727dfd0a2" title="Permalink to this equation">#</a></span>\[\begin{equation}
14711471
\begin{bmatrix}
14721472
7 &amp; -8 &amp; 6 \\
14731473
-3 &amp; 2 &amp; 4 \\
@@ -1478,8 +1478,8 @@ <h3><span style="color:LightGreen">Computing the Reweighted Causal Attention Mas
14781478
<div class="math notranslate nohighlight">
14791479
\[\text{Softmax}(\vec{x}) = \frac{e^{x_i}}{\sum_{j=1}^N{e^{x_j}}}\]</div>
14801480
<p>Then, we can compute softmax for row of the matrix above:</p>
1481-
<div class="amsmath math notranslate nohighlight" id="equation-97377beb-98f5-43d9-a2e7-723db80e3767">
1482-
<span class="eqno">(7)<a class="headerlink" href="#equation-97377beb-98f5-43d9-a2e7-723db80e3767" title="Permalink to this equation">#</a></span>\[\begin{equation}
1481+
<div class="amsmath math notranslate nohighlight" id="equation-b14ca7a2-8ccf-4483-b582-46827047f7cd">
1482+
<span class="eqno">(7)<a class="headerlink" href="#equation-b14ca7a2-8ccf-4483-b582-46827047f7cd" title="Permalink to this equation">#</a></span>\[\begin{equation}
14831483
\text{Softmax}
14841484
\begin{bmatrix}
14851485
7 &amp; -8 &amp; 6 \\
@@ -1518,17 +1518,17 @@ <h3><span style="color:LightGreen">Computing the Reweighted Causal Attention Mas
15181518
\text{Softmax}(x_2) = [\frac{e^{-3}}{e^{-3}+e^{2}+0}, \frac{e^{2}}{e^{-3}+e^{2}+0}, \frac{0}{e^{-3}+e^{2}+0}] = [\frac{e^{-3}}{e^{-3}+e^{2}+0}, \frac{e^{2}}{e^{-3}+e^{2}+0}, \frac{0}{e^{-3}+e^{2}+0}] = [0.0067, 0.9933, 0.0000]
15191519
\]</div>
15201520
<p>So we have exactly what we want! The attention weight of the last value is set to 0, so when we are on the second vector <span class="math notranslate nohighlight">\(x_2\)</span>, we cannot look forward to the future value vectors <span class="math notranslate nohighlight">\(v_3\)</span>, and the remaining parts add up to 1 so its still a probability vector! To do this correctly for the entire matrix, we can just substitute in the top triangle of <span class="math notranslate nohighlight">\(QK^T\)</span> with <span class="math notranslate nohighlight">\(-\infty\)</span>. This would look like:</p>
1521-
<div class="amsmath math notranslate nohighlight" id="equation-a974de21-62ff-4380-a7df-e472c0b3221d">
1522-
<span class="eqno">(8)<a class="headerlink" href="#equation-a974de21-62ff-4380-a7df-e472c0b3221d" title="Permalink to this equation">#</a></span>\[\begin{equation}
1521+
<div class="amsmath math notranslate nohighlight" id="equation-8dd4d9f2-0417-4b76-a90a-a48bb6d688a5">
1522+
<span class="eqno">(8)<a class="headerlink" href="#equation-8dd4d9f2-0417-4b76-a90a-a48bb6d688a5" title="Permalink to this equation">#</a></span>\[\begin{equation}
15231523
\begin{bmatrix}
15241524
7 &amp; -\infty &amp; -\infty \\
15251525
-3 &amp; 2 &amp; -\infty \\
15261526
1 &amp; 6 &amp; -2 \\
15271527
\end{bmatrix}
15281528
\end{equation}\]</div>
15291529
<p>Taking the softmax of the rows of this matrix then gives:</p>
1530-
<div class="amsmath math notranslate nohighlight" id="equation-17d685b6-4194-4b0f-a9a7-741290375345">
1531-
<span class="eqno">(9)<a class="headerlink" href="#equation-17d685b6-4194-4b0f-a9a7-741290375345" title="Permalink to this equation">#</a></span>\[\begin{equation}
1530+
<div class="amsmath math notranslate nohighlight" id="equation-c8bec7f8-10a9-4e30-bc4a-682e98f92055">
1531+
<span class="eqno">(9)<a class="headerlink" href="#equation-c8bec7f8-10a9-4e30-bc4a-682e98f92055" title="Permalink to this equation">#</a></span>\[\begin{equation}
15321532
\text{Softmax}
15331533
\begin{bmatrix}
15341534
7 &amp; -\infty &amp; -\infty \\

_sources/projects/Project_02_Wake_Behind_Cylinder.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -553,7 +553,7 @@ <h2> Contents </h2>
553553
<section class="tex2jax_ignore mathjax_ignore" id="modeling-fluid-flow-past-a-cylinder">
554554
<h1>Modeling Fluid Flow Past a Cylinder<a class="headerlink" href="#modeling-fluid-flow-past-a-cylinder" title="Permalink to this heading">#</a></h1>
555555
<div>
556-
<img src="https://raw.githubusercontent.com/illinois-mlp/MachineLearningForPhysics/main/img/Project_Cylinder_Wake.jpg" width=800>
556+
<img src="https://raw.githubusercontent.com/illinois-mlp/MachineLearningForPhysics/main/img/Project_Cylinder_Wake.jpg" width=1000>
557557
</div><section id="span-style-color-orange-overview-span">
558558
<h2><span style="color:Orange">Overview</span><a class="headerlink" href="#span-style-color-orange-overview-span" title="Permalink to this heading">#</a></h2>
559559
<p>Fluid mechanics are vital processes in the modern world, and modeling such dynamics has become increasingly important in many engineering and physics challenges. However, the complexities are also becoming increasingly complex: disciplines like multiphase flow (pollutant or disease dispersion), hypersonics (spacecraft atmospheric rentry), and fluid-surface interaction (bio-inspired motion) are all very important yet very computational expensive to numerically or experimentally model.</p>

0 commit comments

Comments
 (0)