AustinTSchaffer
diff --git a/‎OMSCS/Courses/GA/04.3 - Graphs - MST.md‎
Lines changed: 36 additions & 0 deletions b/‎OMSCS/Courses/GA/04.3 - Graphs - MST.md‎
Lines changed: 36 additions & 0 deletions
diff --git a/‎OMSCS/Courses/GA/05.1.1 - Max Flow - FF.md‎
Lines changed: 291 additions & 0 deletions b/‎OMSCS/Courses/GA/05.1.1 - Max Flow - FF.md‎
Lines changed: 291 additions & 0 deletions
@@ -144,3 +144,39 @@ When we do $T'=(T \cup e^*) - e'$, how do we prove that $T'$ is a tree?
 - This is an MST algorithm akin to Dijkstra's algorithm.
 - We can use the cut property to prove correctness of Prim's algorithm.
 
+First, we create a `cost[]` array, mapping each vertex to a specific cost. For all vertices, this cost is initialized to infinity. We then pick an arbitrary starting vertex $v_0$ and set its cost to 0. We now have $S=\{v_0\}$ and $\overline{S}$, with no edges in $X$.
+
+We also create a `prev[]` array, mapping each vertex to its parent vertex within the MST. This array is initialized to `nil` for every vertex.
+
+We now create a priority queue (PQ) containing all vertices in G, with priorities assigned to each vertex's cost. This will put $v_0$ at the front of the PQ.
+
+Now, we iterate until PQ is empty. We take the lowest cost vertex $v$ from PQ. Then, for each of the undirected edges containing $v$ ($e=(v,z) \in E$), we check to see if the cost of $z$ (using `cost[z]`) is higher than the weight of $e$. If it is, then we have a better cost estimate for $z$, so we can update the datastructures we're using to track the state of the process.
+
+1. We update the cost of $z$ to the weight of $e$ ($cost[z] := w(e)$)
+2. We set the previous node of $z$ to $v$ ($prev[z]=v$)
+3. We update the priority value of $z$ within the PQ to $z$'s new cost
+
+Since we're using a tree data structure to build the MST, unless there exists a negative edge connected to $v_0$, it's impossible to build a cycle unless we set $prev[v_0]$ to some non-nil value. We picked $v_0$ arbitrarily, but we never select an edge from a future node to point to $v_0$, which would update $prev[v_0]$ to a non-nil value. There is a potential wrinkle with respect to negative values, based on the implementation in DPV, but that could be addressed by initializing the cost of $v_0$ to $-\infty$ instead of 0.
+
+## Union-by-Rank Algorithm
+Kruskal's Algorithm is based on the unioning of disjoint sets.
+
+This algorithm uses a data-structure which starts off as $n$ disjoint trees, each with a vertex $v \in V$ as the only vertex in the tree. This datastructure is symbolized as $\pi$, mapping each vertex $v \in V$ to a parent vertex, all initialized to `nil`. We also have a separate array $rank$, which maps each vertex $v \in V$ to the depth of the tree from $v$. When select a minimum weight edge $e=(v,u)$ from $G$ to potentially add to $X$, we programmatically accomplish this by conditionally performing the union two trees in $\pi$. The union operation happens by traversing from $u$ to the root of its tree and from $v$ to the root of its tree. This gives us the representative of $u$ ($r_u$) and the representative of $v$ ($r_v$). $u$ and $v$ are not guaranteed to be from different trees, so we now check to see if $r_u=r_v$. If they're equal, then we can't add $e$ to $X$, as that would create a cycle in $X$. If $u$ and $v$ have different roots, we make one of them the parent of the other, based on whichever one is the root of a deeper tree ($rank(r_u)$ vs $rank(r_v)$). The deeper tree absorbs the shallower tree. This helps keep the overall search depth low, which is the primary source of cost of this algorithm.
+
+The end result is a tree-structured set which contains every $v \in V$. By using this tree structure, we don't have to run a hash algorithm for each vertex in V, and we don't have to consider the worst-case big-$O$ of hash sets.
+
+**Path compression** is a process which makes the trees shallower over time. Whenever we lookup a vertex, we recursively follow the edges of the tree until we come to the root of the tree. For each vertex along that path, we update the vertex's parent to the root of the tree which contains the vertex. Effectively what this is doing is caching the current representative of a particular vertex $v$, so that subsequent lookups of $v$ require fewer hops in order to determine $v$'s representative. If the root of $v$'s tree is later absorbed into a different tree, then the next time $v$ is looked up, it will now require an additional hop to find $v$'s new representative; however, path compression will then subsequently update $v$'s parent to $v$'s current parent's parent.
+
+This ensures that lookup operations are $O(log \space n)$ as well as ensuring that tree-union operations are $O(log \space n)$.
+
+### Kruskal's Algorithm Pseudocode
+![[Pasted image 20260308110516.png]]
+
+### MakeSet and Find Implementation Pseudocode
+![[Pasted image 20260308110545.png]]
+
+### Union Implementation Pseudocode
+![[Pasted image 20260308110557.png]]
+
+### Find Implementation w/ Path Compression Pseudocode
+![[Pasted image 20260308110614.png]]
@@ -0,0 +1,291 @@
+---
+tags:
+  - OMSCS
+  - Algorithms
+---
+# 05.1 - Max Flow - Ford-Fulkerson
+Outline for unit
+- ford-fulkerson algorithm
+- max-flow = min-cut
+	- implies correctness of FF and EK algorithms
+- image segmentation
+	- application of MaxFlow/MinCut
+- max-flow with demands
+- edmonds-karp algorithm
+
+## Problem Setup
+**Setting:**
+- sending supply from vertex $s$ to vertex $t$
+- have directed network (possibly a DAG, not necessarily)
+- edges have "capacities," instead of weights
+- maximize amount sent from $s$ through $G$ to $t$
+
+**Problem Formulation:**
+- Flow network:
+	- directed graph $G=(V,E)$
+	- designated $s,t \in V$
+	- for each $e \in E$, capacity $c_e > 0$
+- Maximize flow $s$ to $t$
+- $f_e=$ flow along edge $e$
+- **Input:** Flow network
+- **Goal:**
+	- find flows $f_e$ for $e \in E$
+	- **Capacity Constraint:** for all $e \in E, 0 \le f_e \le c_e$
+	- **Conservation of Flow:** for all $v \in V - \{ s \cup t \}$
+		- flow-in to $v$ = flow-out of $v$
+		- $\sum_{(w, v) \in E}f_{(w,v)}=\sum_{(v, z) \in E} f_{(v,z)}$
+	- **Maximize:** find a valid flow of maximum size
+		- $size(f)=$ total flow sent
+		- = flow-out of $s$
+		- = flow-in to $t$
+
+## Example
+![[Pasted image 20260309221138.png]]
+- Source vs Sink
+	- Max flow out of $s$: $6+4+5=15$
+	- Max flow in to $t$: $7+5=12$
+	- $min(12,15)=12$
+	- Can we use all 12 capacity units?
+- vertex $d$
+	- Max flow in: 10
+	- Max flow out: 8
+	- min = 8
+- vertex $f$
+	- Max flow in: 2
+	- ...
+
+| Vertex | Max Flow In | Max Flow Out | Max Flow Through |
+| ------ | ----------- | ------------ | ---------------- |
+| s      | $\infty$    | 6+4+5=15     | 15               |
+| a      | 6           | 10           | 6                |
+| b      | 4           | 1            | 1                |
+| c      | 5+3=8       | 8            | 8                |
+| d      | 8+2=10      | 5+3=8        | 8                |
+| e      | 2+1+3=6     | 2            | 2                |
+| f      | 8           | 7+3=10       | 8                |
+| t      | 5+7=12      | $\infty$     | 12               |
+- 6 through $(s,a)$, 6 through $(a,d)$.
+	- $d$ has 6 flow.
+- 1 through $(s,b)$, 1 through $(b,e)$, 1 through $(e,d)$.
+	- $d$ has 7 flow.
+- 5 through $(d,t)$
+	- $t$ has 5 flow.
+	- $d$ has 2 remaining.
+- 5 through $(s,c)$, 5 through $(c,f)$, 5 through $(f,t)$
+	- $t$ has 10 flow.
+	- $(c,f)$ has 3 remaining out capacity
+	- $(f,t)$ has 2 remaining out capacity
+- 2 through $(d,c)$, 2 through $(c,f)$, 2 through $(f,t)$
+	- $d$ is empty
+	- $t$ has 12 flow
+
+**Total flow: 12** ($size(f)=12$)
+**Edges used:**
+- $sa=6, sb=1, sc=5$
+- $ad=6, ae=0$
+- $be=1$
+- $cf=7$
+- $dc=2, dt=5$
+- $ed=1$
+- $fe=0, ft=7$
+
+We know that we've achieved the maximum flow because if we subtract the flow from the edges incoming to $t$, then $t$ has no remaining incoming capacity, and is therefore now "cut off" from the rest of the network. The upcoming algorithms will codify that behavior.
+
+## Cycles are OK
+In fact, the previous example has a cycle.
+
+- $C \rightarrow F \rightarrow E \rightarrow D \rightarrow C$
+- The flow doesn't typically use the cycle.
+- If we send a unit of flow out of D, around the cycle, then back out of D through $(d,t)$, why didn't we just send it right out of $(d,t)$ in the first place? Makes no sense.
+- Part of the cycle may be utilized. Sometimes you need to pass flow across the network in order to achieve the max flow.
+
+## Anti-Parallel Edges
+
+In the graph below, there are anti-parallel edges between a and b.
+
+```mermaid
+graph LR
+
+s --7--> a
+s --9--> b
+a --5--> t
+b --4--> t
+a --2--> b
+b --3--> a
+```
+
+For some reason, we want to remove anti-parallel edges. We can remove those edges by interrupting one of the flows with a new node, with the same input-output flow. We'll call this new vertex $f$ in the diagram.
+
+```mermaid
+graph LR
+
+f
+s
+a
+b
+t
+
+s --7--> a
+s --9--> b
+a --5--> t
+b --4--> t
+b --3--> f
+f --3--> a
+a --2--> b
+```
+
+Isn't that much nicer? Apparently this is important. Regardless, this was not a crazy transformation, so we can easily convert the max flow on $G'$ back to $G$.
+
+## Toy Example
+```mermaid
+graph LR
+
+s --10--> a
+a --10--> b
+s --7--> b
+a --7--> t
+b --10--> t
+```
+
+## Algorithm Idea
+1. Start with $f_e=0$ for all $e \in E$
+2. We'll keep track of an "available capacity" for each edge. This is just $c_e - f_e$.
+3. Find an st-path ($P$) **with available capacity** (a path from s to t). This can be done with BFS or DFS.
+4. Let $c(P)=min_{e \in P}(c_e - f_e)$ be the minimum capacity edge in $P$. This is the maximum amount of flow that can be sent along that path, considering the available capacity of all $e \in P$.
+5. Augment $f$ by $c(P)$ along $P$
+
+**Available Capacity**
+```mermaid
+graph LR
+
+s --10--> a
+a --10--> b
+s --7--> b
+a --7--> t
+b --10--> t
+```
+
+**Flow**
+```mermaid
+graph LR
+
+s --0--> a
+a --0--> b
+s --0--> b
+a --0--> t
+b --0--> t
+```
+
+**P Selection**
+```mermaid
+graph LR
+
+s ==10==> a
+a ==10==> b
+s --7--> b
+a --7--> t
+b ==10==> t
+```
+
+**Update Flow**
+```mermaid
+graph LR
+
+s --10--> a
+a --10--> b
+s --0--> b
+a --0--> t
+b --10--> t
+```
+
+**Available Capacity**
+```mermaid
+graph LR
+
+s --0--> a
+a --0--> b
+s --7--> b
+a --7--> t
+b --0--> t
+```
+
+We've achieved a flow of 10, but now we can't reach t from s.
+
+## Backward Edges
+The algorithm selected edge $(a,b)$ and used all of its available capacity. If we now evaluate edge $(s,b)$, there's nowhere to send the available capacity of 7. However, we can consider Backwards Edges. Effectively, now that we've sent 10 units of flow across $(a,b)$, there's now an available capacity of $(b,a)$ of 10. This is effectively sending flow backwards through the pipe, back to where it came from.
+
+```mermaid
+graph LR
+
+s --0--> a
+a --0--> b
+s --7--> b
+b ==10==> a
+a --7--> t
+b --0--> t
+```
+
+In fact we can do this for all edges. The bold edges are the backwards edges.
+
+```mermaid
+graph LR
+
+s --0--> a
+a --0--> b
+s --7--> b
+a --7--> t
+b --0--> t
+t ==10==> b
+a ==10==> s
+b ==10==> a
+t ==0==> a
+b ==0==> s
+```
+
+We refer to this network as the **Residual Network**, renaming it from the **Available Flow** network.
+
+## Residual Network
+Definition of residual network: $G^f=(V,E^f)$
+
+For flow network $G=(V,E)$ with $c_e$ for $e \in E$ and flow $f_e$ for $e \in E$
+- If $vw \in E$ and $f_{vw} < c_{vw}$, then add $vw$ to $G^f$ with capacity $c_{vw}-f_{vw}$
+- If $vw \in E$ and $f_{vw} > 0$, then add $wv$ to $G^f$ with capacity $f_{vw}$
+
+## Ford-Fulkerson Algorithm
+1. Set $f_e=0$ for all $e \in E$
+2. Build the residual network $G^f$ for current flow $f$
+3. Check for a st-path in $G^f$. If no such path, then output $f$
+4. Given $P$, let $c(P)=$ min capacity along $P$ in $G^f$
+5. Augment $f$ by $c(P)$ units along $P$
+6. Repeat from step 2 until no such st-path
+
+### Running Time
+- **Correctness:** follows from max-flow = min-cut theorem
+- **Major Assumption:**
+	- Assume all capacities are integers
+	- When we augment flows, we augment them by an integer amount.
+	- Flow increases by $\ge 1$ unit per round
+- Let $C=$ size of max flow, then $\le C$ rounds.
+
+- **The time needed for one round of the algorithm.**
+	- Build residual network...
+		- Must be built once at the beginning: $O(n+m)$
+		- On each iteration, it only changes based on the changes made along st-path $P$
+		- Path is of length at most $n-1$ edges.
+		- On each iteration: $O(n)$
+	- Finding st-path...
+		- DFS or BFS: naively $O(n+m)$
+		- Let's assume that $m \ge n-1$. Therefore we're bounded by order $m$ 
+		- $O(m)$
+	- Augment $f$...
+		- $O(n)$
+	- Each iteration is dominated by $O(m)$
+
+**Overall:** $O(mC)$
+
+**Complications**
+- Run time depends on output
+	- $O(mC)$ is pseudo-polynomial
+	- Edmonds-Karp: $O(m^2n)$, which may be desirable in some cases.
+	- Orlin 2013: $O(mn)$
+- Assumes integer capacities.