Skip to content

Commit 7b4ddda

Browse files
Refresh of OMSCS notes
1 parent 8d3db18 commit 7b4ddda

28 files changed

Lines changed: 607 additions & 6 deletions

OMSCS/Courses/GA/04.3 - Graphs - MST.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,3 +144,39 @@ When we do $T'=(T \cup e^*) - e'$, how do we prove that $T'$ is a tree?
144144
- This is an MST algorithm akin to Dijkstra's algorithm.
145145
- We can use the cut property to prove correctness of Prim's algorithm.
146146

147+
First, we create a `cost[]` array, mapping each vertex to a specific cost. For all vertices, this cost is initialized to infinity. We then pick an arbitrary starting vertex $v_0$ and set its cost to 0. We now have $S=\{v_0\}$ and $\overline{S}$, with no edges in $X$.
148+
149+
We also create a `prev[]` array, mapping each vertex to its parent vertex within the MST. This array is initialized to `nil` for every vertex.
150+
151+
We now create a priority queue (PQ) containing all vertices in G, with priorities assigned to each vertex's cost. This will put $v_0$ at the front of the PQ.
152+
153+
Now, we iterate until PQ is empty. We take the lowest cost vertex $v$ from PQ. Then, for each of the undirected edges containing $v$ ($e=(v,z) \in E$), we check to see if the cost of $z$ (using `cost[z]`) is higher than the weight of $e$. If it is, then we have a better cost estimate for $z$, so we can update the datastructures we're using to track the state of the process.
154+
155+
1. We update the cost of $z$ to the weight of $e$ ($cost[z] := w(e)$)
156+
2. We set the previous node of $z$ to $v$ ($prev[z]=v$)
157+
3. We update the priority value of $z$ within the PQ to $z$'s new cost
158+
159+
Since we're using a tree data structure to build the MST, unless there exists a negative edge connected to $v_0$, it's impossible to build a cycle unless we set $prev[v_0]$ to some non-nil value. We picked $v_0$ arbitrarily, but we never select an edge from a future node to point to $v_0$, which would update $prev[v_0]$ to a non-nil value. There is a potential wrinkle with respect to negative values, based on the implementation in DPV, but that could be addressed by initializing the cost of $v_0$ to $-\infty$ instead of 0.
160+
161+
## Union-by-Rank Algorithm
162+
Kruskal's Algorithm is based on the unioning of disjoint sets.
163+
164+
This algorithm uses a data-structure which starts off as $n$ disjoint trees, each with a vertex $v \in V$ as the only vertex in the tree. This datastructure is symbolized as $\pi$, mapping each vertex $v \in V$ to a parent vertex, all initialized to `nil`. We also have a separate array $rank$, which maps each vertex $v \in V$ to the depth of the tree from $v$. When select a minimum weight edge $e=(v,u)$ from $G$ to potentially add to $X$, we programmatically accomplish this by conditionally performing the union two trees in $\pi$. The union operation happens by traversing from $u$ to the root of its tree and from $v$ to the root of its tree. This gives us the representative of $u$ ($r_u$) and the representative of $v$ ($r_v$). $u$ and $v$ are not guaranteed to be from different trees, so we now check to see if $r_u=r_v$. If they're equal, then we can't add $e$ to $X$, as that would create a cycle in $X$. If $u$ and $v$ have different roots, we make one of them the parent of the other, based on whichever one is the root of a deeper tree ($rank(r_u)$ vs $rank(r_v)$). The deeper tree absorbs the shallower tree. This helps keep the overall search depth low, which is the primary source of cost of this algorithm.
165+
166+
The end result is a tree-structured set which contains every $v \in V$. By using this tree structure, we don't have to run a hash algorithm for each vertex in V, and we don't have to consider the worst-case big-$O$ of hash sets.
167+
168+
**Path compression** is a process which makes the trees shallower over time. Whenever we lookup a vertex, we recursively follow the edges of the tree until we come to the root of the tree. For each vertex along that path, we update the vertex's parent to the root of the tree which contains the vertex. Effectively what this is doing is caching the current representative of a particular vertex $v$, so that subsequent lookups of $v$ require fewer hops in order to determine $v$'s representative. If the root of $v$'s tree is later absorbed into a different tree, then the next time $v$ is looked up, it will now require an additional hop to find $v$'s new representative; however, path compression will then subsequently update $v$'s parent to $v$'s current parent's parent.
169+
170+
This ensures that lookup operations are $O(log \space n)$ as well as ensuring that tree-union operations are $O(log \space n)$.
171+
172+
### Kruskal's Algorithm Pseudocode
173+
![[Pasted image 20260308110516.png]]
174+
175+
### MakeSet and Find Implementation Pseudocode
176+
![[Pasted image 20260308110545.png]]
177+
178+
### Union Implementation Pseudocode
179+
![[Pasted image 20260308110557.png]]
180+
181+
### Find Implementation w/ Path Compression Pseudocode
182+
![[Pasted image 20260308110614.png]]
Lines changed: 291 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,291 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- Algorithms
5+
---
6+
# 05.1 - Max Flow - Ford-Fulkerson
7+
Outline for unit
8+
- ford-fulkerson algorithm
9+
- max-flow = min-cut
10+
- implies correctness of FF and EK algorithms
11+
- image segmentation
12+
- application of MaxFlow/MinCut
13+
- max-flow with demands
14+
- edmonds-karp algorithm
15+
16+
## Problem Setup
17+
**Setting:**
18+
- sending supply from vertex $s$ to vertex $t$
19+
- have directed network (possibly a DAG, not necessarily)
20+
- edges have "capacities," instead of weights
21+
- maximize amount sent from $s$ through $G$ to $t$
22+
23+
**Problem Formulation:**
24+
- Flow network:
25+
- directed graph $G=(V,E)$
26+
- designated $s,t \in V$
27+
- for each $e \in E$, capacity $c_e > 0$
28+
- Maximize flow $s$ to $t$
29+
- $f_e=$ flow along edge $e$
30+
- **Input:** Flow network
31+
- **Goal:**
32+
- find flows $f_e$ for $e \in E$
33+
- **Capacity Constraint:** for all $e \in E, 0 \le f_e \le c_e$
34+
- **Conservation of Flow:** for all $v \in V - \{ s \cup t \}$
35+
- flow-in to $v$ = flow-out of $v$
36+
- $\sum_{(w, v) \in E}f_{(w,v)}=\sum_{(v, z) \in E} f_{(v,z)}$
37+
- **Maximize:** find a valid flow of maximum size
38+
- $size(f)=$ total flow sent
39+
- = flow-out of $s$
40+
- = flow-in to $t$
41+
42+
## Example
43+
![[Pasted image 20260309221138.png]]
44+
- Source vs Sink
45+
- Max flow out of $s$: $6+4+5=15$
46+
- Max flow in to $t$: $7+5=12$
47+
- $min(12,15)=12$
48+
- Can we use all 12 capacity units?
49+
- vertex $d$
50+
- Max flow in: 10
51+
- Max flow out: 8
52+
- min = 8
53+
- vertex $f$
54+
- Max flow in: 2
55+
- ...
56+
57+
| Vertex | Max Flow In | Max Flow Out | Max Flow Through |
58+
| ------ | ----------- | ------------ | ---------------- |
59+
| s | $\infty$ | 6+4+5=15 | 15 |
60+
| a | 6 | 10 | 6 |
61+
| b | 4 | 1 | 1 |
62+
| c | 5+3=8 | 8 | 8 |
63+
| d | 8+2=10 | 5+3=8 | 8 |
64+
| e | 2+1+3=6 | 2 | 2 |
65+
| f | 8 | 7+3=10 | 8 |
66+
| t | 5+7=12 | $\infty$ | 12 |
67+
- 6 through $(s,a)$, 6 through $(a,d)$.
68+
- $d$ has 6 flow.
69+
- 1 through $(s,b)$, 1 through $(b,e)$, 1 through $(e,d)$.
70+
- $d$ has 7 flow.
71+
- 5 through $(d,t)$
72+
- $t$ has 5 flow.
73+
- $d$ has 2 remaining.
74+
- 5 through $(s,c)$, 5 through $(c,f)$, 5 through $(f,t)$
75+
- $t$ has 10 flow.
76+
- $(c,f)$ has 3 remaining out capacity
77+
- $(f,t)$ has 2 remaining out capacity
78+
- 2 through $(d,c)$, 2 through $(c,f)$, 2 through $(f,t)$
79+
- $d$ is empty
80+
- $t$ has 12 flow
81+
82+
**Total flow: 12** ($size(f)=12$)
83+
**Edges used:**
84+
- $sa=6, sb=1, sc=5$
85+
- $ad=6, ae=0$
86+
- $be=1$
87+
- $cf=7$
88+
- $dc=2, dt=5$
89+
- $ed=1$
90+
- $fe=0, ft=7$
91+
92+
We know that we've achieved the maximum flow because if we subtract the flow from the edges incoming to $t$, then $t$ has no remaining incoming capacity, and is therefore now "cut off" from the rest of the network. The upcoming algorithms will codify that behavior.
93+
94+
## Cycles are OK
95+
In fact, the previous example has a cycle.
96+
97+
- $C \rightarrow F \rightarrow E \rightarrow D \rightarrow C$
98+
- The flow doesn't typically use the cycle.
99+
- If we send a unit of flow out of D, around the cycle, then back out of D through $(d,t)$, why didn't we just send it right out of $(d,t)$ in the first place? Makes no sense.
100+
- Part of the cycle may be utilized. Sometimes you need to pass flow across the network in order to achieve the max flow.
101+
102+
## Anti-Parallel Edges
103+
104+
In the graph below, there are anti-parallel edges between a and b.
105+
106+
```mermaid
107+
graph LR
108+
109+
s --7--> a
110+
s --9--> b
111+
a --5--> t
112+
b --4--> t
113+
a --2--> b
114+
b --3--> a
115+
```
116+
117+
For some reason, we want to remove anti-parallel edges. We can remove those edges by interrupting one of the flows with a new node, with the same input-output flow. We'll call this new vertex $f$ in the diagram.
118+
119+
```mermaid
120+
graph LR
121+
122+
f
123+
s
124+
a
125+
b
126+
t
127+
128+
s --7--> a
129+
s --9--> b
130+
a --5--> t
131+
b --4--> t
132+
b --3--> f
133+
f --3--> a
134+
a --2--> b
135+
```
136+
137+
Isn't that much nicer? Apparently this is important. Regardless, this was not a crazy transformation, so we can easily convert the max flow on $G'$ back to $G$.
138+
139+
## Toy Example
140+
```mermaid
141+
graph LR
142+
143+
s --10--> a
144+
a --10--> b
145+
s --7--> b
146+
a --7--> t
147+
b --10--> t
148+
```
149+
150+
## Algorithm Idea
151+
1. Start with $f_e=0$ for all $e \in E$
152+
2. We'll keep track of an "available capacity" for each edge. This is just $c_e - f_e$.
153+
3. Find an st-path ($P$) **with available capacity** (a path from s to t). This can be done with BFS or DFS.
154+
4. Let $c(P)=min_{e \in P}(c_e - f_e)$ be the minimum capacity edge in $P$. This is the maximum amount of flow that can be sent along that path, considering the available capacity of all $e \in P$.
155+
5. Augment $f$ by $c(P)$ along $P$
156+
157+
**Available Capacity**
158+
```mermaid
159+
graph LR
160+
161+
s --10--> a
162+
a --10--> b
163+
s --7--> b
164+
a --7--> t
165+
b --10--> t
166+
```
167+
168+
**Flow**
169+
```mermaid
170+
graph LR
171+
172+
s --0--> a
173+
a --0--> b
174+
s --0--> b
175+
a --0--> t
176+
b --0--> t
177+
```
178+
179+
**P Selection**
180+
```mermaid
181+
graph LR
182+
183+
s ==10==> a
184+
a ==10==> b
185+
s --7--> b
186+
a --7--> t
187+
b ==10==> t
188+
```
189+
190+
**Update Flow**
191+
```mermaid
192+
graph LR
193+
194+
s --10--> a
195+
a --10--> b
196+
s --0--> b
197+
a --0--> t
198+
b --10--> t
199+
```
200+
201+
**Available Capacity**
202+
```mermaid
203+
graph LR
204+
205+
s --0--> a
206+
a --0--> b
207+
s --7--> b
208+
a --7--> t
209+
b --0--> t
210+
```
211+
212+
We've achieved a flow of 10, but now we can't reach t from s.
213+
214+
## Backward Edges
215+
The algorithm selected edge $(a,b)$ and used all of its available capacity. If we now evaluate edge $(s,b)$, there's nowhere to send the available capacity of 7. However, we can consider Backwards Edges. Effectively, now that we've sent 10 units of flow across $(a,b)$, there's now an available capacity of $(b,a)$ of 10. This is effectively sending flow backwards through the pipe, back to where it came from.
216+
217+
```mermaid
218+
graph LR
219+
220+
s --0--> a
221+
a --0--> b
222+
s --7--> b
223+
b ==10==> a
224+
a --7--> t
225+
b --0--> t
226+
```
227+
228+
In fact we can do this for all edges. The bold edges are the backwards edges.
229+
230+
```mermaid
231+
graph LR
232+
233+
s --0--> a
234+
a --0--> b
235+
s --7--> b
236+
a --7--> t
237+
b --0--> t
238+
t ==10==> b
239+
a ==10==> s
240+
b ==10==> a
241+
t ==0==> a
242+
b ==0==> s
243+
```
244+
245+
We refer to this network as the **Residual Network**, renaming it from the **Available Flow** network.
246+
247+
## Residual Network
248+
Definition of residual network: $G^f=(V,E^f)$
249+
250+
For flow network $G=(V,E)$ with $c_e$ for $e \in E$ and flow $f_e$ for $e \in E$
251+
- If $vw \in E$ and $f_{vw} < c_{vw}$, then add $vw$ to $G^f$ with capacity $c_{vw}-f_{vw}$
252+
- If $vw \in E$ and $f_{vw} > 0$, then add $wv$ to $G^f$ with capacity $f_{vw}$
253+
254+
## Ford-Fulkerson Algorithm
255+
1. Set $f_e=0$ for all $e \in E$
256+
2. Build the residual network $G^f$ for current flow $f$
257+
3. Check for a st-path in $G^f$. If no such path, then output $f$
258+
4. Given $P$, let $c(P)=$ min capacity along $P$ in $G^f$
259+
5. Augment $f$ by $c(P)$ units along $P$
260+
6. Repeat from step 2 until no such st-path
261+
262+
### Running Time
263+
- **Correctness:** follows from max-flow = min-cut theorem
264+
- **Major Assumption:**
265+
- Assume all capacities are integers
266+
- When we augment flows, we augment them by an integer amount.
267+
- Flow increases by $\ge 1$ unit per round
268+
- Let $C=$ size of max flow, then $\le C$ rounds.
269+
270+
- **The time needed for one round of the algorithm.**
271+
- Build residual network...
272+
- Must be built once at the beginning: $O(n+m)$
273+
- On each iteration, it only changes based on the changes made along st-path $P$
274+
- Path is of length at most $n-1$ edges.
275+
- On each iteration: $O(n)$
276+
- Finding st-path...
277+
- DFS or BFS: naively $O(n+m)$
278+
- Let's assume that $m \ge n-1$. Therefore we're bounded by order $m$
279+
- $O(m)$
280+
- Augment $f$...
281+
- $O(n)$
282+
- Each iteration is dominated by $O(m)$
283+
284+
**Overall:** $O(mC)$
285+
286+
**Complications**
287+
- Run time depends on output
288+
- $O(mC)$ is pseudo-polynomial
289+
- Edmonds-Karp: $O(m^2n)$, which may be desirable in some cases.
290+
- Orlin 2013: $O(mn)$
291+
- Assumes integer capacities.

0 commit comments

Comments
 (0)