-
Notifications
You must be signed in to change notification settings - Fork 37
Expand file tree
/
Copy pathUnion_Find_Tutorial.txt
More file actions
546 lines (433 loc) · 22.5 KB
/
Union_Find_Tutorial.txt
File metadata and controls
546 lines (433 loc) · 22.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
Union Find
==========
- Introduction:
===============
Union Find, also known as Disjoint Set Union (DSU), is a data structure that keeps track of a partition of a set
into disjoint subsets (meaning no set overlaps with another). It provides two primary operations: find, which determines
which subset a particular element is in, and union, which merges two subsets into a single subset. This pattern is
particularly useful for problems where we need to find whether 2 elements belong to the same group or need to solve
connectivity-related problems in a graph or tree.
* Pros and Cons of DSU:
-----------------------
1. Pros:
Efficiency: Provides near constant time operations for union and find operations, making it highly efficient.
Simplicity: Once set up, it’s straightforward to use for solving problems related to disjoint sets.
2. Cons:
Space Overhead: Requires additional space to store the parent and rank of each element.
Initial Setup: Requires initial setup to create and initialize the data structure.
* Why Choose Union-Find Over BFS/DFS?
-------------------------------------
You can solve similar problems using the Breadth-First Search (BFS) and Depth-First Search (DFS) algorithms,
but the Union Find algorithm provides the optimal solution over them. For problems related to connectivity checks and
component identification, Union-Find is often more efficient than BFS/DFS.
* Time Complexity:
------------------
1. BFS/DFS: For BFS and DFS, the time complexity is O (V + E), where V is the number of vertices and E is the
number of edges. This can be quite slow for large graphs.
2. Union-Find: With path compression and union by rank, the amortized time complexity for union and find operations
can be approximated as O(α(n)), where α(n) is the inverse Ackermann function. This function grows very slowly,
making Union-Find operations almost constant time in practice.
+ Applications:
1. Network Connectivity: To determine if there is a path between nodes in a network.
2. MST Algorithms: Such as Kruskal's algorithm for finding Minimum Spanning Trees.
3. Image Processing: For detecting connected components in images.
4. Equivalence Relations: Managing equivalence relations and partitioning sets based on those relations.
* Key Concepts and Components:
------------------------------
+ Key Concepts:
1. Disjoint Sets:
A collection of sets where no item can be in more than one set.
Used to partition elements into non-overlapping subsets.
2. Representative (or Root):
Each set is identified by a representative or root element.
All elements in the set point directly or indirectly to this root.
3. Connected Components:
Subsets of elements that are interconnected.
+ Components
1. Parent Array:
An array where each element points to its parent.
If an element is its own parent, it is a root.
Java Implementation:
int[] parent = new int[n];
for (int i = 0; i < n; i++) {
parent[i] = i; // Initially, each element is its own parent
}
2. Rank Array (or Size Array):
An array used to keep track of the tree's depth or size.
Helps optimize the union operation by ensuring the smaller tree is merged under the root of the larger tree.
Java Implementation:
int[] rank = new int[n];
for (int i = 0; i < n; i++) {
rank[i] = 0; // Initially, the rank of each tree is 0
}
- Union-Find data structure Operations:
=======================================
Here's an overview of the operations, including the less common deletion operation:
1. Making New Sets:
-------------------
- Initializes each element as its own set.
- Each element is its own parent initially, and the rank of each element is 0.
public class UnionFind {
private int[] parent;
private int[] rank;
// Constructor to initialize parent and rank arrays
public UnionFind(int size) {
parent = new int[size];
rank = new int[size];
for (int i = 0; i < size; i++) {
parent[i] = i;
rank[i] = 0;
}
}
}
2. Finding Set Representatives:
-------------------------------
- Determines the representative (or root) of the set containing a particular element.
- Uses path compression to flatten the structure, making future operations faster.
public int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]); // Path compression
}
return parent[x];
}
3. Merging Two Sets:
--------------------
- Merges two sets into a single set.
- Uses union by rank to attach the smaller tree under the root of the larger tree, keeping the tree flat and efficient.
public void union(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}
4. Deletion (Less Common):
--------------------------
- The standard Union-Find data structure does not support efficient deletion. Deleting an element from a set would
require significant changes to maintain the properties of the data structure. However, in some applications, a
workaround is to mark elements as inactive or use a different data structure to handle deletions.
- One common approach is to use a flag array to mark elements as "deleted" without actually removing them from the data structure.
private boolean[] active;
public UnionFind(int size) {
parent = new int[size];
rank = new int[size];
active = new boolean[size];
for (int i = 0; i < size; i++) {
parent[i] = i;
rank[i] = 0;
active[i] = true; // All elements are initially active
}
}
public void delete(int x) {
active[x] = false; // Mark the element as inactive
}
public boolean isActive(int x) {
return active[x];
}
5. Full Implementation in Java:
-------------------------------
Here's a complete implementation of the Union-Find data structure, including the basic operations and a
method to mark elements as deleted:
public class UnionFind {
private int[] parent;
private int[] rank;
private boolean[] active;
// Constructor to initialize parent, rank, and active arrays
public UnionFind(int size) {
parent = new int[size];
rank = new int[size];
active = new boolean[size];
for (int i = 0; i < size; i++) {
parent[i] = i;
rank[i] = 0;
active[i] = true; // All elements are initially active
}
}
// Find operation with path compression
public int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]);
}
return parent[x];
}
// Union operation with union by rank
public void union(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}
// Mark an element as inactive (deleted)
public void delete(int x) {
active[x] = false; // Mark the element as inactive
}
// Check if an element is active
public boolean isActive(int x) {
return active[x];
}
public static void main(String[] args) {
int n = 10; // Number of elements
UnionFind uf = new UnionFind(n);
// Perform Union operations
uf.union(1, 2);
uf.union(2, 3);
uf.union(4, 5);
// Perform Find operations
System.out.println(uf.find(1)); // Output will be the representative of the set containing 1
System.out.println(uf.find(4)); // Output will be the representative of the set containing 4
// Delete an element
uf.delete(2);
// Check if elements are active
System.out.println(uf.isActive(2)); // Output will be false
System.out.println(uf.isActive(3)); // Output will be true
}
}
+ Complexity
- Find and Union operations are nearly constant time, O(α(n)), where α is the Inverse Ackermann function, which
grows very slowly.
- Deletion is not supported efficiently in the basic Union-Find structure; the workaround using an active array
has O(1) complexity for marking an element as inactive.
- Union-Find data structure Optimization:
=========================================
The Union-Find data structure, also known as Disjoint Set Union (DSU), is used to efficiently manage and merge disjoint sets.
Two common optimizations improve the efficiency of union-find operations: Path Compression and Union by Rank/Union by Size.
These optimizations help achieve nearly constant time complexity for union and find operations.
1. Path Compression:
--------------------
Path Compression is an optimization technique used during the find operation. The main idea is to make the tree flat, so future
find operations are faster.
* Implementation:
-----------------
When performing a find operation, traverse all the way up to the root and make each node on the path point directly to the root.
This flattens the structure, leading to more efficient future operations.
class UnionFind {
private int[] parent;
private int[] rank; // For Union by Rank or Size
public UnionFind(int size) {
parent = new int[size];
rank = new int[size];
for (int i = 0; i < size; i++) {
parent[i] = i;
rank[i] = 1; // For Union by Size, initialize rank as 1
}
}
public int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]); // Path compression
}
return parent[x];
}
public void unionByRank(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}
public void unionBySize(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
rank[rootX] += rank[rootY]; // Update the size
} else {
parent[rootX] = rootY;
rank[rootY] += rank[rootX]; // Update the size
}
}
}
public static void main(String[] args) {
UnionFind uf = new UnionFind(10);
uf.unionByRank(1, 2);
uf.unionByRank(2, 3);
uf.unionByRank(4, 5);
uf.unionByRank(6, 7);
uf.unionByRank(5, 6);
uf.unionByRank(3, 7);
System.out.println(uf.find(1)); // Output: 1
System.out.println(uf.find(3)); // Output: 1
System.out.println(uf.find(5)); // Output: 4
System.out.println(uf.find(7)); // Output: 4
}
}
2. Union by Rank
-----------------
Union by Rank is an optimization that attaches the shorter tree under the root of the taller tree. This keeps the tree as flat as possible.
* Implementation:
-----------------
Each node has a rank (or height). When merging two sets, the root of the smaller tree is made a child of the root of the larger tree.
If the ranks are equal, one root is arbitrarily chosen, and its rank is increased by one.
public void unionByRank(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}
3. Union by Size:
-----------------
Union by Size is similar to Union by Rank but uses the size of the trees (number of elements) instead of their height.
The smaller tree is always added to the larger tree, ensuring that the resultant tree remains balanced.
* Implementation:
Each node keeps track of the size of the tree it represents. When merging two sets, the root of the smaller tree is
made a child of the root of the larger tree, and the size of the larger tree is updated.
public void unionBySize(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
rank[rootX] += rank[rootY]; // Update the size
} else {
parent[rootX] = rootY;
rank[rootY] += rank[rootX]; // Update the size
}
}
}
4. Combining Path Compression with Union by Rank/Size:
------------------------------------------------------
Combining these optimizations results in an extremely efficient union-find structure with nearly constant time
complexity for union and find operations.
*** Example Usage:
The `main` method in the `UnionFind` class demonstrates the usage of union by rank and union by size with path compression.
public static void main(String[] args) {
UnionFind uf = new UnionFind(10);
uf.unionByRank(1, 2);
uf.unionByRank(2, 3);
uf.unionByRank(4, 5);
uf.unionByRank(6, 7);
uf.unionByRank(5, 6);
uf.unionByRank(3, 7);
System.out.println(uf.find(1)); // Output: 1
System.out.println(uf.find(3)); // Output: 1
System.out.println(uf.find(5)); // Output: 4
System.out.println(uf.find(7)); // Output: 4
}
This implementation demonstrates how to optimize union-find operations using path compression and union by rank or size,
resulting in efficient and performant data structure operations.
- Example Union-Find data structure: LeetCode Problem 547. Number of Provinces
==============================================================================
To solve this problem, we can use the Union-Find data structure to efficiently group the connected cities and count
the number of disjoint sets (provinces). Here is a step-by-step Java solution (ChatGPT coded the solution 🤖):
* Steps:
---------
1. Initialize Union-Find Structure: Create a Union-Find data structure to manage city connections.
2. Process Connections: For each pair of cities that are directly connected, perform a union operation.
3. Count Provinces: After processing all connections, count the number of unique roots in the Union-Find structure,
which corresponds to the number of provinces.
* Java Implementation:
----------------------
public class Solution {
// Union-Find data structure
class UnionFind {
private int[] parent;
private int[] rank;
// Constructor to initialize parent and rank arrays
public UnionFind(int size) {
parent = new int[size];
rank = new int[size];
for (int i = 0; i < size; i++) {
parent[i] = i;
rank[i] = 0;
}
}
// Find operation with path compression
public int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]);
}
return parent[x];
}
// Union operation with union by rank
public void union(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}
// Get the number of unique roots (provinces)
public int getCount() {
int count = 0;
for (int i = 0; i < parent.length; i++) {
if (parent[i] == i) {
count++;
}
}
return count;
}
}
public int findCircleNum(int[][] isConnected) {
int n = isConnected.length;
UnionFind uf = new UnionFind(n);
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
if (isConnected[i][j] == 1) {
uf.union(i, j);
}
}
}
return uf.getCount();
}
public static void main(String[] args) {
Solution solution = new Solution();
int[][] isConnected = {
{1, 1, 0},
{1, 1, 0},
{0, 0, 1}
};
int numberOfProvinces = solution.findCircleNum(isConnected);
System.out.println("Number of Provinces: " + numberOfProvinces); // Output should be 2
}
}
* Explanation:
1. Union-Find Class:
- The `UnionFind` class manages the parent and rank arrays, provides the `find` and `union` methods to manage
connections, and a `getCount` method to count the number of unique sets (provinces).
2. findCircleNum Method:
- The `findCircleNum` method initializes the Union-Find structure.
- It processes the `isConnected` matrix to union connected cities.
- Finally, it returns the count of unique roots in the Union-Find structure, representing the number of provinces.
3. Main Method:
- The main method provides a sample `isConnected` matrix and prints the number of provinces.
4. Complexity:
- Time Complexity: O(n^2 * α(n)), where n is the number of cities, and α(n) is the Inverse Ackermann function.
The double loop iterates over all pairs of cities.
- Space Complexity: O(n), for storing the parent and rank arrays.