You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/gl/architecture.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -232,7 +232,7 @@ By keeping all vertex and edge data in adjacent memory blocks, these models prov
232
232
>
233
233
> While the flat adjacency list model is highly efficient for graph storage and traversal, it is highly inefficient to construct element-by-element. The most efficient approach for utilizing flat list graphs is to construct your graph using the standard list model first, and then convert it into the flat list model using the generic [**gl::to**](../cpp-gl/group__GL-Core.md#function-to) conversion function. This exact methodology is utilized internally by the [**graph topology generators**](topologies.md) defined within the library.
234
234
235
-
### Operation Complexity
235
+
### Operation Complexities
236
236
237
237
Depending on the chosen representation model, the computational complexity of standard graph operations will differ. The table below outlines these complexities.
Because hypergraphs are generalizations of graphs, their topological data cannot be stored as simple arrays of pairs. Instead, CPP-GL relies on **Incidence Models**. HGL categorizes its memory representations via `ImplTag` and strictly controls orientation via`LayoutTag`.
203
+
Because hypergraphs are generalizations of graphs, their topological data cannot be stored the same way as standard graphs (Adjacency Models). Instead, CPP-HGL relies on **Incidence Models** to capture the higher-order connections between vertices and hyperedges. The HGL module categorizes its memory representations via the `ImplTag` and strictly controls their memory orientation with the`LayoutTag`.
204
204
205
-
### Incidence Models (Standard & Flat)
205
+
### Fundamental Representations
206
206
207
-
-**Incidence Lists** ([`list_t`](../cpp-gl/structhgl_1_1impl_1_1list__t.md) / [`flat_list_t`](../cpp-gl/structhgl_1_1impl_1_1flat__list__t.md)): Maps elements to dynamic jagged lists. Highly space-efficient for sparse hypergraphs, as memory is only allocated for existing incidence relations.
208
-
-**Incidence Matrices** ([`matrix_t`](../cpp-gl/structhgl_1_1impl_1_1matrix__t.md) / [`flat_matrix_t`](../cpp-gl/structhgl_1_1impl_1_1flat__matrix__t.md)): Allocates a full $|V| \times |E|$ 2D grid. Memory intensive ($O(|V| \times |E|)$), but allows instant $O(1)$ verification if a given vertex belongs to a given hyperedge.
207
+
At their core, hypergraph data structures differ in how they map vertices to the hyperedges they belong to, or vice versa. The primary incidence models in the HGL module are based on the following architectures:
209
208
210
-
### Layout Tags (Orientation)
209
+
-**Incidence Lists**: This model stores only the active incidence relationships. It works by mapping an element (e.g., a vertex) to a dynamic, list of its associated elements (e.g., the hyperedges it belongs to). This approach is highly space-efficient for sparse hypergraphs and allows rapid iteration over local incidence sets.
210
+
-**Incidence Matrices**: This model allocates a full $|V| \times |E|$ 2D grid, where each cell represents a potential connection between a specific vertex and a specific hyperedge. While they consume significantly more memory ($O(|V| \times |E|)$), they provide instant $O(1)$ incidence verification, making them ideal for dense, heavily interconnected hypergraphs.
211
211
212
-
The `LayoutTag` dictates the *primary indexing dimension* of the incidence structure, massively impacting query speeds and memory footprints:
212
+
### Representation Layouts (Orientation)
213
213
214
-
-[**bidirectional_t**](../cpp-gl/structhgl_1_1impl_1_1bidirectional__t.md): Maintains *two* internal mappings (Vertex-to-Hyperedges AND Hyperedge-to-Vertices). Offers optimal $O(1)$ access for both vertex degrees and hyperedge sizes, at the cost of doubled memory consumption. *(Compatible only with Incidence Lists)*.
215
-
-[**vertex_major_t**](../cpp-gl/structhgl_1_1impl_1_1vertex__major__t.md): The primary index is the Vertex. Querying the hyperedges connected to a vertex is instantaneous, but finding which vertices belong to a hyperedge requires an expensive full-graph scan.
216
-
-[**hyperedge_major_t**](../cpp-gl/structhgl_1_1impl_1_1hyperedge__major__t.md): The primary index is the Hyperedge. Querying the vertices within a hyperedge is instantaneous, but finding a vertex's degree requires a full-graph scan.
214
+
The `LayoutTag` dictates the *primary indexing dimension* of the incidence structure. Because hypergraphs map two distinctly different sets ($V$ and $E$), changing the primary index massively impacts query speeds and memory footprints:
215
+
216
+
-[**bidirectional_t**](../cpp-gl/structhgl_1_1impl_1_1bidirectional__t.md): Maintains *two* internal mappings simultaneously (Vertex-to-Hyperedges AND Hyperedge-to-Vertices). Offers optimal $O(1)$ access for both vertex degrees and hyperedge sizes, and fast iteration in both directions, at the cost of doubled memory consumption. **(Compatible only with Incidence Lists)**.
217
+
-[**hyperedge_major_t**](../cpp-gl/structhgl_1_1impl_1_1hyperedge__major__t.md): The primary index is the Hyperedge. Querying the vertices within a specific hyperedge is instantaneous, but finding which hyperedges a vertex belongs to may require an expensive full-structure scan.
218
+
-[**vertex_major_t**](../cpp-gl/structhgl_1_1impl_1_1vertex__major__t.md): The primary index is the Vertex. Querying the hyperedges connected to a specific vertex is instantaneous, but finding which vertices belong to a specific hyperedge may require an expensive, full-structure scan.
217
219
218
220
> [!NOTE] Matrices and Asymmetry
219
221
>
220
-
> Incidence matrices fundamentally represent an asymmetric $|V| \times |E|$ mathematical grid. Therefore, matrix representations strictly require an asymmetric layout (`vertex_major_t` or `hyperedge_major_t`), mapping rows to the major element and columns to the minor element.
222
+
> Incidence matrices fundamentally represent an asymmetric $|V| \times |E|$ grid. Therefore, matrix representations strictly require an asymmetric layout (`vertex_major_t` or `hyperedge_major_t`), mapping rows to the major element and columns to the minor element.
223
+
224
+
### Standard Models
225
+
226
+
Standard models are heap-allocated, nested structures (e.g., `std::vector<std::vector<T>>`) that prioritize flexibility and dynamic structural modification. Because the inner containers can grow independently, they handle topological changes gracefully.
227
+
228
+
-[**list_t**](../cpp-gl/structhgl_1_1impl_1_1list__t.md): A standard Incidence List model implemented using traditional nested containers.
229
+
-[**matrix_t**](../cpp-gl/structhgl_1_1impl_1_1matrix__t.md): A standard Incidence Matrix model implemented using traditional nested containers.
230
+
231
+
<divalign="center"markdown="1">
232
+
233
+
{: width="700" }
234
+
{: width="700" }
235
+
236
+
</div>
237
+
238
+
> [!NOTE] BF-Directed Representations
239
+
>
240
+
> The diagrams above illustrate standard representations for an **undirected** hypergraph. For **BF-directed** hypergraphs, the underlying architecture adapts to capture directionality:
241
+
> -**Incidence List:** The structure maintains separate inner containers to explicitly distinguish between tail-bound and head-bound incidence relationships.
242
+
> -**Incidence Matrix:** The cells of the matrix are no longer simple boolean indicators. Instead, they store specific directional states (conceptually mapped as `-1`, `0`, and `1`) to indicate whether a vertex is in the tail of the hyperedge, not connected, or in the head of the hyperedge, respectively.
243
+
244
+
### Flat Models
245
+
246
+
To maximize cache locality, the flat representations map the logical 2D structures into contiguous 1D memory blocks. By keeping all incidence data tightly packed, these models provide the absolute maximum traversal speed. However, this cache-friendliness comes at a structural cost: modifying an inner segment often requires shifting the entire remainder of the flat container in memory.
247
+
248
+
-[**flat_list_t**](../cpp-gl/structhgl_1_1impl_1_1flat__list__t.md): A flattened Incidence List model implemented using the generic [**gl::flat_jagged_vector**](../cpp-gl/classgl_1_1flat__jagged__vector.md) data structure.
249
+
-[**flat_matrix_t**](../cpp-gl/structhgl_1_1impl_1_1flat__matrix__t.md): A flattened Incidence Matrix model implemented using the generic [**gl::flat_matrix**](../cpp-gl/classgl_1_1flat__matrix.md) data structure.
Because performance is heavily dictated by the combination of the Representation Model (List vs. Matrix) and the Layout Tag (Bidirectional vs. Major), the complexities are grouped accordingly.
256
+
</div>
225
257
226
-
*Note: In the tables below, $|V|$ is vertex count, $|E|$ is hyperedge count, $deg(v)$ is vertex degree, and $|e|$ is hyperedge size (number of incident vertices).*
258
+
> [!NOTE] BF-Directed Representations
259
+
>
260
+
> Similarly to the standard models, flat architectures seamlessly adapt to directionality. A BF-directed *Flat Incidence List* utilizes separated flat storage for tail and head incidence, preserving contiguous memory reads for specific directional traversals. Meanwhile, a *Flat Incidence Matrix* packs the tri-state directional indicators directly into its contiguous 1D grid layout.
261
+
262
+
> [!NOTE] Flat List Model Performance
263
+
>
264
+
> While the flat incidence list model is highly efficient for hypergraph storage and traversal, it is also highly inefficient to construct sequentially (element-by-element and incidence-by-incidence). The most efficient approach for utilizing flat list hypergraphs is to construct your hypergraph using the standard `list_t` model first, and then convert it into the `flat_list_t` model using the generic [**hgl::to**](../cpp-gl/group__HGL-Core.md#function-to) conversion function.
Because hypergraph performance is heavily dictated by the combination of the Representation Model (List vs. Matrix), the underlying Memory Strategy (Standard vs. Flat), and the Layout Tag (Bidirectional vs. Major), the complexities are grouped accordingly.
*Note: In the tables below, $|V|$ is the total vertex count, $|E|$ is the total hyperedge count, $deg(v)$ is the degree of a specific vertex, and $|e|$ is the size of a specific hyperedge. $I$ represents the total number of incidences across the entire hypergraph ($I = \sum deg(v) = \sum |e|$).*
|**Iterate Incident Hyperedges** of $v$ | $O(\vert E \vert)$ (Scan Row) | $O(\vert E \vert)$ (Scan Column) |
245
-
|**Iterate Incident Vertices** of $e$ | $O(\vert V \vert)$ (Scan Column) | $O(\vert V \vert)$ (Scan Row) |
246
-
|**Get Degree** of $v$ | $O(\vert E \vert)$ | $O(\vert E \vert)$ |
247
-
|**Get Size** of $e$ | $O(\vert V \vert)$ | $O(\vert V \vert)$ |
248
-
|**Memory Footprint (Topology)**| $O(\vert V \vert \times \vert E \vert)$ | $O(\vert V \vert \times \vert E \vert)$ |
272
+
> [!NOTE] Directionality and Complexity
273
+
> For **BF-directed** hypergraphs, operations like `in_degree`, `out_degree`, `tail_vertices`, and `head_vertices` follow the exact same complexity class as their undirected counterparts (`degree`, `incident_vertices`). They merely operate on isolated sub-containers (in list models) or specific tri-state indicators (in matrix models), keeping the Big-O structurally identical.
274
+
275
+
#### 1. Incidence Lists
276
+
277
+
**Topological Queries:**
278
+
279
+
The complexities of topological queries is identical for standard (`list_t`) and flat (`flat_list_t`) Incidence List models.
*(Note: Adding/removing incidences in flat lists is always $O(I)$ because modifying any internal jagged segment requires shifting the remainder of the massive contiguous 1D data array).*
312
+
313
+
#### 2. Incidence Matrices
314
+
315
+
**Topological Queries:**
316
+
317
+
The complexities of topological queries is identical for standard (`matrix_t`) and flat (`flat_matrix_t`) Incidence Matrix models.
|**Add Vertex**| $O(\vert E \vert)$ amortized if `vertex_major_t`<br>$O(\vert V \vert \times \vert E \vert)$ if `hyperedge_major_t`| $O(\vert V \vert \times \vert E \vert)$ |
332
+
|**Add Hyperedge**| $O(\vert V \vert \times \vert E \vert)$ if `vertex_major_t`<br>$O(\vert V \vert)$ amortized if `hyperedge_major_t`| $O(\vert V \vert \times \vert E \vert)$ |
333
+
|**Remove Vertex / Hyperedge**| $O(\vert V \vert \times \vert E \vert)$ (Shift rows/cols) | $O(\vert V \vert \times \vert E \vert)$ |
> While the theoretical Big-O complexities appear identical across matrix layouts, **real-world performance is heavily bound by cache locality.** Matrices map one dimension to contiguous rows and the other to strided columns.
339
+
>
340
+
> -**Query Direction:** Choose the layout that aligns your most frequent traversal with contiguous memory reads. If your algorithm primarily iterates over the vertices within specific hyperedges, `hyperedge_major_t` will drastically outperform `vertex_major_t` by avoiding cache misses.
341
+
> -**Dimension Imbalance:** If your hypergraph is highly asymmetric (e.g., $|V| \gg |E|$), scanning the larger dimension across strided memory can degrade performance. Align the major layout with the dimension you need to scan most efficiently.
249
342
250
343
### Choosing the Representation & Layout
251
344
252
-
Selecting the correct combination ensures your application hits peak performance:
345
+
Selecting the optimal combination of Representation Model and Layout Tag requires balancing memory constraints, topology density, and your specific traversal patterns to achieve peak performance:
253
346
254
-
- Use **`bidirectional_t`** Lists (the default) for general-purpose hypergraphs where you frequently query in both directions (e.g., "what nodes are in this hyperedge?" AND "what hyperedges is this node in?").
255
-
- Use **`hyperedge_major_t`** Lists when dealing with massive datasets where memory is tight, and your algorithms are strictly edge-centric (e.g., simulating isolated hyperedge reactions).
256
-
- Use **`matrix_t`** when your hypergraph is extremely dense, hyperedge sizes are close to $|V|$, and absolute instant incidence verification `are_incident(v, e)` is the algorithmic bottleneck.
257
-
```
347
+
-**`bidirectional_t` Lists (Default):** The most versatile choice for general-purpose hypergraphs. Use this when your algorithms require fast, symmetric traversals (e.g., frequently oscillating between querying "which vertices belong to this hyperedge?" and "which hyperedges contain this vertex?"). It trades a doubled memory footprint for optimal $O(1)$ degree and size lookups, ensuring fluid bidirectional navigation.
348
+
-**Single-Major Lists (`hyperedge_major_t` / `vertex_major_t`):** Ideal for massive, sparse datasets operating under strict memory limits. Use these layouts when your algorithm's traversal logic is heavily asymmetrical. For instance, if an algorithm strictly simulates isolated chemical reactions and rarely needs to compute a vertex's degree, `hyperedge_major_t` eliminates the redundant backward-mapping overhead.
349
+
-**Matrices (`matrix_t` / `flat_matrix_t`):** Reserved for exceptionally dense hypergraphs where hyperedge sizes consistently approach $|V|$. Choose a matrix representation only when absolute, instant $O(1)$ incidence verification (`are_incident(v, e)`) is the primary algorithmic bottleneck and the $O(|V| \times |E|)$ memory consumption is an acceptable trade-off.
0 commit comments