Skip to content

Commit 27cd0fc

Browse files
committed
feat: Expand data category with 4 advanced dataflow skills
Added comprehensive skills for timely-dataflow and differential-dataflow systems: - timely-dataflow.md: Timely dataflow framework, progress tracking, operators * Core concepts: dataflow graphs, timestamps, frontiers * Patterns: custom operators, stateful aggregation, iterative computation * Rust examples with timely crate - differential-dataflow.md: Differential computation, incremental updates * Core concepts: collections, arrangements, differential operators * Patterns: incremental joins, group-by reduce, connected components * Efficient arrangements for shared state - dataflow-coordination.md: Coordination patterns for distributed systems * Core concepts: barriers, epochs, watermarks, snapshots * Patterns: Chandy-Lamport snapshots, backpressure, causal consistency * Multi-language examples (Rust, Go, Python) - streaming-aggregations.md: Windowing and time-series aggregation * Core concepts: tumbling/sliding/session windows, watermarks * Patterns: late data handling, multi-resolution aggregation, top-K * Time semantics and watermark strategies Updated INDEX.md from 5 to 9 skills and enhanced discover-data gateway with workflow combinations for real-time analytics and incremental computation systems.
1 parent c45da06 commit 27cd0fc

6 files changed

Lines changed: 2754 additions & 8 deletions

File tree

skills/data/INDEX.md

Lines changed: 42 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Category Overview
44

5-
**Total Skills**: 5
5+
**Total Skills**: 9
66
**Category**: data
77

88
## Skills in This Category
@@ -27,6 +27,26 @@ cat skills/data/data-validation.md
2727

2828
---
2929

30+
### dataflow-coordination.md
31+
**Description**: Coordination patterns for distributed dataflow systems including barriers, epochs, and distributed snapshots
32+
33+
**Load this skill**:
34+
```bash
35+
cat skills/data/dataflow-coordination.md
36+
```
37+
38+
---
39+
40+
### differential-dataflow.md
41+
**Description**: Differential computation for incremental updates, maintaining indexed collections and efficient joins
42+
43+
**Load this skill**:
44+
```bash
45+
cat skills/data/differential-dataflow.md
46+
```
47+
48+
---
49+
3050
### etl-patterns.md
3151
**Description**: Designing data extraction from multiple sources (databases, APIs, files)
3252

@@ -57,6 +77,26 @@ cat skills/data/stream-processing.md
5777

5878
---
5979

80+
### streaming-aggregations.md
81+
**Description**: Windowing, sessionization, time-series aggregation, and late data handling for streaming systems
82+
83+
**Load this skill**:
84+
```bash
85+
cat skills/data/streaming-aggregations.md
86+
```
87+
88+
---
89+
90+
### timely-dataflow.md
91+
**Description**: Timely dataflow framework for low-latency, high-throughput streaming computation with progress tracking
92+
93+
**Load this skill**:
94+
```bash
95+
cat skills/data/timely-dataflow.md
96+
```
97+
98+
---
99+
60100
## Loading All Skills
61101

62102
```bash
@@ -67,7 +107,7 @@ ls skills/data/*.md
67107
cat skills/data/batch-processing.md
68108
cat skills/data/data-validation.md
69109
cat skills/data/etl-patterns.md
70-
# ... and 2 more
110+
# ... and 6 more
71111
```
72112

73113
## Related Categories

0 commit comments

Comments
 (0)