You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AGENTS.md
+18-14Lines changed: 18 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,33 +6,36 @@ See [ARCHITECTURE.md](ARCHITECTURE.md) for a full architecture reference includi
6
6
7
7
## Purpose & Architecture
8
8
9
-
Galactic is the SRv6 data plane for multi-cloud VPC networking. It consists of a DaemonSet agent (`internal/agent/`) that manages kernel SRv6 routes and VRFs per node, and a CNI plugin (`internal/cni/`) that wires containers into VPC networks. VPC and VPCAttachment CRD management lives in a separate operator project; Galactic receives pre-populated identifiers through the CNI config and acts on them. BGP is used as the control plane for distributing SRv6 routes between agents.
9
+
Galactic is the SRv6 data plane for multi-cloud VPC networking. It consists of a controller-runtime reconciler (`cmd/galactic-router/`) that watches Cosmos BGP CRDs and drives an embedded GoBGP server per node, and a CNI plugin (`internal/cni/`) that wires containers into VPC networks. VPC and VPCAttachment CRD management lives in a separate operator project; Galactic receives pre-populated identifiers through the CNI config and acts on them.
10
10
11
-
**Data flow:** CNI invoked with pre-populated VPC/VPCAttachment identifiers → gRPC registers endpoint with agent → agent manages SRv6 ingress routes locally → BGP distributes SRv6 routes between agents.
11
+
**Data flow:** CNI invoked with pre-populated VPC/VPCAttachment identifiers → CNI creates kernel SRv6 state (VRF, veth, ingress route) and writes a `BGPAdvertisement` CRD → `galactic-router` reconciles the CRD → GoBGP advertises the EVPN path → BGP distributes routes between nodes.
12
12
13
13
**Non-obvious decisions:**
14
14
- VPC identifiers are 48-bit hex; VPCAttachment identifiers are 16-bit hex. These are embedded into IPv6 SRv6 endpoint addresses for deterministic route lookups. Both are supplied by an external operator via the CNI config.
15
15
- Identifiers are also Base62-encoded for interface naming (VRF: `vrfX-Y`, veth host side: `galX-Y`) to keep kernel interface name length within limits.
16
-
-`galactic-cni` is a pure CNI plugin; `main()` calls `cni.RunPlugin()` directly with no CLI layer. `galactic-agent` uses flag parsing for its configuration flags.
16
+
-`galactic-cni` is a pure CNI plugin; `main()` calls `cni.RunPlugin()` directly with no CLI layer. `galactic-router` uses environment variables (`NODE_NAME`, `ROUTER_ROLE`) for its configuration.
17
17
- The Kubernetes operator, VPC/VPCAttachment CRDs, and webhook code have been removed from this repository. They live in a separate companion operator project.
18
+
- GoBGP starts lazily on the first `BGPRouter` reconcile (`listenPort=-1`, outbound-only). ASN or RouterID changes trigger a full `Reconfigure`.
19
+
- Liveness and readiness probes use the gRPC health protocol on port 5000. There is no HTTP health endpoint.
task run-agent# run agent (requires root / CAP_NET_ADMIN)
38
+
task run-router # run galactic-router (requires root / CAP_NET_ADMIN)
36
39
```
37
40
38
41
**Before every PR:**`task lint test`.
@@ -47,13 +50,14 @@ Summary:
47
50
48
51
## Deployments
49
52
50
-
-**`deploy/galactic-agent/`** — Kustomize manifests for the agent DaemonSet, RBAC, and ServiceAccount. Apply with `kubectl apply -k deploy/galactic-agent/`.
51
-
-**`deploy/containerlab/`** — ContainerLab topology (`gvpc.clab.yaml`) for three Kind clusters (iad, sjc, infra) wired over an IPv6 SRv6 transit mesh. FRR runs as a hostNetwork DaemonSet on each worker for eBGP underlay; GoBGP handles L3VPN type-5 routes over iBGP to the infra route reflector. See `deploy/containerlab/README.md` and `deploy/containerlab/Taskfile.yaml` for bring-up commands.
53
+
-**`deploy/galactic-router/`** — Kustomize manifests for the router DaemonSet, RBAC, and ServiceAccount. Apply with `kubectl apply -k deploy/galactic-router/`.
54
+
-**`deploy/containerlab/`** — ContainerLab topology (`gvpc.clab.yaml`) for three Kind clusters (iad, sjc, infra) wired over an IPv6 SRv6 transit mesh. FRR runs as a hostNetwork DaemonSet on each worker for eBGP underlay; `galactic-router` (tenant role) handles EVPN path distribution over iBGP. See `deploy/containerlab/README.md` and `deploy/containerlab/Taskfile.yaml` for bring-up commands.
52
55
53
56
## New Developer Entry Points
54
57
55
58
1. Run `task build` to verify toolchain; run `task test` to confirm unit tests pass.
56
-
2. Read `internal/cni/cni.go` (cmdAdd/cmdDel) to understand the container attach path.
57
-
3. Read `internal/plumbing/intf/intf.go` to understand SRv6 endpoint encoding, interface naming, and base62↔hex conversion.
58
-
4. Read `internal/plumbing/srv6/srv6.go` to understand kernel SRv6 ingress route management.
59
-
5. Explore `internal/plumbing/` for shared kernel and network primitives (VRF, sysctl, interface naming, SRv6).
59
+
2. Read `internal/cni/cni.go` (cmdAdd/cmdDel) to understand the container attach path and how `BGPAdvertisement` CRDs are created.
60
+
3. Read `internal/reconcile/reconcile.go` to understand how Cosmos CRDs are translated into a `DesiredRouter`.
61
+
4. Read `internal/runtime/gobgp/runtime.go` to understand how `DesiredRouter` is applied to GoBGP.
62
+
5. Read `internal/plumbing/intf/intf.go` to understand SRv6 endpoint encoding, interface naming, and base62↔hex conversion.
63
+
6. Explore `internal/plumbing/` for shared kernel and network primitives (VRF, sysctl, interface naming, SRv6).
|`internal/plumbing/srv6`| both | SRv6 ingress route add/del (END.DT46) |
@@ -92,12 +102,16 @@ See [docs/agent-startup.md](docs/agent-startup.md) for the agent startup sequenc
92
102
93
103
-**Identifiers in the SID.** VPC (48-bit) and VPCAttachment (16-bit) identifiers are packed into the low 64 bits of the SRv6 SID, making forwarding state fully self-describing without a lookup table.
94
104
-**Base62 interface names.** Kernel interface names are Base62-encoded to stay within the 15-character limit (`vrfX-Y`, `galX-Y`). The hex form is used for BGP and SRv6; base62 for kernel interfaces.
95
-
-**GoBGP embedded, not sidecar.** GoBGP runs in-process so the agent owns its lifecycle and can gate readiness on BGP availability. Peer and policy config is applied by the cosmos operator via `BGPProvider` / `BGPInstance` / `BGPPeer` CRDs. The provider advertises `L2VPN/EVPN` (AFI=25, SAFI=70) as its sole address family capability.
96
-
-**CNI binary auto-detects mode.** The `galactic-cni` binary runs as both the CNI plugin (when `CNI_COMMAND` is set) and a CLI tool. This avoids shipping two separate binaries on the node.
105
+
-**GoBGP embedded, lazy-started.** GoBGP runs in-process and starts only when the first `BGPRouter` is reconciled (`listenPort=-1`, outbound-only). ASN or RouterID changes trigger a full `Reconfigure` (fresh `BgpServer` — `StopBgp` is not called because it permanently terminates the v4 Serve loop).
106
+
-**CRD-driven config, no sidecar gRPC.**`galactic-router` watches cosmos BGP CRDs directly via controller-runtime. The CNI writes a `BGPAdvertisement` CRD; the router reconciler picks it up. No in-node gRPC calls.
107
+
-**Hash-based no-op suppression.** SHA-256 over the sorted `DesiredRouter` prevents redundant GoBGP Apply calls on every CRD event touch.
108
+
-**RuntimeFactory pattern.**`ROUTER_ROLE=tenant` selects GoBGP; `ROUTER_ROLE=fabric` selects FRR (Phase 2 stub). The binary is selected at startup; no controller changes are needed for Phase 2.
109
+
-**gRPC health on :5000.** Liveness and readiness probes use the gRPC health protocol (`google.golang.org/grpc/health`) on port 5000. No HTTP health endpoint.
97
110
98
111
---
99
112
100
113
## Known Constraints
101
114
102
-
-**GoBGP RIB is ephemeral.** All BGP state is in-process memory. On restart, sessions and paths must be re-established. The cosmos operator is responsible for re-applying config.
103
-
-**No kernel-path unit tests.**`internal/cni`, `internal/plumbing/srv6`, and `internal/plumbing/vrf` require `CAP_NET_ADMIN` and a real kernel. `internal/plumbing/intf` is fully unit-testable (pure functions only). Coverage comes from the e2e suite (`task ci:e2etest`), which only runs on `main` and release tags.
115
+
-**GoBGP RIB is ephemeral.** All BGP state is in-process memory. On restart, sessions and paths must be re-established from CRD state; controller-runtime's reconcile loop handles this automatically.
116
+
-**EVPN Type 5 deferred.**`BGPAdvertisement` does not carry a Route Distinguisher field in the current cosmos API. `galactic-router` returns `ErrMissingRouteDistinguisher` for l2vpn/evpn advertisements and sets `Accepted=False` on the CRD.
117
+
-**No kernel-path unit tests.**`internal/cni`, `internal/plumbing/srv6`, and `internal/plumbing/vrf` require `CAP_NET_ADMIN` and a real kernel. `internal/plumbing/intf` is fully unit-testable (pure functions only). Coverage comes from the e2e suite (`task test:e2e`).
-`internal/plumbing/` — low-level kernel and network primitives shared between router and CNI (`intf`, `srv6`, `sysctl`, `vrf`)
15
+
-`internal/controller/` — controller-runtime reconcilers (BGPRouter, BGPPeer, BGPAdvertisement, BGPPolicy, Secret, Node); also contains field index registration (`indexer.go`) and CRD status helpers (`status.go`)
Place new code in `internal/` unless it must be imported by an external caller. Prefer creating a focused sub-package over adding to an existing large one.
0 commit comments