|
| 1 | +--- |
| 2 | +title: VXLAN Policy Agent Enforcement Cycle |
| 3 | +expires_at: never |
| 4 | +tags: [cf-networking-release, silk-release, asg] |
| 5 | +--- |
| 6 | + |
| 7 | +<!-- vim-markdown-toc GFM --> |
| 8 | + |
| 9 | +* [VXLAN Policy Agent Enforcement Cycle](#vxlan-policy-agent-enforcement-cycle) |
| 10 | + * [Overview](#overview) |
| 11 | + * [Naming Conventions](#naming-conventions) |
| 12 | + * [replaceChainRules Decision Tree](#replacechainrules-decision-tree) |
| 13 | + * [State-by-State Walkthrough](#state-by-state-walkthrough) |
| 14 | + * [Neither exists](#neither-exists) |
| 15 | + * [Only original exists](#only-original-exists) |
| 16 | + * [Only candidate exists (Scenario 1 recovery)](#only-candidate-exists-scenario-1-recovery) |
| 17 | + * [Both exist (Scenario 2 recovery)](#both-exist-scenario-2-recovery) |
| 18 | + * [Failure Scenarios Prevented](#failure-scenarios-prevented) |
| 19 | + * [Normal Update Path Detail](#normal-update-path-detail) |
| 20 | + |
| 21 | +<!-- vim-markdown-toc --> |
| 22 | + |
| 23 | +# VXLAN Policy Agent Enforcement Cycle |
| 24 | + |
| 25 | +## Overview |
| 26 | + |
| 27 | +The `vxlan-policy-agent` periodically polls the policy server for Application Security Group (ASG) rules and enforces them on the host using `iptables`. To ensure that network traffic is not interrupted or dropped during rule updates, the agent uses a "candidate and rename" strategy. Instead of modifying the active chain directly, it builds a new "candidate" chain, inserts a jump rule to it, and then swaps it into place by deleting the old chain and renaming the new one. |
| 28 | + |
| 29 | +This document explains the decision tree in the `replaceChainRules` function, which is responsible for this atomic swap and for recovering from partial failures if the agent crashes mid-update. |
| 30 | + |
| 31 | +## Naming Conventions |
| 32 | + |
| 33 | +- **Original / Active Chain (`asg-XYZ`)**: The chain currently serving traffic for a container. `XYZ` is derived from the container handle. |
| 34 | +- **Candidate Chain (`casg-XYZ`)**: A temporary chain built during an update to hold the new rules. |
| 35 | +- **Jump Rule**: A rule in the parent chain (e.g., `netout-XYZ`) that directs traffic into the ASG chain (e.g., `-A netout-XYZ -j asg-XYZ`). The presence of a jump rule determines whether a chain is actively evaluating traffic. |
| 36 | + |
| 37 | +## replaceChainRules Decision Tree |
| 38 | + |
| 39 | +The `replaceChainRules` function begins by checking the parent chain for jump rules pointing to either the original chain or the candidate chain. Based on what it finds, it determines the current state and how to proceed. |
| 40 | + |
| 41 | +```mermaid |
| 42 | +flowchart TD |
| 43 | + Start["replaceChainRules entry"] --> CheckJumps["Check jump rules in parent chain"] |
| 44 | +
|
| 45 | + CheckJumps --> ParentCheck{"Parent chain\nexists?"} |
| 46 | + ParentCheck -->|"No"| Skip["Skip container\n(Container Creation Race)"] |
| 47 | + Skip --> Abort["Abort (Retry next sync)"] |
| 48 | +
|
| 49 | + ParentCheck -->|"Yes"| EvaluateState["Evaluate jump rule state"] |
| 50 | +
|
| 51 | + EvaluateState --> Neither{"Neither jump\nexists?"} |
| 52 | + EvaluateState --> OnlyOrig{"Only original\njump exists?"} |
| 53 | + EvaluateState --> OnlyCandidate{"Only candidate\njump exists?"} |
| 54 | + EvaluateState --> BothExist{"Both jumps\nexist?"} |
| 55 | +
|
| 56 | + Neither -->|"First time setup"| EnforceDirect["Enforce directly on asg-XYZ\n[active: asg-XYZ]"] |
| 57 | + EnforceDirect --> Done["Done"] |
| 58 | +
|
| 59 | + OnlyOrig -->|"Normal path"| NormalUpdate["Normal Update Path\n(see below)"] |
| 60 | + NormalUpdate --> Done |
| 61 | +
|
| 62 | + OnlyCandidate -->|"Failed to rename"| CheckOrphan["Check if orphaned\nasg-XYZ chain exists"] |
| 63 | + CheckOrphan -->|"Yes"| FlushDelete["Flush and delete\norphaned asg-XYZ\n[active: casg-XYZ]"] |
| 64 | + CheckOrphan -->|"No"| RenameRecovery1["Rename casg-XYZ to asg-XYZ\n[active: asg-XYZ]"] |
| 65 | + FlushDelete --> RenameRecovery1 |
| 66 | + RenameRecovery1 --> NormalUpdate |
| 67 | +
|
| 68 | + BothExist -->|"Failed to delete old"| DeleteOrig["Delete asg-XYZ chain\nand jump rule\n[active: casg-XYZ]"] |
| 69 | + DeleteOrig --> RenameRecovery2["Rename casg-XYZ to asg-XYZ\n[active: asg-XYZ]"] |
| 70 | + RenameRecovery2 --> NormalUpdate |
| 71 | +``` |
| 72 | + |
| 73 | +## State Evaluation and Recovery |
| 74 | + |
| 75 | +The following table details the four possible states detected by `replaceChainRules`, the recovery actions taken, and why those actions are safe for the running application's traffic. |
| 76 | + |
| 77 | +| State / Failure Mode | Jump Rules Found | Recovery Action | Active Chain (Latest-known-good) | Why it is Safe | |
| 78 | +| :--- | :--- | :--- | :--- | :--- | |
| 79 | +| **First-time Setup** | Neither | Enforce directly on `asg-XYZ` (create chain, append rules, insert jump). | `asg-XYZ` | No existing traffic or rules to disrupt. | |
| 80 | +| **Normal Update** | Only `asg-XYZ` | Build `casg-XYZ`, insert jump, delete `asg-XYZ` jump & chain, rename `casg-XYZ` -> `asg-XYZ`. | `asg-XYZ` -> `casg-XYZ` -> `asg-XYZ` | The new rules are fully built in `casg-XYZ` before its jump rule is inserted at position 1. Traffic seamlessly shifts to the new rules before the old chain is deleted. | |
| 81 | +| **Container Creation Race (Parent chain not ready)** | N/A (Check fails) | Skip container enforcement. Retry on next sync cycle. | None | The container's network interface (and parent chain) is still being created by the CNI plugin. No application traffic can escape the container until the parent chain is wired up, so skipping enforcement temporarily does not leak traffic. | |
| 82 | +| **Interrupted Update (Failed to create candidate)** | Only `asg-XYZ` | Normal update path retries. | `asg-XYZ` | The original chain and jump rule were never modified. Traffic continues to flow through the old rules uninterrupted. | |
| 83 | +| **Interrupted Update (Failed to insert candidate jump)** | Only `asg-XYZ` | Normal update path retries. Agent attempts to delete the orphaned candidate chain during the failure. | `asg-XYZ` | The parent chain was never successfully modified to point to the candidate. Traffic continues through the original chain. | |
| 84 | +| **Interrupted Update (Failed to append rules)** | Only `asg-XYZ` (if immediate cleanup succeeds) or Both (if cleanup fails) | Normal update or "Both exist" recovery. | `asg-XYZ` (if cleanup succeeds) or `casg-XYZ` (if cleanup fails) | The agent immediately attempts to delete the candidate chain and jump rule to revert traffic to the original chain. If this cleanup fails, it falls into the "Both exist" recovery state on the next sync. | |
| 85 | +| **Interrupted Update (Failed to rename)** | Only `casg-XYZ` | Flush/delete orphaned `asg-XYZ` chain, rename `casg-XYZ` -> `asg-XYZ`, run normal update. | `casg-XYZ` | The candidate chain contains the fully built ruleset from the previous run. Renaming it restores the standard naming convention without dropping traffic. Clearing the orphaned chain prevents `RenameChain` failures (indefinite loop bug). | |
| 86 | +| **Interrupted Update (Failed to delete old)** | Both | Delete `asg-XYZ` chain & jump, rename `casg-XYZ` -> `asg-XYZ`, run normal update. | `casg-XYZ` | `casg-XYZ` was inserted at position 1, so it is already actively evaluating traffic with the newer rules. Deleting the old chain safely cleans up unused rules and prevents traffic from falling back to old rules (traffic regression bug). | |
| 87 | + |
| 88 | +## Normal Update Path Detail |
| 89 | + |
| 90 | +When the agent executes a normal update (starting from "Only original exists" or after recovering from a partial failure), it follows these steps: |
| 91 | + |
| 92 | +```mermaid |
| 93 | +flowchart TD |
| 94 | + Start["Normal Update Path"] --> CreateCand["Create casg-XYZ chain"] |
| 95 | + CreateCand -->|"Error"| FailCreate["Abort\n(Failed to create candidate)"] |
| 96 | +
|
| 97 | + CreateCand -->|"Success"| InsertJump["Insert jump rule to casg-XYZ\nat position 1 in parent"] |
| 98 | + InsertJump -->|"Error"| FailInsert["Delete casg-XYZ & Abort\n(Failed to insert candidate jump)"] |
| 99 | +
|
| 100 | + InsertJump -->|"Success\n[active: casg-XYZ]"| AppendRules["Append new rules to casg-XYZ"] |
| 101 | + AppendRules -->|"Error"| FailAppend["Delete casg-XYZ & jump rule, then Abort\n(Failed to append rules)"] |
| 102 | +
|
| 103 | + AppendRules -->|"Success"| DeleteOld["Delete old asg-XYZ chain\nand its jump rule"] |
| 104 | + DeleteOld -->|"Error"| FailDelete["Abort\n(Failed to delete old)"] |
| 105 | +
|
| 106 | + DeleteOld -->|"Success"| RenameCand["Rename casg-XYZ to asg-XYZ"] |
| 107 | + RenameCand -->|"Error"| FailRename["Abort\n(Failed to rename)"] |
| 108 | +
|
| 109 | + RenameCand -->|"Success\n[active: asg-XYZ]"| Cleanup["Cleanup extra parent jump rules"] |
| 110 | + Cleanup --> Done["Done"] |
| 111 | +``` |
| 112 | + |
| 113 | +## Happy Path iptables Examples |
| 114 | + |
| 115 | +The following examples show the state of the `iptables` rules for a container (handle `abc123def456`) at each step of its lifecycle, from initial creation by the CNI plugin to a successful ASG update by the `vxlan-policy-agent`. |
| 116 | + |
| 117 | +### 1. CNI netrules create |
| 118 | +The CNI plugin creates the parent chain (`netout-abc123def456`) and adds default rules to allow established connections and reject everything else. The ASG chain does not exist yet. |
| 119 | + |
| 120 | +```iptables |
| 121 | +-N netout-abc123def456 |
| 122 | +-A netout-abc123def456 -m state --state RELATED,ESTABLISHED -j ACCEPT |
| 123 | +-A netout-abc123def456 -j REJECT --reject-with icmp-port-unreachable |
| 124 | +``` |
| 125 | + |
| 126 | +### 2. CNI force asg (Initial enforcement) |
| 127 | +The CNI plugin calls the `vxlan-policy-agent` to force an immediate ASG sync. The agent creates the `asg-abc123def456` chain, populates it with the initial rules, and inserts a jump rule at position 1 in the parent chain. |
| 128 | + |
| 129 | +```iptables |
| 130 | +-N asg-abc123def456 |
| 131 | +-A asg-abc123def456 -d 10.0.0.0/8 -p tcp -m tcp --dport 80 -j ACCEPT |
| 132 | +-N netout-abc123def456 |
| 133 | +-A netout-abc123def456 -j asg-abc123def456 |
| 134 | +-A netout-abc123def456 -m state --state RELATED,ESTABLISHED -j ACCEPT |
| 135 | +-A netout-abc123def456 -j REJECT --reject-with icmp-port-unreachable |
| 136 | +``` |
| 137 | + |
| 138 | +### 3. vxlan-policy-agent updating (new chain created) |
| 139 | +During a periodic sync, the agent detects a rule change. It creates a new candidate chain (`casg-abc123def456`). The original chain is still active. |
| 140 | + |
| 141 | +```iptables |
| 142 | +-N asg-abc123def456 |
| 143 | +-A asg-abc123def456 -d 10.0.0.0/8 -p tcp -m tcp --dport 80 -j ACCEPT |
| 144 | +-N casg-abc123def456 |
| 145 | +-N netout-abc123def456 |
| 146 | +-A netout-abc123def456 -j asg-abc123def456 |
| 147 | +-A netout-abc123def456 -m state --state RELATED,ESTABLISHED -j ACCEPT |
| 148 | +-A netout-abc123def456 -j REJECT --reject-with icmp-port-unreachable |
| 149 | +``` |
| 150 | + |
| 151 | +### 4. vxlan-policy-agent updating (new jump inserted & rules appended) |
| 152 | +The agent inserts a jump rule to the candidate chain at position 1 in the parent chain, and appends the new rules to the candidate chain. Traffic now flows through the candidate chain. |
| 153 | + |
| 154 | +```iptables |
| 155 | +-N asg-abc123def456 |
| 156 | +-A asg-abc123def456 -d 10.0.0.0/8 -p tcp -m tcp --dport 80 -j ACCEPT |
| 157 | +-N casg-abc123def456 |
| 158 | +-A casg-abc123def456 -d 10.0.0.0/8 -p tcp -m tcp --dport 443 -j ACCEPT |
| 159 | +-N netout-abc123def456 |
| 160 | +-A netout-abc123def456 -j casg-abc123def456 |
| 161 | +-A netout-abc123def456 -j asg-abc123def456 |
| 162 | +-A netout-abc123def456 -m state --state RELATED,ESTABLISHED -j ACCEPT |
| 163 | +-A netout-abc123def456 -j REJECT --reject-with icmp-port-unreachable |
| 164 | +``` |
| 165 | + |
| 166 | +### 5. vxlan-policy-agent updating (remove old jump rule) |
| 167 | +The agent deletes the jump rule pointing to the original `asg-abc123def456` chain. |
| 168 | + |
| 169 | +```iptables |
| 170 | +-N asg-abc123def456 |
| 171 | +-A asg-abc123def456 -d 10.0.0.0/8 -p tcp -m tcp --dport 80 -j ACCEPT |
| 172 | +-N casg-abc123def456 |
| 173 | +-A casg-abc123def456 -d 10.0.0.0/8 -p tcp -m tcp --dport 443 -j ACCEPT |
| 174 | +-N netout-abc123def456 |
| 175 | +-A netout-abc123def456 -j casg-abc123def456 |
| 176 | +-A netout-abc123def456 -m state --state RELATED,ESTABLISHED -j ACCEPT |
| 177 | +-A netout-abc123def456 -j REJECT --reject-with icmp-port-unreachable |
| 178 | +``` |
| 179 | + |
| 180 | +### 6. vxlan-policy-agent updating (remove old chain) |
| 181 | +The agent flushes and deletes the original `asg-abc123def456` chain. |
| 182 | + |
| 183 | +```iptables |
| 184 | +-N casg-abc123def456 |
| 185 | +-A casg-abc123def456 -d 10.0.0.0/8 -p tcp -m tcp --dport 443 -j ACCEPT |
| 186 | +-N netout-abc123def456 |
| 187 | +-A netout-abc123def456 -j casg-abc123def456 |
| 188 | +-A netout-abc123def456 -m state --state RELATED,ESTABLISHED -j ACCEPT |
| 189 | +-A netout-abc123def456 -j REJECT --reject-with icmp-port-unreachable |
| 190 | +``` |
| 191 | + |
| 192 | +### 7. vxlan-policy-agent updating (rename new chain to old chain) |
| 193 | +The agent renames `casg-abc123def456` to `asg-abc123def456`. The jump rule in the parent chain is automatically updated by `iptables` to reflect the new name. The update is complete. |
| 194 | + |
| 195 | +```iptables |
| 196 | +-N asg-abc123def456 |
| 197 | +-A asg-abc123def456 -d 10.0.0.0/8 -p tcp -m tcp --dport 443 -j ACCEPT |
| 198 | +-N netout-abc123def456 |
| 199 | +-A netout-abc123def456 -j asg-abc123def456 |
| 200 | +-A netout-abc123def456 -m state --state RELATED,ESTABLISHED -j ACCEPT |
| 201 | +-A netout-abc123def456 -j REJECT --reject-with icmp-port-unreachable |
| 202 | +``` |
0 commit comments