Skip to content

Commit 68c9a2a

Browse files
author
Alex J Lennon
committed
Implement Foundries VPN bootstrap workflow and client peer management
Priority 1: Eliminate public IP access requirement - Created check_client_peer_registered() tool to verify client peer registration - Created register_foundries_vpn_client() tool for client peer registration via Foundries VPN - Updated all tools to require Foundries VPN connection (removed hardware lab VPN fallback) - Documented bootstrap workflow: first admin needs one-time server access, then all operations via VPN - Created docs/FOUNDRIES_VPN_BOOTSTRAP.md with complete bootstrap guide - Updated docs/FOUNDRIES_VPN_CLEAN_INSTALLATION.md to remove hardware lab VPN references Priority 2: Server-side client peer management - Added load_client_peers() method to read from /etc/wireguard/factory-clients.conf - Added apply_client_peers() method to apply client peers to WireGuard interface - Integrated client peer loading into daemon startup and apply_conf() method - Client peers now persist across daemon restarts via config file - Config file format: <public_key> <assigned_ip> [comment] Priority 3: Improved error messages - Enhanced connect_foundries_vpn() to check client peer registration on failure - Enhanced verify_foundries_vpn_connection() with client peer check - All error messages now include client peer registration guidance - Updated help documentation with bootstrap workflow and troubleshooting Documentation: - Created docs/FOUNDRIES_VPN_BOOTSTRAP.md - Complete bootstrap workflow guide - Created docs/FOUNDRIES_VPN_SETUP_DIAGRAM.md - 6 Mermaid diagrams showing setup process - Created docs/FOUNDRIES_VPN_IMPLEMENTATION_SUMMARY.md - Implementation summary - Updated docs/FOUNDRIES_VPN_CLEAN_INSTALLATION.md - Removed hardware lab VPN references - Updated docs/FOUNDRIES_VPN_REVIEW_AND_IMPROVEMENTS.md - Correct bootstrap workflow - Updated lab_testing/resources/help.py - Enhanced troubleshooting with bootstrap info Key Insight: Once connected to Foundries VPN, server is accessible at 10.42.42.1 and all operations can be done via VPN. Hardware lab VPN is only for local lab access, not field devices. All code compiles and tools are registered. Ready for testing on server.
1 parent 16b1988 commit 68c9a2a

43 files changed

Lines changed: 6476 additions & 66 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.cursorrules

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Cursor Rules for MCP Remote Testing Project
2+
3+
## SSH Connections
4+
5+
**ALWAYS use SSH multiplexing (ControlMaster) for repeated SSH connections:**
6+
7+
1. **First connection** - Establish master:
8+
```bash
9+
sshpass -p 'password' ssh -o ControlMaster=yes \
10+
-o ControlPath=~/.ssh/controlmasters/%h-%p-%r \
11+
-o ControlPersist=300 \
12+
-p PORT user@host "command"
13+
```
14+
15+
2. **Subsequent connections** - Reuse master:
16+
```bash
17+
ssh -o ControlPath=~/.ssh/controlmasters/%h-%p-%r \
18+
-p PORT user@host "command"
19+
```
20+
21+
3. **For WireGuard server** (proxmox.dynamicdevices.co.uk:5025):
22+
- Use multiplexing for all diagnostic commands
23+
- Master connection persists for 5 minutes
24+
- No password needed after first connection
25+
26+
## Best Practices
27+
28+
- Use multiplexing whenever making multiple SSH connections to the same host
29+
- Set ControlPersist to 300-600 seconds
30+
- Use unique ControlPath per host/port/user combination
31+
- Document SSH connection patterns in code comments

add_client_peer.sh

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
#!/bin/bash
2+
# Script to add client peer to WireGuard server
3+
4+
CLIENT_PUBLIC_KEY="mzHaZPGowqqzAa5tVFQJs0zoWuDVLppt44HwgdcPXkg="
5+
CLIENT_IP="10.42.42.10"
6+
VPN_NETWORK="10.42.42.0/24"
7+
8+
echo "=== Adding Client Peer to WireGuard ==="
9+
echo ""
10+
11+
# Check if peer already exists
12+
if sudo wg show factory | grep -q "$CLIENT_PUBLIC_KEY"; then
13+
echo "⚠️ Client peer already exists, updating allowed-ips..."
14+
sudo wg set factory peer "$CLIENT_PUBLIC_KEY" allowed-ips "$VPN_NETWORK"
15+
else
16+
echo "➕ Adding new client peer..."
17+
sudo wg set factory peer "$CLIENT_PUBLIC_KEY" allowed-ips "$VPN_NETWORK"
18+
fi
19+
20+
echo ""
21+
echo "=== Updating Device Peers for Development ==="
22+
echo ""
23+
24+
# Update device peers to allow VPN network access
25+
sudo wg set factory peer 7RI5ZqxHy0MbtowYH1lcnBLoP7Zx+AtcPWq4kD2UPU0= allowed-ips "$VPN_NETWORK"
26+
sudo wg set factory peer ueiKEbnBWnbkNePceOxbz6q9NnM8skS6dWZ7p1Y2Sh4= allowed-ips "$VPN_NETWORK"
27+
28+
echo ""
29+
echo "=== Enabling IP Forwarding ==="
30+
echo 1 > /proc/sys/net/ipv4/ip_forward
31+
echo 'net.ipv4.ip_forward=1' >> /etc/sysctl.conf
32+
sysctl -p
33+
34+
echo ""
35+
echo "=== Checking Firewall ==="
36+
sudo iptables -L FORWARD -n -v | head -5
37+
38+
echo ""
39+
echo "=== Saving Configuration ==="
40+
sudo wg-quick save factory
41+
42+
echo ""
43+
echo "=== Verification ==="
44+
sudo wg show factory
45+
46+
echo ""
47+
echo "✅ Done! Client peer should now be configured."
48+
echo "Test from client: ping 10.42.42.2"

docs/COMPLETE_STATUS.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Complete Status - Server Update
2+
3+
## ✅ Successfully Completed
4+
5+
1. **Repository**: Server uses `git@github.com:DynamicDevices/factory-wireguard-server.git`
6+
2. **PR Created**: PR #18 to upstream repository
7+
3. **Code Changes**: All modifications applied
8+
4. **Daemon**: Running with `--allow-device-to-device` flag
9+
5. **Fix**: `apply_conf()` removes peers before applying config
10+
6. **Enhancement**: Daemon always applies config when `allow_device_to_device` is enabled
11+
12+
## Current Status
13+
14+
**Daemon**: ✅ Running (`active`)
15+
**Code**: ✅ All changes applied
16+
**Config File**: ✅ Has `AllowedIPs = 10.42.42.0/24` for all device peers
17+
**WireGuard Runtime**:
18+
- Peer 1: `allowed ips: (none)` ⚠️
19+
- Peer 2: `allowed ips: 10.42.42.0/24`
20+
- Client: `allowed ips: 10.42.42.0/24`
21+
22+
## How It Works Now
23+
24+
1. **Daemon Logic**: Always applies config when `allow_device_to_device` is enabled
25+
- This ensures `apply_conf()` runs on every sync cycle
26+
- Not just when config string changes
27+
28+
2. **apply_conf() Fix**: Removes peers before applying config
29+
- Clears endpoints
30+
- Applies config with AllowedIPs
31+
- Devices reconnect and preserve AllowedIPs
32+
33+
## Next Steps
34+
35+
The daemon syncs every 5 minutes. On the next cycle:
36+
- `apply_conf()` will remove all device peers
37+
- Apply config with `AllowedIPs = 10.42.42.0/24`
38+
- Devices reconnect and preserve AllowedIPs
39+
40+
This should fix the remaining peer with `(none)`.
41+
42+
## Testing
43+
44+
Once both device peers show `allowed ips: 10.42.42.0/24`:
45+
- Devices should be able to communicate with each other
46+
- Client should be able to reach devices
47+
- Full device-to-device communication enabled
48+
49+
## Files Modified
50+
51+
- `/root/factory-wireguard-server/factory-wireguard.py`
52+
- Added `--allow-device-to-device` flag
53+
- Modified `apply_conf()` to remove peers before applying
54+
- Updated daemon to always apply when flag is enabled
55+
- `/etc/systemd/system/factory-vpn-dynamic-devices.service`
56+
- Added `--allow-device-to-device` flag to ExecStart

docs/DAEMON_CODE_ANALYSIS.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Factory WireGuard Daemon Code Analysis
2+
3+
## Repository
4+
5+
Yes, this code is from: https://github.com/foundriesio/factory-wireguard-server
6+
7+
## Key Functions
8+
9+
### 1. `gen_conf()` - Line ~240-270
10+
Generates WireGuard configuration file with:
11+
- AllowedIPs = 10.42.42.0/24 (after our modification on line 257)
12+
13+
### 2. `apply_conf()` - Line ~269-310
14+
Applies the generated configuration using `wg-quick up`.
15+
16+
**THE PROBLEM IS HERE:**
17+
- Calls `wg-quick up factory` which applies config
18+
- If peers already exist with endpoints, WireGuard clears AllowedIPs
19+
- No code to remove peers before applying config
20+
21+
### 3. `daemon()` - Line ~493-530
22+
Main loop that:
23+
- Periodically checks for config changes
24+
- Calls `apply_conf()` when changes detected
25+
- Runs every 5 minutes (default)
26+
27+
## Root Cause
28+
29+
The `apply_conf()` function uses `wg-quick up` which:
30+
1. Reads config file (has AllowedIPs = 10.42.42.0/24)
31+
2. Uses `wg setconf` to apply peers
32+
3. BUT: If peers already have endpoints, WireGuard clears AllowedIPs
33+
34+
## Solution
35+
36+
Modify `apply_conf()` to:
37+
1. Remove existing device peers BEFORE calling `wg-quick up`
38+
2. Then apply config (peers added without endpoints)
39+
3. Devices reconnect and add endpoints, preserving AllowedIPs
40+
41+
## Code Location
42+
43+
File: `/root/factory-wireguard-server/factory-wireguard.py`
44+
Function: `apply_conf()` around line 269-310

docs/DAEMON_FIX.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Factory WireGuard Daemon Fix
2+
3+
## Problem Location
4+
5+
File: `/root/factory-wireguard-server/factory-wireguard.py`
6+
Function: `apply_conf()` (lines 269-280)
7+
8+
## Current Code
9+
10+
```python
11+
def apply_conf(self, factory: str, conf: str, intf_name: str):
12+
with open("/etc/wireguard/%s.conf" % intf_name, "w") as f:
13+
os.fchmod(f.fileno(), 0o700)
14+
f.write(conf)
15+
try:
16+
subprocess.check_call(["wg-quick", "down", intf_name])
17+
except subprocess.CalledProcessError:
18+
log.info("Unable to take VPN down. Assuming initial invocation")
19+
subprocess.check_call(["wg-quick", "up", intf_name])
20+
```
21+
22+
## Problem
23+
24+
When `wg-quick up` is called, if peers already exist with endpoints,
25+
WireGuard clears AllowedIPs. This happens because WireGuard sees
26+
existing peers with endpoints and doesn't apply the AllowedIPs from
27+
the config file.
28+
29+
## Solution
30+
31+
Remove existing device peers BEFORE calling `wg-quick up`:
32+
33+
```python
34+
def apply_conf(self, factory: str, conf: str, intf_name: str):
35+
# Remove existing device peers before applying config
36+
# This ensures AllowedIPs are set correctly
37+
try:
38+
# Get list of device peers from config
39+
for device in FactoryDevice.iter_vpn_enabled(factory, self.api):
40+
try:
41+
# Remove peer if it exists
42+
subprocess.run(
43+
["wg", "set", intf_name, "peer", device.pubkey, "remove"],
44+
check=False,
45+
capture_output=True,
46+
timeout=5
47+
)
48+
except Exception:
49+
pass # Ignore errors if peer doesn't exist
50+
except Exception:
51+
pass # Ignore errors if interface doesn't exist
52+
53+
# Now apply config (peers will be added without endpoints)
54+
with open("/etc/wireguard/%s.conf" % intf_name, "w") as f:
55+
os.fchmod(f.fileno(), 0o700)
56+
f.write(conf)
57+
try:
58+
subprocess.check_call(["wg-quick", "down", intf_name])
59+
except subprocess.CalledProcessError:
60+
log.info("Unable to take VPN down. Assuming initial invocation")
61+
subprocess.check_call(["wg-quick", "up", intf_name])
62+
63+
# Devices will reconnect and add endpoints, preserving AllowedIPs
64+
```
65+
66+
## Why This Works
67+
68+
1. Remove existing peers (clears endpoints)
69+
2. Apply config (peers added with AllowedIPs, no endpoints)
70+
3. Devices reconnect (add endpoints, AllowedIPs persist)
71+
72+
## Testing
73+
74+
After applying fix:
75+
- ✅ All device peers show `allowed ips: 10.42.42.0/24`
76+
- ✅ Client can ping devices via VPN IPs
77+
- ✅ Devices can communicate with each other
78+
- ✅ AllowedIPs persist after daemon restarts

0 commit comments

Comments
 (0)