Skip to content

Commit 19f0d18

Browse files
authored
Merge pull request #3204 from OctopusDeploy/argo-cd-gateway-troubleshooting-consolidation
Improve Argo CD gateway troubleshooting, add install prerequisites
2 parents 266f1d7 + 119d83f commit 19f0d18

2 files changed

Lines changed: 164 additions & 62 deletions

File tree

src/pages/docs/argo-cd/instances/index.md

Lines changed: 20 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: src/layouts/Default.astro
33
pubDate: 2025-09-15
4-
modDate: 2025-12-08
4+
modDate: 2026-06-11
55
navSection: Argo CD Instances
66
navTitle: Overview
77
title: Overview
@@ -29,6 +29,24 @@ The [Kubernetes agent](/docs/kubernetes/targets/kubernetes-agent) and the [Kuber
2929

3030
:::
3131

32+
## Prerequisites
33+
34+
The gateway makes three outgoing connections. Before installing, make sure all of them are reachable from inside your cluster:
35+
36+
| Destination | Protocol | Port |
37+
| --- | --- | --- |
38+
| Octopus Server REST API | HTTPS | `443` |
39+
| Octopus Server gRPC endpoint | gRPC (HTTP/2) | `8443` by default |
40+
| Argo CD API server | gRPC (in-cluster) | Argo CD service port |
41+
42+
:::div{.warning}
43+
If your Octopus Server sits behind a load balancer, proxy, or firewall, make sure the gRPC port (`8443` by default) is forwarded to Octopus Server and the proxy supports HTTP/2. Forwarding only HTTPS (`443`) is a common cause of installation failure, where the gateway registers successfully but never connects. See [Troubleshooting](/docs/argo-cd/troubleshooting#failed-to-connect-to-octopus) for details.
44+
:::
45+
46+
:::div{.hint}
47+
The gateway holds long-lived gRPC streams and sends a keep-alive every 30 seconds by default. If a load balancer between the cluster and Octopus Server closes idle connections, set its idle timeout to comfortably exceed the keep-alive interval (`gateway.octopus.keepAlive.intervalSeconds`).
48+
:::
49+
3250
## Installing the Octopus Argo CD Gateway
3351

3452
The gateway is installed using [Helm](https://helm.sh) via the [octopusdeploy/octopus-argocd-gateway-chart](https://hub.docker.com/r/octopusdeploy/octopus-argocd-gateway-chart) chart.
@@ -206,61 +224,7 @@ The Octopus Argo CD gateway Helm chart follows [Semantic Versioning](https://sem
206224

207225
## Troubleshooting
208226

209-
### Argo CD TLS Errors
210-
211-
If your gateway is unable to connect to your Argo CD instance due to TLS errors it is likely due to the certificate that Argo CD is serving traffic with.
212-
213-
#### Self Signed Certificate
214-
215-
If you are getting an error that looks like this:
216-
217-
```text
218-
tls: failed to verify certificate: x509: certificate signed by unknown authority
219-
```
220-
221-
It is most likely due to Argo CD using a self-signed certificate, if it is intended that your certificate is self-signed you can disable certificate verification by doing the following:
222-
223-
Using Helm for existing installation:
224-
225-
```bash
226-
helm upgrade --atomic \
227-
--version "1.0.0" \
228-
--namespace "{{GATEWAY_NAMESPACE}}" \
229-
--reset-then-reuse-values \
230-
--set gateway.argocd.insecure="true" \
231-
--set gateway.argocd.plaintext="false" \
232-
{{EXISTING_HELM_RELEASE_NAME}} \
233-
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
234-
```
235-
236-
:::div{.warning}
237-
By setting `gateway.argocd.insecure="true"`, TLS certificate verification will no longer be performed between the gateway and the Argo CD instance. Make sure this configuration is necessary to avoid potential security issues.
238-
:::
239-
240-
#### No Certificate
241-
242-
If you are running your Argo CD instance without a certificate due to terminating SSL at a load balancer level the gateway will likely fail to connect with the following error:
243-
244-
```text
245-
transport: authentication handshake failed: EOF
246-
```
247-
248-
This is because the gateway is configured by default to require encrypted traffic, if it is intended that you don't have a certificate you can disable encryption between the gateway and Argo CD by doing the following:
249-
250-
```bash
251-
helm upgrade --atomic \
252-
--version "1.0.0" \
253-
--namespace "{{GATEWAY_NAMESPACE}}" \
254-
--reset-then-reuse-values \
255-
--set gateway.argocd.insecure="false" \
256-
--set gateway.argocd.plaintext="true" \
257-
{{EXISTING_HELM_RELEASE_NAME}} \
258-
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
259-
```
260-
261-
:::div{.warning}
262-
By setting `gateway.argocd.plaintext="true"`, all traffic between the gateway and Argo CD will be unencrypted. Make sure this configuration is necessary to avoid potential security issues.
263-
:::
227+
If your gateway is unable to connect to your Argo CD instance or Octopus Server (e.g. due to TLS errors), see [Troubleshooting Argo CD in Octopus](/docs/argo-cd/troubleshooting) for common issues and resolutions.
264228

265229
## Deleting an Octopus Argo CD Gateway
266230

src/pages/docs/argo-cd/troubleshooting.md

Lines changed: 144 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: src/layouts/Default.astro
33
pubDate: 2025-09-15
4-
modDate: 2025-09-15
4+
modDate: 2026-06-11
55
title: Troubleshooting Argo CD in Octopus
66
navTitle: Troubleshooting
77
description: How to resolve configuration issues
@@ -38,14 +38,45 @@ Resolution:
3838

3939
### Argo CD Gateway install fails initial health check
4040

41+
#### Failed to connect to Octopus
42+
43+
Behavior:
44+
45+
- Install Argo CD Gateway dialog states:
46+
- "Gateway registered with Octopus" was successful
47+
- "Failed to connect to Octopus" and "Failed to connect to Argo CD" both show as failed
48+
- The gateway pod is in a CrashLoopBackoff
49+
- In a Kubernetes viewer (e.g. K9s), the gateway pod logs state "*Gateway failed to connect to Octopus*"
50+
- If installed with `helm install --atomic`, the install fails and rolls back, removing the gateway from the cluster. The registered gateway still appears under **Infrastructure ➜ Argo CD Instances** but will never become healthy
51+
52+
Cause:
53+
54+
- The gateway cannot establish a gRPC connection to Octopus Server. Both "Failed to connect" rows in the dialog are caused by this single problem, not two separate ones
55+
- Registration uses the REST API url (`registration.octopus.serverApiUrl`), while the running gateway connects to a separate gRPC endpoint (`gateway.octopus.serverGrpcUrl`, port `8443` by default), so a successful registration does not mean the gRPC endpoint is reachable
56+
57+
Resolution:
58+
59+
- Confirm port `8443` is open and routed through to Octopus Server. A load balancer, proxy, or firewall that only forwards HTTPS (`443`) is a common cause. Probe it from inside the cluster:
60+
61+
```bash
62+
kubectl run port-check --image=busybox --restart=Never --rm -it -- \
63+
sh -c 'nc -z -w 5 your-octopus-url 8443 && echo REACHABLE || echo UNREACHABLE'
64+
```
65+
66+
- Confirm `gateway.octopus.serverGrpcUrl` points at your Octopus Server's gRPC endpoint, including the port (not the web url)
67+
- If the gateway logs a certificate thumbprint mismatch, confirm `gateway.octopus.serverThumbprint` matches your Octopus Server's certificate thumbprint
68+
- Inspect the gateway pod logs for connection details: `kubectl logs deploy/octopus-argocd-gateway -n <namespace>`
69+
- If the install was rolled back (e.g. `helm install --atomic` failed and cleaned up the cluster), delete the orphaned Argo CD Gateway in Octopus, resolve the connection issue, and re-run the installation
70+
71+
#### Failed to connect to ArgoCD
72+
4173
Behavior:
4274

4375
- Install Argo CD Gateway dialog states:
44-
- "established a connection" was successful
45-
- Health check failed
46-
- The Gateway pod is in a CrashLoopBackoff
76+
- "Gateway registered with Octopus" was successful
77+
- "Failed to connect to Argo CD" show as failed
4778
- In a Kubernetes viewer (e.g. K9s), the gateway pod logs state "*error validating connection to Argo CD*"
48-
- In Octopus, the healthcheck task log contains: "The Argo CD Gateway has not established a gRPC connection to Octopus Server"
79+
- In Octopus when navigating to newly added ArgoCD instance "Gateway connectivity" tab show "Argo CD Connectivity Issues" warning
4980

5081
Cause:
5182

@@ -54,10 +85,117 @@ Cause:
5485
Resolution:
5586

5687
- Confirm the URL specified for the `gateway.argocd.serverGrpcUrl` matches the expected grpc endpoint of your argo instance (`<servicename>.<namespace>.svc.cluster.local`)
57-
- If your Argo CD instance is using a self-signed certificate ensure `gateway.argocd.insecure` is set to `true`
88+
- If your Argo CD instance is using a self-signed certificate ensure `gateway.argocd.insecure` is set to `true` (see [TLS errors](#argo-cd-gateway-cannot-connect-to-argo-cd-due-to-tls-errors) below)
5889
- If your Argo CD instance is running in "insecure" mode, ensure `gateway.argocd.plaintext` is set to `true` (false otherwise)
5990
- In Octopus, delete the registered Argo CD Gateway, follow all required helm deletion commands, and reinstall
6091

92+
### Argo CD Gateway cannot connect to Argo CD due to TLS errors
93+
94+
If your gateway is unable to connect to your Argo CD instance due to TLS errors it is likely due to the certificate that Argo CD is serving traffic with.
95+
96+
#### Self Signed Certificate
97+
98+
Behavior:
99+
100+
- The gateway is unable to connect to your Argo CD instance
101+
- The gateway pod logs contain:
102+
103+
```text
104+
tls: failed to verify certificate: x509: certificate signed by unknown authority
105+
```
106+
107+
Cause:
108+
109+
- Argo CD is using a self-signed certificate
110+
111+
Resolution:
112+
113+
- Configure the gateway to trust your certificate, as described in [Trusting Certificates](/docs/argo-cd/instances#trusting-certificates)
114+
- Alternatively, if it is intended that your certificate is self-signed, you can disable certificate verification by doing the following:
115+
116+
Using Helm for existing installation:
117+
118+
```bash
119+
helm upgrade --atomic \
120+
--version "1.0.0" \
121+
--namespace "{{GATEWAY_NAMESPACE}}" \
122+
--reset-then-reuse-values \
123+
--set gateway.argocd.insecure="true" \
124+
--set gateway.argocd.plaintext="false" \
125+
{{EXISTING_HELM_RELEASE_NAME}} \
126+
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
127+
```
128+
129+
:::div{.warning}
130+
By setting `gateway.argocd.insecure="true"`, TLS certificate verification will no longer be performed between the gateway and the Argo CD instance. Make sure this configuration is necessary to avoid potential security issues.
131+
:::
132+
133+
#### No Certificate
134+
135+
Behavior:
136+
137+
- The gateway fails to connect to your Argo CD instance
138+
- The gateway pod logs contain:
139+
140+
```text
141+
transport: authentication handshake failed: EOF
142+
```
143+
144+
Cause:
145+
146+
- Your Argo CD instance is running without a certificate (e.g. SSL is terminated at a load balancer), while the gateway is configured by default to require encrypted traffic
147+
148+
Resolution:
149+
150+
- If it is intended that you don't have a certificate, you can disable encryption between the gateway and Argo CD by doing the following:
151+
152+
```bash
153+
helm upgrade --atomic \
154+
--version "1.0.0" \
155+
--namespace "{{GATEWAY_NAMESPACE}}" \
156+
--reset-then-reuse-values \
157+
--set gateway.argocd.insecure="false" \
158+
--set gateway.argocd.plaintext="true" \
159+
{{EXISTING_HELM_RELEASE_NAME}} \
160+
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
161+
```
162+
163+
:::div{.warning}
164+
By setting `gateway.argocd.plaintext="true"`, all traffic between the gateway and Argo CD will be unencrypted. Make sure this configuration is necessary to avoid potential security issues.
165+
:::
166+
167+
## Gateway Connectivity
168+
169+
### Gateway connection drops at regular intervals (load balancer idle timeout)
170+
171+
Behavior:
172+
173+
- The gateway installs and connects successfully, but loses its connection to Octopus Server after every quiet period of the same length (e.g. 60 seconds without activity)
174+
- Deployments with Argo CD steps fail intermittently with gRPC connection errors, and succeed when retried
175+
- The "Gateway connectivity" tab of the Argo CD instance intermittently shows "Unavailable", depending on when the last health check ran
176+
- The gateway pod logs show stream errors followed by an immediate reconnection
177+
- If the load balancer drops connections silently instead of closing them, the logs show failing keep alives (`keep alive check failed - cancelling subscribers` with `DeadlineExceeded` errors) and the gateway pod restart count climbs at a regular cadence
178+
179+
Cause:
180+
181+
- A load balancer or proxy between the gateway and Octopus Server closes connections it considers idle
182+
- The gateway sends a keep alive to Octopus Server every 30 seconds by default to hold the connection open. If the load balancer's idle timeout is shorter than the keep alive interval (or keep alives are disabled), the connection is terminated before the next keep alive is sent
183+
184+
Resolution:
185+
186+
- Increase the idle timeout on your load balancer so it comfortably exceeds the keep alive interval (`gateway.octopus.keepAlive.intervalSeconds`, default 30 seconds)
187+
- Alternatively, reduce the keep alive interval below the load balancer's idle timeout:
188+
189+
```bash
190+
helm upgrade --atomic \
191+
--version "1.0.0" \
192+
--namespace "{{GATEWAY_NAMESPACE}}" \
193+
--reset-then-reuse-values \
194+
--set gateway.octopus.keepAlive.intervalSeconds="15" \
195+
{{EXISTING_HELM_RELEASE_NAME}} \
196+
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
197+
```
198+
61199
## Application/Project mapping
62200

63201
### No applications are listed on the **Argo CD Instance ➜ Applications** page

0 commit comments

Comments
 (0)