Checklist:
Describe the bug
ArgoCD application controller enters an infinite sync failure loop when managed namespaces are deleted without first removing the argocd.argoproj.io/managed-by label. The controller crashes during cluster cache synchronization every 10 minutes with 403 Forbidden errors when attempting to list resources in deleted namespaces, requiring manual controller restart to recover.
This occurs because the GitOps Engine's cluster cache sync process (pkg/cache/cluster.go) iterates through the configured namespace list but has no mechanism to detect when those namespaces have been deleted externally. The sync fails at the Kubernetes API level when trying to list resources in non-existent namespaces.
To Reproduce
- Deploy ArgoCD with namespaced configuration managing specific namespaces
- Create a test namespace and add the managed-by label:
kubectl create namespace test-namespace
kubectl label namespace test-namespace argocd.argoproj.io/managed-by=argocd
- Deploy an application to the namespace (optional, but makes issue more visible)
- Delete the namespace WITHOUT removing the label first:
kubectl delete namespace test-namespace
- Wait for the next cluster cache sync cycle (default: 10 minutes)
- Observe application controller logs showing sync failures
- Note that applications fail to sync, enter
Unkown state and ArgoCD becomes unresponsive
Expected behavior
ArgoCD should gracefully handle deleted namespaces by:
- Detecting when managed namespaces no longer exist
- Automatically removing deleted namespaces from its configuration
- Continuing normal operation with remaining valid namespaces
- Self-healing without requiring manual intervention
Screenshots
Version
All ArgoCD versions are affected but I specifically encountered the error on
Logs
time="2025-01-23T09:15:04Z" level=error msg="error synchronizing cache state : failed to sync cluster https://kubernetes.default.svc:443: failed to load initial state of resource apps.Deployment: deployments.apps is forbidden: User \"system:serviceaccount:argocd:argocd-application-controller\" cannot list resource \"deployments\" in API group \"apps\" in the namespace \"test-namespace\"" application=example-app
time="2025-01-23T09:25:04Z" level=error msg="error synchronizing cache state : failed to sync cluster https://kubernetes.default.svc:443: failed to load initial state of resource core.Pod: pods is forbidden: User \"system:serviceaccount:argocd:argocd-application-controller\" cannot list resource \"pods\" in API group \"\" in the namespace \"test-namespace\"" application=example-app
Checklist:
argocd version.Describe the bug
ArgoCD application controller enters an infinite sync failure loop when managed namespaces are deleted without first removing the
argocd.argoproj.io/managed-bylabel. The controller crashes during cluster cache synchronization every 10 minutes with 403 Forbidden errors when attempting to list resources in deleted namespaces, requiring manual controller restart to recover.This occurs because the GitOps Engine's cluster cache sync process (
pkg/cache/cluster.go) iterates through the configured namespace list but has no mechanism to detect when those namespaces have been deleted externally. The sync fails at the Kubernetes API level when trying to list resources in non-existent namespaces.To Reproduce
kubectl delete namespace test-namespaceUnkownstate and ArgoCD becomes unresponsiveExpected behavior
ArgoCD should gracefully handle deleted namespaces by:
Screenshots
Version
All ArgoCD versions are affected but I specifically encountered the error on
Logs