Component(s)
target allocator
What happened?
Description
The current implementation of the OpenTelemetry Operator and CRD does not support collectorSelectors for the TargetAllocator CR.
This is mandatory as without any workload in the Cluster with a certain label will be assigned as collector depending on the RBAC permission of the serviceAccount.
The Env OTELCOL_NAMESPACE can limit the selected collectors but still in the example I end up with none OTC pods being assigned and therefor missing Scrape Targets.
Steps to Reproduce
cat <<'EOF' | oc -n reproducer create -f-
apiVersion: opentelemetry.io/v1alpha1
kind: TargetAllocator
metadata:
name: reproducer
spec:
allocationStrategy: consistent-hashing
collectorNotReadyGracePeriod: 30s
filterStrategy: relabel-config
ipFamilyPolicy: SingleStack
managementState: managed
observability:
metrics:
enableMetrics: false
prometheusCR:
enabled: false
scrapeInterval: 30s
replicas: 1
scrapeConfigs:
- fallback_scrape_protocol: PrometheusText1.0.0
job_name: reproducer
scheme: http
static_configs:
- targets:
- host01:9090
- host02:9090
- host03:9090
- host04:9090
- host05:9090
- host06:9090
- host07:9090
- host08:9090
- host09:9090
- host10:9090
- host11:9090
- host12:9090
- host13:9090
- host14:9090
- host15:9090
- host16:9090
- host17:9090
- host18:9090
- host19:9090
- host20:9090
EOF
- use port-forwarding to see which collectors have been assigned
oc -n reproducer port-forward service/reproducer-targetallocator 8080:80 &
cat <<'EOF' | python
import re, requests
pattern = r'.*href="/debug/collector.+?>([^<]+)</a>'
rsp = requests.get("http://localhost:8080").text
for match in re.finditer(pattern, rsp):
print(match.groups()[0])
EOF
cat <<'EOF' | oc -n reproducer create -f-
apiVersion: v1
kind: Pod
metadata:
name: example
labels:
app: httpd
namespace: targetallocatorreproducer
spec:
securityContext:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
containers:
- name: httpd
image: 'image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest'
ports:
- containerPort: 8080
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
EOF
- re-check the targetAllocator collector list
oc -n reproducer port-forward service/reproducer-targetallocator 8080:80 &
cat <<'EOF' | python
import re, requests
pattern = r'.*href="/debug/collector.+?>([^<]+)</a>'
rsp = requests.get("http://localhost:8080").text
for match in re.finditer(pattern, rsp):
print(match.groups()[0])
EOF
- this for example outputs now (after adding two httpd pods)
example
example2
reproducer-targetallocator-75596bd594-mp4np
- the UI of the targetAllocator also shows assigned Job and TargetCounts
Collector Job Count Target Count
example 1 6
example2 1 10
reproducer-targetallocator-75596bd594-mp4np 1 4
Expected Result
Actual Result
Kubernetes Version
1.32
Operator version
1.48.0
Collector version
1.48.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
Log output
Additional context
OpenShift, orignally:
https://redhat.atlassian.net/browse/TRACING-5977
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.
Component(s)
target allocator
What happened?
Description
The current implementation of the OpenTelemetry Operator and CRD does not support collectorSelectors for the TargetAllocator CR.
This is mandatory as without any workload in the Cluster with a certain label will be assigned as collector depending on the RBAC permission of the serviceAccount.
The Env OTELCOL_NAMESPACE can limit the selected collectors but still in the example I end up with none OTC pods being assigned and therefor missing Scrape Targets.
Steps to Reproduce
will output the targetAllocator itself and or multiple other pods in that namespace
add a simple httpd pod to verify targetAllocator selects it for target dispatching
Expected Result
Actual Result
Kubernetes Version
1.32
Operator version
1.48.0
Collector version
1.48.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
Log output
Additional context
OpenShift, orignally:
https://redhat.atlassian.net/browse/TRACING-5977
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding
+1orme too, to help us triage it. Learn more here.