Skip to content

OpenTelemetry Operator TargetAllocator CR does not support collectorSelector #5016

@frzifus

Description

@frzifus

Component(s)

target allocator

What happened?

Description

The current implementation of the OpenTelemetry Operator and CRD does not support collectorSelectors for the TargetAllocator CR.

This is mandatory as without any workload in the Cluster with a certain label will be assigned as collector depending on the RBAC permission of the serviceAccount.

The Env OTELCOL_NAMESPACE can limit the selected collectors but still in the example I end up with none OTC pods being assigned and therefor missing Scrape Targets.

Steps to Reproduce

  • create a namespace
oc create reproducer
  • create a TargetAllocator
cat <<'EOF' | oc -n reproducer create -f-
apiVersion: opentelemetry.io/v1alpha1
kind: TargetAllocator
metadata:
  name: reproducer
spec:
  allocationStrategy: consistent-hashing
  collectorNotReadyGracePeriod: 30s
  filterStrategy: relabel-config
  ipFamilyPolicy: SingleStack
  managementState: managed
  observability:
    metrics:
      enableMetrics: false
  prometheusCR:
    enabled: false
    scrapeInterval: 30s
  replicas: 1
  scrapeConfigs:
    - fallback_scrape_protocol: PrometheusText1.0.0
      job_name: reproducer
      scheme: http
      static_configs:
        - targets:
            - host01:9090
            - host02:9090
            - host03:9090
            - host04:9090
            - host05:9090
            - host06:9090
            - host07:9090
            - host08:9090
            - host09:9090
            - host10:9090
            - host11:9090
            - host12:9090
            - host13:9090
            - host14:9090
            - host15:9090
            - host16:9090
            - host17:9090
            - host18:9090
            - host19:9090
            - host20:9090
EOF
  • use port-forwarding to see which collectors have been assigned 
oc -n reproducer port-forward service/reproducer-targetallocator 8080:80 & 
cat <<'EOF' | python 
import re, requests
pattern = r'.*href="/debug/collector.+?>([^<]+)</a>'
rsp = requests.get("http://localhost:8080").text
for match in re.finditer(pattern, rsp):
    print(match.groups()[0])

EOF
  • will output the targetAllocator itself and or multiple other pods in that namespace

  • add a simple httpd pod to verify targetAllocator selects it for target dispatching

cat <<'EOF' | oc -n reproducer create -f-
apiVersion: v1
kind: Pod
metadata:
  name: example
  labels:
    app: httpd
  namespace: targetallocatorreproducer
spec:
  securityContext:
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  containers:
    - name: httpd
      image: 'image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest'
      ports:
        - containerPort: 8080
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
            - ALL
EOF
  • re-check the targetAllocator collector list
oc -n reproducer port-forward service/reproducer-targetallocator 8080:80 & 
cat <<'EOF' | python 
import re, requests
pattern = r'.*href="/debug/collector.+?>([^<]+)</a>'
rsp = requests.get("http://localhost:8080").text
for match in re.finditer(pattern, rsp):
    print(match.groups()[0])

EOF
  • this for example outputs now (after adding two httpd pods)
example
example2
reproducer-targetallocator-75596bd594-mp4np
  • the UI of the targetAllocator also shows assigned Job and TargetCounts
Collector    Job Count    Target Count
example    1    6
example2    1    10
reproducer-targetallocator-75596bd594-mp4np    1    4

Expected Result

Actual Result

Kubernetes Version

1.32

Operator version

1.48.0

Collector version

1.48.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

Log output

Additional context

OpenShift, orignally:
https://redhat.atlassian.net/browse/TRACING-5977

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions