Skip to content

OBSDA-1383: Make Splunk output sourcetype configurable in CLF#3251

Open
marpears wants to merge 8 commits intoopenshift:masterfrom
marpears:splunk-sourcetype
Open

OBSDA-1383: Make Splunk output sourcetype configurable in CLF#3251
marpears wants to merge 8 commits intoopenshift:masterfrom
marpears:splunk-sourcetype

Conversation

@marpears
Copy link
Copy Markdown

@marpears marpears commented Apr 13, 2026

Description

This PR allows for configuration of the Splunk output source type in the CLF using a new sourceType field. This is so that we can support users who have defined custom source types in Splunk.

The sourceType field can be used only when payloadKey is defined, and allows for a templated value so that it can be retrieved from a pod label. If sourceType is not defined, the current logic is preserved where it defaults to _json. If using payloadKey without sourceType, the source type used will be either _json or generic_single_line, depending on the structure of the final event payload.

CLF configuration example:

  outputs:
    - name: splunk-app-team-1
      splunk:
        authentication:
          token:
            key: hecToken
            secretName: splunk-app-team-1
        sourceType: '{.kubernetes.labels.splunk/sourcetype||"generic_single_line"}'
        url: 'https://splunk.customer.com:8088'
      type: splunk

/cc @Clee2691 @cahartma
/assign @jcantrill

Links

@jcantrill
Copy link
Copy Markdown
Contributor

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 13, 2026
Copy link
Copy Markdown
Contributor

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no functional test to support this addition. Also note we do not backport features to earlier releases

Comment thread api/observability/v1/output_types.go Outdated
Copy link
Copy Markdown
Contributor

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add e2e validation tests and at least one functional test

Comment thread api/observability/v1/output_types.go
Comment thread docs/features/logforwarding/outputs/splunk-forwarding.adoc Outdated
Comment thread docs/features/logforwarding/outputs/splunk-forwarding.adoc
Comment thread docs/features/logforwarding/outputs/splunk-forwarding.adoc
Comment thread docs/features/logforwarding/outputs/splunk-forwarding.adoc Outdated
@marpears marpears changed the title Make Splunk output sourcetype configurable in CLF WIP - Make Splunk output sourcetype configurable in CLF Apr 17, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 17, 2026
@jcantrill
Copy link
Copy Markdown
Contributor

/ok-to-test

@openshift-ci openshift-ci Bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Apr 20, 2026
@jcantrill jcantrill changed the title WIP - Make Splunk output sourcetype configurable in CLF OBSDA-1383: Make Splunk output sourcetype configurable in CLF Apr 20, 2026
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 20, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 20, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@marpears: This pull request references OBSDA-1383 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the feature to target the "4.8.0" version, but no target version was set.

Details

In response to this:

Description

This PR allows for configuration of the Splunk output source type in the CLF using a new sourceType field. This is so that we can support users who have defined custom source types in Splunk.

If sourceType is not defined, then the current behavior is preserved where _json is the default and can be overridden if payloadKey is set based on the type of the final event payload.

CLF configuration example:

 outputs:
   - name: splunk-app-team-1
     splunk:
       authentication:
         token:
           key: hecToken
           secretName: splunk-app-team-1
       sourceType: 'my:custom:sourcetype'
       url: 'https://splunk.customer.com:8088'
     type: splunk

/cc @Clee2691 @cahartma
/assign @jcantrill

/cherrypick release-6.4
/cherrypick release-6.5

Links

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 27, 2026
@marpears
Copy link
Copy Markdown
Author

/retest

Comment thread docs/features/logforwarding/outputs/splunk-forwarding.adoc Outdated
Comment thread internal/generator/vector/output/splunk/splunk.go
Comment thread test/e2e/collection/apivalidations/api_validations_test.go
@jcantrill
Copy link
Copy Markdown
Contributor

@marpears You will need to rebase and force push.

@marpears marpears force-pushed the splunk-sourcetype branch 2 times, most recently from 1fc8d94 to b04ac06 Compare April 29, 2026 11:44
@vparfonov
Copy link
Copy Markdown
Contributor

/retest

@jcantrill
Copy link
Copy Markdown
Contributor

/label tide/squash-merge-method

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 29, 2026

@jcantrill: The label(s) /label tide/squash-merge-method cannot be applied. These labels are supported: acknowledge-critical-fixes-only, platform/aws, platform/azure, platform/baremetal, platform/google, platform/libvirt, platform/openstack, ga, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, px-approved, docs-approved, qe-approved, ux-approved, no-qe, rebase/manual, cluster-config-api-changed, run-integration-tests, verified, ready-for-human-review, approved, backport-risk-assessed, bugzilla/valid-bug, cherry-pick-approved, jira/skip-dependent-bug-check, jira/valid-bug, ok-to-test, stability-fix-approved, staff-eng-approved. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

Details

In response to this:

/label tide/squash-merge-method

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jcantrill
Copy link
Copy Markdown
Contributor

/label tide/merge-method-squash

@openshift-ci openshift-ci Bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Apr 29, 2026
Comment thread bundle/manifests/cluster-logging.clusterserviceversion.yaml Outdated
@marpears marpears force-pushed the splunk-sourcetype branch from b04ac06 to c55ba44 Compare April 30, 2026 08:29
@jcantrill
Copy link
Copy Markdown
Contributor

/approve

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 30, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcantrill, marpears

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2026
@marpears marpears force-pushed the splunk-sourcetype branch from c55ba44 to 48c7394 Compare May 5, 2026 13:51
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Warning

Rate limit exceeded

@marpears has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 7 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f3bd07bb-6fd6-430b-9428-bd03a7d00d8e

📥 Commits

Reviewing files that changed from the base of the PR and between 48c7394 and 4e7fcdf.

📒 Files selected for processing (6)
  • api/observability/v1/output_types.go
  • bundle/manifests/cluster-logging.clusterserviceversion.yaml
  • bundle/manifests/observability.openshift.io_clusterlogforwarders.yaml
  • config/crd/bases/observability.openshift.io_clusterlogforwarders.yaml
  • config/manifests/bases/cluster-logging.clusterserviceversion.yaml
  • docs/features/logforwarding/outputs/splunk-forwarding.adoc

Walkthrough

This PR adds support for an optional sourceType field in Splunk log forwarding outputs. The feature allows users to specify or derive the Splunk source type, with a cross-field validation rule requiring payloadKey to be set whenever sourceType is used. Vector generator logic conditionally applies different remap templates based on whether sourceType is static or templated.

Changes

Splunk SourceType Feature

Layer / File(s) Summary
Data Shape & Validation
api/observability/v1/output_types.go, config/crd/bases/observability.openshift.io_clusterlogforwarders.yaml, bundle/manifests/observability.openshift.io_clusterlogforwarders.yaml
Splunk struct adds optional SourceType field with +kubebuilder:validation:XValidation rule enforcing sourceType requires payloadKey; CRD schemas updated with matching field definitions and validation rules.
Manifest Descriptors
config/manifests/bases/cluster-logging.clusterserviceversion.yaml, bundle/manifests/cluster-logging.clusterserviceversion.yaml
CRD spec descriptors added for outputs[].splunk.sourceType with display metadata, constraints, and documentation describing template/static usage patterns and default behavior.
Vector Generator Logic
internal/generator/vector/output/splunk/splunk.go
Added payloadKeysourceTypeTmpl VRL template; New() conditionally selects between payloadKeysourceTypeTmpl (when SourceType is set) and payloadKeyTmpl (when unset); defaults sourcetype to "_json" when PayloadKey is absent.
Vector Transform Templates
internal/generator/vector/output/splunk/splunk_sink_payloadkey.toml, internal/generator/vector/output/splunk/splunk_sink_with_payloadkey_and_sourcetype.toml, internal/generator/vector/output/splunk/splunk_sink_with_payloadkey_and_static_sourcetype.toml
Removed hardcoded sourcetype default from payloadkey template; added two new templates (splunk_sink_with_payloadkey_and_*) defining remap transforms for timestamp parsing, metadata derivation (source/sourcetype from labels/types), and event restructuring around payloadKey.
Documentation & Examples
docs/features/logforwarding/outputs/splunk-forwarding.adoc
Expanded examples, property list, and default settings table to cover sourceType configuration, interaction with payloadKey, dynamic label derivation with fallbacks, and source type nomination rules based on payload type.
E2E & Functional Tests
internal/generator/vector/output/splunk/splunk_test.go, test/e2e/collection/apivalidations/api_validations_test.go, test/e2e/collection/apivalidations/splunk-*.yaml, test/functional/outputs/splunk/forward_to_splunk_metadata_test.go
Added test entries for payloadKey with static/dynamic sourceType; created five new e2e validation fixtures (payloadKey alone, payloadKey+static sourceType, payloadKey+templated sourceType, sourceType alone, templated sourceType alone); updated API validation test to verify pass/fail behavior; added functional test validating sourcetype field in Splunk events.

Sequence Diagram

sequenceDiagram
    participant User as User/Admin
    participant APIServer as Kubernetes API Server
    participant Validator as Field Validator
    participant Generator as Vector Generator
    participant Templates as Transform Templates
    participant Vector as Vector Agent
    participant Splunk as Splunk HEC

    User->>APIServer: Create ClusterLogForwarder with sourceType + payloadKey
    APIServer->>Validator: Validate XValidation rule
    Validator-->>APIServer: ✓ sourceType allowed (payloadKey exists)
    APIServer-->>User: Resource created

    Generator->>Generator: Check PayloadKey & SourceType config
    alt Has PayloadKey + SourceType
        Generator->>Templates: Select payloadKeysourceTypeTmpl
    else Has PayloadKey only
        Generator->>Templates: Select payloadKeyTmpl
    else No PayloadKey
        Generator->>Templates: Use default ("_json")
    end
    Templates->>Vector: Deploy transforms & sink config

    Vector->>Vector: Parse timestamp
    Vector->>Vector: Extract source from log_type/log_source
    Vector->>Vector: Derive sourcetype (label or config)
    Vector->>Vector: Restructure event around payloadKey
    Vector->>Splunk: Send JSON with templated source/sourcetype
    Splunk-->>Vector: ✓ Event indexed
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Microshift Test Compatibility ⚠️ Warning New e2e tests use observability.openshift.io/v1 ClusterLogForwarder API which is unavailable on MicroShift. Tests lack [Skipped:MicroShift] label or [apigroup:...] tag. Add [apigroup:observability.openshift.io] tag to test names or [Skipped:MicroShift] label to prevent running on MicroShift.
✅ Passed checks (10 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: making Splunk output sourcetype configurable in CLF, which aligns with the extensive changes across API definitions, generators, tests, and documentation files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names are stable and deterministic. No dynamic content found in Ginkgo test titles. Test names use static, descriptive strings that remain consistent across runs.
Test Structure And Quality ✅ Passed Test code meets quality requirements: single responsibility, proper setup/cleanup via BeforeEach/AfterEach, timeouts with Eventually, assertion messages included, and follows codebase patterns.
Single Node Openshift (Sno) Test Compatibility ✅ Passed New tests do not assume multi-node clusters. All tests validate API constraints or single-pod log forwarding behavior, fully compatible with Single Node OpenShift.
Topology-Aware Scheduling Compatibility ✅ Passed Adds sourceType field to Splunk config. No scheduling constraints, affinity rules, topology assumptions, or operator deployment changes introduced.
Ote Binary Stdout Contract ✅ Passed PR does not violate OTE Binary Stdout Contract. All test code is properly isolated within Ginkgo test blocks; no process-level stdout writes detected.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed New Ginkgo tests for Splunk sourceType are IPv6-compatible: no hardcoded IPv4 addresses, no external connectivity, all in-cluster operations using pod loopback and service DNS.
Description check ✅ Passed The PR description comprehensively covers the feature intent, rationale, configuration example, and backward compatibility, with proper section structure and linked JIRA issues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/observability/v1/output_types.go`:
- Around line 1359-1362: The kubebuilder validation pattern on the SourceType
field (the struct tag on SourceType string `json:"sourceType,omitempty"`) omits
':' in the static character class causing valid colon-separated sourcetypes to
be rejected; update the +kubebuilder:validation:Pattern annotation to include
':' inside the static class (e.g. add : to `[a-zA-Z0-9-_.\/]`) so values like
"my:custom:sourcetype" validate, and also correct the doc comment/example that
shows `"log4j"` (remove the stray quotes) so examples match the intended valid
values.

In `@bundle/manifests/observability.openshift.io_clusterlogforwarders.yaml`:
- Line 3912: The example value currently shows a quoted string `"log4j"` which
violates the declared regex expecting an unquoted literal; update the example
list item to use an unquoted entry (log4j) so it matches the validation pattern
and remove the surrounding quotes from the value shown in the manifest example
for the ClusterLogForwarder configuration.
- Line 3919: The sourceType validation pattern is too restrictive and disallows
colons (:) causing valid Splunk sourcetypes like "my:custom:sourcetype" to be
rejected; update the regex used for the sourceType field (the pattern shown) to
include ":" in the allowed character class (i.e., add ":" to the character class
that currently contains a-zA-Z0-9-_.\/) so static values with colons pass API
validation while preserving existing escaping/grouping logic.

In `@config/crd/bases/observability.openshift.io_clusterlogforwarders.yaml`:
- Around line 3910-3913: The example value for the sourceType field contains
quotes which violate the field's validation regex; update the example in the
ClusterLogForwarder CRD (the `sourceType` example lines shown as `2. "log4j"`)
to remove the quotes so it matches the pattern (e.g., change `2. "log4j"` to `2.
log4j`) and ensure any other static-literal examples in that `sourceType`
example block follow the same unquoted format.

In `@docs/features/logforwarding/outputs/splunk-forwarding.adoc`:
- Around line 57-60: Update the splunk-forwarding docs to clarify that
sourceType defaults to `_json` only when payloadKey is not set and otherwise
resolves to `generic_single_line` (adjust text near `sourceType` and the
conditional description around `payloadKey`), standardize the naming to use
`splunk/sourcetype` everywhere (replace `splunk_sourcetype` occurrences), and
fix the grammar in the `compression` line to read “available are: `none`,
`gzip`.” Ensure references to `payloadKey`, `sourceType`, and `compression` are
consistent across the other referenced sections (lines 76–78, 97–98, 245–249).

In `@test/e2e/collection/apivalidations/splunk-sourcetype.yaml`:
- Around line 22-23: The fixture in splunk-sourcetype.yaml sets sourceType
(sourceType: log4j) but omits the required cross-field payloadKey, causing CRD
validation to fail; update the manifest to include a payloadKey entry paired
with the existing sourceType (i.e., add a payloadKey field alongside sourceType)
so it satisfies the Splunk cross-field validation rule.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 03c4129b-d332-454e-b2ff-da63fd0e1a1f

📥 Commits

Reviewing files that changed from the base of the PR and between ff06c7a and 48c7394.

📒 Files selected for processing (18)
  • api/observability/v1/output_types.go
  • bundle/manifests/cluster-logging.clusterserviceversion.yaml
  • bundle/manifests/observability.openshift.io_clusterlogforwarders.yaml
  • config/crd/bases/observability.openshift.io_clusterlogforwarders.yaml
  • config/manifests/bases/cluster-logging.clusterserviceversion.yaml
  • docs/features/logforwarding/outputs/splunk-forwarding.adoc
  • internal/generator/vector/output/splunk/splunk.go
  • internal/generator/vector/output/splunk/splunk_sink_payloadkey.toml
  • internal/generator/vector/output/splunk/splunk_sink_with_payloadkey_and_sourcetype.toml
  • internal/generator/vector/output/splunk/splunk_sink_with_payloadkey_and_static_sourcetype.toml
  • internal/generator/vector/output/splunk/splunk_test.go
  • test/e2e/collection/apivalidations/api_validations_test.go
  • test/e2e/collection/apivalidations/splunk-payloadkey-and-sourcetype.yaml
  • test/e2e/collection/apivalidations/splunk-payloadkey-and-templated-sourcetype.yaml
  • test/e2e/collection/apivalidations/splunk-payloadkey.yaml
  • test/e2e/collection/apivalidations/splunk-sourcetype.yaml
  • test/e2e/collection/apivalidations/splunk-templated-sourcetype.yaml
  • test/functional/outputs/splunk/forward_to_splunk_metadata_test.go
💤 Files with no reviewable changes (1)
  • internal/generator/vector/output/splunk/splunk_sink_payloadkey.toml

Comment on lines +1359 to +1362
// +kubebuilder:validation:Optional
// +kubebuilder:validation:Pattern:=`^(([a-zA-Z0-9-_.\/])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$`
// +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="SourceType",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:text"}
SourceType string `json:"sourceType,omitempty"`
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

SourceType pattern validation rejects colon — blocks the primary Splunk naming convention.

The pattern's static character class [a-zA-Z0-9-_.\/] does not include : (colon). The old Splunk style uses underscore separators (e.g., access_combined) while the new style uses colon separators (e.g., ibm:ldap:audit), and the tradition is to use a single colon to denote the hierarchical levels from least specific to most specific — the software product is listed first, then the specific component of the product. Real-world examples like cisco:esa:textmail, zeek:conn:json, and the PR description's own example my:custom:sourcetype would all be rejected by the current pattern.

A user wanting a static colon-separated sourcetype has no clean workaround: my:custom:sourcetype fails validation, and the template syntax cannot encode a bare static string with colons either.

🐛 Proposed fix — add : to the static character class
-	// +kubebuilder:validation:Pattern:=`^(([a-zA-Z0-9-_.\/])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$`
+	// +kubebuilder:validation:Pattern:=`^(([a-zA-Z0-9-_.:\/])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$`

Also note that example 2 in the doc comment ("log4j" with literal double-quotes) would not match the pattern's static portion either. The intended value is simply log4j (no quotes), which does pass. The comment is misleading and should be corrected.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// +kubebuilder:validation:Optional
// +kubebuilder:validation:Pattern:=`^(([a-zA-Z0-9-_.\/])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$`
// +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="SourceType",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:text"}
SourceType string `json:"sourceType,omitempty"`
// +kubebuilder:validation:Optional
// +kubebuilder:validation:Pattern:=`^(([a-zA-Z0-9-_.:\/])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$`
// +operator-sdk:csv:customresourcedefinitions:type=spec,displayName="SourceType",xDescriptors={"urn:alm:descriptor:com.tectonic.ui:text"}
SourceType string `json:"sourceType,omitempty"`
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/observability/v1/output_types.go` around lines 1359 - 1362, The
kubebuilder validation pattern on the SourceType field (the struct tag on
SourceType string `json:"sourceType,omitempty"`) omits ':' in the static
character class causing valid colon-separated sourcetypes to be rejected; update
the +kubebuilder:validation:Pattern annotation to include ':' inside the static
class (e.g. add : to `[a-zA-Z0-9-_.\/]`) so values like "my:custom:sourcetype"
validate, and also correct the doc comment/example that shows `"log4j"` (remove
the stray quotes) so examples match the intended valid values.

✅ Addressed in commit 17bcba0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vparfonov @Clee2691 This pattern matcher is restricted by VRL. Do we have any idea if VRL would allow us to add a colon for this type? Given our template utilities it may complicate the other fields and we may need to think about how we would accept this suggestion?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment thread bundle/manifests/observability.openshift.io_clusterlogforwarders.yaml Outdated
4. {.foo||.bar||"missing"}

5. foo.{.bar.baz||.qux.quux.corge||.grault||"nil"}-waldo.fred{.plugh||"none"}
pattern: ^(([a-zA-Z0-9-_.\/])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

sourceType regex is too restrictive for common Splunk sourcetypes

Line 3919 blocks static values containing :, so values like my:custom:sourcetype (the core use case) will be rejected by API validation.

Suggested fix
-                          pattern: ^(([a-zA-Z0-9-_.\/])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$
+                          pattern: ^(([a-zA-Z0-9\-_.\/:])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pattern: ^(([a-zA-Z0-9-_.\/])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$
pattern: ^(([a-zA-Z0-9\-_.\/:])*(\{(\.[a-zA-Z0-9_]+|\."[^"]+")+((\|\|)(\.[a-zA-Z0-9_]+|\.?"[^"]+")+)*\|\|"[^"]*"\})*)*$
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bundle/manifests/observability.openshift.io_clusterlogforwarders.yaml` at
line 3919, The sourceType validation pattern is too restrictive and disallows
colons (:) causing valid Splunk sourcetypes like "my:custom:sourcetype" to be
rejected; update the regex used for the sourceType field (the pattern shown) to
include ":" in the allowed character class (i.e., add ":" to the character class
that currently contains a-zA-Z0-9-_.\/) so static values with colons pass API
validation while preserving existing escaping/grouping logic.

Comment thread docs/features/logforwarding/outputs/splunk-forwarding.adoc Outdated
Comment thread test/e2e/collection/apivalidations/splunk-sourcetype.yaml
@marpears
Copy link
Copy Markdown
Author

marpears commented May 6, 2026

/retest

@marpears
Copy link
Copy Markdown
Author

marpears commented May 6, 2026

Hi @jcantrill, I think it would be worth extending this feature with a special option of sourceType: none which does not set sourcetype in the Splunk sink config in Vector. This is to handle a case where a user may wish for a source type to be enforced at the Splunk side.

According to the Vector docs, it does not mandate sourcetype. Splunk will use a sourcetype to httpevent if not set, so using sourceType: none will come with a warning that the user must configure this suitably at the Splunk side.

Do you agree?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 6, 2026

@marpears: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/functional-target 4e7fcdf link true /test functional-target

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jcantrill
Copy link
Copy Markdown
Contributor

Hi @jcantrill, I think it would be worth extending this feature with a special option of sourceType: none which does not set sourcetype in the Splunk sink config in Vector. This is to handle a case where a user may wish for a source type to be enforced at the Splunk side.

What does it mean to enforce it on the the receiver side? What is the receiver behavior? Does it still accept the logs forwarded by the collector?

@marpears
Copy link
Copy Markdown
Author

marpears commented May 7, 2026

Hi @jcantrill, I think it would be worth extending this feature with a special option of sourceType: none which does not set sourcetype in the Splunk sink config in Vector. This is to handle a case where a user may wish for a source type to be enforced at the Splunk side.

What does it mean to enforce it on the the receiver side? What is the receiver behavior? Does it still accept the logs forwarded by the collector?

Hi @jcantrill, I've done a test which confirmed that when the collector doesn't define a source type, Splunk accepts the log event and uses a default source type of httpevent.

But on reflection, I think the complexities and use case of a special option to not set the source type in the collector needs some further thought. I do not want to detract from the core objective of this PR, which was to allow for a user-defined source type in the collector, so I'll park this idea.

@jcantrill
Copy link
Copy Markdown
Contributor

Hi @jcantrill, I think it would be worth extending this feature with a special option of sourceType: none which does not set sourcetype in the Splunk sink config in Vector. This is to handle a case where a user may wish for a source type to be enforced at the Splunk side.

But on reflection, I think the complexities and use case of a special option to not set the source type in the collector needs some further thought. I do not want to detract from the core objective of this PR, which was to allow for a user-defined source type in the collector, so I'll park this idea.

We discussed this in a team meeting and I do also don't like the idea of a "magic" word. My suggested alternative was to test an empty string. What does that mean to splunk? I am assuming that vector will send an empty string if configured which is different than configurating nothing and getting 'httpevent'

@marpears
Copy link
Copy Markdown
Author

marpears commented May 8, 2026

Hi @jcantrill, I think it would be worth extending this feature with a special option of sourceType: none which does not set sourcetype in the Splunk sink config in Vector. This is to handle a case where a user may wish for a source type to be enforced at the Splunk side.

But on reflection, I think the complexities and use case of a special option to not set the source type in the collector needs some further thought. I do not want to detract from the core objective of this PR, which was to allow for a user-defined source type in the collector, so I'll park this idea.

We discussed this in a team meeting and I do also don't like the idea of a "magic" word. My suggested alternative was to test an empty string. What does that mean to splunk? I am assuming that vector will send an empty string if configured which is different than configurating nothing and getting 'httpevent'

I've given that scenario a try, with the source type set to an empty string in the collector, it resulted in Splunk using a source type of an empty string.

I did another test to confirm the source type needs to be completely omitted from the sink output config in the collector for Splunk to assign a source type. My test was done with a simple unstructured log message, and Splunk assigned the source type of httpevent. I may repeat that using a log4j style log message to see if Splunk's automatic source type matching assigns it to its pretrained log4j source type rather than using httpevent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants