Skip to content

add agent certificates to binding cache for syslog-drain valdation me…#1301

Draft
corporatemax wants to merge 1 commit intocloudfoundry:developfrom
corporatemax:enable-syslog-drain-error-messages
Draft

add agent certificates to binding cache for syslog-drain valdation me…#1301
corporatemax wants to merge 1 commit intocloudfoundry:developfrom
corporatemax:enable-syslog-drain-error-messages

Conversation

@corporatemax
Copy link
Copy Markdown

@corporatemax corporatemax commented Jan 20, 2026

Please take a moment to review the questions before submitting the PR

🚫 We only accept PRs to develop branch. If this is an exception, please specify why 🚫

WHAT is this change about?

This PR adds the loggregator agent TLS certificates (CA, cert, key) to the syslog-binding-cache job. These are required by the forwarder-agent to emit syslog drain configuration error messages as app logs via the AppLogEmitter, as introduced in loggregator-agent-release#633. Without this change, the forwarder-agent in the binding-cache cannot establish the TLS connection needed to surface drain errors to application developers.

What customer problem is being addressed? Use customer persona to define the problem e.g. Alana is unable to...

Understanding why this change is being made is fantastically helpful. Please do tell...

Please provide any contextual information.

cloudfoundry/loggregator-agent-release#579
cloudfoundry/loggregator-agent-release#633

Has a cf-deployment including this change passed cf-acceptance-tests?

  • YES
  • NO

Does this PR introduce a breaking change? Please take a moment to read through the examples before answering the question.

  • YES - please choose the category from below. Feel free to provide additional details.
  • NO

Types of breaking changes:

  1. causes app or operator downtime
  2. increases VM footprint of cf-deployment - e.g. new jobs, new add ons, increases # of instances etc.
  3. modifies, deletes or moves the name of a job or instance group in the main manifest
  4. modifies the name or deletes a property of a job or instance group in the main manifest
  5. changes the name of credentials in the main manifest
  6. requires out-of-band manual intervention on the part of the operator
  7. modifies the ops-file path, changes the type, changes the values or removes ops-files from the following folders
    • ./operations/ or ./operations/experimental
    • ./addons
    • ./backup-and-restore/

If you're promoting an experimental Ops-file (or removing one), Please follow the Ops-file workflows.

Ops files changes in the following folders are considered as NON BREAKING CHANGES
./community, ./example-vars-files, ./test

How should this change be described in cf-deployment release notes?

Something brief that conveys the change and is written with the persona (Alana, Cody...) in mind. See previous release notes for examples.

Syslog drain configuration errors are now surfaced directly to application developers as app logs, helping diagnose issues when logs are not arriving at the configured syslog drain target.

Does this PR introduce a new BOSH release into the base cf-deployment.yml manifest or any ops-files?

  • YES - please specify
  • NO

Does this PR make a change to an experimental or GA'd feature/component?

  • experimental feature/component
  • GA'd feature/component

Please provide Acceptance Criteria for this change?

AC 1: Unreachable drain target shows error in app logs
Given I have an application deployed to Cloud Foundry And I have created a user-provided syslog drain service bound to my application with an unreachable host (e.g. syslog://unreachable-host:9999) When I run cf logs Then I see an error message in the log stream indicating the drain target could not be reached And the error message includes the drain destination

AC 2: Invalid drain URL shows error in app logs
Given I have an application deployed to Cloud Foundry And I have created a user-provided syslog drain service bound to my application with an invalid URL (e.g. malformed scheme or missing port) When I run cf logs Then I see an error message in the log stream indicating the drain binding is misconfigured And the error message identifies the nature of the misconfiguration

AC 3: TLS connection failure shows error in app logs
Given I have an application deployed to Cloud Foundry And I have created a syslog-tls drain service bound to my application where the target has an invalid or expired certificate When I run cf logs Then I see an error message in the log stream indicating a TLS handshake failure And the error message references the drain endpoint

AC 4: Connection timeout shows error in app logs
Given I have an application deployed to Cloud Foundry And I have created a syslog drain service bound to my application where the target accepts connections but does not respond within the write timeout When the forwarder-agent attempts to write logs to the drain Then I see an error message in cf logs indicating a write timeout And the error is attributed to the correct drain binding

AC 5: Healthy drain does not emit error messages
Given I have an application deployed to Cloud Foundry And I have created a correctly configured syslog drain service bound to my application with a reachable and healthy target When I run cf logs Then I do not see any syslog drain error messages in the application log stream And logs are successfully delivered to the drain target

AC 6: Error messages are scoped to the bound application
Given I have two applications: app-A with a misconfigured drain and app-B with a healthy drain When I run cf logs app-A Then I see drain error messages for app-A only When I run cf logs app-B Then I do not see any drain error messages

What is the level of urgency for publishing this change?

  • Urgent - unblocks current or future work
  • Slightly Less than Urgent

Tag your pair, your PM, and/or team!

@chombium
@jorbaum

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant