fix: log sanitized error details when cluster secret unmarshal fails (#27148)#27159
Conversation
When SecretToCluster fails, the error was logged without any diagnostic details, making it difficult for operators to debug why a cluster secret could not be parsed. A prior attempt to log the raw error was rejected because json.Unmarshal errors can contain fragments of the data being parsed, which in this case is sensitive cluster configuration. This change introduces sanitizeUnmarshalError which extracts only safe diagnostic information from the error: byte offset for syntax errors, and field name for type mismatch errors. This gives operators enough context to diagnose the issue without risk of leaking secret data into application logs. All five call sites that previously discarded the error details now include the sanitized message. Closes argoproj#27148 Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
❌ Preview Environment undeployed from BunnyshellAvailable commands (reply to this comment):
|
There was a problem hiding this comment.
Pull request overview
Adds safe diagnostic logging for cluster secret JSON unmarshal failures so operators can troubleshoot parsing issues without risking leakage of sensitive secret contents.
Changes:
- Introduces
sanitizeUnmarshalError(err)to extract safe details fromencoding/jsonunmarshal errors (syntax offset / type error field / generic fallback). - Updates cluster secret unmarshal error logs in
ListClusters,WatchClusters, andUpdateClusterto include sanitized details. - Adds unit tests for
sanitizeUnmarshalErrorand a test ensuringListClustersskips invalid cluster secrets while returning valid ones.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| util/db/cluster.go | Adds sanitization helper and uses it in cluster secret unmarshal error logs. |
| util/db/cluster_test.go | Adds tests for sanitization behavior and for skipping invalid cluster secrets in ListClusters. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| t.Run("json type error", func(t *testing.T) { | ||
| err := &json.UnmarshalTypeError{Field: "tlsClientConfig.insecure"} | ||
| msg := sanitizeUnmarshalError(err) | ||
| assert.Equal(t, `cannot unmarshal into field "tlsClientConfig.insecure"`, msg) | ||
| }) |
There was a problem hiding this comment.
The expected string in this assertion is a raw string literal (backticks), so the " sequences are not unescaped. This currently asserts for backslashes that won’t be present in sanitizeUnmarshalError output (which uses %q and produces plain double quotes). Use a normal string literal with escaped quotes, or keep the raw string but remove the backslashes so the expected value matches the actual message.
There was a problem hiding this comment.
Good catch on the string literal concern. Switched from a raw string literal to a regular double-quoted string with escaped quotes so the intent is unambiguous and consistent with the other test cases in this block.
Changed the test expectation from a raw string literal with escaped quotes to a regular string literal, so the assertion matches the actual output from sanitizeUnmarshalError which uses fmt.Sprintf %q. Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #27159 +/- ##
==========================================
- Coverage 63.36% 63.36% -0.01%
==========================================
Files 415 415
Lines 56551 56561 +10
==========================================
+ Hits 35836 35842 +6
- Misses 17336 17344 +8
+ Partials 3379 3375 -4 ☔ View full report in Codecov by Sentry. |
Checklist:
Summary
When
SecretToClusterfails, the error was logged without any diagnostic details, making it difficult for operators to debug why a cluster secret could not be parsed.A prior attempt (#27151) to fix this by logging the raw error was correctly rejected because
json.Unmarshalerrors can contain fragments of the data being parsed, which in this case is sensitive cluster configuration (TLS certs, tokens, credentials).This PR introduces
sanitizeUnmarshalError, which useserrors.Asto extract only safe diagnostic information:*json.SyntaxError: logs the byte offset (e.g.,json syntax error at byte offset 42)*json.UnmarshalTypeError: logs the field name (e.g.,cannot unmarshal into field "tlsClientConfig.insecure")invalid cluster secret data)This gives operators enough context to diagnose the issue (e.g., "the JSON is malformed at byte 42" or "the field type doesn't match") without any risk of leaking secret data into application logs.
All five call sites in
cluster.gothat previously discarded the error have been updated.Example log output
Before:
After:
Validation
sanitizeUnmarshalErrorcovering all three error classification paths (syntax error, type error, unknown) plus wrapped errors.ListClustersverifying that invalid secrets are skipped while valid ones are still returned.util/dbtest suite passes.Closes #27148