fix(detectors/sonarcloud): reduce false positives by requiring "sonarcloud" prefix#5020
Conversation
…cloud" prefix The SonarCloud detector used PrefixRegex with the keyword "sonar", which matched generic "sonar" references (e.g., "sonarsource", "sonarqube-scan-action") in dependabot PR diffs and changelogs. This caused false positive detections on GitHub commit SHAs that happened to appear within 40 characters of any "sonar" substring. This commit narrows the prefix to "sonarcloud" so the detector only triggers when the literal service name appears near the candidate secret. Fixes trufflesecurity#5000 Signed-off-by: Wahaj Ahmed <wahajahmed010@gmail.com>
|
Wahaj Ahmed seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 093b39f. Configure here.
|
|
||
| // Make sure that your group is surrounded in boundary characters such as below to reduce false positives. | ||
| keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"sonar"}) + `(?:^|[^@])\b([0-9a-z]{40})\b`) | ||
| keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"sonarcloud"}) + `(?:^|[^@])\b([0-9a-z]{40})\b`) |
There was a problem hiding this comment.
Keywords() not updated to match narrowed PrefixRegex
Medium Severity
PrefixRegex was narrowed from "sonar" to "sonarcloud", but Keywords() still returns "sonar". Every other detector in the codebase (e.g., abstract, abuseipdb, abyssale) aligns these two values. This mismatch means the Aho-Corasick pre-filter still triggers this detector for any chunk containing "sonar" (including the sonarqube URLs the PR aims to ignore), only for the regex to never match — wasting processing. Additionally, the new test case for the sonarqube URL only passes the FindDetectorMatches pre-filter check because of this inconsistency; fixing Keywords() to "sonarcloud" later would break that test.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 093b39f. Configure here.


Background
The SonarCloud detector uses PrefixRegex([]string{"sonar"}) to scan for 40-character hex tokens near the keyword "sonar". This causes false positives when processing GitHub dependabot PRs that reference sonarsource/sonarqube-scan-action commit URLs - the commit SHAs (e.g., e050aa9e699112ca0664dd2a5c694ddab05dc555) happen to be 40-char hex strings within 40 characters of "sonar".
Change
Narrowed the prefix keyword from "sonar" to "sonarcloud" so the regex only triggers on explicit references to the SonarCloud service name, not generic "sonar" substrings in repository names, project names, or URLs.
Test
Added a test case verifying that sonarsource/sonarqube-scan-action commit URLs do not produce false positive results.
AI assistance disclosure
This contribution was written with the assistance of an AI agent to help identify the root cause, implement the fix, and produce this PR description. All code changes have been reviewed for correctness.
Closes #5000
Note
Low Risk
Small detector-regex tweak with a targeted test; may miss tokens only labeled with generic “sonar” wording, but SonarCloud-specific contexts remain covered.
Overview
The SonarCloud detector’s token regex now requires the
sonarcloudprefix (viaPrefixRegex) instead ofsonar, so 40-character hex strings are only considered when they appear near an explicit SonarCloud reference—not generic “sonar” text in URLs or repo names (e.g. dependabotsonarqube-scan-actioncommit SHAs).A pattern test was added to assert that a
sonarsource/sonarqube-scan-actionGitHub commit URL does not yield a secret match.Reviewed by Cursor Bugbot for commit 093b39f. Bugbot is set up for automated code reviews on this repo. Configure here.