Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2,438 changes: 2,438 additions & 0 deletions .2ms.yml

Large diffs are not rendered by default.

2 changes: 0 additions & 2 deletions .github/workflows/pr-validation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ name: PR Validation

on:
pull_request:
branches:
- master
merge_group:

jobs:
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/security.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@ on:
branches:
- master
pull_request:
branches:
- master
merge_group:
schedule:
- cron: "0 0 * * *"
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/trivy-vulnerability-scan.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ on:
push:
workflow_dispatch:
pull_request:
branches:
- master
schedule:
- cron: '5 6 * * *' # Runs every day at 06:05 UTC

Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/validate-readme.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ name: Validate README

on:
pull_request:
branches:
- master
merge_group:

jobs:
Expand Down
3 changes: 3 additions & 0 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@ linters:
- opinionated
- performance
- style
settings:
hugeParam:
sizeThreshold: 124
gocyclo:
min-complexity: 15
govet:
Expand Down
86 changes: 76 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ Scan recent Git history instead:
- Unified scanning for local directories, Git history, Slack, Discord, Confluence Cloud, and Paligo — each exposed as a dedicated subcommand.
- Hundreds of tuned detection rules curated by Checkmarx on top of gitleaks, enriched with CVSS-based scoring in every finding.
- Optional live secret validation (`--validate`) to confirm whether discovered credentials are still active.
- Flexible filtering and noise reduction: `--rule`, `--ignore-rule`, `--add-special-rule`, `--ignore-result`, `--regex`, `--allowed-values`, and `--max-target-megabytes`.
- Flexible filtering and noise reduction: `--rule`, `--ignore-rule`, `--add-special-rule`, `--ignore-result`, `--regex`, `--allowed-values`, `--max-target-megabytes`, `--max-findings`, `--max-rule-matches-per-fragment`, and `--max-secret-size`.
- Rich reporting for developers and pipelines with JSON, YAML, and SARIF outputs, multiple `--report-path` destinations, and CI-aware exit handling via `--ignore-on-exit`.
- Automation ready: configuration files, `2MS_*` environment variables, Docker images, and GitHub Actions templates.
- Extensible plugin architecture — contributions for new data sources are welcome.
Expand Down Expand Up @@ -244,15 +244,18 @@ Global flags work with every subcommand. Combine them with configuration files a

### Global Flags

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--config` | string | | Path to a YAML or JSON configuration file. |
| `--log-level` | string | `info` | Logging level: `trace`, `debug`, `info`, `warn`, `error`, `fatal`, or `none`. |
| `--stdout-format` | string | `yaml` | `yaml`, `json`, or `sarif` output on stdout. |
| `--report-path` | string slice | | Write findings to one or more files; format is inferred from the extension. |
| `--ignore-on-exit` | enum | `none` | Control exit codes: `all`, `results`, `errors`, or `none`. |
| `--max-target-megabytes` | int | `0` | Skip files larger than the threshold (0 disables the check). |
| `--validate` | bool | `false` | Enrich results by verifying secrets when supported. |
| Flag | Type | Default | Description |
|-----------------------------------|--------------|---------|-----------------------------------------------------------------------------------------------------------------|
| `--config` | string | | Path to a YAML or JSON configuration file. |
| `--log-level` | string | `info` | Logging level: `trace`, `debug`, `info`, `warn`, `error`, `fatal`, or `none`. |
| `--stdout-format` | string | `yaml` | `yaml`, `json`, or `sarif` output on stdout. |
| `--report-path` | string slice | | Write findings to one or more files; format is inferred from the extension. |
| `--ignore-on-exit` | enum | `none` | Control exit codes: `all`, `results`, `errors`, or `none`. |
| `--max-target-megabytes` | int | `0` | Skip files larger than the threshold (0 disables the check). |
| `--max-findings` | int | `0` | Caps the total number of results. Scan stops early if limit is reached. Omit or set to 0 to disable. |
| `--max-rule-matches-per-fragment` | int | `0` | Caps the number of results per rule per fragment (e.g., file, chunked file, page). Omit or set to 0 to disable. |
| `--max-secret-size` | int | `0` | Secrets larger than this size (in bytes) will be ignored. Omit or set to 0 to disable this check. |
| `--validate` | bool | `false` | Enrich results by verifying secrets when supported. |

### Configuration Files & Environment Variables

Expand Down Expand Up @@ -327,6 +330,69 @@ jobs:

Use `--ignore-on-exit results` to keep pipelines green when only findings (not errors) are present, or leave it at the default `none` to fail on detected secrets.

## Custom Rules File

We support custom rules, which are user defined rules that can be passed via a custom rules file using the `--custom-rules-path` flag. The custom rules file format and extension can be YAML or JSON.

Custom rules can be:

- **Overrides** - if a rule present in the file shares the same ruleId as a default rule of 2ms, the rule present in the file will replace (override) the default rule in the scan.
- Note: If a rule is overridden, it will simply take all fields from the rule as defined in the file. You must include all fields that you want to be defined, otherwise they will be nil/empty.

- **New rules** - if a rule does not share ruleId with a default rule, it will be appended to the list of rules used in the scan.

Custom rules work properly with --rule and --ignore-rule flags. Rules can be selected/ignored by ruleId, ruleName and tag

Regardless of being an override or new rule, a custom rule has the following required fields:
- ruleId - unique identifier of the rule
- ruleName - human readable name of the rule
- regex - regex pattern used to identify the secret

Other fields are optional and can be seen in the example bellow of a file with a custom rule

**YAML Example:**
```yaml
- ruleId: 01ab7659-d25a-4a1c-9f98-dee9d0cf2e70 # REQUIRED: unique id, must match default rule id to override that default rule. Rule ids can be used as values in --rule and --ignore-rule flags
Comment thread
cx-leonardo-fontes marked this conversation as resolved.
ruleName: Custom-Api-Key # should be human-readable name. If left empty for new rule, ruleName will take the value of ruleId. If left empty for override, default rule name will be considered. Rule names can be used as values in --rule and --ignore-rule flags
description: Custom rule
regex: (?i)\b\w*secret\w*\b\s*:?=\s*["']?([A-Za-z0-9/_+=-]{8,150})["']? # REQUIRED: golang regular expression used to find secrets. For regexes, if enclosed in "", make sure to escape backslashes (\\, \\b, etc.). If capture group is present in regex, it's used to find the secret, otherwise whole regex is used. Which group is considered the secret can be defined with secretGroup
keywords: # Keywords are used for pre-regex check filtering. Rules that contain keywords will perform a quick string compare check to make sure the keyword(s) are in the content being scanned.
- access
- api
entropy: 3.5 # minimum shannon entropy, which measures how random a string is. The more unique characters a string has, the higher the entropy. The value of entropy will tend to become log2(unique chars), so long as all unique are equally present in the string ('abcd' string has entropy of log2(4)=2, but so does 'aabbccdd'). To test entropy values, use https://textcompare.io/shannon-entropy-calculator. Default rules that use entropy have values between 2.0 and 4.5, though these minimums can sometimes be 1-2 lower than the entropy of a true positive. Leave entropy empty to consider matches regardless of entropy
secretGroup: 1 # defines which capture group of regex match is considered the secret. Is also used as the group that will have its entropy checked if `entropy` is set. Can be left empty, in which case the first capture group to match will be considered the secret
path: "(?i)\\.(?:tf|hcl)$" # regex to limit the rule to specific file paths, for example, only .tf and .hcl files. For regexes, if enclosed in "", make sure to escape backslashes (\\, \\b, etc.)
severity: High # severity, can only be one of [Critical, High, Medium, Low, Info]
tags: # identifiers for the rule, tags can be used as values of --rule and --ignore-rule flags
- api-key
category: General # category of the rule, should be a string of type ruledefine.RuleCategory. Can be omitted in custom rule, but if omitted and ruleId matches a default rule, the category will take the value of the category of that defaultRule. Impacts cvss score
scoreRuleType: 4 # can go from 1 to 4, 4 being most severe. If omitted in rule it will take the value of 1. Impacts cvss score
disableValidation: false # if true, disables validity check for this rule, regardless of --validate flag
deprecated: false # if true, the rule will not be used in the scan, regardless of --rule flag
allowLists: # allowed values to ignore if matched
- description: Allowlist for Custom Rule
matchCondition: OR # Can be AND or OR. determines whether all criteria in the allowList must match. Defaults to OR if not specified
regexTarget: match - # Can be match or line. Determines whether the regexes in allowList are tested against the rule.Regex match or the full line being scanned. Defaults to "match" if not specified
regexes: # allowed regex patterns
- (?i)(?:access(?:ibility|or)|access[_.-]?id|random[_.-]?access|api[_.-]?(?:id|name|version)|rapid|capital|[a-z0-9-]*?api[a-z0-9-]*?:jar:|author|X-MS-Exchange-Organization-Auth|Authentication-Results|(?:credentials?[_.-]?id|withCredentials)|(?:25[0-5]|2[0-4]\d|1?\d?\d)(?:\.(?:25[0-5]|2[0-4]\d|1?\d?\d)){3}|(?:bucket|foreign|hot|idx|natural|primary|pub(?:lic)?|schema|sequence)[_.-]?key|(?:turkey)|key[_.-]?(?:alias|board|code|frame|id|length|mesh|name|pair|press(?:ed)?|ring|selector|signature|size|stone|storetype|word|up|down|left|right)|KeyVault(?:[A-Za-z]*?(?:Administrator|Reader|Contributor|Owner|Operator|User|Officer))\s*[:=]\s*['"]?[0-9a-f]{8}(?:-[0-9a-f]{4}){3}-[0-9a-f]{12}['"]?|key[_.-]?vault[_.-]?(?:id|name)|keyVaultToStoreSecrets|key(?:store|tab)[_.-]?(?:file|path)|issuerkeyhash|(?-i:[DdMm]onkey|[DM]ONKEY)|keying|(?:secret)[_.-]?(?:length|name|size)|UserSecretsId|(?:csrf)[_.-]?token|(?:io\.jsonwebtoken[
\t]?:[
\t]?[\w-]+)|(?:api|credentials|token)[_.-]?(?:endpoint|ur[il])|public[_.-]?token|(?:key|token)[_.-]?file|(?-i:(?:[A-Z_]+=\n[A-Z_]+=|[a-z_]+=\n[a-z_]+=)(?:\n|\z))|(?-i:(?:[A-Z.]+=\n[A-Z.]+=|[a-z.]+=\n[a-z.]+=)(?:\n|\z)))
stopWords: # stop words that if found in the secret, will discard the finding. Stop words are searched on the secret, which can be either the full regex match or the capture group if any is defined in the rule regex
- 000000,
- 6fe4476ee5a1832882e326b506d14126
paths: # paths that can be ignored for this allowList
- \.bb$
- \.bbappend$
- \.bbclass$
- \.inc$
- matchCondition: AND
regexTarget: line
regexes:
- LICENSE[^=]*=\s*"[^"]+
- LIC_FILES_CHKSUM[^=]*=\s*"[^"]+
- SRC[^=]*=\s*"[a-zA-Z0-9]+
```

## Contributing

2ms is built around a plugin system so new targets and enhancements are easy to add. Check out [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, coding guidelines, and how to propose new rules or plugins.
Expand Down
10 changes: 5 additions & 5 deletions benches/process_items_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@ import (
"sync"
"testing"

"github.com/checkmarx/2ms/v4/engine"
"github.com/checkmarx/2ms/v4/internal/workerpool"
"github.com/checkmarx/2ms/v4/lib/reporting"
"github.com/checkmarx/2ms/v4/lib/secrets"
"github.com/checkmarx/2ms/v4/plugins"
"github.com/checkmarx/2ms/v5/engine"
"github.com/checkmarx/2ms/v5/internal/workerpool"
"github.com/checkmarx/2ms/v5/lib/reporting"
"github.com/checkmarx/2ms/v5/lib/secrets"
"github.com/checkmarx/2ms/v5/plugins"
"github.com/rs/zerolog"
)

Expand Down
59 changes: 56 additions & 3 deletions cmd/config.go
Original file line number Diff line number Diff line change
@@ -1,20 +1,25 @@
package cmd

import (
"encoding/json"
"fmt"
"os"
"path/filepath"
"regexp"
"strings"

"github.com/checkmarx/2ms/v4/lib/utils"
"github.com/checkmarx/2ms/v5/engine/rules/ruledefine"
"github.com/checkmarx/2ms/v5/lib/utils"
"github.com/rs/zerolog"
"github.com/rs/zerolog/log"
"github.com/spf13/cobra"
"gopkg.in/yaml.v3"
)

var (
errInvalidOutputFormat = fmt.Errorf("invalid output format")
errInvalidReportExtension = fmt.Errorf("invalid report extension")
errInvalidOutputFormat = fmt.Errorf("invalid output format")
errInvalidReportExtension = fmt.Errorf("invalid report extension")
errInvalidCustomRulesExtension = fmt.Errorf("unknown file extension, expected JSON or YAML")
)

func processFlags(rootCmd *cobra.Command) error {
Expand All @@ -37,6 +42,14 @@ func processFlags(rootCmd *cobra.Command) error {
engineConfigVar.CustomRegexPatterns = customRegexRuleVar
}

if customRulesPathVar != "" {
rules, err := loadRulesFile(customRulesPathVar)
if err != nil {
return fmt.Errorf("failed to load custom rules file: %w", err)
}
engineConfigVar.CustomRules = rules
}

setupLogging()

return nil
Expand Down Expand Up @@ -119,6 +132,46 @@ func setupFlags(rootCmd *cobra.Command) {
IntVar(&engineConfigVar.MaxTargetMegabytes, maxTargetMegabytesFlagName, 0,
"files larger than this will be skipped.\nOmit or set to 0 to disable this check.")

rootCmd.PersistentFlags().
Uint64Var(&engineConfigVar.MaxFindings, maxFindingsFlagName, 0,
"caps the total number of results. Scan stops early if limit is reached.\nOmit or set to 0 to disable this check.")

rootCmd.PersistentFlags().
Uint64Var(&engineConfigVar.MaxRuleMatchesPerFragment, maxRuleMatchesPerFragmentFlagName, 0,
"caps the number of results per rule per fragment (e.g., file, chunked file, page).\nOmit or set to 0 to disable this check.")

rootCmd.PersistentFlags().
Uint64Var(&engineConfigVar.MaxSecretSize, maxSecretSizeFlagName, 0,
"secrets larger than this size (in bytes) will be ignored.\nOmit or set to 0 to disable this check.")

rootCmd.PersistentFlags().
BoolVar(&validateVar, validate, false, "trigger additional validation to check if discovered secrets are valid or invalid")

rootCmd.PersistentFlags().
StringVar(&customRulesPathVar, customRulesFileFlagName, "", "Path to a custom rules file (JSON or YAML)."+
" Rules should be a list of ruledefine.Rule objects. --rule, --ignore-rule still apply to custom rules")
}

func loadRulesFile(path string) ([]*ruledefine.Rule, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, err
}

ext := filepath.Ext(path)
var customRules []*ruledefine.Rule

switch ext {
case ".json":
err = json.Unmarshal(data, &customRules)
case ".yaml", ".yml":
err = yaml.Unmarshal(data, &customRules)
default:
return nil, errInvalidCustomRulesExtension
}
if err != nil {
return nil, err
}

return customRules, nil
}
105 changes: 102 additions & 3 deletions cmd/config_test.go
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
package cmd

import (
"fmt"
"os"
"path/filepath"
"testing"

"github.com/checkmarx/2ms/v5/engine/rules/ruledefine"
"github.com/rs/zerolog"
"github.com/rs/zerolog/log"
"github.com/spf13/cobra"
Expand Down Expand Up @@ -223,14 +225,14 @@ max-target-megabytes: 10`
})
}

func TestConfigGile(t *testing.T) {
func TestConfigFile(t *testing.T) {
t.Run("ValidConfigFile", func(t *testing.T) {
tempDir := t.TempDir()
configFile := filepath.Join(tempDir, ".2ms.yml")

configContent := `log-level: debug
report-path:
- test-report.json
report-path:
- test-report.json
stdout-format: json
max-target-megabytes: 100`

Expand Down Expand Up @@ -272,3 +274,100 @@ stdout-format: json`
assert.Error(t, err)
})
}

func TestCustomRulesFlag(t *testing.T) {
expectedRules := []*ruledefine.Rule{
{
RuleID: "db18ccf1-4fbf-49f6-aec1-939a2e5464c0",
RuleName: "mock-rule",
Description: "Match passwords",
Regex: "[A-Za-z0-9]{32}",
Keywords: []string{"password", "pwd"},
Entropy: 3.5,
Path: "secrets/passwords.txt",
SecretGroup: 1,
Severity: "High",
OldSeverity: "Critical",
AllowLists: []*ruledefine.AllowList{
{
Description: "Ignore test files",
MatchCondition: "OR",
Paths: []string{"test/.*"},
RegexTarget: "match",
Regexes: []string{"test-password", "dummy-secret"},
StopWords: []string{"example", "sample"},
},
},
Tags: []string{"security", "credentials"},
Category: "General",
ScoreRuleType: 2,
DisableValidation: true,
Deprecated: true,
},
{
RuleID: "b47a1995-6572-41bb-b01d-d215b43ab089",
RuleName: "mock-rule2",
Description: "Match API keys",
Regex: "[A-Za-z0-9]{40}",
Keywords: []string{"api", "key"},
Entropy: 4.0,
Path: "config/api_keys.yaml",
SecretGroup: 0,
Severity: "Medium",
OldSeverity: "High",
AllowLists: []*ruledefine.AllowList{},
Tags: []string{"api", "custom"},
DisableValidation: false,
Deprecated: false,
},
}

tests := []struct {
name string
customRulesFile string
expectedRules []*ruledefine.Rule
expectErrors []error
}{
{
name: "Valid json custom rules file",
customRulesFile: "testData/customRulesValid.json",
expectedRules: expectedRules,
expectErrors: nil,
},
{
name: "Valid yaml custom rules file",
customRulesFile: "testData/customRulesValid.yaml",
expectedRules: expectedRules,
expectErrors: nil,
},
{
name: "Invalid custom rules file",
customRulesFile: "testData/customRulesInvalidFormat.toml",
expectedRules: nil,
expectErrors: []error{errInvalidCustomRulesExtension},
},
{
name: "Invalid rule type",
customRulesFile: "testData/customRulesInvalidRuleType.json",
expectedRules: nil,
expectErrors: []error{fmt.Errorf("cannot unmarshal number -2 into Go struct field Rule.scoreRuleType of type uint8")},
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
customRegexRuleVar = []string{}
engineConfigVar.CustomRules = nil

rootCmd := &cobra.Command{Use: "test"}
rootCmd.PersistentFlags().StringVar(&configFilePath, configFileFlag, "", "")
rootCmd.PersistentFlags().StringVar(&customRulesPathVar, customRulesFileFlagName, tt.customRulesFile, "")

err := processFlags(rootCmd)
for _, expectErr := range tt.expectErrors {
assert.ErrorContains(t, err, expectErr.Error())
}
assert.Equal(t, tt.expectedRules, engineConfigVar.CustomRules)
})
}
}
Loading
Loading