fix(query): changed all dockerfile queries for case insensitive support of dockerfile commands#8006
Open
cx-andre-pereira wants to merge 102 commits intoCheckmarx:masterfrom
Conversation
…s named 'dockerfile'
…es and all files with prefix 'dockerfile.' as well as all files with the '.dockerfile' extension type in a case insensitive matter (improvement on first commit)
…sion, added support for all ubi8/debian files in case of valid dockerfile structure, added support for lower case dockerfile commands - most queries will have issues with this but relevant text files are properly detected as a 'dockerfile' as intended
…ormats for consistency
…ility to lower cyclomatic complexity
…rors, fixed 'gitignore' files exclusion, docker parser will handle said case like before but with explicit 'gitignore' extension rather than 'possibleDockerfile' like before
…sion so that it 1- gets detected regardless of syntax inside 2- gets detected withouth checking syntax inside through the code optimizing detection speed for said files
…wice and minor simplificaton of query arguments
…test 105, improved uni tests to include new case insensitive samples
… dockerfiles as '.dockerfile'
…ailing whitespace
… unnecessary 'gitignore' case in analyzer's workers
…have to be explicitly set as unwanted to allign with '.gitignore' behaviour
…s://github.com/cx-andre-pereira/kics into AST-140477--Improvement-to-dockerfile-scanning
…ct commands in payload
…ed from searchKey
…140477--Improvement-to-dockerfile-scanning
… without extension or within docker folder, adjusted readPossibleDockerfile to an improved regex based logic, adjusted file.Dockerfile to test for whitespaces before a from statement
…t identification - E2E test should break
… dockerfile folder, updated E2E results
…rfile_queries_fix_for_case_insensitivity
…://github.com/cx-andre-pereira/kics into Dockerfile_queries_fix_for_case_insensitivity
…umentation to reflect specific dockerfile searchKey building
…xclusion based regex
…rfile_queries_fix_for_case_insensitivity
…ine FROM statements are compatible, fixed whitespace support for ARG/comments in dockerfiles and fix new E2E results
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reason for Proposed Changes
The changes from #7995 (included in this PR) enable scanned Dockerfiles to recognize valid case-insensitive syntax. However, none of the existing Dockerfile queries were prepared for this — all of them contain hardcoded uppercase references to Docker commands.
Additionally, testing revealed that the existing query logic, payload generation, and engine line-searching logic did not support multiple
FROMstatements associated with the same image within a single document. In such cases, only the firstFROMstatement would produce results; the engine was unable to flag any subsequent occurrences.Proposed Changes
New auxiliary functions
Two new helper functions were added to the common library
dockerfile.rego:get_original_from_command— Extracts theFROMcommand with its original casing along with theLineHintvalue for that command.add_line_hint— Appends the extractedLineHintto the string using^as a separator.The
LineHintvalue enables the engine to distinguish between multipleFROMstatements referencing the same image.Query updates
searchKeyvalue. On the engine side, preserving the original casing ofFROMenables more precise searching through the source file. TheLineHintvalue is consumed bydocker_detectand stripped by thevulnerability_builderbefore results are emitted.Parser changes
FROMstatement generates a distinct object. Previously, allFROMcommands referencing the same image were merged into a single object.Test coverage
A new positive and negative sample was added to every test folder. These mirror the existing
positive/positive1andnegative/negative1tests but use all-lowercase commands.A new E2E test (107) was added, it tests that the payload/results of a scan on a multistage dockerfile sample with duplicate
FROMstatements is parsed/flagged as expected.New test based on
OriginalData4added to the docker_detect_test file, also checks that a multistage sample with duplicateFROMstatements flags properly.TestDockerSearchKeyLineHintRemovalwas added to the vulnerability_builder_test file, as the name implies it checks that theLineHintappended to the searchKey value of a sample dockefile query is removed properly.Updated Documentation/Script
searchKeyvalue attribution.validate_search_line.pyCI script was updated to account for the fact that Dockerfile queries do not currently supportsearchLinevalues. The script now excludes all queries under thedockerfilequeries folder.I submit this contribution under the Apache-2.0 license.