ReversingLabs SpectraAssure rl-json parser for DefectDojo#12579
Conversation
|
This pull request contains a potential resource exhaustion risk in the ReversingLabs SpectraAssure JSON parsing method, where large JSON inputs could consume excessive memory during parsing, though the risk is currently classified as non-blocking.
Resource Exhaustion Risk in
|
| Vulnerability | Resource Exhaustion Risk |
|---|---|
| Description | Large JSON input could consume excessive memory during parsing. The json.load() method reads the entire file into memory, which could lead to a potential denial of service if an extremely large file is processed. |
All finding details can be found in the DryRun Security Dashboard.
|
after bot complain added 'title: ...' to the doc file |
|
fixed the readme after the checker complained |
|
my local hugo extended test now shows no errors |
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| WHAT = "ReversingLabs Spectra Assure" |
There was a problem hiding this comment.
Can you give this a clearer name?
There was a problem hiding this comment.
I expect you mean on the left side of the '=', yes will fix that
| cin.score = score | ||
| cin.score_severity = self._score_to_severity(score=score) | ||
|
|
||
| # TODO: tags |
There was a problem hiding this comment.
Is this still a TODO?
There was a problem hiding this comment.
sorry my mistake, tags have been added, i will remove the todo line
| if len(tail) == 0: | ||
| tail = f"{dep_name}@{dep_version}" | ||
|
|
||
| cin.unique_id_from_tool = f"{cin.component_file_sha256}:{cve}:{tail}" |
There was a problem hiding this comment.
the unique_id_from_tool value should be something that is in the report, not a computed/constructed value. Can you look at this?
There was a problem hiding this comment.
This one is more tricky.
In order for unique-id from tool to be valuable for detecting duplicates across future uploads we have to distinguish between what we call components (unpacked-files), vulnerabilities (cve) and component-dependencies in our report.
We detect vulnerabilities (cve) directly on components (unpacked-files) and we detect vulnerabilities (cve) on dependencies of components.
Components (unpacked files) are files in our report and so have a unique sha256. Dependencies may occur on multiple files(components) in the scan (as a result of unpacking, zip files, msi installers and such).
Dependencies usually have a package_url, or if they dont they have a version and name or product string.
So to uniquely identify one vulnerability (cve) on one item (component or dependency) we need to combine items.
- for vulnerabilities only on components (without dependencies) we need the sha256 and the cve to uniquely point to the file and the cve.
- for vulnerabilities on dependencies of compnents we need the sha256 of the components the package-url of the dependency and the cve of the vulneramility.
Once we have that we can fully use deduplication on future scans and imports into DefectDojo.
The report file internally uses 'uuid' to condense the report items but they are only unique inside this one report-file so have no meaning for deduplication. As such we do not have one single uniqe id in our report that uniquely identifies one DefectDojo-Finding.
A good example is the provided multi language installer in 'unittests/scans/reversinglabs_spectraassure/HxDSetup_2.5.0.exe-report.rl.json'
as the installer supports 18 different languages we see 18 embedded files. each file has dependencies and many depdendencies have vulnerabilities we report.
I hope this helps to explain why we have to construct a unique id.
i would guess the alternative would be not to privide a unique-id.
There was a problem hiding this comment.
i would guess the alternative would be not to privide a unique-id.
I suspect this may be the case. The unique_id_from_tool is intended to track the same vulnerability fro report to report. Ideally, this is handled by the vendor, but in cases where it is not, DefectDojo will essentially construct something similar you are doing, but not at the parser level. Instead, those fields will be captured in the deduplication settings through the hash code fields. It is essentially the same thing you are doing, just with DefectDojo fields after the findings are created by the importer/reimporter.
There was a problem hiding this comment.
uniq_id from tool removed
Maffooch
left a comment
There was a problem hiding this comment.
The structure of this parser is quite a bit different than other parsers that are found in DefectDojo. This may make it bit more difficult to maintain going forward
The trivy parser could be a good example to mimic
|
Will have a look at the trivy parser but i already see that it will not match our data, we use a significant amount or references to other items in our reports. I see that most data from other reports produce a relativly flat list of results, however we produce compacted dicts with cross references. The actual parsing happens in: I can naturally move that code into parser.py if that would suit you better. In general our report requires some significant study to see how the network of references work before doing any maintenance on the parser as it is easy to overlook things or make the wrong assuptions. |
Wow, you weren't kidding! 2400 lines for a single vulnerability is quite a bit of data. This confirms my fear that if, for reason, you're not around to maintain this parser if something changes in the report format producee by Reversing Labs, making changes will be a heavy task. I suppose the request for simplification is more important than I realized. Rather than asking you to reengineer the parsing logic, I think we can get away with documenting the parser. How does it sound to heavily comment the code in both files, and then write a short README in the same directory as the |
|
I'd like to request to make the variable names more clear as well. I had trouble following the code with variables like r, rr, cin. |
|
@valentijnscholten: will do @Maffooch: good idea, will do |
|
both done, short names replaced with more meaningful ones, the actual parser in |
|
not sure if this is related to my changes |
Feature: ReversingLabs SpectraAssure rl-json parser for DefectDojo
Documentation and unittests provded according to the plugin documentation
This adds a new parser for the rl-json format produced by the SpectraAssure cli or portal.