When I scan this text GNU Library General Public Licence we got "detected_license_expression": null but here we can clearly see this is a LGPL license we got only license_clues for this text.
Almost 2.8k files have this license sentence see: https://github.com/search?q=%22GNU+Library+General+Public+Licence%22&type=code
Here we can see the result that is getting from scancode-toolkit.
{
"path": "check.txt",
"type": "file",
"detected_license_expression": null,
"detected_license_expression_spdx": null,
"license_detections": [],
"license_clues": [
{
"license_expression": "lgpl-2.0 OR mulle-kybernetik",
"license_expression_spdx": "LGPL-2.0-only OR LicenseRef-scancode-mulle-kybernetik",
"from_file": "check.txt",
"start_line": 1,
"end_line": 1,
"matcher": "3-seq",
"score": 10.64,
"matched_length": 5,
"match_coverage": 10.64,
"rule_relevance": 100,
"rule_identifier": "lgpl-2.0_or_mulle-kybernetik4.RULE",
"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/lgpl-2.0_or_mulle-kybernetik4.RULE",
"matched_text": "GNU Library General Public Licence",
"matched_text_diagnostics": "GNU Library General Public Licence"
}
],
"percentage_of_license_text": 100.0,
"scan_errors": []
}
Problem
problem in licence word Look the text, in the text licence word is present according SPDX license list matching guidelines both word license or license have same meaning.
Lets take new text GNU Library General Public License here licence is replaced with license now scan this text.
And see the result.
{
"path": "check.txt",
"type": "file",
"detected_license_expression": "lgpl-2.0-plus",
"detected_license_expression_spdx": "LGPL-2.0-or-later",
"license_detections": [
{
"license_expression": "lgpl-2.0-plus",
"license_expression_spdx": "LGPL-2.0-or-later",
"matches": [
{
"license_expression": "lgpl-2.0-plus",
"license_expression_spdx": "LGPL-2.0-or-later",
"from_file": "check.txt",
"start_line": 1,
"end_line": 1,
"matcher": "1-hash",
"score": 100.0,
"matched_length": 5,
"match_coverage": 100.0,
"rule_relevance": 100,
"rule_identifier": "lgpl-2.0-plus_186.RULE",
"rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/lgpl-2.0-plus_186.RULE",
"matched_text": "GNU Library General Public License",
"matched_text_diagnostics": "GNU Library General Public License"
}
],
"identifier": "lgpl_2_0_plus-07b4098a-cf1f-a108-69d6-3021dbfb3959"
}
],
"license_clues": [],
"percentage_of_license_text": 100.0,
"scan_errors": []
}
Here we detect the correct license.
When I scan this text
GNU Library General Public Licencewe got"detected_license_expression": nullbut here we can clearly see this is a LGPL license we got onlylicense_cluesfor this text.Almost 2.8k files have this license sentence see: https://github.com/search?q=%22GNU+Library+General+Public+Licence%22&type=code
Here we can see the result that is getting from scancode-toolkit.
Problem
problem in
licenceword Look the text, in the textlicenceword is present according SPDX license list matching guidelines both wordlicenseorlicensehave same meaning.Lets take new text
GNU Library General Public Licenseherelicenceis replaced withlicensenow scan this text.And see the result.
Here we detect the correct license.