Allow multiple matches for SimplePackageConfigurationProvider#11618
Allow multiple matches for SimplePackageConfigurationProvider#11618sschuberth wants to merge 2 commits into
SimplePackageConfigurationProvider#11618Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #11618 +/- ##
=========================================
Coverage 58.39% 58.39%
+ Complexity 1761 1759 -2
=========================================
Files 355 355
Lines 13204 13204
Branches 1307 1307
=========================================
Hits 7710 7710
Misses 5007 5007
Partials 487 487
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
29b39b0 to
28b9a90
Compare
As announced in the commit message of 00a8487 and rationalized in [1], remove the restriction of allowing at most one matching package configuration. Resolves #6972. [1]: #6972 (comment) Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
28b9a90 to
0784796
Compare
| # (listed first) takes precedence. Configurations must be unique by ID and provenance within a single provider. | ||
| # Ensure that different providers do not provide conflicting configurations for the same package, see | ||
| # https://github.com/oss-review-toolkit/ort/issues/6972 for details. | ||
| # configurations for the same package ID and provenance, the configuration from the higher-priority provider (listed |
There was a problem hiding this comment.
I've read [1] and 00a8487, but couldn't find a clear rationale for this commit.
Previously, the use of SimplePackageConfigurationProvider as part of 'ort-config' CI ensured that there are not mutliple package configuration for an (id, provenance).
I do not see any scenario in which it does make sense to support that. So, I believe this check should be kept per provider.
Can you also help me understand why we do not do the following?
- Keep the invariant per provider
- Allow multiple package configuration for same
(id, provenance)only from different providers
Note: If multiple configurations apply, the excludes and license finding curation simply could be merged and all applied, or? And if the source provider is information which is necessary to keep, then probably the Compound...Provider might need to go, as it would loose that information?
There was a problem hiding this comment.
ensured that there are not mutliple package configuration for an (id, provenance).
But why is a problem to have multiple matches there, but it's not a problem when using multiple providers?
Can you also help me understand why we do not do the following?
In my opinion is adds code and complexity while at the same time being inconsistent. Either multiple matches should be allow or not, no matter whether they originate from the same provider or not. Otherwise, this results in a weird user experience. For example, the same new package configuration would be accepted or not, depending on which provider you add it to (if all providers are part of a composite provider).
In the end this seems to be a real issue, as both @MarcelBochtler and myself were running into this "artificial" limitation as described in the issue.
There was a problem hiding this comment.
The only reason I can think of to disallow multiple matches for a single SimplePackageConfigurationProvider is that the order of returned package configuration might be unclear, e.g. when iterating over files in a filesystem. But I believe this can be handled and documented by the implementation, e.g. by sorting files.
There was a problem hiding this comment.
But why is a problem to have multiple matches there, but it's not a problem when using multiple providers?
How about the other way round: Why is it it beneficial to have two package configurations for a specific (id, provenance), e.g. in a single provider, e.g. ort.yml or in ort-config? (in particular also from the user point of view)
In my opinion is adds code and complexity
I'm not sure which code complexity you mean. If you mean the function which checks uniqueness, I believe this code is an ok price for ensuring to have all curations / path excludes for a id, provenance in a single place.
In the end this seems to be a real issue, as both @MarcelBochtler and myself were running into this "artificial" limitation as described in the #6972.
To my understanding, the issue from @MarcelBochtler was to have a pair of conflicting configuration
spread in ort.yml and ort-config. I agree that this should not be flagged, but I believe simply separate providers for these should be used and the check kept. Would this solve the problem and work for you?
There was a problem hiding this comment.
Why is it it beneficial to have two package configurations for a specific (id, provenance), e.g. in a single provider
For example, users may want to separate path_excludes (which are more technical) from license_finding_curations (which are more legally relevant) for the same (id, provenance) by using separate files.
There was a problem hiding this comment.
For example, users may want to separate
path_excludes(which are more technical) fromlicense_finding_curations(which are more legally relevant) for the same (id, provenance) by using separate files.
These are in the same file already separated by the dedicated properties.
However, one could also separate these using a entirely separate dirs, which would then be separate providers.
There was a problem hiding this comment.
However, one could also separate these using a entirely separate dirs, which would then be separate providers.
Sure, there are work-arounds. But the question remains: Why does a user have to use work-arounds? Maybe now you'll willing to answer my question without a counter-question and explain what's the benefit of keeping the current inconsistency and limitation of SimplePackageConfigurationProvider.
There was a problem hiding this comment.
Sure, there are work-arounds.
I wouldn't see this necessarily as a work around, but as a different way of supporting it.
Maybe now you'll willing to answer my question without a counter-question and explain what's the benefit of keeping the current inconsistency and limitation of SimplePackageConfigurationProvider.
In my initial comment I highlighted that I did not get the rationale from the commit message. And it was not too easy to extract from this discussion that the rationale is: separating legal from non-legal or rather
license finding curations from path excludes in separate files.
I anticipate many complications from this, just to name a few:
- Allowing arbitrary number of files per
(id, provenance)adds complexity, and makes it harder to write /
use atomation helper to work with the files. - It becomes more errorprone, because when looking
at a particular package configuration file, you cannot assume anymore that it is the complete configuration. - The relation
Project : ort.ymlis quite similar toPackage: package-configuration.yml. Theort.yml
also containslicense_finding_curationsnext topath_excludes. The two concepts are analog and imho
should remain consistent. ort-configCI checks will be weakened by this change.
For example, users may want to separate path_excludes (which are more technical) from license_finding_curations (which are more legally relevant) for the same (id, provenance) by using separate files.
Maybe now you'll willing to answer with the entire rationale if there is further arguments besides this example?
I'm wonder if simply disagreeing and making the provider configurable to tolerate uniqueness (not by default) could shortcut this to some compromise solution?
Please have a look at the individual commit messages for the details.