contrib: require explicit agreement for including external code#23201
contrib: require explicit agreement for including external code#23201ngxson wants to merge 1 commit into
Conversation
|
Does this include code files that are used in the vendor directory? |
|
I would suggest that they credit the original author for their work too, even if they had their blessing to use it, so that we know where it came from. |
There was a problem hiding this comment.
My opinion is that (instead of requiring explicit permission in all cases) we should instead require disclosure and then decide on a case-by-case basis how to go from there.
I would suggest that they credit the original author for their work too, even if they had their blessing to use it, so that we know where it came from.
I agree that external code sources should be documented.
|
@inforithmics we include full license from vendors in the distributed package for this purpose @JohannesGaessler can you point me to a real example where always require explicit acknowledgement can be a problem? many recent slop PRs are taken straight from llama.cpp forks, like turboquant and dsv4, where the original author have no intent to push it upstream. this kind of contribution farming is not something we should tolerate IMO, it's straight up stealing code without consent. some examples:
|
|
@taronaeo to be clear, we do try our best to give attributions to original authors. Asking for an explicit approval will also can also prevent potential unfortunate events where code is merged to upstream, but the author then requires a specific form of attribution that we cannot offer. |
|
The problem we have is people creating spam PRs. I'm thinking there could be legitimate use for taking code from external sources where the original author is not available/responding for whatever reason. Although we could also handle that on a case-by-case basis. Maybe we were also not thinking of the same thing. The text in the PR says "obtain explicit acknowledgement from the original author" but what do we actually mean by that? I read it as "explicit permission". |
Yes, I do mean "explicit permission" or in other words, "written proof"
IMO it is a bit risky that way, for the reason I mentioned in my previous comment. I do think an explicit consensus is much safer in the context of llama.cpp as a well-known open-source project. I would assume the case where we legitimately copy code from another repo is pretty rare in llama.cpp, for 2 reasons:
Also note that cases below are not counted as "include code from external source":
|
|
Any code licensed under a license compatible with inclusion in llamacpp should be allowed to be included in a pr if it serves the project. Requiring explicit permission from the author of the code in question is not necessary nor useful as a stipulation if that author has released the code under a license that allows inclusion. Allowing the original author to upstream his own code published elsewhere is a common curtsy that should be observed, but there is absolutely nothing wrong with anyone adding code that the original author is not interested in up-streaming or if the code in question is being borrowed from a unrelated project. |
|
@IMbackK In theory, MIT or any compatible license allow doing exactly that. But in reality, not everyone is happy with their code is being copied without their acknowledgement. Example: Imagine that I'm working on a big feature and I want to optimize it further, then create the PR on upstream later on. HOWEVER if someone take my code and push to upstream in this bad state (without my acknowledgement), that would be pretty much unwanted, even though everything is permitted by the licenses. And indeed, just to remind that there were some messy consequences of not having explicit agreement from original author about how to give them attributions (I won't mention in details here as most maintainers already know). So I think my point still stands, having explicit acknowledgement from the original author is still much better / much safer to have. If you have an example of how this can hurt the development of the project, I'm happy to discuss further. |
MIT does require acknowledgement of license and author through the copyright note for copies of "substantial portions" of code. This is sensible anyways, if you use someone's code (as allowed by the license), at the very least credit them. I don't think we need a specific rule beyond that, the current rules already allow closing these kinds of spam PRs without any problem. It can be handled case by case. |
|
@0cc4m can you cite the exact text in the license that explicitly or implicitly implies the terms you mentioned ? |
|
I'm not a lawyer, but the MIT license states
Maybe if including MIT-licensed code in another MIT-licensed project the second condition is already satisfied by the main license, but the copyright notice would still be required and can maybe be in a comment above the code that was imported. I might be wrong, but it sounds to me like that covers our cases well enough. I don't think we would really consider importing code from other projects most of the time anyways. |
|
@0cc4m I think what you are referring to has nothing to do with the point of "the author must acknowledge about how their code is being used" Grammatically say, the term you mentioned is in the passive form (use of the phrase
It means the subject who use or redistribute the software must acknowledge and include the license, but not the way around. But still, the license the "in theory". After all, I believe my arguments are pretty solid as they are backed by real examples that can be verified. I still strongly believe that a change in the guideline is needed. However, after second thought, I'd agree my proposed expression is a bit strict. The @JohannesGaessler 's proposal sounds a bit better, so I'll rethink and adapt to his version instead. |
Yes, I am talking about what anyone submitting code to us that they have not authored must do. Whether we accept it or not is a different question. If someone does not follow this, it can be declined immediately. If someone does follow it, it can be considered/discussed and that may include asking the original author if necessary. I just don't think we should codify requiring direct confirmation from the original author if it is licensed in a permissive way, but I also don't mean we should just accept code from other projects. I agree with deciding case-by-case. |
|
How about something like this: "If at all possible, coordinate the upstreaming of code from forks of this repository with the original author(s). If you are not doing this, explain why. Always disclose when code is being upstreamed." I think our issue largely has to do with forks of llama.cpp specifically, not with random MIT-licensed projects. |
Overview
In simple words: ask the author before copying their code
Requirements