Skip to content

Commit 707f41c

Browse files
committed
2 parents bef12b9 + 579b322 commit 707f41c

25 files changed

Lines changed: 623 additions & 9 deletions

website/blog/authors.yml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,16 @@ pombredanne:
2828
linkedin: philippeombredanne
2929
github: pombredanne
3030

31+
team:
32+
name: AboutCode team
33+
title: Open source for open source
34+
url: https://github.com/aboutcode-org
35+
image_url: /img/nexB_icon.svg
36+
page: true
37+
socials:
38+
linkedin: https://www.linkedin.com/company/nexb
39+
github: https://github.com/aboutcode-org
40+
3141
tg1999:
3242
name: Tushar Goel
3343
title: Software Engineer
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
---
2+
slug: copyleft-licensed-software-java-app
3+
title: Using Copyleft-licensed software components in a Java application
4+
authors: [team]
5+
tags: [copyleft, java, license compliance]
6+
hide_table_of_contents: false
7+
---
8+
9+
Key considerations while using Copyleft-licensed software components in a Java application.
10+
11+
![java_copyleft_license](java_copyleft_license.png)
12+
13+
This document explains some key considerations for the use of Copyleft-licensed software components in a Java application in two contexts:
14+
15+
- Execution of the Java code in a shared JVM.
16+
- Combining class files in a shared executable JAR – and by extension in a Combined JAR (e.g. uber-jar or fat jar).
17+
18+
For this document, “JAR” refers specifically to an executable Java library that is a collection of `.class` files packaged into a file with the `.jar` extension; it does not refer to the use of a `.jar` file as an archive file only (such as for packaging source files for a Java library).
19+
20+
The purpose of this document is to present a “conservative” interpretation of what linking, or interaction may mean in the Java context. It is not based on any particular product or application and we are not aware of any specific license compliance enforcement actions in this area.
21+
22+
## “Strong” Copyleft-licensed Components
23+
24+
The execution of any software component licensed under GPL (or another “strong” Copyleft license such as AGPL, SleepyCat, etc.) in a JVM effectively links that component with all other software components in that JVM process and therefore those other components become subject to GPL license obligations including redistribution of source code.
25+
26+
The net impact of this interaction inside a JVM is that you should not Deploy any GPL-licensed code in a commercial Java-based product, unless that GPL-licensed code is executed in a separate JVM. This use case is possible, but quite rare in practice.
27+
28+
In such rare cases, the GPL-licensed component should be used as-is and un-modified.
29+
30+
If a modification is absolutely required, the purpose of the modification must not be to expose some privileged way to communicate with this library from proprietary code such as exposing a socket interface or other API for the sole benefit of avoiding a direct call to the Copyleft-licensed library.
31+
32+
Such modifications would be considered as essentially similar to running the Copyleft-licensed library in the same JVM process and making direct calls so that the Copyleft obligation would still apply.
33+
34+
## “Limited” Copyleft-licensed Components
35+
36+
Any code included within a JAR can be considered to be statically linked with any other code in that JAR, even though strictly-speaking there is no such concept of “static linking” in Java technology.
37+
38+
The primary logic here is that a JAR is an executable program and all of the files inside it interact within that context.
39+
40+
Clearly there are many programming-level differences between:
41+
42+
1. the process of compiling and linking C/C++ source files into an executable program and
43+
2. the process of converting .java or other source files (such as Scala) into `.class` files and packaging them into a JAR.
44+
45+
But there are more similarities than differences. The net impact of this interaction inside a JAR is that you should not deploy any Copyleft-licensed code in a JAR in combination with any proprietary code.
46+
47+
The impact of software interaction of `.class` files within a JAR varies according to the specific subtype of limited Copyleft license. There are three primary subtypes to consider:
48+
49+
1. LGPL
50+
2. GPL with Classpath Exception
51+
3. “Public” or file-based licenses (CDDL, EPL, MPL)
52+
53+
## 1) LGPL
54+
55+
The LGPL version 2 and version 3 licenses are quite different, but in both cases there are specific terms and conditions related to software interaction and these provide the strongest case that combining `.class` files in an executable `.jar` is a form of static linking.
56+
57+
## 2) GPL with Classpath Exception
58+
59+
This license permits static linking of “independent modules”, but it may be hard to argue that `.class` files combined into a single JAR are independent.
60+
61+
## 3) “Public” or file-based licenses (CDDL, EPL, MPL)
62+
63+
The Copyleft impact from these licenses are primarily limited to the file level so this is the best case to argue that you can combine class files into one JAR without Copyleft impact.
64+
65+
For a component licensed under any of the Limited Copyleft licenses, you do have the option to dynamically link separate libraries (JARs) within a JVM. This is different from GPL-licensed code, as described above, because you can dynamically link libraries under a Limited Copyleft license inside a JVM without a Copyleft impact on other libraries.
66+
67+
The recommended best practice is to Deploy any Java library under a Limited Copyleft license as a separate “dynamic” library as provisioned from the original OSS project. This is the best way to avoid Copyleft impact.
68+
69+
## Combined JARs: uber-jars, mega-jars and fat-jars
70+
71+
Java code is typically packaged and redistributed as pre-compiled `.class` files assembled in one or more JAR libraries. Open source Java libraries are commonly downloaded at build time from a repository such as Maven (either a private or the Maven Central public repository).
72+
73+
The process of creating a Combined JAR is to combine the `.class` files from all of the third-party dependency JARs together with proprietary-licensed `.class` files in a single JAR. This larger Combined JAR mixes open source (and possibly Copyleft-licensed code) and proprietary code in a single JAR.
74+
75+
Creating larger Combined JARs is typically automated as part of a product build. Maven-based build plugins and tools include Maven Shade, one-jar, fat jar and others.
76+
77+
In most cases, this is an addition to the build that is easily reversed to revert to a multi-jar deployment approach. The technical purpose of building a Combined JAR may be to:
78+
79+
- Simplify the deployment or configuration of some larger Java applications by reducing the number of `.jar` libraries to be deployed.
80+
- Simplify runtime configuration. In particular the Java class paths do not need to be configured to reference the dependencies since they are all contained in a single executable library.
81+
- Accelerate initial loading of the application in the JVM where startup time is critical for the application. This acceleration is likely to be minimal.
82+
83+
In addition to the Copyleft interaction issues outlined above, some other disadvantages of using Combined JARs are:
84+
85+
- In the process of creating a Combined JAR, some common files with the same name and path (such as NOTICE, LICENSE) may be overwritten in a Combined JAR. Only one copy of each such file will exist in the Combined JAR. The terms of most open source licenses do not permit you to remove license or notice files.
86+
- The repackaging of un-modified JARs in a Combined JAR could be considered to be a modification. Most Copyleft licenses require you to track and document changes so this repackaging may require additional documentation work for the product team.
87+
- Tracing the package-version of an individual third-party component included in a Combined JAR may be difficult, which in turn may make it difficult to comply with Copyleft license conditions that require an offer to redistribute package-version-specific source code.
88+
- When updating software, the entire Combined JAR will need to be rebuilt even if most individual third-party packages are unchanged. In particular if a single third-party component JAR needs to be updated for a vulnerability, bug or new feature fix, then the whole Combined JAR need to be redistributed to customers.
89+
- If several larger Combined JARs are created in a product, the resulting size of the executables may be larger, as the contents of every shared third-party JAR will be duplicated in each Combined JAR instead of being shared across modules. Thus, a Combined JAR can impede the possibility and flexibility of Java library reuse.
90+
91+
In general, Combined JARs are best suited for Deployment of Java applications in an internal system/IT- or SaaS-only use case where some of their benefits are measurable and there are fewer issues related to license compliance and Copyleft-licensed component interaction.
92+
93+
When used in a commercial product that is distributed in any way, the issues attached to larger combined JARs usually outweigh any technical benefits that they may offer.
39.8 KB
Loading
458 KB
Loading
165 KB
Loading
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
slug: scancode-licensedb
3+
title: ScanCode LicenseDB -- 2,000+ licenses curated in a public database
4+
authors: [team]
5+
tags: [license compliance, license detection]
6+
hide_table_of_contents: false
7+
---
8+
9+
The ScanCode LicenseDB is all about identifying a wide variety of licenses that are actually found in software.
10+
11+
![scancode-db-blog](scancode-db-blog.png)
12+
13+
New software licenses appear constantly (like mushrooms popping out of the ground after a heavy rain) and old nearly-forgotten ones are rediscovered when someone [scans a codebase](https://www.nexb.com/scancode/) that incorporates legacy code (like finding rare medieval manuscripts in the back shelves of a library). The [ScanCode LicenseDB](https://scancode-licensedb.aboutcode.org/) precisely identifies and organizes licenses and their metadata so that multiple members of the software community can understand exactly which licenses are being referenced in project documentation.
14+
15+
If you have seen a license notice, passed it on to your legal team for scrutiny, and completed that review, but you probably do not want to repeat that process over and over again.
16+
17+
With over 2,000 licenses, ScanCode LicenseDB is arguably the largest free list of curated software licenses available on the internet, and an essential reference license resource for license compliance and SBOMs. ScanCode LicenseDB is available as a website, a JSON or YAML API, and a git repository making it easy to reuse and integrate in tools that need a database of reference software licenses.
18+
19+
Here are some key points about the ScanCode LicenseDB:
20+
21+
- Is a list of 2,092 licenses recognized by scancode-toolkit as of 2023-03-13
22+
- Identifies each license by the license key defined in scancode-toolkit
23+
- Provides an SPDX Identifier (with link) to every license and exception on the SPDX License List, and a “Licenseref” identifier for every license and exception not on the SPDX License List.
24+
- Provides license texts in plain text formats.
25+
- Provides license texts and metadata in yml and json.
26+
- Freely accessible via [API](https://scancode-licensedb.aboutcode.org/help.html#api)
27+
- Data licensed under CC-BY-4.0
28+
- Community supported on [GitHub](https://github.com/nexB/scancode-licensedb).
29+
30+
And below are some frequently asked questions about the ScanCode LicenseDB.
31+
32+
**Q: What are the inclusion criteria for a license to be in the ScanCode LicenseDB?**
33+
34+
A: The only requirements are a text and a usage in existing code. The ScanCode LicenseDB includes multiple categories of licenses, not just open source: permissive, copyleft, commercial, proprietary free, source-available, etc. More information on license categories is available here: https://scancode-licensedb.aboutcode.org/help.html#license-categories
35+
36+
**Q: Does the ScanCode LicenseDB compete with other license lists, such as the SPDX license list?**
37+
38+
A: No. The ScanCode LicenseDB is intended to **supplement** other license lists. When new licenses are discovered by scancode-toolkit or the software community, they are added to the list with references to other lists whenever possible.
39+
40+
**Q: What is the process for adding or correcting licenses in the ScanCode LicenseDB?**
41+
42+
A: License curation is primarily a task of the active participants in [AboutCode.org](https://www.aboutcode.org/), but any member of the software community is welcome to log and respond to issues at https://github.com/nexB/scancode-licensedb/issues. See https://scancode-licensedb.aboutcode.org/help.html#support for more details.
43+
44+
**Q: Is a license in the ScanCode LicenseDB “approved” or “recommended for use”?**
45+
46+
A: The ScanCode LicenseDB is all about identifying the wide variety of licenses that are actually found in software. There is no attempt to approve or disapprove of license terms, and there is no attempt to correct poorly written licenses. The only license interpretation provided is a license category, which represents the best judgment of the license curators.
47+
48+
**Q: How are licenses discovered (detected) by scancode-toolkit?**
49+
50+
A: For license detection, ScanCode uses a (large) number of license texts and license detection ‘rules’ that are compiled in a search index. When scanning, the text of the target file is extracted and used to query the license search index and find license matches.
51+
52+
For copyright detection, ScanCode uses a grammar that defines the most common and less common forms of copyright statements. When scanning, the target file text is extracted and ‘parsed’ with this grammar to extract copyright statements.
53+
54+
More detailed information is available at https://scancode-toolkit.readthedocs.io/en/stable/explanation/scancode-license-detection.html#scancode-license-detection.
55+
56+
**Q: How can I get help or contribute to ScanCode LicenseDB?**
57+
58+
A: You can chat with the AboutCode community on [Gitter](https://app.gitter.im/#/room/#aboutcode-org_discuss:gitter.im), or report issues or ask questions at https://github.com/nexB/scancode-licensedb/issues.
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
---
2+
slug: non-vulnerable-dependency-resolution
3+
title: Non-Vulnerable Dependency Resolution
4+
authors: [tg1999]
5+
tags: [dependencies, vulnerabilities]
6+
hide_table_of_contents: false
7+
---
8+
9+
Dependencies may come with vulnerabilities that can be exploited by attackers.
10+
11+
![non-vulnerable-dependency](non-vulnerable-dependency.png)
12+
13+
Dependency resolution is the process of identifying and installing the required software packages to ensure that the software being developed runs smoothly. However, these dependencies may come with vulnerabilities that can be exploited by attackers.
14+
15+
Until now, these contexts have been considered as separate domains:
16+
17+
- Package management tools resolve the version expression of the dependent package of a package to resolved versions in order to install the selected versions.
18+
19+
- Security tools check if resolved package versions are affected by known vulnerabilities (even when integrated in a package management tool)
20+
21+
As a result, the typical approach to get a non-vulnerable dependency tree is:
22+
23+
1. Resolve a dependency tree and install the resolved package versions.
24+
25+
2. For each resolved dependent package version, translate the identifiers and look in a vulnerability or bug database to determine if a version is affected by a vulnerability and which package version fixes this vulnerability, if any.
26+
27+
3. Update the vulnerable versions with fixing versions.
28+
29+
4. Repeat step 1 until you have exhausted all possibilities. Stop on conflicts if a resolution is not possible when considering functional requirements and vulnerability fixing versions.
30+
31+
That approach is complex, tedious and time-consuming. It also suggests non-vulnerable versions without consideration for the functional dependency requirements necessary when reconsidering each dependency separately. This is a waste of time and effort as the non-vulnerable suggestion may not satisfy the functional constraints. Stated otherwise, the result may be a non-vulnerable package tree where packages do not work together and do not satisfy functional requirements, e.g., this results in potentially non-functional software.
32+
33+
[![maven-find-transitive-dependencies](maven-dependency-tree.png)](https://www.tutorialworks.com/maven-find-transitive-dependencies/)
34+
35+
Here at nexB, we propose a new method and process to resolve software package vulnerable version ranges and dependency version constraints at the same time. This enables developers to obtain a resolved software package version tree matching the blended constraints of functional and vulnerability requirements in order to provide non-vulnerable and up-to-date software code.
36+
37+
The process would go through these typical steps:
38+
39+
1. Given an input software package, collect its direct functional dependency requirements from its manifests and/or lockfiles. Optionally, parse these requirements to normalize them as [Package-URLs](https://github.com/package-url) and [Version Range Specs](https://github.com/nexB/univers).
40+
41+
2. Fetch the known package versions set from the ecosystem package registry.
42+
43+
3. Collect known affected package versions ranges and fixed ranges from a vulnerability database or service using the identifiers from step 1.
44+
45+
4. Combine the version ranges of each dependency from steps 1 and 3 in a single new version range and for each dependent package.
46+
47+
5. Feed the combined ranges from step 4 as input to the dependency resolver. Obtain resolved dependencies that satisfies both constraints. The resolver may further request additional versions and ranges using the processes from steps 1 through 4 when new dependent packages are collected during the resolution process.
48+
49+
6. Obtain and output the results of the combined resolution of step 5. Report conflicts and optionally suggest conflict resolutions.
50+
51+
With this new process, we get a resolved package dependency tree with versions that satisfy both functional and vulnerability or bug constraints in a single resolution pass.
52+
53+
It’s worth noting that non-vulnerable dependency resolution is an ongoing process. Developers should regularly monitor their software packages for any newly discovered vulnerabilities and update their packages accordingly. This is particularly important when new vulnerabilities are discovered in commonly used packages, as these can have a significant impact on a wide range of software applications.
54+
55+
In conclusion, non-vulnerable dependency resolution is an essential practice that should be adopted by all developers. By selecting software packages that are free from known vulnerabilities, developers can significantly reduce the risk of security breaches in their software applications. Additionally, regularly monitoring and updating packages, as well as ensuring that packages are obtained from trusted sources, can further enhance the security of software development.
56+
57+
To understand this topic in more detail, read this defensive publication on [Non-Vulnerable Dependency Resolution](https://www.tdcommons.org/dpubs_series/5224/) from the Technical Disclosure Commons.
245 KB
Loading
55.2 KB
Loading

0 commit comments

Comments
 (0)