Skip to content

Commit 267e18a

Browse files
authored
Merge branch 'main' into jupinzer/promote_container_detectors_default
2 parents 329dc3d + 63dafd9 commit 267e18a

20 files changed

Lines changed: 1170 additions & 32 deletions

File tree

README.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -33,21 +33,34 @@ Component Detection can also be used as a library to detect dependencies in your
3333

3434
## Features
3535

36-
Component Detection supports detecting libraries from the following ecosystem:
36+
Component Detection supports detecting libraries from the following ecosystems:
3737

3838
| Ecosystem | Scanning | Graph Creation |
3939
| -------------------------------------------------------------------------------- | ----------------------------------------------- | -------------- |
40-
| CocoaPods |||
41-
| [Go](docs/detectors/go.md) |||
40+
| [CocoaPods](docs/detectors/cocoapods.md) |||
41+
| [Conan](docs/detectors/conan.md) |||
42+
| [Conda (Python)](docs/detectors/conda.md) |||
43+
| [Docker Compose](docs/detectors/dockercompose.md) |||
44+
| [Dockerfile](docs/detectors/dockerfile.md) |||
45+
| [DotNet SDK](docs/detectors/dotnet.md) |||
46+
| [Go](docs/detectors/go.md) || ✔ (with Go 1.11+) |
4247
| [Gradle (lockfiles only)](docs/detectors/gradle.md) |||
43-
| [Linux (Debian, Alpine, Rhel, Centos, Fedora, Ubuntu)](docs/detectors//linux.md) | ✔ (via [syft](https://github.com/anchore/syft)) ||
48+
| [Helm](docs/detectors/helm.md) |||
49+
| [Ivy](docs/detectors/ivy.md) |||
50+
| [Linux (Debian, Alpine, Rhel, Centos, Fedora, Ubuntu)](docs/detectors/linux.md) | ✔ (via [syft](https://github.com/anchore/syft)) ||
4451
| [Maven](docs/detectors/maven.md) |||
4552
| [NPM (including Yarn, Pnpm)](docs/detectors/npm.md) |||
4653
| [NuGet (including Paket)](docs/detectors/nuget.md) |||
4754
| [Pip (Python)](docs/detectors/pip.md) |||
4855
| [Poetry (Python, lockfiles only)](docs/detectors/poetry.md) |||
49-
| Ruby |||
50-
| Rust |||
56+
| [Ruby](docs/detectors/ruby.md) |||
57+
| [Rust (Cargo)](docs/detectors/rust.md) |||
58+
| [SPDX SBOM](docs/detectors/spdx.md) |||
59+
| [Swift](docs/detectors/swift.md) |||
60+
| [Uv (Python)](docs/detectors/uv.md) |||
61+
| [Vcpkg](docs/detectors/vcpkg.md) |||
62+
63+
See the [detectors directory](docs/detectors/README.md) for the current status (Stable, Experimental, or DefaultOff) of each individual detector.
5164

5265
For a complete feature overview refer to [feature-overview.md](docs/feature-overview.md)
5366

docs/detectors/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@
8787
| NuGetComponentDetector | Stable |
8888
| NuGetPackagesConfigDetector | Stable |
8989
| NuGetProjectModelProjectCentricComponentDetector | Stable |
90-
| MSBuildBinaryLogComponentDetector | DefaultOff |
90+
| MSBuildBinaryLogComponentDetector | Experimental |
9191

9292
- [Pip](pip.md)
9393

docs/detectors/nuget.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,9 @@ The `NuGetPackagesConfig` detector raises NuGet components referenced by project
3838

3939
## MSBuildBinaryLog
4040

41-
The `MSBuildBinaryLog` detector is a **DefaultOff** detector intended to eventually replace both the `NuGetProjectCentric` and `DotNet` detectors. It combines MSBuild binary log (binlog) information with `project.assets.json` to provide enhanced component detection with project-level classifications.
41+
The `MSBuildBinaryLog` detector is an **Experimental** detector intended to eventually replace both the `NuGetProjectCentric` and `DotNet` detectors. It combines MSBuild binary log (binlog) information with `project.assets.json` to provide enhanced component detection with project-level classifications.
42+
43+
As an experimental detector, it runs automatically whenever a scan is performed, but its results are not reported as part of the normal scan output. Instead, the results are compared against the existing `NuGetProjectCentric` and `DotNet` detectors and recorded as telemetry so maintainers can evaluate parity before promoting the detector to default.
4244

4345
It looks for `project.assets.json` files and separately discovers `*.binlog` files. The binlog provides build-time context that isn't available from `project.assets.json` alone.
4446

docs/feature-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
| NPM | <ul><li>package.json</li><li>package-lock.json</li><li>npm-shrinkwrap.json</li><li>lerna.json</li></ul> | - | ✔ (dev-dependencies in package.json, dev flag in package-lock.json) ||
1212
| Yarn (v1, v2) | <ul><li>package.json</li><li>yarn.lock</li></ul> | - | ✔ (dev-dependencies in package.json) ||
1313
| Pnpm | <ul><li>shrinkwrap.yaml</li><li>pnpm-lock.yaml</li></ul> | - | ✔ (packages/{package}/dev flag) ||
14-
| NuGet | <ul><li>project.assets.json</li><li>*.nupkg</li><li>*.nuspec</li><li>packages.config</li><li>nuget.config</li><li>*.binlog (DefaultOff)</li></ul> | - | - | ✔ (required project.assets.json) |
14+
| NuGet | <ul><li>project.assets.json</li><li>*.nupkg</li><li>*.nuspec</li><li>packages.config</li><li>nuget.config</li><li>*.binlog (Experimental)</li></ul> | - | - | ✔ (required project.assets.json) |
1515
| Pip (Python) | <ul><li>setup.py</li><li>requirements.txt</li><li>*setup=distutils.core.run_setup({setup.py}); setup.install_requires*</li><li>dist package METADATA file</li></ul> | <ul><li>Python 2 or Python 3</li><li>Internet connection</li></ul> |||
1616
| Poetry (Python) | <ul><li>poetry.lock</li><ul> | - |||
1717
| Ruby | <ul><li>gemfile.lock</li></ul> | - |||

src/Microsoft.ComponentDetection.Common/DockerReference/DockerReferenceUtility.cs

Lines changed: 96 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,10 @@
2727
namespace Microsoft.ComponentDetection.Common;
2828

2929
using System;
30+
using System.Buffers;
31+
using System.Collections.Generic;
3032
using System.Diagnostics.CodeAnalysis;
33+
using System.Text.RegularExpressions;
3134
using Microsoft.ComponentDetection.Contracts;
3235
using Microsoft.Extensions.Logging;
3336

@@ -39,18 +42,55 @@ public static class DockerReferenceUtility
3942
private const string LEGACYDEFAULTDOMAIN = "index.docker.io";
4043
private const string OFFICIALREPOSITORYNAME = "library";
4144

45+
// Delimiters that only appear in an image reference as part of an unresolved templating
46+
// token: '$', '{' and '}' cover shell / Helm / Go-template placeholders (e.g. ${VAR},
47+
// {{ .Values.tag }}). These are recognized templating syntaxes expected in un-rendered manifests,
48+
// so TryParseImageReference skips them (logging a warning) rather than treating them as invalid.
49+
// A token wrapped in matching '#' or '!' (handled by DelimiterWrappedTokenRegex) is treated the same way.
50+
// When no templating token is present, stray invalid characters (e.g. a single '#' or '!') are reported
51+
// via GetInvalidReferenceCharacters.
52+
private static readonly char[] TemplateDelimiters = ['$', '{', '}'];
53+
54+
// Matches token-replacement placeholders that wrap an identifier in double underscores,
55+
// e.g. __IMAGE_TAG__ or __MCR_ENDPOINT__. Without this they parse as an uppercase repository
56+
// name and surface as a noisy parse failure instead of being skipped as a templated value.
57+
private static readonly Regex DoubleUnderscoreTokenRegex = new(@"__\w+__");
58+
59+
// Matches token-replacement placeholders wrapped in a matching '#' or '!', e.g. #imageTag#,
60+
// #cs_containerRegistryLoginServerUrl#, or !imageTag!. A string surrounded by the same '#' or
61+
// '!' delimiter is almost always an unsubstituted template variable (Azure DevOps token
62+
// replacement and similar), so it is skipped (and may be logged as a warning) instead of
63+
// surfacing as a misleading docker-reference parse failure. The backreference requires the closing delimiter to match
64+
// the opening one, so a mismatched stray '#' or '!' is left to GetInvalidReferenceCharacters.
65+
private static readonly Regex DelimiterWrappedTokenRegex = new(@"([#!])[^#!]+\1");
66+
67+
// Every character permitted anywhere in a docker reference per the grammar at the top of this
68+
// file: alphanumerics, the separators '.', '_' and '-', the path separator '/', the tag/port
69+
// and digest separators ':' and '@', and the digest-algorithm separator '+'. Anything else
70+
// (e.g. '#', '!') comes from unsubstituted template tokens and is reported as invalid.
71+
private static readonly SearchValues<char> ValidReferenceChars = SearchValues.Create(
72+
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789._-/:@+");
73+
4274
/// <summary>
43-
/// Returns true if the reference contains unresolved variable placeholders (e.g., ${VAR}, {{ .Values.tag }}).
44-
/// Such references should be skipped before calling <see cref="ParseFamiliarName"/> or <see cref="ParseQualifiedName"/>.
75+
/// Returns true if the reference contains unresolved variable or templating placeholders,
76+
/// e.g. <c>${VAR}</c>, <c>{{ .Values.tag }}</c>, <c>__IMAGE_TAG__</c>, <c>#imageTag#</c>, or
77+
/// <c>!imageTag!</c>.
78+
/// Such references are not real, resolvable images, so they should be skipped before calling
79+
/// <see cref="ParseFamiliarName"/> or <see cref="ParseQualifiedName"/> and treated as
80+
/// unresolved values rather than reported as parse failures.
4581
/// </summary>
4682
/// <param name="reference">The image reference string to check.</param>
4783
/// <returns><c>true</c> if the reference contains variable placeholder characters; otherwise <c>false</c>.</returns>
4884
public static bool HasUnresolvedVariables(string reference) =>
49-
reference.IndexOfAny(['$', '{', '}']) >= 0;
85+
reference.IndexOfAny(TemplateDelimiters) >= 0 ||
86+
DoubleUnderscoreTokenRegex.IsMatch(reference) ||
87+
DelimiterWrappedTokenRegex.IsMatch(reference);
5088

5189
/// <summary>
5290
/// Attempts to parse an image reference string into a <see cref="DockerReference"/>.
53-
/// Returns <c>null</c> if the reference contains unresolved variables or cannot be parsed.
91+
/// Returns <c>null</c> if the reference contains unresolved variables, contains characters that
92+
/// are not valid in a docker reference, or otherwise cannot be parsed. A warning is logged in
93+
/// every skip/failure case so that references which are not scanned remain visible in logs.
5494
/// </summary>
5595
/// <param name="imageReference">The image reference string to parse.</param>
5696
/// <param name="logger">Optional logger for recording parse failures.</param>
@@ -59,6 +99,19 @@ public static bool HasUnresolvedVariables(string reference) =>
5999
{
60100
if (HasUnresolvedVariables(imageReference))
61101
{
102+
logger?.LogWarning(
103+
"Skipping image reference '{ImageReference}' because it contains one or more unresolved template tokens or variable placeholders.",
104+
imageReference);
105+
return null;
106+
}
107+
108+
var invalidCharacters = GetInvalidReferenceCharacters(imageReference);
109+
if (invalidCharacters.Length > 0)
110+
{
111+
logger?.LogWarning(
112+
"Skipping image reference '{ImageReference}' because it contains character(s) that are not valid in a docker reference: {InvalidCharacters}",
113+
imageReference,
114+
invalidCharacters);
62115
return null;
63116
}
64117

@@ -76,7 +129,7 @@ public static bool HasUnresolvedVariables(string reference) =>
76129
/// <summary>
77130
/// Parses an image reference and registers it with the recorder if valid.
78131
/// Skips references with unresolved variables or that cannot be parsed,
79-
/// logging a warning for parse failures so that remaining entries continue to be processed.
132+
/// logging a warning in each skipped case so that remaining entries continue to be processed.
80133
/// </summary>
81134
/// <param name="imageReference">The image reference string to parse.</param>
82135
/// <param name="recorder">The component recorder to register the image with.</param>
@@ -228,6 +281,44 @@ public static DockerReference ParseAll(string name)
228281
return ParseFamiliarName(name);
229282
}
230283

284+
/// <summary>
285+
/// Returns the distinct characters in <paramref name="reference"/> that are not valid in any
286+
/// part of a docker reference (domain, repository, tag, or digest) as a comma-separated string,
287+
/// or an empty string when every character is valid. Characters such as <c>#</c> and <c>!</c>
288+
/// commonly appear in unsubstituted template tokens and otherwise surface as misleading
289+
/// "must be lowercase" or "invalid reference format" parse errors.
290+
/// </summary>
291+
/// <param name="reference">The image reference string to inspect.</param>
292+
/// <returns>A comma-separated list of invalid characters, or an empty string if there are none.</returns>
293+
private static string GetInvalidReferenceCharacters(string reference)
294+
{
295+
// Vectorized happy-path check: the overwhelmingly common case is an all-valid reference,
296+
// for which this returns without allocating. Only gather the offending characters when
297+
// at least one is present.
298+
var span = reference.AsSpan();
299+
if (!span.ContainsAnyExcept(ValidReferenceChars))
300+
{
301+
return string.Empty;
302+
}
303+
304+
SortedSet<char> invalid = [];
305+
foreach (var c in span)
306+
{
307+
if (!ValidReferenceChars.Contains(c))
308+
{
309+
invalid.Add(c);
310+
}
311+
}
312+
313+
var invalidStrings = new List<string>(invalid.Count);
314+
foreach (var c in invalid)
315+
{
316+
invalidStrings.Add($"'{c}'");
317+
}
318+
319+
return string.Join(", ", invalidStrings);
320+
}
321+
231322
private static DockerReference CreateDockerReference(Reference options)
232323
{
233324
return DockerReference.CreateDockerReference(options.Repository, options.Domain, options.Digest, options.Tag);

0 commit comments

Comments
 (0)