Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 12 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Changed

- **BREAKING:** `apm install` now exits `1` whenever the diagnostic summary reports `Installation failed with N error(s)`. Previously the command exited `0` even after reporting errors, so CI could not detect failure via exit code. `--force` continues to bypass only the security scan's critical-finding block; it does **not** suppress general install errors (matches `npm` / `pip` / `cargo`). Callers that asserted `exit_code == 0` while errors were reported must update.

### Added

- Deterministic Artifactory boundary probe: at install time, `_resolve_artifactory_boundary` HEAD-probes the candidate archive URLs and rebuilds the dependency reference at the proxy-verified split. Covers both explicit-FQDN (`host/artifactory/key/owner/repo/...`) and bare-shorthand-under-proxy deps. Distinguishes "missing repo" from auth (401/403) errors; uses `allow_redirects=False` so the bearer token can't leak cross-host. Mirrors the native GitLab probing pattern but without a separate metadata API. (#1472)
- `ref:` on git-source dependencies now accepts semver ranges (`^1.2.0`, `~1.4`, `>=2.0 <3`, `1.5.x`). `apm install` runs `git ls-remote`, picks the highest tag matching the range, and pins the resolved tag, commit SHA, version, and original constraint in `apm.lock.yaml`. Subsequent installs replay the lockfile without network; use `apm install --update` to re-resolve against current remote tags. Two tag patterns are tried in order (`v{version}`, `{name}--v{version}`) with a bare `{version}` fallback. (closes #1488)
- `apm deps why <package>` explains why a transitive dependency is installed by walking the lockfile's `resolved_by` chain back to the user's direct declaration in `apm.yml`. Supports `--global` for user-scope lockfiles and `--json` for scriptable output (JSON to stdout, all logs to stderr; analogue of `npm why` / `yarn why`). Exits `0` on success, `1` when the package isn't installed or the query is ambiguous, `2` when no lockfile exists. (#1490)

### Changed

- **BREAKING:** `apm install` now exits `1` whenever the diagnostic summary reports `Installation failed with N error(s)`. Previously the command exited `0` even after reporting errors, so CI could not detect failure via exit code. `--force` continues to bypass only the security scan's critical-finding block; it does **not** suppress general install errors (matches `npm` / `pip` / `cargo`). Callers that asserted `exit_code == 0` while errors were reported must update.
- Parse-time Artifactory boundary detection no longer uses directory-marker heuristics (`skills/`, `prompts/`, `agents/`, `collections/`, `instructions/`). The install-time resolver is authoritative for the (owner, repo, virtual_path) split. Parse-time defaults differ by mode:
- **Explicit FQDN** (`host/artifactory/key/owner/repo/...`): `parse_artifactory_path` stays intentionally shallow -- `owner` = first segment after the prefix, `repo` = next segment, remainder = `virtual_path`.
- **Bare shorthand under `PROXY_REGISTRY_ONLY`**: `_bare_shorthand_repo_segment_count` defaults to all-as-repo with a structural file-extension rule on the last segment (a path ending in `.prompt.md`/`.instructions.md`/`.chatmode.md`/`.agent.md` is by shape a virtual file; everything before it is the repo).

Both modes converge at install time: `_resolve_artifactory_boundary` HEAD-probes candidate splits and rebuilds the dependency reference at the proxy-verified split. The `_VIRTUAL_PATH_ROOT_SEGMENTS`, `_ARTIFACTORY_VIRTUAL_MARKERS`, and `_ARTIFACTORY_VIRTUAL_FILE_EXTENSIONS` constants are removed. (#1472)

### Fixed

- `apm install` against a registry proxy now works for GitLab nested-group repos (3+ path segments, e.g. `group/subgroup/project`). Previously the proxy resolver guessed `owner/repo` from the first two path segments and treated the rest as an in-repo virtual sub-path, so the downloader requested the wrong archive URL and the install failed with HTTP 404. The new install-time probe HEAD-walks candidate splits against the proxy and locks in the first one whose archive responds, so nested shorthand (`apm install <host>/artifactory/<key>/<group>/<subgroup>/<project>` or the bare-shorthand form under `PROXY_REGISTRY_URL` + `PROXY_REGISTRY_ONLY=1`) just works. The probe distinguishes auth (401/403) from missing-repo (4xx) so a misconfigured token surfaces as an auth problem instead of a "missing repo", and runs with `allow_redirects=False` so the bearer token cannot follow a redirect off the proxy host. When the proxy is unreachable, the `//` notation can mark the repo/virtual boundary explicitly as an escape hatch. (#1472)
- URL-form Artifactory deps no longer round-trip with the `artifactory/<key>` prefix folded into `repo_url`. The duplicated prefix caused the downloader to construct double-prefixed archive URLs (`/artifactory/key/artifactory/key/owner/repo/...`) and 404. `_validate_url_repo_path` now strips the Artifactory VCS prefix before returning the bare `owner/repo` slug; the prefix is still recovered separately via `_extract_artifactory_prefix`. (#1472)
- `apm install --update` now re-resolves direct git-source semver dependencies. Previously, when the dependency's install path already existed on disk, the BFS resolver short-circuited and `--update` was a silent no-op for git-semver refs; the lockfile kept the previously-resolved tag.
- `policy.dependencies.require_pinned_constraint: true` no longer misclassifies the npm- and cargo-style explicit-equality form `=1.2.3` as `BARE_BRANCH`. Both `1.2.3` and `=1.2.3` are now recognized as pinned constraints; the pip-style `==1.2.3` form is still rejected (not part of node-semver). Follow-up to #1494 / #1505.

Expand Down
127 changes: 127 additions & 0 deletions docs/src/content/docs/enterprise/registry-proxy.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,129 @@ rewrites `apm.lock.yaml`.
| `apm marketplace` (`marketplace.json` fetch) | Yes; falls back to GitHub Contents API unless `PROXY_REGISTRY_ONLY=1` |
| Policy file fetch (`apm-policy.yml`) | No -- uses the GitHub API directly |

### Nested-group repos (GitLab subgroups behind the proxy)

GitHub uses a fixed `owner/repo` shape, but GitLab projects can sit at any
subgroup depth (e.g. `group/subgroup/project`). When
`PROXY_REGISTRY_ONLY=1` is set, APM treats path segments past the second
as part of the repo slug; the real boundary between repo path and
in-repo virtual sub-path is then settled at install time by the same
deterministic boundary probe used for explicit FQDN deps (see
[Explicit Artifactory FQDN](#explicit-artifactory-fqdn-deterministic-boundary-probe)
below):

```yaml
# apm.yml -- 3-segment GitLab project behind a registry proxy
dependencies:
apm:
- group/subgroup/project#main # resolves to the full nested path
- group/sub-a/sub-b/project#v1.2.0 # arbitrary depth supported
```

Virtual sub-paths under nested-group repos work via the probe: parse
defaults to all-as-repo, then the install-time resolver HEAD-probes
candidate splits against the proxy and rebuilds the dependency
reference at the first split whose archive responds 2xx-3xx:

```yaml
dependencies:
apm:
# The probe walks shallow-first and lands on the real boundary --
# ``group/subgroup/project`` is the repo, ``skills/<name>`` is the
# virtual sub-path -- no marker-segment heuristic involved.
- group/subgroup/project/skills/<name>
# Files ending in ``.prompt.md`` / ``.instructions.md`` /
# ``.chatmode.md`` / ``.agent.md`` are structurally a virtual file
# at parse time; the probe still confirms which directory the file
# sits under is part of the repo path.
- group/subgroup/project/<name>.prompt.md
```

Probe authentication matches the URL being probed: bare-shorthand deps
(Mode 2) use the proxy's own bearer token from `PROXY_REGISTRY_TOKEN`,
while explicit-FQDN deps (Mode 1) use the per-host auth resolver -- in
both cases the audience matches the probed URL, never the upstream Git
host.

#### Trade-off: lockfile env-dependence

The fold-into-repo behavior is gated on `PROXY_REGISTRY_ONLY` to keep the
legacy two-segment shape for direct installs. Consequence: the same
shorthand parses differently with vs. without the env set. For a team
that always runs through the proxy, this is invisible. For a mixed CI
fleet, expect lockfile drift if some agents have the env and others
don't -- pin the env in the same place you pin Python and APM versions.

#### Configuring the upstream remote (GitLab)

When the proxy fronts a private GitLab instance, the proxy itself must
authenticate upstream -- the client (APM) does not need a token if the
proxy is configured to accept anonymous reads on its API.

In the Artifactory UI, for the remote pointing at GitLab:

| Field | Value |
|---|---|
| URL | `https://<gitlab-host>` (no path prefix) |
| Repository Path Prefix | *blank* (any value gets prepended to every upstream request) |
| Username | empty *or* the GitLab username |
| Password / Token | the raw GitLab PAT value -- no `PRIVATE-TOKEN:` prefix |
| Token Authentication | enable when the password is a GitLab PAT |
| VCS Provider | `GitLab` |

The PAT must carry **`read_repository`** scope -- `read_api` alone does
not permit `/-/archive/` downloads. Verify directly against GitLab
before saving on the remote:

```bash
curl -sI -H "PRIVATE-TOKEN: $PAT" \
"https://<gitlab-host>/<group>/<project>/-/archive/<ref>/<basename>-<ref>.zip" \
| head -3
# Want: HTTP/1.1 200 OK + Content-Type: application/zip
```

#### Default branch gotcha

APM defaults to `main` when no ref is provided. GitLab projects whose
default branch is still `master` will return HTTP 404 for every archive
URL APM tries. Pin the ref in `apm.yml` (`<repo>#master`) when the
project hasn't been renamed.

#### Explicit Artifactory FQDN: deterministic boundary probe

When a dep is written with the full proxy URL --
`<host>/artifactory/<key>/<owner>/<repo>[/<more>]` -- parse time gives a
simple `owner / first-segment / rest-as-virtual` split. The real
boundary is settled at install time by an authoritative resolver that
mirrors APM's native GitLab probing pattern, without a separate metadata
API:

1. Enumerate every plausible `(owner, repo, virtual_path)` split
shallow-first.
2. `HEAD` each candidate's archive URL on the proxy (no follow on
redirects, so the bearer token can't leak cross-host).
3. The first candidate that responds 2xx-3xx wins; the dependency
reference is rebuilt at that boundary and persisted to `apm.yml` as
a structured `git:` + `path:` entry.

If every candidate is rejected the resolver raises -- there is no
silent fallback to the parse-time guess:

| Result | Behaviour |
|---|---|
| Single candidate (e.g. `host/artifactory/key/owner/repo`) | Parse-time ref returned unchanged; no HEAD probe issued. |
| All candidates `4xx` (excluding 401/403) | `ValueError: ... did not resolve to a reachable repository archive` |
| All candidates `401`/`403` | `ValueError: ... authentication problem, not a missing repo` -- check the token's read scope. |

To opt out of probing -- e.g. when the proxy is offline at install time
or when you want a deterministic byte-for-byte string -- use the
explicit `//` boundary marker, which short-circuits the resolver to a
single candidate:

```text
<host>/artifactory/<key>/<owner>/<deep>/<slug>//<virtual/path>
```

When a surface is not proxy-routed and `PROXY_REGISTRY_ONLY=1`, APM
aborts rather than silently fetching direct.

Expand Down Expand Up @@ -157,6 +280,10 @@ and `apm cache clean`.
| `git clone` hangs through the proxy | `HTTPS_PROXY` not set in the env that runs `git` | Export it in the shell that invokes `apm install`; CI secrets often miss this |
| `DeprecationWarning: ARTIFACTORY_BASE_URL is deprecated` | Legacy env names | Rename to `PROXY_REGISTRY_*` |
| Plaintext-token warning on proxy startup | Token sent over `http://` | Use `https://`, or set `PROXY_REGISTRY_ALLOW_HTTP=1` if the link is internal-only |
| `Invalid zip archive` with a body that starts `<!DOCTYPE html>` and is ~17KB | Upstream returned a sign-in page; proxy cached the HTML | Configure upstream credentials on the registry remote, purge the cache, then refetch |
| 3-segment dep (`group/sub/project`) fails with HTTP 404 from the proxy | APM treated `project` as a virtual sub-path | Set `PROXY_REGISTRY_ONLY=1`; see [Nested-group repos](#nested-group-repos-gitlab-subgroups-behind-the-proxy) |
| HTTP 404 on every ref of an existing GitLab project | Default branch is `master`, APM defaults to `main` | Pin the ref: `<repo>#master` in `apm.yml` |
| Upstream URL in `X-Artifactory-Origin-Remote-Path` has a duplicated group name | The remote's "Repository Path Prefix" is prepending a segment that's also in the request | Clear the prefix field on the remote |

For fully disconnected CI (no proxy reach at all), build a bundle on a
connected host with `apm pack` and restore offline. See
Expand Down
2 changes: 1 addition & 1 deletion docs/src/content/docs/reference/manifest-schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -358,7 +358,7 @@ dependencies:

#### 4.1.2. Object Form

REQUIRED when the shorthand is ambiguous (e.g. nested-group repos with virtual paths).
REQUIRED when the shorthand is ambiguous (e.g. direct nested-group repos with virtual paths). NOT required for nested-group deps that route through a registry proxy (explicit `host/artifactory/<key>/...` FQDN, or bare shorthand under `PROXY_REGISTRY_URL` + `PROXY_REGISTRY_ONLY=1`): the install-time boundary probe HEAD-walks candidate splits against the proxy and locks in the first one whose archive responds. See [Registry proxy guide](../../enterprise/registry-proxy/#nested-group-repos-gitlab-subgroups-behind-the-proxy).

| Field | Type | Required | Pattern / Constraint | Description |
|---|---|---|---|---|
Expand Down
2 changes: 1 addition & 1 deletion packages/apm-guide/.apm/skills/apm-usage/dependencies.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ Virtual packages reference a subset of a repository.

Classification is by extension only. A path like `owner/repo/collections/security` (no extension) is a Subdirectory; the actual shape -- APM package (incl. dep-only `apm.yml` with no `.apm/`), skill bundle, or plugin -- is resolved at fetch time by probing for `apm.yml`.

**Gitea and Gogs (self-hosted or vendor-hosted):** virtual packages resolve via the host's `/{owner}/{repo}/raw/{ref}/{path}` URL first, then fall back to the Contents API (v1 native, v3 Gogs-compat). GitLab nested-group repos (`group/subgroup/repo`) require the object form (`git: <full-url>`, `path: <virtual>`) -- shorthand is ambiguous on >2-segment paths.
**Gitea and Gogs (self-hosted or vendor-hosted):** virtual packages resolve via the host's `/{owner}/{repo}/raw/{ref}/{path}` URL first, then fall back to the Contents API (v1 native, v3 Gogs-compat). Direct GitLab nested-group repos (`group/subgroup/repo`) require the object form (`git: <full-url>`, `path: <virtual>`) -- shorthand is ambiguous on >2-segment paths. **Exception:** when the dep routes through a registry proxy (explicit `host/artifactory/<key>/...` FQDN, or bare shorthand under `PROXY_REGISTRY_URL` + `PROXY_REGISTRY_ONLY=1`), the install-time boundary probe HEAD-walks the candidate splits against the proxy and locks in the first one whose archive responds, so nested-group shorthand works without the object form (#1472).

> **Removed (#1094):** the legacy `.collection.yml` / `.collection.yaml` virtual-package form is no longer supported. Convert any `.collection.yml` to an `apm.yml` with a `dependencies:` section, then reference the resulting subdirectory as a regular subdirectory virtual package.

Expand Down
7 changes: 5 additions & 2 deletions src/apm_cli/commands/install.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

import click

from apm_cli.install.artifactory_resolver import _resolve_artifactory_boundary
from apm_cli.install.errors import (
AuthenticationError,
DirectDependencyError,
Expand Down Expand Up @@ -461,13 +462,15 @@ def warning_handler(msg):

# Canonicalize input
try:
dep_ref, direct_gitlab_virtual_resolved = resolve_parsed_dependency_reference(
dep_ref, direct_virtual_resolved = resolve_parsed_dependency_reference(
package,
marketplace_dep_ref,
dependency_reference_cls=DependencyReference,
try_resolve_gitlab_direct_shorthand=_try_resolve_gitlab_direct_shorthand,
resolve_artifactory_boundary=_resolve_artifactory_boundary,
auth_resolver=auth_resolver,
verbose=bool(logger and logger.verbose),
logger=logger,
)
canonical = dep_ref.to_canonical()
identity = dep_ref.get_identity()
Expand All @@ -483,7 +486,7 @@ def warning_handler(msg):
_seen.add(_s)
_normalized.append(_s)
dep_ref.skill_subset = _normalized
if marketplace_dep_ref is not None or direct_gitlab_virtual_resolved:
if marketplace_dep_ref is not None or direct_virtual_resolved:
_apm_yml_entries[canonical] = dependency_reference_to_yaml_entry(dep_ref)
except ValueError as e:
reason = str(e)
Expand Down
11 changes: 8 additions & 3 deletions src/apm_cli/deps/artifactory_orchestrator.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,12 +175,15 @@ def _resolve_host_prefix(
@staticmethod
def _split_owner_repo(dep_ref: DependencyReference) -> tuple[str, str]:
repo_parts = dep_ref.repo_url.split("/")
if len(repo_parts) < 2 or not repo_parts[0] or not repo_parts[1]:
if len(repo_parts) < 2 or not all(repo_parts):
raise ValueError(
f"Invalid Artifactory repo reference '{dep_ref.repo_url}': "
"expected 'owner/repo' format"
)
return repo_parts[0], repo_parts[1]
# Owner is the top-level namespace; the remainder of the path is the
# project slug. For GitLab projects behind an Artifactory VCS proxy
# the slug can include subgroups (e.g. ``group/subgroup/project``).
return repo_parts[0], "/".join(repo_parts[1:])

@staticmethod
def _progress(progress_obj, progress_task_id, *, completed: int, total: int = 100) -> None:
Expand Down Expand Up @@ -257,7 +260,9 @@ def download_subdirectory(
subdir_path = dep_ref.virtual_path
repo_parts = dep_ref.repo_url.split("/")
owner = repo_parts[0]
repo = repo_parts[1] if len(repo_parts) > 1 else repo_parts[0]
# Preserve subgroup nesting (GitLab via proxy) by folding everything
# past the owner into the repo slug.
repo = "/".join(repo_parts[1:]) if len(repo_parts) > 1 else repo_parts[0]
host, prefix, scheme = proxy_info

self._progress(progress_obj, progress_task_id, completed=10)
Expand Down
Loading
Loading