changelog: source bundle entries from the CDN with a per-product registry#3470
changelog: source bundle entries from the CDN with a per-product registry#3470cotti wants to merge 1 commit into
Conversation
…stry
Source the individual changelog entries that make up a bundle from the
public CDN by default, scoped to the bundle's product(s), instead of the
local folder. Because the bundle filter is content-based (an entry's
products/prs/issues live inside the YAML, not its name) and CloudFront has
no ListObjects, a per-product entry index ({product}/changelog/registry.json)
is published on upload so the client can enumerate then fetch+filter.
- Add bundle.use_local_changelogs opt-out, plus automatic local fallback
when no concrete product can scope the per-product CDN fetch.
- Extend RegistryBuilder/RegistryKey to write and pass-through the entry
index (scrubber recognizes {product}/changelog/registry.json).
- Add CdnChangelogEntryFetcher and centralize CDN base resolution in
ChangelogCdn (shared with the changelog directive's cdn: mode).
- Emit cdn_url from `changelog bundle --plan` so CI can poll for the
scrubbed bundle ({base}/{product}/bundle/{file}).
- Harden entry sourcing: a registry-listed entry that has not yet
propagated to the CDN is retried with short backoff and cache-busting;
a persistent miss fails the bundle instead of silently shipping an
incomplete release.
dotnet format, AOT publish (0 trim/AOT warnings), and the affected unit
tests all pass; cli-schema.json regenerated for the new --plan output.
Co-authored-by: Cursor <cursoragent@cursor.com>
Mpdreamz
left a comment
There was a problem hiding this comment.
A few code-quality notes, then a broader design question worth discussing before this merges.
Code issues
HttpClient is never disposed in CdnChangelogEntryFetcher.
The class allocates an HttpClient in a field initializer but doesn't implement IDisposable, so the connection pool leaks in any reuse scenario (tests, future service mode). Should implement IDisposable and dispose the client, or accept an IHttpClientFactory.
Fetch() blocks the calling thread for all HTTP I/O.
FetchRegistry and FetchText use the synchronous HttpClient.Send() overload. Everything else in the service layer is async; this blocks a thread-pool thread for the full CDN round-trip per product.
Multi-paragraph XML doc comments on private methods.
FetchCdnEntries, ResolveCdnBundleUrl, and ResolvePrimaryProduct all have verbose <summary> blocks. Project style is one short line max (or none) on private methods.
PlanBundleAsync duplicates the CDN-producibility check.
The plan path and the bundle path compute "is a product resolvable?" with different expressions at different pipeline stages (pre- vs post-ApplyConfigDefaults). Currently correct, but there is no comment explaining the difference, so they are likely to diverge as the code evolves.
Design question — should this be declarative in docset.yml?
The current approach fetches changelog entries imperatively inside changelog bundle, driven by a use_local_changelogs escape hatch in changelog.yml. That works, but it means:
- The fetch happens deep in the bundle command where concurrency is ad-hoc (synchronous loop over entries per product).
- The build cannot validate or prefetch CDN requirements up-front — it discovers the dependency at execution time.
- The changelog directive has nowhere to point the user on a fetch error; it can only say "fetch failed" rather than "declare this in your config".
Compare how crosslinks work: repos are declared in docset.yml under cross_links, docs-builder knows about the external dependency at startup, can fail fast if the index is unreachable, and can fetch all link indexes concurrently before any page is rendered.
A similar pattern for changelog sourcing might look like:
# docset.yml
release_notes:
- repository: elastic/elasticsearch
- repository: elastic/kibanaWith that declaration:
- docs-builder knows at startup which CDN products to pull, can fetch all registries and entries concurrently, and can fail fast before any directive tries to render.
- The
changelogdirective just references an already-loaded in-memory set — no per-directive HTTP. - If a directive references a product not declared in
docset.yml, it emits a clear error pointing the user at the config key rather than a raw CDN failure at render time. changelog bundlereads the same declaration to know which products to source from the CDN, keeping the config in one place.
The use_local_changelogs flag would still make sense as an escape hatch, but the primary sourcing decision would be declared rather than implied by the bundle command's product filter.
Is this the direction you want to go, or is there a reason to keep it embedded in changelog.yml / the bundle command?
Note
Stacked on
changelog_directive_s3. Base this against that branch and merge it first; the diff shown here is only this change.Summary
Source the individual changelog entries that make up a bundle from the public CDN by default (scoped to the bundle's product), instead of the local
docs/changelog/folder. This makes bundles reflect what was actually published to S3 — i.e. with private references scrubbed — and decouples bundle creation from a checkout that happens to hold every entry.The bundle filter is content-based: whether an entry belongs in a bundle depends on the
products/prs/issuesfields inside the YAML, not on the file name. CloudFront has noListObjects, so we publish a small per-product index on upload and the client enumerates → fetches → filters.What changed
uploadnow writes{product}/changelog/registry.jsonalongside the existing bundle registry (RegistryBuilder/RegistryKey); the scrubber key allow-list recognizes it.CdnChangelogEntryFetcherdownloads the entry index and each entry; CDN base resolution is centralized inChangelogCdn(shared with the changelog directive'scdn:mode).bundle.use_local_changelogs: trueforces the local folder, and we fall back to local automatically when no concrete product can scope the per-product CDN fetch.--plancdn_url:changelog bundle --plannow emitscdn_url({base}/{product}/bundle/{file}) so CI can poll for the scrubbed bundle.docs/cli-schema.jsonregenerated.Verification
dotnet format --verify-no-changes: cleandotnet publish -c Release(docs-builder): 0 trim/AOT warningsElastic.Documentation.Configuration.Tests(409/409) andElastic.Changelog.Tests(732/732): passcli-schema.jsonmatches-- __schemaoutputTest plan
uploadpublishes{product}/changelog/registry.jsonand individual entriesbundle(default) fetches scrubbed entries from the CDN and applies the content filterbundlewithuse_local_changelogs: trueuses the local folderbundle --plansurfaces a correctcdn_urlMade with Cursor