You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat(web,download): absorb #1048 media + --stdout into web read
Distill the useful pieces of the abandoned PR #1048 (`web md`) into the
existing shared pipeline instead of introducing a parallel command:
- Turndown rules for <video> / <audio> / <iframe>. Video and audio are
emitted as inline HTML so renderers that support it keep playback,
and iframes degrade to markdown links (title + src) so embedded
content (YouTube, CodePen, …) stays reachable. `iframe` moves out of
STRIPPED_TAGS since it's now handled explicitly.
- `stdout` option on ArticleDownloadOptions: writes the full markdown
to process.stdout, skips image download + mkdir + file write, and
reports saved='-'. Remote image URLs stay intact so piped output is
self-contained.
- `web read --stdout` wires the above through.
- Lazy-load src rewrite: the extractor now promotes data-src /
data-original / data-lazy-src / data-srcset onto `src` before the
HTML is frozen, so the markdown body and the image-download list
reference the same URL (previously a page with placeholder.gif +
data-src produced broken image links in the output).
Nothing in #1048 that overlapped with the already-merged #1143
hardening was kept — no new Readability wiring, no duplicate Turndown
config, no new command.
* fix(web): keep stdout streaming output clean
* fix(tests): update iframe e2e assertion and drop relative src import
- article-extract e2e fixture test: iframe now converts to a markdown
link instead of being stripped, so assert the YouTube embed link
survives rather than asserting its absence.
- clis/web/read.test.js: replace vi.importActual('../../src/registry.js')
with a direct __test__.command export from read.js; the relative
import into src/ tripped the package-exports adapter guardrail.
0 commit comments