Skip to content

Fix Full Article extraction for Naver Blog#5228

Open
box-j wants to merge 1 commit intoRanchero-Software:mainfrom
box-j:claude/enhance-article-fetching-h8Gbj
Open

Fix Full Article extraction for Naver Blog#5228
box-j wants to merge 1 commit intoRanchero-Software:mainfrom
box-j:claude/enhance-article-fetching-h8Gbj

Conversation

@box-j
Copy link
Copy Markdown

@box-j box-j commented Apr 4, 2026

Naver Blog's desktop site (blog.naver.com) is a JavaScript-heavy SPA that Mercury/Feedbin's extractor cannot parse, returning no content. The mobile site (m.blog.naver.com) renders as static HTML and extracts successfully.

Add extractionURL(for:) to ArticleExtractor that transforms blog.naver.com URLs to m.blog.naver.com before sending to the extractor service, also stripping the ?fromRss=true&trackingCode=rss tracking parameters that Naver appends to RSS feed links.

Naver Blog's desktop site (blog.naver.com) is a JavaScript-heavy SPA
that Mercury/Feedbin's extractor cannot parse, returning no content.
The mobile site (m.blog.naver.com) renders as static HTML and extracts
successfully.

Add extractionURL(for:) to ArticleExtractor that transforms
blog.naver.com URLs to m.blog.naver.com before sending to the extractor
service, also stripping the ?fromRss=true&trackingCode=rss tracking
parameters that Naver appends to RSS feed links.

https://claude.ai/code/session_01DWJjZT2o7Cir7tmGYR3Z2U
@box-j box-j closed this Apr 4, 2026
@box-j box-j deleted the claude/enhance-article-fetching-h8Gbj branch April 4, 2026 16:28
@box-j box-j restored the claude/enhance-article-fetching-h8Gbj branch April 4, 2026 16:28
@box-j box-j reopened this Apr 4, 2026
@box-j box-j force-pushed the claude/enhance-article-fetching-h8Gbj branch from 8e026e8 to c31774b Compare April 4, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants