feat(route/baidu): add BAIDU_COOKIE support and extract shared …#21663
feat(route/baidu): add BAIDU_COOKIE support and extract shared …#21663FlanChanXwO wants to merge 13 commits intoDIYgod:masterfrom
Conversation
…a routes and implement user post retrieval
… handling in forum, post, search, and user routes
… formatting - Refactor parseBaiduCookies function to use chained array methods for better readability - Simplify cookie parsing by trimming and filtering empty strings before mapping - Remove redundant comments in common.ts and post.tsx files - Improve code formatting by removing excessive blank lines - Maintain same functionality while enhancing code maintainability
…utilities Add common.ts for cookie parsing, page retrieval, security check and URL normalization. Refactor forum, post, search and user routes to use shared utilities. Preserve rich text content in post replies. Support direct reply links. Fix cookie value parsing for cookies containing '=' character. Route/baidu
- Remove trailing whitespace from empty lines - Ensure consistent line endings in utility functions
# Conflicts: # lib/routes/baidu/tieba/utils.ts
|
Successfully generated as following: http://localhost:1200/baidu/tieba/forum/good/孙笑川 - Failed ❌http://localhost:1200/baidu/tieba/post/10620724314 - Failed ❌http://localhost:1200/baidu/tieba/user/斗鱼游戏君 - Failed ❌http://localhost:1200/baidu/tieba/search/反原神 - Failed ❌ |
|
Successfully generated as following: http://localhost:1200/baidu/tieba/forum/good/孙笑川 - Failed ❌http://localhost:1200/baidu/tieba/post/10620724314 - Failed ❌http://localhost:1200/baidu/tieba/user/斗鱼游戏君 - Failed ❌http://localhost:1200/baidu/tieba/search/反原神 - Failed ❌ |
…ontent retrieval (#2)
|
Successfully generated as following: http://localhost:1200/baidu/tieba/forum/good/孙笑川 - Failed ❌http://localhost:1200/baidu/tieba/post/10620724314 - Failed ❌http://localhost:1200/baidu/tieba/user/斗鱼游戏君 - Failed ❌http://localhost:1200/baidu/tieba/search/反原神 - Failed ❌ |
|
Successfully generated as following: http://localhost:1200/baidu/tieba/forum/good/孙笑川 - Failed ❌http://localhost:1200/baidu/tieba/post/10620724314 - Failed ❌http://localhost:1200/baidu/tieba/user/斗鱼游戏君 - Failed ❌http://localhost:1200/baidu/tieba/search/反原神 - Failed ❌ |
There was a problem hiding this comment.
Pull request overview
Adds BAIDU_COOKIE-driven Puppeteer fetching for Baidu Tieba routes to mitigate recent anti-bot/JS-rendering changes (closes #19642), and factors shared logic into a common module.
Changes:
- Migrate Tieba forum/post/search/user routes from direct HTTP fetching to Puppeteer-based page rendering with cookie injection.
- Introduce shared helpers (
getTiebaPageContent, cookie parsing, URL normalization, security-verification detection) and thread/time parsing utilities. - Extend global config to support
BAIDU_COOKIEand expose it viaconfig.baidu.cookie.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| lib/routes/baidu/tieba/common.ts | New shared Puppeteer fetching + cookie parsing + verification detection + URL normalization. |
| lib/routes/baidu/tieba/utils.ts | New utilities for relative time parsing and thread list parsing across page variants. |
| lib/routes/baidu/tieba/forum.tsx | Switch forum routes to Puppeteer + shared parsing utilities; add BAIDU_COOKIE requirement. |
| lib/routes/baidu/tieba/post.tsx | Switch post route to Puppeteer and new selectors; add BAIDU_COOKIE requirement. |
| lib/routes/baidu/tieba/search.tsx | Switch search route to Puppeteer and new DOM selectors; normalize links; add BAIDU_COOKIE requirement. |
| lib/routes/baidu/tieba/user.tsx | Replace old user route implementation with Puppeteer-based version; add BAIDU_COOKIE requirement. |
| lib/routes/baidu/tieba/user.ts | Remove legacy got-based implementation. |
| lib/config.ts | Add baidu.cookie config plumbing and BAIDU_COOKIE env key. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thank you for the detailed review! All feedback has been incorporated. |
| searchParams: params, | ||
| }); | ||
| const pageUrl = `https://tieba.baidu.com/f?kw=${encodeURIComponent(kw)}&pn=0${cid === '0' ? '' : `&cid=${cid}`}${ctx.req.path.includes('good') ? '&tab=good' : ''}${sortParam}`; | ||
| const data = await getTiebaPageContent(pageUrl, `tieba:forum:${kw}:${cid}:${sortBy}`, { waitForSelector: '.thread-card-wrapper', timeout: 3000 }); |
There was a problem hiding this comment.
Since browser rendering and cookie are now required, it would better to utility the API at /c/f/frs/page_pc instead of going for HTML.
Involved Issue / 该 PR 相关 Issue
Close #19642
Example for the Proposed Route(s) / 路由地址示例
New RSS Route Checklist / 新 RSS 路由检查表
PuppeteerNote / 说明
概述
本次 PR 为百度贴吧的 4 个路由(
/baidu/tieba/forum、/baidu/tieba/post、/baidu/tieba/search、/baidu/tieba/user)添加了 BAIDU_COOKIE 支持,以应对百度反爬机制的升级。主要改动
从 HTTP 请求迁移到 Puppeteer
got/ofetch直接发送 HTTP 请求获取页面内容新增
common.ts共享模块parseBaiduCookies(): 正确解析包含=字符的 Cookie 值getTiebaPageContent(): 统一的页面获取逻辑,包含 Cookie 注入和安全验证检查normalizeUrl(): 将相对链接转换为绝对地址checkSecurityVerification(): 检测百度安全验证页面并抛出友好错误提示各路由适配
.thread-card选择器改进时间解析
parseRelativeTime()支持更多格式(刚刚、昨天、X天前等)必需配置
以下 4 个路由现在必须配置
BAIDU_COOKIE才能使用:/baidu/tieba/forum/:kw/baidu/tieba/forum/女图/baidu/tieba/post/:id/baidu/tieba/post/686961453/baidu/tieba/search/:qw/baidu/tieba/search/neuro/baidu/tieba/user/:uid/baidu/tieba/user/斗鱼游戏君配置方法
在环境变量或
.env文件中设置:BAIDU_COOKIE=BDUSS=your_bduss_value;other_cookie=value