Skip to content

Commit 0a85e73

Browse files
authored
feat(convention): listing↔detail id pairing rule + CI gate (#1297)
* feat(convention): listing↔detail id pairing rule + CI gate Adds a hard convention: when a site exposes both a listing-class command (search / hot / top / recent / ...) and a detail-class command (read / paper / article / view / ...), every listing row MUST surface an id-shaped column whose value round-trips into the detail command. Without that, an agent has no way to follow up on a listing row except re-searching by title or scraping URLs out of band — both of which break the agent-native contract. What's in this PR - docs/conventions/listing-detail-id-pairing.md — full rule, examples table, why-it-matters, what counts as id-shaped, exemption taxonomy, how to add an id column to a listing. - scripts/check-listing-id-pairing.mjs — validator that reads cli-manifest.json, classifies each entry as listing / detail / other, and fails when a listing on a site that also has a read-detail command is missing an id-shaped column. Exemption allowlist records WHY each pair is exempt so future maintainers know what to verify. - npm run check:listing-id-pairing — strict-mode wrapper. - CI: new step in build job runs the validator after the manifest freshness check on Linux. - docs/developer/ts-adapter.md — cross-link from the adapter authoring guide. - docs/.vitepress/config.mts — sidebar entries for the new conventions section. Fixes brought to zero violations - 1688/search: add offer_id (already extracted, just surfaced) - bluesky/user: add uri (AT URI round-trips into bluesky/thread) - tieba/search: add id + url (thread_id already extracted) - tieba/hot: add url (rows are topics, not threads — url is the best-effort round-trip handle, doc'd as such) Exemptions (intentional, doc'd in EXEMPT map with rationale) - nowcoder/hot, bluesky/trending, twitter/trending — listing rows are topic strings, not posts. - lesswrong/user, reddit/user — rows are profile-attribute key/value pairs, addressed by the username arg. - discord-app/search — desktop UI session, message ids not extractable. - notion/search — Strategy.UI Quick Find, page ids not exposed in DOM. Validator output after this PR: 32 sites scanned, 75 listings checked, 7 exempted, 0 violations. * fix(convention): tighten listing id gate * fix(convention): close url-derived id loophole
1 parent 29b4869 commit 0a85e73

20 files changed

Lines changed: 444 additions & 14 deletions

File tree

.github/workflows/ci.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,14 @@ jobs:
5050
exit 1
5151
fi
5252
53+
# Guard: listing-class commands must surface an id-shaped column so their
54+
# rows round-trip into the site's detail-class command. Without this, an
55+
# agent has to re-search by title or scrape URLs to follow up on a row.
56+
# See docs/conventions/listing-detail-id-pairing.md.
57+
- name: Check listing↔detail id pairing
58+
if: runner.os == 'Linux'
59+
run: npm run check:listing-id-pairing
60+
5361
# ── Unit tests (vitest shard) ──
5462
# PR: ubuntu + Node 22 only (fast feedback, 2 jobs).
5563
# Push to main/dev: full matrix for cross-platform/cross-version coverage (12 jobs).

cli-manifest.json

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,7 @@
120120
],
121121
"columns": [
122122
"rank",
123+
"offer_id",
123124
"title",
124125
"price_text",
125126
"moq_text",
@@ -2990,6 +2991,7 @@
29902991
],
29912992
"columns": [
29922993
"rank",
2994+
"uri",
29932995
"text",
29942996
"likes",
29952997
"reposts",
@@ -8699,6 +8701,7 @@
86998701
],
87008702
"columns": [
87018703
"rank",
8704+
"tid",
87028705
"title",
87038706
"url"
87048707
],
@@ -8883,6 +8886,7 @@
88838886
],
88848887
"columns": [
88858888
"rank",
8889+
"tid",
88868890
"title",
88878891
"author",
88888892
"replies",
@@ -10262,6 +10266,7 @@
1026210266
}
1026310267
],
1026410268
"columns": [
10269+
"id",
1026510270
"author",
1026610271
"content",
1026710272
"likes",
@@ -10415,6 +10420,7 @@
1041510420
}
1041610421
],
1041710422
"columns": [
10423+
"id",
1041810424
"author",
1041910425
"content",
1042010426
"likes",
@@ -10489,6 +10495,7 @@
1048910495
}
1049010496
],
1049110497
"columns": [
10498+
"id",
1049210499
"content",
1049310500
"type",
1049410501
"likes",
@@ -16508,7 +16515,8 @@
1650816515
"rank",
1650916516
"title",
1651016517
"discussions",
16511-
"description"
16518+
"description",
16519+
"url"
1651216520
],
1651316521
"type": "js",
1651416522
"modulePath": "tieba/hot.js",
@@ -16635,10 +16643,12 @@
1663516643
],
1663616644
"columns": [
1663716645
"rank",
16646+
"id",
1663816647
"title",
1663916648
"forum",
1664016649
"author",
16641-
"time"
16650+
"time",
16651+
"url"
1664216652
],
1664316653
"type": "js",
1664416654
"modulePath": "tieba/search.js",
@@ -18764,6 +18774,7 @@
1876418774
}
1876518775
],
1876618776
"columns": [
18777+
"id",
1876718778
"author",
1876818779
"text",
1876918780
"reposts",
@@ -18915,6 +18926,7 @@
1891518926
],
1891618927
"columns": [
1891718928
"rank",
18929+
"id",
1891818930
"title",
1891918931
"author",
1892018932
"time",

clis/1688/search.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -294,7 +294,7 @@ cli({
294294
help: `结果数量上限(默认 ${SEARCH_LIMIT_DEFAULT},最大 ${SEARCH_LIMIT_MAX})`,
295295
},
296296
],
297-
columns: ['rank', 'title', 'price_text', 'moq_text', 'seller_name', 'location'],
297+
columns: ['rank', 'offer_id', 'title', 'price_text', 'moq_text', 'seller_name', 'location'],
298298
func: async (page, kwargs) => {
299299
const query = String(kwargs.query ?? '');
300300
const limit = parseSearchLimit(kwargs.limit);

clis/bluesky/user.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,14 +16,15 @@ cli({
1616
},
1717
{ name: 'limit', type: 'int', default: 20, help: 'Number of posts' },
1818
],
19-
columns: ['rank', 'text', 'likes', 'reposts', 'replies'],
19+
columns: ['rank', 'uri', 'text', 'likes', 'reposts', 'replies'],
2020
pipeline: [
2121
{ fetch: {
2222
url: 'https://public.api.bsky.app/xrpc/app.bsky.feed.getAuthorFeed?actor=${{ args.handle }}&limit=${{ args.limit }}',
2323
} },
2424
{ select: 'feed' },
2525
{ map: {
2626
rank: '${{ index + 1 }}',
27+
uri: '${{ item.post.uri }}',
2728
text: '${{ item.post.record.text }}',
2829
likes: '${{ item.post.likeCount }}',
2930
reposts: '${{ item.post.repostCount }}',

clis/hupu/hot.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ cli({
88
args: [
99
{ name: 'limit', type: 'int', default: 20, help: 'Number of hot posts' },
1010
],
11-
columns: ['rank', 'title', 'url'],
11+
columns: ['rank', 'tid', 'title', 'url'],
1212
pipeline: [
1313
{ navigate: 'https://bbs.hupu.com/' },
1414
{ evaluate: `(async () => {
@@ -33,6 +33,7 @@ cli({
3333
` },
3434
{ map: {
3535
rank: '${{ index + 1 }}',
36+
tid: '${{ item.tid }}',
3637
title: '${{ item.title }}',
3738
url: 'https://bbs.hupu.com/${{ item.tid }}.html',
3839
} },

clis/hupu/search.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ cli({
3838
help: '排序方式: general/createtime/replytime/light/reply'
3939
}
4040
],
41-
columns: ['rank', 'title', 'author', 'replies', 'lights', 'forum', 'url'],
41+
columns: ['rank', 'tid', 'title', 'author', 'replies', 'lights', 'forum', 'url'],
4242
func: async (page, kwargs) => {
4343
const { query, page: pageNum = 1, limit = 20, forum, sort = 'general' } = kwargs;
4444
const searchUrl = getHupuSearchUrl(query, pageNum, forum, sort);
@@ -48,6 +48,7 @@ cli({
4848
// 处理结果:清理HTML标签,解码HTML实体
4949
const processedResults = results.slice(0, Number(limit)).map((item, index) => ({
5050
rank: index + 1,
51+
tid: String(item.id || ''),
5152
title: decodeHtmlEntities(stripHtml(item.title)),
5253
author: item.username || '未知用户',
5354
replies: item.replies || '0',

clis/jike/feed.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ cli({
1717
args: [
1818
{ name: 'limit', type: 'int', default: 20 },
1919
],
20-
columns: ['author', 'content', 'likes', 'comments', 'time', 'url'],
20+
columns: ['id', 'author', 'content', 'likes', 'comments', 'time', 'url'],
2121
func: async (page, kwargs) => {
2222
const limit = kwargs.limit || 20;
2323
// 1. 导航到即刻首页,等待 SPA 重定向到 /following
@@ -44,6 +44,7 @@ cli({
4444
if (!author && !content) continue;
4545
4646
results.push({
47+
id: data.id,
4748
author,
4849
content: content.replace(/\\n/g, ' ').slice(0, 120),
4950
likes: data.likeCount || 0,

clis/jike/search.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ cli({
1818
{ name: 'query', type: 'string', required: true, positional: true },
1919
{ name: 'limit', type: 'int', default: 20 },
2020
],
21-
columns: ['author', 'content', 'likes', 'comments', 'time', 'url'],
21+
columns: ['id', 'author', 'content', 'likes', 'comments', 'time', 'url'],
2222
func: async (page, kwargs) => {
2323
const keyword = kwargs.query;
2424
const limit = kwargs.limit || 20;
@@ -44,6 +44,7 @@ cli({
4444
if (!author && !content) continue;
4545
4646
results.push({
47+
id: data.id,
4748
author,
4849
content: content.replace(/\\n/g, ' ').slice(0, 120),
4950
likes: data.likeCount || 0,

clis/jike/user.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ cli({
1616
},
1717
{ name: 'limit', type: 'int', default: 20, help: 'Number of posts' },
1818
],
19-
columns: ['content', 'type', 'likes', 'comments', 'time', 'url'],
19+
columns: ['id', 'content', 'type', 'likes', 'comments', 'time', 'url'],
2020
pipeline: [
2121
{ navigate: 'https://m.okjike.com/users/${{ args.username }}' },
2222
{ evaluate: `(() => {
@@ -39,6 +39,7 @@ cli({
3939
})()
4040
` },
4141
{ map: {
42+
id: '${{ item.id }}',
4243
content: '${{ item.content }}',
4344
type: '${{ item.type }}',
4445
likes: '${{ item.likes }}',

clis/tieba/hot.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ cli({
1313
args: [
1414
{ name: 'limit', type: 'int', default: 20, help: 'Number of items to return' },
1515
],
16-
columns: ['rank', 'title', 'discussions', 'description'],
16+
columns: ['rank', 'title', 'discussions', 'description', 'url'],
1717
func: async (page, kwargs) => {
1818
const limit = normalizeTiebaLimit(kwargs.limit);
1919
// Use the default browser settle path so we do not scrape the previous page.

0 commit comments

Comments
 (0)