feat(drive): add +list-comments shortcut with anchor-aware filtering#1114
feat(drive): add +list-comments shortcut with anchor-aware filtering#1114kyalpha313 wants to merge 3 commits into
Conversation
Adds drive +list-comments, a new shortcut that lists docx comments with
smart defaults aligned to the Lark UI side panel:
- filter: unresolved + non-orphan anchors
- output: each comment with its reply_list (replies + reactions)
- order: by anchor position in the document body
Opt-in flags --include-orphaned, --include-resolved, --no-reactions, and
--order=created widen the result set or change ordering.
Each item is augmented with derived fields:
- anchor_state: "valid" | "structural" | "orphaned"
- anchor_position: byte offset in the normalized document text
Orphan detection strategy:
- is_whole=true -> valid (full-document comments)
- quote == "[Sticky note]" -> "structural" if any <readonly-block>
element still exists in the document, otherwise "orphaned"
- Otherwise: first-line of quote (after HTML strip + entity decode +
whitespace removal) must occur in the normalized document text
Two known limitations are documented in code and the linked issue:
1. Short anchor text appearing in multiple places cannot be uniquely
localized; the first occurrence wins. May falsely classify a
deleted-anchor comment as valid when the text still appears
elsewhere.
2. Sticky-note ordering uses the readonly-block XML element position,
which may differ from the rendered visual placement in some cases.
Closes larksuite#1111
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
📝 WalkthroughWalkthroughAdds a new ChangesDrive +list-comments Shortcut Implementation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@shortcuts/drive/drive_list_comments_test.go`:
- Around line 458-593: Both tests
(TestDriveListComments_E2E_FiltersOrphanedByDefault and
TestDriveListComments_E2E_IncludeOrphanedKeepsAll) rely on global config and
must isolate config state; add t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir())
at the start of each test (or in a shared test helper called before
cmdutil.TestFactory/mountAndRunDrive) so the tests use a temporary
LARKSUITE_CLI_CONFIG_DIR and avoid cross-test state leakage.
- Around line 86-132: Replace uses of common.TestNewRuntimeContext(...) in these
tests with the repo-standard test factory cmdutil.TestFactory(t,
&core.CliConfig{}); specifically, when building the runtime for
validateListComments calls (tests referencing newListCommentsCmd and
validateListComments), call cmdutil.TestFactory(t, &core.CliConfig{}) to obtain
the test factory/runtime instead of common.TestNewRuntimeContext, and update any
variable names as needed so validateListComments receives the factory/runtime
produced by cmdutil.TestFactory.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 177b228c-d297-4257-8756-8d8f50e92a48
📒 Files selected for processing (4)
shortcuts/drive/drive_list_comments.goshortcuts/drive/drive_list_comments_test.goshortcuts/drive/shortcuts.goshortcuts/drive/shortcuts_test.go
Add t.Setenv("LARKSUITE_CLI_CONFIG_DIR", t.TempDir()) to the two httpmock E2E
tests for drive +list-comments, matching the pattern used by
drive_inspect_test.go and AGENTS.md guidance for config-state isolation.
Addresses CodeRabbit review on larksuite#1114.
…tests
Revert the t.Setenv("LARKSUITE_CLI_CONFIG_DIR", ...) added in 64e65b6. Per the
repo learning from PR larksuite#343, tests built via cmdutil.TestFactory(t, config) use
an in-memory config closure and never touch the filesystem, so isolating
CONFIG_DIR has no effect. Only the real NewDefault() factory path needs it.
Verified: cmdutil.TestFactory sets Config: func() (*core.CliConfig, error) {
return config, nil } — no filesystem access.
| hasStructuralAnchor := strings.Contains(docXML, structuralAnchorTag) | ||
| // Approximate position of the FIRST sticky-note anchor in normalized space. | ||
| // All sticky-anchored comments share this position for sorting purposes. | ||
| structuralPos := projectRawPosToNormalized(docXML, normalized, structuralAnchorTag) |
There was a problem hiding this comment.
This computes a single structuralPos from the first <readonly-block> and then reuses that same position for every sticky-note comment. As a result, multiple structural comments do not actually sort by document position under the default order=anchor; they all tie on the same anchor position and then fall back to create_time. That does not match the PR description about approximating sticky-note order from XML position. If we cannot map individual sticky-note comments to distinct structural anchors, we should either avoid claiming anchor-order fidelity for them or make this limitation explicit in output and tests.
| return "" | ||
| } | ||
| decoded := html.UnescapeString(firstLine) | ||
| return whitespacePattern.ReplaceAllString(decoded, "") |
There was a problem hiding this comment.
Rich-text quotes containing inline markup will be misclassified here. The PR description says quote matching should strip HTML tags before comparison, but normalizeQuoteNeedle only unescapes entities and removes whitespace. If the comments API returns formatting tags inside quote (for example <b> or <i>), the needle will still contain markup and fail to match the tag-stripped document text, causing a still-valid comment to be classified as orphaned and filtered out by default. Please strip tags here as well, and add a test where the quote itself contains inline markup.
|
感谢这个PR,看了下实现,现在是根据评论的quota去匹配正文的,定位在xml内容的起始位置,有几个问题:
另外,我们下周会在评论接口返回评论所在的blockID位置,这个会更加准确,也能覆盖block评论场景,可以等下我们下周的版本。 |
|
@wittam-01 感谢 review 和详细反馈,这两点都很到位:
听到下周接口会返回 blockID,这才是真正的根治方案——可以唯一定位锚、自然覆盖 block 评论场景,连 fetch 正文都省了。我先把这个 PR 转 draft 暂停,等新接口上线后用 blockID 重写,代码也能砍掉一大半。 不用提前给我看新字段 schema,下周看到了再说。期间如果有什么我能帮忙的(比如新接口上线时优先试一下、或者拆其它周边小工具),随时 ping 我。 |
Summary
新增
drive +list-commentsshortcut,以智能默认值返回 docx 评论,对齐用户在飞书 UI 侧栏看到的评论卡片。详见 #1111 的背景与设计讨论。Changes
shortcuts/drive/drive_list_comments.go:实现 shortcut 主体(Validate / DryRun / Execute)+ 锚定状态识别 + 排序shortcuts/drive/drive_list_comments_test.go:单测覆盖 parse / validate / normalize / buildCommentItem / sort / DryRun,以及两个 httpmock E2E(默认过滤 /--include-orphaned)shortcuts/drive/shortcuts.go:注册DriveListCommentsshortcuts/drive/shortcuts_test.go:在 expected commands 表里加入+list-commentsCLI 形态
输出结构
复用
drive file.comments list原响应,每条评论多两个派生字段:{ "comment_id": "...", "quote": "...", "is_solved": false, "is_whole": false, "anchor_state": "valid" | "structural" | "orphaned", "anchor_position": 12345, "reply_list": { "replies": [ { ..., "reactions": [...] } ] } }外层附
counts: {total, valid, structural, orphaned}与file_token/wiki_token,便于上层做断言。内部流程
--doc,wiki URL 通过wiki/v2/spaces/get_node解包到 docx token。GET /open-apis/drive/v1/files/:file_token/comments,按 flag 决定是否带is_solved=false和need_reaction=true。POST /open-apis/docs_ai/v1/documents/:token/fetch拉取正文(format=xml,避免 fix: remove unsupported docs fetch text format #1109 后text不可用)。anchor_state:is_whole=true→validquote == "[Sticky note]"→ 文档存在<readonly-block>元素 →structural,否则orphaned<b>)拆开的锚文本仍能匹配——例如<p>由 <b>P2 / P3 事件聚类</b>触发:…</p>中的 quote由 P2 / P3 事件聚类触发:…。--include-*过滤后排序输出。已知限制(issue 中也已说明)
anchor.block_id,无法区分多个相同文本的锚。算法取第一个出现位置,可能:<readonly-block>元素在 XML 中的位置,可能与渲染位置不完全一致。Test Plan
go test -race -count=1 ./shortcuts/drive/...通过(含本 PR 新增的所有 unit 和 httpmock E2E 测试)go test -race -count=1 ./shortcuts/...通过(无回归)go vet ./shortcuts/drive/通过gofmt -l shortcuts/drive/drive_list_comments.go shortcuts/drive/drive_list_comments_test.go无输出go mod tidy不改动 go.mod / go.sumhttps://<host>/docx/<token>上一份共 29 条未解决评论的真实文档,UI 显示 22 条,本 shortcut 默认返回 23 条(22 valid + 1 structural),与 UI 对齐到一条短锚多处出现的已知限制($ID 的"已知错误"评论:算法误判为 valid,UI 实际隐藏)。--include-orphaned后返回全部 29 条。Closes #1111
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests