Skip to content

fix(im): validate Content-Range for resource download chunks#1178

Open
xu91102 wants to merge 1 commit into
larksuite:mainfrom
xu91102:codex/fix-im-content-range-validation
Open

fix(im): validate Content-Range for resource download chunks#1178
xu91102 wants to merge 1 commit into
larksuite:mainfrom
xu91102:codex/fix-im-content-range-validation

Conversation

@xu91102
Copy link
Copy Markdown

@xu91102 xu91102 commented May 29, 2026

Summary

Fixes an integrity gap in im +messages-resources-download range downloads where later 206 Partial Content responses could return the wrong byte range while still matching the final byte count. The downloader now parses and validates Content-Range for the initial and subsequent chunks.

Changes

  • Add a concrete Content-Range parser that validates byte start, end, and total values.
  • Validate the initial range response against the probe request and every later chunk against the requested range.
  • Keep 200 OK single-stream downloads unchanged.
  • Tighten IM resource output path normalization so Unix-rooted paths such as /tmp/out.bin are rejected on Windows too.
  • Add unit coverage for malformed ranges, mismatched chunks, and platform-neutral path expectations.

Test Plan

  • Unit tests pass: go test ./shortcuts/im -run TestParseContentRange|TestDownloadIMResourceRangeChunksValidateContentRange|TestDownloadIMResourceToPathInitialRangeMustMatchRequest|TestDownloadIMResourceToPathHTTPClientError|TestNormalizeDownloadOutputPath|TestParseContentDispositionFilename|TestResolveIMResourceDownloadPath|TestDownloadIMResourceToPathRangeDownload|TestDownloadIMResourceToPathInvalidContentRange|TestDownloadIMResourceToPathRangeChunkFailureCleansOutput|TestDownloadIMResourceToPathRangeOverflowCleansOutput|TestDownloadIMResourceToPathRangeShortChunkSizeMismatch
  • Format check passes: gofmt -l shortcuts/im/im_messages_resources_download.go shortcuts/im/helpers_test.go shortcuts/im/helpers_network_test.go produced no output
  • Diff check passes: git diff --check
  • Full package tests: go test ./shortcuts/im was run on Windows but hit existing platform issues unrelated to this change: temp media files remained locked during cleanup, and TestValidateMediaFlagPath/absolute_path_rejected uses /etc/passwd, which is not absolute on Windows.
  • Manual local verification confirms the lark-cli im +messages-resources-download flow works as expected: not run; change is covered by focused HTTP stream unit tests.

Related Issues

  • None

Summary by CodeRabbit

  • Bug Fixes

    • Improved HTTP range request validation for downloads to prevent server response mismatches.
    • Enhanced output path normalization with stricter validation to reject invalid directory paths.
    • Strengthened validation of partial content responses to ensure download integrity.
  • Tests

    • Expanded test coverage for content-range header parsing and range request validation scenarios.

Review Change Stack

@github-actions github-actions Bot added domain/im PR touches the im domain size/M Single-domain feat or fix with limited business impact labels May 29, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

📝 Walkthrough

Walkthrough

This PR hardens HTTP range-request validation for IM resource downloads by introducing a contentRange struct with parsing and validation, applying stricter path normalization, integrating validation into the download reader, and adding comprehensive integration tests covering initial probe and chunk-range scenarios.

Changes

Range Request Validation for IM Downloads

Layer / File(s) Summary
Content-Range parsing and validation contract
shortcuts/im/im_messages_resources_download.go, shortcuts/im/helpers_test.go
New contentRange struct with String() method; parseContentRange parses Content-Range header format; validateContentRange checks start/end/total consistency. Replaces prior parseTotalSize. TestParseContentRange validates parsing across valid formats and error conditions.
Path handling strictness
shortcuts/im/im_messages_resources_download.go, shortcuts/im/helpers_test.go
normalizeDownloadOutputPath now trims whitespace, rejects absolute paths (via filepath.IsAbs and leading / or \), cleans the result, and errors when cleaned path becomes ".". Test expectations updated to use filepath.Join for OS independence.
Range chunk reader validation
shortcuts/im/im_messages_resources_download.go
rangeChunkReader.Read validates each Content-Range response header against expected byte bounds and total using validateContentRange; invalid headers close the body and return network error.
Initial 206 response handling
shortcuts/im/im_messages_resources_download.go
downloadIMResourceToPath now fully parses and validates the initial Content-Range, requiring start == 0 and end matching the probe chunk bound, with consistent total size across responses. Validated total initializes the range reader.
Integration tests for range validation
shortcuts/im/helpers_network_test.go
TestDownloadIMResourceToPathInitialRangeMustMatchRequest verifies initial Range header and detects mismatched Content-Range. TestDownloadIMResourceRangeChunksValidateContentRange uses table-driven scenarios to check Range headers and Content-Range mismatches across chunks. Helper imResourceRangeResponse constructs test responses.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The changes introduce a new validation contract for Content-Range parsing and apply it across three integration points (initial response, chunk reader, and download orchestration). The logic is moderately complex with multiple validation checks, and the test coverage spans unit tests for parsing and integration tests for the full flow. The changes are cohesive within the IM download module with no cross-module dependencies.

Possibly related PRs

  • larksuite/cli#283: The initial range-download implementation and Content-Range error handling that this PR extends with stricter validation of the initial probe response and per-chunk Content-Range headers.

Suggested labels

domain/im, size/M

Suggested reviewers

  • jackie3927
  • YangJunzhou-01

Poem

🐰 A rabbit downloads with range so bright,
Validates headers left and right,
Start, end, and total all align,
Chunk by chunk, the bits combine! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(im): validate Content-Range for resource download chunks' directly summarizes the main change: adding Content-Range validation for IM resource downloads.
Description check ✅ Passed The description includes all required template sections (Summary, Changes, Test Plan, Related Issues) with detailed, concrete information about the fix and verification approach.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
shortcuts/im/helpers_test.go (1)

562-569: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Cover the new Windows-rooted path branch here.

This table exercises /tmp/out.bin, but it never hits the new strings.HasPrefix(outputPath, "\\") rejection path. Adding a \tmp\out.bin case would lock in the regression this PR is trying to prevent on Windows.

Suggested test case
 	{name: "absolute path", fileKey: "file_123", outputPath: "/tmp/out.bin", wantErr: "absolute paths are not allowed"},
+	{name: "windows rooted path", fileKey: "file_123", outputPath: `\tmp\out.bin`, wantErr: "absolute paths are not allowed"},
 	{name: "parent escape", fileKey: "file_123", outputPath: "../out.bin", wantErr: "path cannot escape the current working directory"},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shortcuts/im/helpers_test.go` around lines 562 - 569, Add a test row to the
table in shortcuts/im/helpers_test.go that exercises the Windows-rooted path
branch by using outputPath set to `\tmp\out.bin` (e.g., {name: "Windows-rooted
path", fileKey: "file_123", outputPath: "\\tmp\\out.bin", wantErr: "absolute
paths are not allowed"}) so the strings.HasPrefix(outputPath, "\\") rejection
branch is covered; place it alongside the other cases that assert errors for
absolute/escaped paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@shortcuts/im/im_messages_resources_download.go`:
- Around line 244-251: The code currently only validates the Content-Range
header via validateContentRange (using r.nextOffset, end, r.totalSize) but does
not verify the actual response body length, allowing corrupted writes; update
the download handling around the validateContentRange/resp.Body logic (the block
that calls validateContentRange and the similar logic at lines ~253-254) to
compute expectedLen = end - start + 1 (from the contentRange you validated),
read exactly expectedLen bytes from resp.Body (e.g., via io.ReadFull or io.CopyN
into the destination) and fail with an error (close resp.Body and return
output.ErrNetwork(...)) if fewer or more bytes are read (EOF/mismatch), and only
advance r.nextOffset after a successful exact-length write; apply the same
exact-length check to the initial probe body path as well.

---

Outside diff comments:
In `@shortcuts/im/helpers_test.go`:
- Around line 562-569: Add a test row to the table in
shortcuts/im/helpers_test.go that exercises the Windows-rooted path branch by
using outputPath set to `\tmp\out.bin` (e.g., {name: "Windows-rooted path",
fileKey: "file_123", outputPath: "\\tmp\\out.bin", wantErr: "absolute paths are
not allowed"}) so the strings.HasPrefix(outputPath, "\\") rejection branch is
covered; place it alongside the other cases that assert errors for
absolute/escaped paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7d40b66d-6137-48a2-b13a-d82f8e397d5c

📥 Commits

Reviewing files that changed from the base of the PR and between d126ea2 and 3ce5cec.

📒 Files selected for processing (3)
  • shortcuts/im/helpers_network_test.go
  • shortcuts/im/helpers_test.go
  • shortcuts/im/im_messages_resources_download.go

Comment on lines +244 to +251
if err := validateContentRange(resp.Header.Get("Content-Range"), contentRange{
start: r.nextOffset,
end: end,
total: r.totalSize,
}); err != nil {
resp.Body.Close()
return 0, output.ErrNetwork("invalid Content-Range header on range response: %s", err)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate the chunk body length, not just the header.

This still accepts a 206 body whose actual byte count differs from end-start+1. Because nextOffset advances from the header immediately, an oversized chunk followed by a later undersized chunk can still land on the expected final totalSize while writing corrupted bytes to disk. Please track the advertised length for each response and fail on EOF when the body length does not match it; the initial probe body needs the same check.

Also applies to: 253-254

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shortcuts/im/im_messages_resources_download.go` around lines 244 - 251, The
code currently only validates the Content-Range header via validateContentRange
(using r.nextOffset, end, r.totalSize) but does not verify the actual response
body length, allowing corrupted writes; update the download handling around the
validateContentRange/resp.Body logic (the block that calls validateContentRange
and the similar logic at lines ~253-254) to compute expectedLen = end - start +
1 (from the contentRange you validated), read exactly expectedLen bytes from
resp.Body (e.g., via io.ReadFull or io.CopyN into the destination) and fail with
an error (close resp.Body and return output.ErrNetwork(...)) if fewer or more
bytes are read (EOF/mismatch), and only advance r.nextOffset after a successful
exact-length write; apply the same exact-length check to the initial probe body
path as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain/im PR touches the im domain size/M Single-domain feat or fix with limited business impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant