Skip to content

feat: implement retry mechanism for QQ Official API file uploads#7430

Open
KBVsent wants to merge 2 commits intoAstrBotDevs:masterfrom
KBVsent:feat/qqofficial-upload-retry
Open

feat: implement retry mechanism for QQ Official API file uploads#7430
KBVsent wants to merge 2 commits intoAstrBotDevs:masterfrom
KBVsent:feat/qqofficial-upload-retry

Conversation

@KBVsent
Copy link
Copy Markdown
Contributor

@KBVsent KBVsent commented Apr 9, 2026

Related Issue: #7257
The QQ Official API's file upload endpoints (/v2/users/{openid}/files and /v2/groups/{group_openid}/files) can occasionally return transient server errors (ServerError HTTP 500/504). Without retry logic, these transient failures silently drop media uploads, causing messages to send without their intended attachments. This PR wraps both upload call sites with a tenacity-based retry decorator so that recoverable errors are retried automatically before surfacing to the user.

Modifications / 改动点

  • astrbot/core/platform/sources/qqofficial/qqofficial_message_event.py: Added a module-level _qqofficial_retry decorator (3 attempts, exponential back-off 1–10 s, logs warnings between retries) and applied it to both _do_upload() inner functions in _upload_file_for_message() and upload_media(). Also catches ServerError/SequenceNumberError after exhausted retries to log a clear error message instead of swallowing it silently.
  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

After the feature was implemented, I have been running it for over a week without any media upload errors.


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Add a retry mechanism around QQ Official API file upload requests to improve reliability of media uploads under transient server errors.

New Features:

  • Introduce a shared retry decorator for QQ Official API upload calls with bounded exponential backoff and logging.

Enhancements:

  • Wrap user and group file upload paths in the QQ Official message handler with the shared retry logic and clearer error logging on repeated failures.

@auto-assign auto-assign bot requested review from Raven95676 and Soulter April 9, 2026 08:57
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Apr 9, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The _qqofficial_retry decorator comment mentions HTTP 429 but the retry condition only targets ServerError and SequenceNumberError; consider either adding the 429-related exception type or updating the comment to reflect the actual behavior.
  • Both _do_upload inner functions are redefined on each call; if this pattern is used elsewhere or grows, consider extracting a shared helper that takes the route/payload to avoid per-call function creation and keep retry usage consistent.
  • The log message 上传媒体文件失败,已重试3次 hardcodes the retry count; it would be more maintainable to derive this from the stop_after_attempt configuration or a shared constant so logs stay accurate if the retry policy changes.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `_qqofficial_retry` decorator comment mentions HTTP 429 but the retry condition only targets `ServerError` and `SequenceNumberError`; consider either adding the 429-related exception type or updating the comment to reflect the actual behavior.
- Both `_do_upload` inner functions are redefined on each call; if this pattern is used elsewhere or grows, consider extracting a shared helper that takes the route/payload to avoid per-call function creation and keep retry usage consistent.
- The log message `上传媒体文件失败,已重试3次` hardcodes the retry count; it would be more maintainable to derive this from the `stop_after_attempt` configuration or a shared constant so logs stay accurate if the retry policy changes.

## Individual Comments

### Comment 1
<location path="astrbot/core/platform/sources/qqofficial/qqofficial_message_event.py" line_range="55-64" />
<code_context>

 _patch_qq_botpy_formdata()

+# Retry decorator for QQ Official API transient errors (HTTP 500/504, 429)
+_qqofficial_retry = retry(
+    retry=retry_if_exception_type(
+        (
+            botpy.errors.ServerError,
+            botpy.errors.SequenceNumberError,
+        )
+    ),
+    stop=stop_after_attempt(3),
+    wait=wait_exponential(multiplier=1, min=1, max=10),
+    before_sleep=before_sleep_log(logger, logging.WARNING),
+    reraise=True,
+)
+
</code_context>
<issue_to_address>
**issue (bug_risk):** The retry decorator comment mentions HTTP 429, but the retry condition only covers specific exception types, which might not include 429 responses.

The decorator comment implies we generically handle HTTP 500/504 and 429, but the predicate only retries `ServerError` and `SequenceNumberError`. If 429 is represented by a different exception type in `botpy`, those cases won’t be retried as the comment suggests. Please either (a) adjust the comment to match the actual retried exceptions, or (b) expand the predicate to include the 429-specific exception type, if one exists in `botpy`.
</issue_to_address>

### Comment 2
<location path="astrbot/core/platform/sources/qqofficial/qqofficial_message_event.py" line_range="566-567" />
<code_context>
                     file_info=result["file_info"],
                     ttl=result.get("ttl", 0),
                 )
+        except (botpy.errors.ServerError, botpy.errors.SequenceNumberError):
+            logger.error(f"上传媒体文件失败,已重试3次: {file_source}")
         except Exception as e:
             logger.error(f"上传请求错误: {e}")
</code_context>
<issue_to_address>
**nitpick:** The log message about the number of retries may not match the actual number of attempts configured in the retry policy.

With `stop_after_attempt(3)`, tenacity makes at most 3 total attempts (1 initial + 2 retries). The text `"已重试3次"` suggests 3 retries after the first attempt, which can be misleading when debugging. Consider updating the message to clearly distinguish total attempts vs retries, or derive the number from the retry configuration instead of hardcoding it.
</issue_to_address>

### Comment 3
<location path="astrbot/core/platform/sources/qqofficial/qqofficial_message_event.py" line_range="478" />
<code_context>
-            result = await self.bot.api._http.request(route, json=payload)
-        else:
-            raise ValueError("Invalid upload parameters")
+        @_qqofficial_retry
+        async def _do_upload():
+            if "openid" in kwargs:
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting a shared retried HTTP helper method and using it from both upload functions to avoid nested functions and duplicated retry logic.

You can keep the retry behavior while reducing nesting and duplication by extracting a small shared helper and moving retry to that layer instead of per‑call inner functions.

### 1. Centralize the retried HTTP call

Define a single retried helper on the class and use it in both methods:

```python
# keep your existing decorator definition
_qqofficial_retry = retry(
    retry=retry_if_exception_type(
        (
            botpy.errors.ServerError,
            botpy.errors.SequenceNumberError,
        )
    ),
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=1, max=10),
    before_sleep=before_sleep_log(logger, logging.WARNING),
    reraise=True,
)


class QQOfficialMessageEvent(AstrMessageEvent):
    ...

    @_qqofficial_retry
    async def _request_with_retry(self, route: Route, payload: dict):
        return await self.bot.api._http.request(route, json=payload)
```

### 2. Flatten `upload_group_and_c2c_image`

Remove the inner function, keep routing/validation in the main function, and delegate only the HTTP request to the retried helper:

```python
    async def upload_group_and_c2c_image(
        self,
        image_base64: str,
        file_type: int,
        **kwargs,
    ) -> botpy.types.message.Media:
        payload = {
            "file_data": image_base64,
            "file_type": file_type,
            "srv_send_msg": False,
        }

        if "openid" in kwargs:
            payload["openid"] = kwargs["openid"]
            route = Route("POST", "/v2/users/{openid}/files", openid=kwargs["openid"])
        elif "group_openid" in kwargs:
            payload["group_openid"] = kwargs["group_openid"]
            route = Route(
                "POST",
                "/v2/groups/{group_openid}/files",
                group_openid=kwargs["group_openid"],
            )
        else:
            raise ValueError("Invalid upload parameters")

        result = await self._request_with_retry(route, payload)

        if not isinstance(result, dict):
            raise RuntimeError(
                f"Failed to upload image, response is not dict: {result}"
            )

        return Media(
            file_uuid=result["file_uuid"],
            file_info=result["file_info"],
            ttl=result.get("ttl", 0),
        )
```

This keeps the same retry semantics but removes the extra nested async function and makes the boundary between parameter routing and HTTP I/O clearer.

### 3. Simplify `upload_group_and_c2c_media` and centralize logging

Use the same helper and reduce duplicated exception handling around the same error types:

```python
    async def upload_group_and_c2c_media(
        self,
        file_source: str,
        file_type: int,
        srv_send_msg: bool = False,
        file_name: str | None = None,
        **kwargs,
    ) -> Media | None:
        payload: dict = {"file_type": file_type, "srv_send_msg": srv_send_msg}
        if file_name:
            payload["file_name"] = file_name

        if os.path.exists(file_source):
            async with aiofiles.open(file_source, "rb") as f:
                file_content = await f.read()
            payload["file_data"] = base64.b64encode(file_content).decode("utf-8")
        else:
            payload["file_data"] = await file_to_base64(file_source)

        if "openid" in kwargs:
            payload["openid"] = kwargs["openid"]
            route = Route("POST", "/v2/users/{openid}/files", openid=kwargs["openid"])
        elif "group_openid" in kwargs:
            payload["group_openid"] = kwargs["group_openid"]
            route = Route(
                "POST",
                "/v2/groups/{group_openid}/files",
                group_openid=kwargs["group_openid"],
            )
        else:
            return None

        try:
            result = await self._request_with_retry(route, payload)

            if result:
                if not isinstance(result, dict):
                    logger.error(f"上传文件响应格式错误: {result}")
                    return None

                return Media(
                    file_uuid=result["file_uuid"],
                    file_info=result["file_info"],
                    ttl=result.get("ttl", 0),
                )
        except botpy.errors.ServerError | botpy.errors.SequenceNumberError:
            logger.error(f"上传媒体文件失败,已重试3次: {file_source}")
        except Exception as e:
            logger.error(f"上传请求错误: {e}")

        return None
```

Key points:

- Retry logic is applied once in `_request_with_retry`, avoiding per‑call inner functions and duplicated decorator usage.
- Routing and argument validation stay in the public methods, making it clearer what is retryable vs. parameter errors.
- Logging for final failure of retryable errors is localized in one place per method, without re‑encoding the same error list in both the decorator and inner function.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@dosubot dosubot bot added the area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. label Apr 9, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry mechanism using the tenacity library to handle transient errors (such as ServerError and SequenceNumberError) during image and media uploads for the QQ Official platform. The implementation uses an exponential backoff strategy and logs attempts before sleeping. Feedback was provided to improve consistency by adding explicit exception handling and logging to the upload_group_and_c2c_image function when all retry attempts are exhausted, similar to the implementation in the media upload function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant