fix(slack): drop synthetic thread_ts to avoid invalid_thread_ts#212
fix(slack): drop synthetic thread_ts to avoid invalid_thread_ts#212LIU9293 wants to merge 6 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7b7cfa16b1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // a task/cron run, fall back to posting at the top of the channel rather | ||
| // than letting Slack reject the call with `invalid_thread_ts`. | ||
| const threadIsSynthetic = isSyntheticOwner(threadId); | ||
| const threadField = threadIsSynthetic ? {} : { thread_ts: threadId }; |
There was a problem hiding this comment.
Preserve synthetic question routing
When a task/cron run emits an ask_user question, this drops the synthetic thread_ts and posts the Slack question as a new top-level message, but the pending question is still stored under the synthetic thread id in request-run.ts (setPendingQuestion(request.channelId, request.threadId, ...)). Slack button handling later derives the thread from the posted message (body.message?.thread_ts || body.message?.ts in commands.ts), so a click/reply on this top-level question is routed under the real Slack timestamp instead of the synthetic thread and getPendingQuestion will not find it. In that scenario the user's answer starts a fresh turn instead of resuming the blocked task/cron run; the fallback needs to seed/mirror the synthetic session or otherwise map the posted question's ts back before exposing interactive/plain questions top-level.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — flagging this as an explicit follow-up rather than fixing in this PR.
This PR is intentionally narrow: stop posting an invalid thread_ts so Slack stops rejecting cron/task questions with invalid_thread_ts (Sentry ODE-DEAMON-7). Before this change, the question was lost entirely; after it, the question is at least visible at the top of the channel.
You're right that interactive button clicks / threaded replies under the new top-level message route through body.message?.thread_ts || body.message?.ts and so won't match the synthetic-thread pendingQuestion written by request-run.ts:495. That means: with this PR, the user sees the question but pressing a button starts a fresh turn instead of resuming the blocked cron/task run — strictly better than the previous "silently swallowed" failure mode, but still incomplete.
The proper fix needs one of:
- Mirror
pendingQuestionunder both the synthetic threadId and the posted message'stsonce we know it - Or re-key the pending-question entry after
chat.postMessagereturns the realts - Or seed the cron/task session early (right after the first
sendChannelMessage) so the synthetic id is gone by the timepostSlackQuestionruns
That's a session/state-routing change that belongs in its own PR. I'd rather land this fix now to stop the Sentry bleed and tackle the routing in a focused follow-up than block on it here. Will open a tracking issue.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cd11394fa4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const result = await slackApp.client.chat.postMessage({ | ||
| channel: rawChannelId, | ||
| thread_ts: threadId, | ||
| ...(threadIsSynthetic ? {} : { thread_ts: threadId }), |
There was a problem hiding this comment.
Use the channel workspace token for synthetic top-level sends
When threadIsSynthetic is true this call is now a top-level channel post, but the token was still resolved with getSlackBotToken(channelId, threadId) before this branch. In multi-workspace installs where a scheduled task/cron fires before that channel has been seen by the message router, getSlackBotToken can fall back to the first registered token instead of the token for rawChannelId; the existing top-level helper avoids that by consulting getWorkspaceBotTokenForChannel(channelId) first. In that context the new fallback can still fail with channel_not_found/not_in_channel even though dropping thread_ts was supposed to make the message deliverable.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 6da6431. Token resolution now mirrors sendChannelMessage when threadIsSynthetic is true:
getSlackBotTokenForProcessor(processorId)
?? (threadIsSynthetic
? (getWorkspaceBotTokenForChannel(channelId) ?? getSlackBotToken(channelId))
: getSlackBotToken(channelId, threadId));
The real-thread path is unchanged so existing registry bindings keep winning for in-flight replies. Suite still green (408 pass, 1 skip).
Addresses Codex P2 review on PR #212. When threadIsSynthetic is true, sendMessage degenerates to a top-level channel post (no thread_ts), but the token was still being resolved with getSlackBotToken(channelId, threadId) first. In multi-workspace installs where a scheduled task/cron fires before that channel has been seen by the message router, the registry has no entry for the fake thread_ts and getSlackBotToken can fall back to the first registered token instead of the token for rawChannelId — leading to a follow-on channel_not_found / not_in_channel even though dropping thread_ts was supposed to make the message deliverable. Mirror the token-resolution chain that sendChannelMessage already uses for top-level posts: processor token -> workspace-by-channel token -> getSlackBotToken for the synthetic path. The real-thread path is unchanged so registry bindings for in-flight thread replies still win. Tests: full suite still green (408 pass, 1 skip, 0 fail).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6da643122b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const result = await client.chat.postMessage({ | ||
| channel: channelId, | ||
| thread_ts: threadId, | ||
| ...threadField, |
There was a problem hiding this comment.
Resolve the channel token before posting synthetic questions
When this synthetic branch turns a task/cron ask_user question into a top-level Slack post, it still uses the token selected by createSlackAdapter.sendQuestion, which calls getSlackBotToken(channelId, threadId) with the fake task:/cron-job: id rather than doing the workspace-first channel lookup used for top-level sends. In a multi-workspace install where the scheduled channel has not yet been bound in the Slack registry, that can fall back to the first registered bot token, so these newly top-level Block Kit/plain question posts fail with channel_not_found/not_in_channel even though sendMessage was fixed; the fresh evidence is that the question path still passes the old token into this spread.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 1d4e434. The sendQuestion adapter now mirrors the token-resolution logic from sendMessage (client.ts:321): when threadId is a synthetic placeholder, resolve via getWorkspaceBotTokenForChannel(channelId) first, falling back to getSlackBotToken(channelId). The real-thread path is unchanged so existing registry bindings keep winning for in-flight replies.
const threadIsSynthetic = isSyntheticOwner(threadId);
const token = getSlackBotTokenForProcessor(processorId)
?? (threadIsSynthetic
? (getWorkspaceBotTokenForChannel(channelId) ?? getSlackBotToken(channelId))
: getSlackBotToken(channelId, threadId));
Full suite still green (408 pass, 1 skip).
Mirror the token-resolution logic from sendMessage in the sendQuestion adapter. When the thread id is a synthetic placeholder (task: / cron-job: / cron:), postSlackQuestion drops thread_ts and degenerates to a top-level channel post; in that case the registry has no entry for the fake thread_ts, and getSlackBotToken(channelId, fakeTs) can fall back to the first registered workspace token in multi-workspace installs. Use getWorkspaceBotTokenForChannel(channelId) first so the posted Block Kit / plain-text question lands with the token bound to the actual channel. Addresses Codex review comment on PR #212.
When a task or cron job's agent run emits an intermediate message,
the Slack adapter currently forwards the synthetic placeholder thread
id (`task:{id}` / `cron-job:{id}:{run}` / `cron:{id}`) directly as
`thread_ts`. Slack rejects those with `invalid_thread_ts` because
they are not real message timestamps, so the message is lost and the
delivery captures a Sentry error (ODE-DEAMON-7).
Synthetic owners are internal placeholders that exist before the
scheduler has posted a top-level message and learned the real thread
id; the scheduler later seeds the session for the real thread via
`seedCronChannelThreadSession` / `seedChannelThreadSession`. While
the agent is mid-run, the safe fallback is to post the intermediate
output at the top of the channel rather than reject it.
- sendMessage now omits thread_ts when threadId is synthetic and skips
binding the bot token to the fake thread id.
- postSlackQuestion does the same for both the plain-text and Block
Kit branches.
- New api.test.ts mocks the bolt app and asserts thread_ts is dropped
for cron-job:/task: ids and preserved for real Slack timestamps.
The previous mock.module factory only returned getApp + getSlackBotToken, so when later test files in the same Bun process imported other exports from packages/ims/slack/client (e.g. sendChannelMessage in packages/core/tasks/scheduler.ts and packages/core/test/web-routes.test.ts), Bun resolved them against the stub and threw: SyntaxError: Export named 'sendChannelMessage' not found in module '/home/runner/work/ode/ode/packages/ims/slack/client.ts' Locally test ordering hid this; CI's alphabetic ordering surfaced it as 2 failed tests / 2 unhandled errors in fast-checks. Fix: load the real ./client first and spread its exports into the stub before overriding getApp + getSlackBotToken. Downstream importers in the same process now still see every real export.
Addresses Codex P2 review on PR #212. When threadIsSynthetic is true, sendMessage degenerates to a top-level channel post (no thread_ts), but the token was still being resolved with getSlackBotToken(channelId, threadId) first. In multi-workspace installs where a scheduled task/cron fires before that channel has been seen by the message router, the registry has no entry for the fake thread_ts and getSlackBotToken can fall back to the first registered token instead of the token for rawChannelId — leading to a follow-on channel_not_found / not_in_channel even though dropping thread_ts was supposed to make the message deliverable. Mirror the token-resolution chain that sendChannelMessage already uses for top-level posts: processor token -> workspace-by-channel token -> getSlackBotToken for the synthetic path. The real-thread path is unchanged so registry bindings for in-flight thread replies still win. Tests: full suite still green (408 pass, 1 skip, 0 fail).
Mirror the token-resolution logic from sendMessage in the sendQuestion adapter. When the thread id is a synthetic placeholder (task: / cron-job: / cron:), postSlackQuestion drops thread_ts and degenerates to a top-level channel post; in that case the registry has no entry for the fake thread_ts, and getSlackBotToken(channelId, fakeTs) can fall back to the first registered workspace token in multi-workspace installs. Use getWorkspaceBotTokenForChannel(channelId) first so the posted Block Kit / plain-text question lands with the token bound to the actual channel. Addresses Codex review comment on PR #212.
1d4e434 to
a109625
Compare
|
Rebased onto latest
Full |
|
Addressed the remaining P1 review comment in dc9842c. When a synthetic task/cron question is posted top-level and Slack returns a real message ts, request-run now mirrors the pending question state from the synthetic session onto that real Slack thread. Button/reply routing via the posted message ts can now find the pending question and resume the blocked run instead of starting a fresh turn.\n\nTargeted validation: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dc9842c207
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ?? (threadIsSynthetic | ||
| ? (getWorkspaceBotTokenForChannel(channelId) ?? getSlackBotToken(channelId)) | ||
| : getSlackBotToken(channelId, threadId)); |
There was a problem hiding this comment.
Bind synthetic question threads to the resolved Slack token
When threadIsSynthetic is true, this correctly resolves the channel's workspace token for the top-level question, but the returned Slack ts is never registered as a message/thread token. The button path later calls handleButtonSelection, which looks up the token from the clicked selection message or the question thread ts; for a scheduled task/cron question in a multi-workspace install where the channel has not already been bound by the message router, that lookup falls back to an empty/default processor, so multi-question follow-ups or error replies after the first button click can be sent with the wrong workspace token. Please store the resolved token against the postSlackQuestion result when the synthetic post creates the real thread.
Useful? React with 👍 / 👎.
When the synthetic-thread branch in createSlackAdapter.sendQuestion resolves the channel's workspace token and posts the question as a top-level channel message, the returned Slack ts is now registered against that token via slackAuthRegistry.setMessageBotToken. handleButtonSelection looks up the bot token via getMessageBotToken on the selection message ts, and getThreadBotToken on the synthetic placeholder thread id, so without this binding a multi-workspace install where the channel has not yet been seen by the message router can mis-route the follow-up through the wrong processor. Mirrors the binding that sendMessage already performs for regular outgoing posts. Addresses the Codex P2 review comment on PR #212 (ODE-DEAMON-7).
|
Addressed the Codex P2 review comment in 3dff52d. After Targeted validation:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3dff52d3f6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // multi-workspace installs can mis-route follow-ups through the wrong | ||
| // processor (Codex P2 on PR #212). | ||
| if (questionMessageTs) { | ||
| slackAuthRegistry.setMessageBotToken(channelId, questionMessageTs, token); |
There was a problem hiding this comment.
Bind synthetic question threads as thread tokens
In a multi-workspace install where a task/cron question is posted as a new top-level Slack message and the user clicks a button, this new binding still won't be found by the button path: commands.ts passes the selection reply's ts as messageTs and the question ts as threadId, while handleButtonSelection checks getMessageBotToken(channel, selectionTs) and then getThreadBotToken(channel, questionTs). Since this line stores the question ts only in the message-token map, the lookup can still fall back to the default/first workspace processor and mis-route follow-up questions or error replies; the fresh evidence is that the attempted fix writes the wrong registry map for the value later used as the thread id.
Useful? React with 👍 / 👎.
What
Stop forwarding synthetic placeholder thread ids (
task:{id}/cron-job:{id}:{run}/cron:{id}) as Slackthread_ts. When the value is synthetic, post at the top of the channel instead.Why
Sentry ODE-DEAMON-7 — "IM slack send failed: An API error occurred: invalid_thread_ts". The most recent events (2026-06-02) all have:
channel_id: C0ATGCJ0YK0thread_id: cron-job:a86fbdc5-01df-4caf-9e0c-c0c199f00379:1780441200000op: sendA task / cron run starts with a synthetic placeholder thread id; the scheduler only learns the real Slack
tsafter it posts the top-level result viasendChannelMessage. In between, any intermediate output the agent runtime emits flows throughim.sendMessage(channelId, syntheticThreadId, ...)→chat.postMessage({ thread_ts: "cron-job:..." })→ Slack rejects withinvalid_thread_ts, the message is lost, and Sentry captures a delivery failure.Sibling PR #211 already auto-disables cron jobs that hit a permanent channel error (e.g.
channel_not_found). This PR fixes the other mode of the same Sentry group: messages that would have landed in the channel except for the bogusthread_ts.Design notes
thread_ts). That keeps the message visible rather than silently swallowed.isSyntheticOwneralready exists inpackages/ims/shared/synthetic-owner.tswith full test coverage of the prefix set (task:/cron-job:/cron:); reuse it here so the matcher stays in one place.packages/ims/slack/client.ts:sendMessage(plus the bot-token binding helper, which must not bind a real token to a fake thread id) andpackages/ims/slack/api.tspostSlackQuestion(both the plain-text fallback and the Block Kit branch).thread_tsanalogues are routed through platform-specific resolvers that already special-case synthetic owners.Tests
packages/ims/slack/api.test.tsmock the bolt app and assert:cron-job:...thread id →postMessagecalled withoutthread_ts(Block Kit branch)task:...thread id → same (plain-text fallback)1717000000.000200) →thread_tspreservedbun testfull suite: 408 pass, 1 skip, 0 fail.Out of scope
channel_not_foundmode addressed by fix(cron): auto-disable cron jobs whose target channel is unreachable #211).