Discord image attachments are silently dropped on text-only model providers

## Summary

When a Discord user sends an image attachment to an OpenFang agent — either with a caption or bare — the attachment is silently discarded before it reaches the model on text-only providers (e.g. the `claude_code` driver). The model then either:

- **(captioned case)** receives only the caption text and confabulates an acknowledgement of "the image" it never saw, or
- **(bare-image case)** receives nothing at all — the message is dropped before dispatch and the agent appears unresponsive.

Vision-capable providers are unaffected in principle, but the parser path also mishandled the bare-image shape, so multimodal dispatch was incomplete in practice.

## Reproduction

1. Configure an OpenFang agent on a Discord channel with a text-only provider (`claude_code` driver, or any provider without vision).
2. **Case A:** DM the agent `"look at this"` + a PNG attachment.
3. **Case B:** DM the agent a PNG attachment with no message body.

### Observed

- **Case A:** Agent responds as if to a text-only message; references "the image" hallucinated from prior context or the caption alone.
- **Case B:** No response. Daemon log shows the inbound `MESSAGE_CREATE` payload but no dispatch downstream.

### Expected

- **Case A:** Model receives the caption *and* a coherent indication that an image was attached, with enough metadata (mime, size) to acknowledge it without confabulation.
- **Case B:** Model receives a coherent indication that a bare image was sent.
- Vision-capable providers receive the actual image bytes as a `ContentBlock::Image` for true multimodal dispatch.

## Root cause

Two defects in `crates/openfang-channels/src/discord.rs::parse_discord_message`:

1. **Bare-image drop.** An early `if content.is_empty() { return None; }` killed any message whose body was empty, regardless of attachment count. Bare-image posts never reached the bridge.
2. **Caption-wins drop.** When both text *and* attachments were present, only the text was preserved as `ChannelContent::Text`; attachments were discarded silently.

There was also no representation in `ChannelContent` for "a caption plus one or more attachments as a coherent unit," so even fixing the parser had no destination type to emit into.

## Proposed fix

End-to-end vertical slice across the channel + runtime layer:

- **`ChannelContent::Multipart(Vec<ChannelContent>)`** — new variant for caption + attachment(s) as sibling blocks. Nesting forbidden by doc + `debug_assert`.
- **Discord parser** — classify attachments by MIME (with a filename-extension fallback for bot-relayed payloads that omit `content_type`) under a 5 MB vision-size cap matching Anthropic's image-block limit. Vision-eligible images become `Image`; everything else becomes `File`. Emit `Multipart` whenever text + attachments coexist or multiple attachments are present.
- **Bridge** — flat-map `Multipart` in both dispatch paths: into `Vec<ContentBlock>` for multimodal-capable providers, and into a newline-joined text descriptor for text-flatten providers.
- **Telegram channel** — exhaustive-match parity for the new variant; defensive flatten on outbound.
- **`claude_code` driver** — render `Image` blocks as `[attachment: <mime> image, ~N KB — not viewable on this provider]` instead of dropping them. The model still cannot see the image, but it can acknowledge it coherently rather than confabulate.

## Out of scope (follow-ups)

- Vision-provider dispatch refinements beyond exposing the existing image bytes.
- Non-Discord channel parity for inbound attachment classification (Telegram inbound is unchanged here; only the outbound `Multipart` arm was added).
- Any handling of files larger than the 5 MB vision cap beyond classifying them as `File` and rendering the marker.

## Test plan

- 9 new unit tests in the discord parser covering all `(text-empty, n-attachments)` shapes plus MIME edge cases (HEIC, oversize, missing `content_type`).
- 2 new unit tests in the `claude_code` driver covering captioned and bare-image marker rendering.
- Manual smoke test, both shapes, end-to-end (Discord → daemon log → model prompt).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discord image attachments are silently dropped on text-only model providers #1142

Summary

Reproduction

Observed

Expected

Root cause

Proposed fix

Out of scope (follow-ups)

Test plan

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discord image attachments are silently dropped on text-only model providers #1142

Description

Summary

Reproduction

Observed

Expected

Root cause

Proposed fix

Out of scope (follow-ups)

Test plan

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions