Skip to content

[BUG] : improve visibility and handling of deleted image_bytes on caption failure #590

Description

@suhaniiz

Description of the Bug

In generate_captions_for_chunks, if caption_image throws an unhandled exception, the except block catches it, logs a debug message, and strips image_bytes from the chunk while setting a fallback text string.

While stripping the bytes prevents serialization errors, silently discarding the image data on a temporary network failure (e.g., OpenAI rate limits or brief downtime) means the application permanently loses the ability to re-try captioning that specific image later in the pipeline.

Steps to Reproduce

Separate system/network failures (which should probably be retried or bubble up) from formatting failures.

If a permanent failure is assumed, log it as an logger.error or logger.warning instead of logger.debug so administrators are aware that image data is being permanently discarded.

Expected Behavior

If an image fails to caption due to a transient error (like a network timeout), the application should visible surface a WARNING or ERROR log indicating that the image payload is being dropped, or ideally raise/retry the error rather than silently destroying the data with a logger.debug signature.

Screenshots / Logs

No response

Environment

GSSoC '26

  • Yes, I am participating in GirlScript Summer of Code and would like to fix this.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinggssocGirlScript Summer of Code 2026 issue/PR

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions