Description of the Bug
In generate_captions_for_chunks, if caption_image throws an unhandled exception, the except block catches it, logs a debug message, and strips image_bytes from the chunk while setting a fallback text string.
While stripping the bytes prevents serialization errors, silently discarding the image data on a temporary network failure (e.g., OpenAI rate limits or brief downtime) means the application permanently loses the ability to re-try captioning that specific image later in the pipeline.
Steps to Reproduce
Separate system/network failures (which should probably be retried or bubble up) from formatting failures.
If a permanent failure is assumed, log it as an logger.error or logger.warning instead of logger.debug so administrators are aware that image data is being permanently discarded.
Expected Behavior
If an image fails to caption due to a transient error (like a network timeout), the application should visible surface a WARNING or ERROR log indicating that the image payload is being dropped, or ideally raise/retry the error rather than silently destroying the data with a logger.debug signature.
Screenshots / Logs
No response
Environment
GSSoC '26
Description of the Bug
In generate_captions_for_chunks, if caption_image throws an unhandled exception, the except block catches it, logs a debug message, and strips image_bytes from the chunk while setting a fallback text string.
While stripping the bytes prevents serialization errors, silently discarding the image data on a temporary network failure (e.g., OpenAI rate limits or brief downtime) means the application permanently loses the ability to re-try captioning that specific image later in the pipeline.
Steps to Reproduce
Separate system/network failures (which should probably be retried or bubble up) from formatting failures.
If a permanent failure is assumed, log it as an logger.error or logger.warning instead of logger.debug so administrators are aware that image data is being permanently discarded.
Expected Behavior
If an image fails to caption due to a transient error (like a network timeout), the application should visible surface a WARNING or ERROR log indicating that the image payload is being dropped, or ideally raise/retry the error rather than silently destroying the data with a logger.debug signature.
Screenshots / Logs
No response
Environment
GSSoC '26