Skip to content

fix: redact inline file data in prompt logs#1407

Open
atharvasingh7007 wants to merge 1 commit intosimonw:mainfrom
atharvasingh7007:fix/redact-inline-file-data
Open

fix: redact inline file data in prompt logs#1407
atharvasingh7007 wants to merge 1 commit intosimonw:mainfrom
atharvasingh7007:fix/redact-inline-file-data

Conversation

@atharvasingh7007
Copy link
Copy Markdown

Summary

  • redact file.file_data when prompt JSON is logged, alongside the existing image_url.url and input_audio.data handling
  • add a direct redact_data() regression covering inline image, audio, and file payloads while preserving external URLs and file_id
  • add a CLI regression proving PDF attachments keep their full payload in the outgoing OpenAI request but are redacted in stored prompt_json

Root cause

PDF attachments are encoded as inline {type: file, file: {file_data: data:application/pdf;base64,...}} payloads before they hit redact_data(), but redact_data() only knew how to scrub image_url.url and input_audio.data. That left the full base64 PDF content in responses.prompt_json.

Testing

  • python -m pytest tests/test_cli_openai_models.py
  • python -m ruff check .
  • python -m mypy llm

Closes #1396

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

redact_data() misses file.file_data -- base64 PDF contents persist in prompt_json logs

1 participant