This document describes how conversation attachments work in the Conversations application, including the upload process, security measures, and how documents are processed for use with Large Language Models (LLMs).
Two attachment scopes coexist:
- Conversation attachments (
ChatConversationAttachment.conversationset): scoped to a single chat. Indexed on the first chat turn that needs them. - Project attachments (
ChatConversationAttachment.projectset): scoped to a project, shared by every conversation in that project. Indexed at upload time so they are searchable from the very first turn of every conversation in the project.
Both share the same model, same storage, same RAG backend, and the same retrieval tools - they only differ in scope, when indexing happens, and how they appear in the LLM's context (see Hybrid context delivery).
- Overview
- Supported Attachment Types
- Architecture & Flow
- Project Attachments
- Security & Validation
- Document Processing for LLMs
- Configuration
Conversations allows users to attach files to their conversations with the AI assistant. These attachments can be:
- Images (displayed directly to vision-capable LLMs)
- PDF documents (sent as document URLs to the LLM)
- Other documents (converted to text and indexed for semantic search)
The attachment system uses S3-compatible object storage (such as MinIO in development) to store files securely. The backend generates presigned URLs that allow the frontend to upload files directly to the storage, without routing the file data through the backend server.
Note about documents: The system uses a tool called MarkItDown to convert various document formats (Word, Excel, PowerPoint, text files, etc.) into Markdown text for processing by LLMs. When at least one non-image document is attached either to the current conversation or to its project, the system enables:
- a hybrid context delivery that inlines small conversation documents directly into the LLM's system instructions (
full-context) and exposes the rest via tools (tool_call_only). Project documents are listed separately underproject_documentsand are alwaystool_call_only- they do not compete for the inlining budget. See Other Document Types. - a Retrieval-Augmented Generation (RAG) search tool (
document_search_rag) to query relevant sections of any attached document. The search payload covers the conversation's collection and, when applicable, the parent project's collection in a single call. Supports an optionaldocument_idargument to target a single attachment (conversation- or project-scoped). - a summarization tool (
summarize) to provide document summaries of files attached to the current conversation, also supportingdocument_idtargeting. - a project-scoped summarization tool (
summarize_project) for files in the project library, registered only when the conversation belongs to a project.⚠️ naive implementation at the moment, needs improvement before being used in production.
The following attachment types are supported:
- Images:
image/png,image/jpeg,image/gif,image/webp. - PDF documents:
application/pdf - Other documents:
- Microsoft Word:
application/vnd.openxmlformats-officedocument.wordprocessingml.document - Microsoft Excel:
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet - Microsoft PowerPoint:
application/vnd.openxmlformats-officedocument.presentationml.presentation - Text files:
text/plain,text/markdown,text/csv
- Microsoft Word:
Warning: The current implementation for PDF expects the LLM to be able to manage them. We need to improve the handling of PDFs in case the LLM cannot process them natively.
Todo:
- Add support for more file types and improve document processing workflows.
- Allow PDF management via RAG search when the LLM cannot handle them natively.
- Allow file type restrictions based on model settings, instead of globally.
- Improve the summarization tool to provide better summaries and handle larger documents.
- Start file upload right away when the user selects a file, instead of waiting for the user to send the message.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Frontend │ │ Backend │ │ S3 Storage │ │ Malware Det.│
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │ │
│ 1. Create attachment│ │ │
├────────────────────>│ │ │
│ │ │ │
│ 2. Return presigned │ │ │
│ URL for upload │ │ │
│<────────────────────┤ │ │
│ │ │ │
│ 3. Upload file │ │ │
│ directly to S3 │ │ │
├──────────────────────────────────────────>│ │
│ │ │ │
│ 4. Notify upload │ │ │
│ completed │ │ │
├────────────────────>│ │ │
│ │ │ │
│ │ 5. Detect MIME type │ │
│ ├────────────────────>│ │
│ │ │ │
│ │ 6. Scan for malware │ │
│ ├──────────────────────────────────────────>│
│ │ │ │
│ │ 7. Update status │ │
│ 8. Return status │<──────────────────────────────────────────┤
│<────────────────────┤ │ │
│ │ │ │
When a user selects a file to upload, the frontend sends a POST request to create an attachment record:
Endpoint: POST /api/conversations/{conversation_id}/attachments/
Request payload:
{
"file_name": "document.pdf",
"size": 1048576,
"content_type": "application/pdf"
}Backend processing (ChatConversationAttachmentViewSet.perform_create):
- Verifies the user owns the conversation
- Generates a unique UUID for the file
- Creates a storage key:
{conversation_id}/attachments/{uuid}.{extension} - Creates a database record with status
PENDING
Response:
{
"id": "uuid-of-attachment",
"key": "conversation-id/attachments/file-id.pdf",
"file_name": "document.pdf",
"size": 1048576,
"upload_state": "pending",
"policy": "https://s3.example.com/bucket/...?presigned-params"
}The policy field contains a presigned URL valid for a limited time (configured by AWS_S3_UPLOAD_POLICY_EXPIRATION).
The frontend uses the presigned URL to upload the file directly to S3 storage using a PUT request.
Technical details:
- The presigned URL includes authentication parameters
- The upload is done with
Content-Typeheader matching the file's MIME type - No backend involvement in the data transfer
After successful upload, the frontend notifies the backend:
Endpoint: POST /api/conversations/{conversation_id}/attachments/{attachment_id}/upload-ended/
Backend processing (ChatConversationAttachmentViewSet.upload_ended):
-
MIME Type Detection (
chat/views.py):mime_detector = magic.Magic(mime=True) with default_storage.open(attachment.key, "rb") as file: mimetype = mime_detector.from_buffer(file.read(2048)) size = file.size
Uses
python-magicto detect the actual MIME type from file content (first 2048 bytes). -
Update attachment status:
- Status:
PENDING→ANALYZING - Store detected MIME type and actual file size
- Status:
-
Trigger Malware Detection:
malware_detection.analyse_file( attachment.key, safe_callback="chat.malware_detection.conversation_safe_attachment_callback", unknown_callback="chat.malware_detection.unknown_attachment_callback", unsafe_callback="chat.malware_detection.conversation_unsafe_attachment_callback", conversation_id=conversation_id, )
The malware detection service (configurable via MALWARE_DETECTION_BACKEND) scans the file and calls one of three callbacks:
Safe file (conversation_safe_attachment_callback):
- Status:
ANALYZING→READY - File is ready for use
Unsafe file (conversation_unsafe_attachment_callback):
- Status:
ANALYZING→SUSPICIOUS - File is quarantined and not accessible
- Security log entry created
Unknown status (unknown_attachment_callback):
- Handles special cases (e.g., file too large to analyze)
- Status:
ANALYZING→FILE_TOO_LARGE_TO_ANALYZE
Project attachments live on ChatProject rather than on a single ChatConversation. They are uploaded through a parallel viewset (ChatProjectAttachmentViewSet, POST /api/v1.0/projects/{project_id}/attachments/) and reuse the same upload pipeline (presigned URL → upload-ended → MIME detection → malware scan). Three things differ from the conversation flow.
Conversation attachments are indexed lazily when a chat turn first builds RAG context. Project attachments cannot follow that pattern - any conversation in the project may be the first to ask about a file, and we want them searchable immediately.
The malware safe callback for projects (chat.malware_detection.project_safe_attachment_callback) does two things in sequence after marking the attachment READY:
- Fetches the attachment with
select_related("project", "uploaded_by"). - Calls
chat.agent_rag.indexing.index_project_attachment(attachment).
index_project_attachment is a one-shot, idempotent indexer: it skips images, skips markdown companion rows (conversion_from set), skips attachments already carrying a rag_document_id, parses the file via RAG_DOCUMENT_PARSER, and stores the result in the project's RAG collection. Failures are logged and swallowed - the file stays READY and downloadable but won't surface in RAG search until a future re-index.
Each project lazily creates its own RAG collection on the first indexable upload. ChatProject. collection_id starts NULL; _ensure_project_collection(project) runs the creation under
select_for_update + transaction.atomic so two concurrent first-uploads on a fresh project cannot each create a competing collection at the backend (race condition).
At search time, when a conversation belongs to a project, the project's collection is added to the search payload alongside the conversation's own collection (read_only_collection_id=[project.collection_id]). The model can therefore pull chunks from both scopes in a single tool call.
The Albert backend records a per-document id (rag_document_id) at index time. The search tool prefers it over name-based filtering (document_ids: [<int>] instead of metadata_filters: document_name) - this is collection-aware and unambiguous, whereas a name match could collide across the conversation and project collections. The Find backend currently does not return per-document ids; both id-based and name-based filters fall back to a plain query.
Parsing a non-text input (PDF, DOCX, ODT, ...) is expensive: it can call Albert's parser endpoint, run MarkItDown, or shell out to odfdo. We do not want to pay that cost on every chat turn that reads the document. The companion attachment is the parsed-content cache that solves this:
| Original row | Companion row | |
|---|---|---|
file_name |
report.pdf |
report.pdf.md |
content_type |
application/pdf |
text/markdown |
conversion_from |
NULL |
<original.key> (the marker that distinguishes a companion from a real markdown upload) |
upload_state |
READY |
READY |
rag_document_id |
populated (Albert) | NULL |
| S3 key | Conversation: shared - one blob backs both rows (companion overwrites the original on upload). Project: distinct - companion stored at <original.key>.md so the original binary stays intact for direct retrieval. |
What reads it:
- Hybrid context delivery (
build_document_context_instruction): when a conversation attachment is small enough to inline asfull-context, the builder reads the companion's markdown from S3 and embeds it in the system prompt. Without the companion, every turn would have to re-parse the original. - System-prompt listing: every turn lists every text attachment by
idandtitle. The companion gives the listing a stable markdown title (report.pdfafter stripping.md) and adocument_idthat maps to the original throughconversion_from. summarizetool: pulls the parsed markdown directly from the companion's S3 blob.- RAG
document_idresolution: searches text attachments, which means companions are filtered out (conversion_fromset) - the model targets the original UUID, and the search uses the original'srag_document_id. The companion exists only as a parsed-content cache; it is never the search target.
What writes it:
- Conversation attachments: created lazily by
_parse_input_documentson the first chat turn that needs RAG context, after the parser runs. - Project attachments: created at index time inside
index_project_attachment, so the companion is in place before the very first chat turn that asks about the file.
Cleanup is set-based: the per-attachment delete path collects every key scheduled for removal (original + companion) into a set and hands it to _bulk_delete_s3_blobs, which issues one DELETE per unique blob. For project attachments this means two DELETE calls (original + <original.key>.md); for conversation attachments where the companion shares the original's key, set semantics dedup it down to one. The companion DB row is dropped explicitly on per-attachment delete (filter on conversion_from = original.key); on conversation/project cascade delete, both rows are dropped by the same DB cascade.
Three paths drop project attachments and each one cleans up the side effects in a defined order:
| Path | Order | Notes |
|---|---|---|
Per-attachment delete (DELETE /projects/{p}/attachments/{a}/) |
RAG document → S3 blob → companion row → original row | Each step is best-effort; failures are logged but never block the user-facing delete. Companion is matched by (project_id, conversion_from = original.key). |
Conversation delete (DELETE /chats/{c}/) |
RAG collection → S3 blobs (deduped) → DB cascade | Collection drop covers all per-doc state in one call, so per-attachment delete_document is not iterated. S3 keys are collected before cascade since CASCADE drops attachment rows but never touches storage. |
Project delete (DELETE /projects/{p}/) |
Child-conversation RAG collections → collect S3 keys → bulk conversation delete → project RAG collection → S3 blobs (deduped) → DB cascade | ChatConversation.project uses on_delete=SET_NULL, so child conversations are explicitly deleted here. Bulk QuerySet.delete() bypasses ChatViewSet.perform_destroy, which is why child collections are dropped above before the bulk delete fires. The S3-key set must be assembled before the bulk conversation delete so the queries still resolve to live attachment rows. |
The trade-off accepted on every path: a transient backend hiccup may strand orphaned RAG/S3 storage rather than strand a DB row the user cannot remove. Cleanup deduplicates S3 keys into a set before calling default_storage.delete, so conversation companions (which share the original's key) cost one DELETE while project companions (distinct <original.key>.md) cost two.
For now, the system is not intended to host user-uploaded files for public download. All files are stored in private S3 buckets with presigned URLs for controlled access and only the owner of the conversation/the uploader can access them, so the risk is quite low around bad use of the attachment system.
Also, the document content is sent to the LLM and does not prevent any prompt injection attacks, which is not an issue specific to the attachment system but to the overall design of LLM-based applications and should be addressed globally. Also for the moment, the system does not have any action tools that could be used to execute malicious code based on document content.
The malware detection system is pluggable and configurable, allowing different backends to be used.
By default, a DummyBackend is provided that marks all files as safe.
When a user sends a message with attachments, the system processes them differently based on their type:
MIME types: image/png, image/jpeg, image/gif, image/webp, etc.
Processing flow:
-
URL Conversion: Local media URLs are converted to presigned S3 URLs before sending to the LLM:
# From: chat/agents/local_media_url_processors.py content.url = generate_retrieve_policy(key)
-
Sent to LLM: Images are sent as
ImageUrlobjects in the prompt:ImageUrl( url="https://s3.example.com/bucket/key?presigned-params", identifier="file-id.png", )
-
Vision models can analyze the image content directly.
-
Response processing: After the LLM responds, presigned URLs are converted back to local URLs for storage:
# Mapping: presigned_url -> /media-key/{conversation_id}/attachments/{file_id}.png
MIME type: application/pdf
Processing flow:
-
Direct URL passing: PDFs are sent as
DocumentUrlobjects :DocumentUrl( url="https://s3.example.com/bucket/key?presigned-params", identifier="file-id.pdf", )
-
LLM processing: Compatible LLMs can:
- Extract and read text from PDFs
- Understand document structure
- Answer questions about the content
-
No conversion needed: PDFs are passed directly without preprocessing.
MIME types: Word documents, Excel spreadsheets, PowerPoint, text files, Markdown, etc.
Processing flow:
-
Document parsing: Two pluggable settings drive parsing and storage:
RAG_DOCUMENT_PARSER(defaultchat.agent_rag.document_converter.parser.AlbertParser): converts the source bytes to Markdown. Routes by content type - ODT files always useodfdo, everything other than PDF/ODT goes through MarkItDown, PDFs depend on which parser is configured (see below).RAG_DOCUMENT_SEARCH_BACKEND(AlbertRagBackendorFindRagBackend): stores the converted markdown in the configured vector index and serves search queries.
PDF parsing strategies (per
RAG_DOCUMENT_PARSER):-
AlbertParser: every PDF is sent unconditionally to the Albert/v1/parse-betaendpoint. -
AdaptivePdfParser: runs apypdf-based heuristic on the upload first (seeanalyze_pdfinparser.py):- Counts pages with extractable text and the average characters per page.
- If
avg_chars_per_page > MIN_AVG_CHARS_FOR_TEXT_EXTRACTIONANDtext_coverage > MIN_TEXT_COVERAGE_FOR_TEXT_EXTRACTION, the PDF is treated as a born-digital text PDF and converted locally via MarkItDown - no external API call. - Otherwise (scanned, image-only, or low-text-density PDF), the parser falls back to the configured OCR endpoint (
OCR_HRIDprovider's/v1/ocr, default Mistral OCR), processing the document inOCR_BATCH_PAGES-sized batches withOCR_MAX_RETRIESretry attempts.
The point of the heuristic is to avoid the latency/cost of OCR when
pypdfcan already pull clean text out of the PDF.
-
Conversion artifacts: For non-text inputs (PDF, DOCX, etc.) a hidden markdown companion attachment is created (
content_type=text/markdown,conversion_from=<original.key>). Conversation attachments create the companion lazily on the first chat turn that needs RAG; project attachments create it at upload time (see Project Attachments). Conversation companions reuse the original's S3 key (the companion overwrites the original blob); project companions are stored at a distinct<original.key>.mdblob so the original binary remains available for direct retrieval. -
Hybrid context delivery: On every chat turn, each text attachment is exposed to the LLM in one of two modes (described in detail below):
full-context: small enough conversation documents are inlined directly into the system instructions so the LLM can read them without calling a tool.tool_call_only: oversized or evicted conversation documents, and every project document, stay reachable via thedocument_search_ragtool, plussummarizefor conversation files orsummarize_projectfor project-library files.
-
RAG (Retrieval-Augmented Generation):
- Converted text is indexed in a vector database (Albert collection or Find index). Conversations and projects each have their own collection.
- The LLM uses the
document_search_ragtool to query relevant chunks. When the conversation belongs to a project, both the conversation collection and the project collection are searched in a single call. The tool accepts an optionaldocument_idargument to target a single attachment from either scope.
-
Summarization tool if needed (also accepts
document_id).
When at least one text attachment exists, the assembled system instruction includes a JSON listing of the attached documents. This is the LLM's "table of contents" — it tells the model which documents are present, which are inlined as full content, and which can only be reached via tools.
Listing shape (sent on every turn):
{
"documents_order": "newest_to_oldest",
"documents": [
{
"document_id": "<UUID>",
"title": "report.pdf",
"access": "full-context",
"content": "<full markdown of the document>",
"info": "last_uploaded_document"
},
{
"document_id": "<UUID>",
"title": "big_dataset.csv",
"access": "tool_call_only",
"content": "available via tools",
"info": null
}
],
"project_documents": [
{
"document_id": "<UUID>",
"title": "team-handbook.pdf",
"access": "tool_call_only",
"content": null,
"info": "first_uploaded_document"
}
],
"note": "Documents marked 'tool_call_only' are accessible through tools like RAG search or summary. Documents marked 'full-context' can be directly manipulated by you, ... Entries listed under 'project_documents' come from the user's project library and are shared across every conversation in this project."
}Notes:
document_idis the attachment's UUID and is the value the model passes back todocument_search_rag(document_id=...),summarize(document_id=...), orsummarize_project(document_id=...)to target a specific document. The id MUST come from the matching array (documents↔summarize,project_documents↔summarize_project); cross-array ids are rejected as IDOR violations.infoflags the first and last uploaded documents to help the LLM reason about temporal context.documentsandproject_documentscarry their own independentinfoordering.accessis the per-doc decision made by the inlining policy described below.project_documentsis present only when the conversation belongs to a project that has at least one indexable file. Every entry underproject_documentsistool_call_only- project files do not compete for the inlining budget. The key is dropped from the JSON entirely (not rendered asnullor[]) when no project files exist.
The decision of which documents are inlined as full-context vs left as tool_call_only is made by chat/document_context_builder.py:build_document_context_instruction on each turn:
- Compute the
document_budgetin tokens:document_budget = max(int(model.max_token_context * DOCUMENT_CONTEXT_BUDGET_RATIO) - DOCUMENT_CONTEXT_SECURITY_BUFFER_TOKENS, 0) - Iterate documents oldest-first. For each document:
- If its token count exceeds the whole budget alone → keep
tool_call_only. - Otherwise, while adding it would overflow the budget, evict the oldest currently-inlined document (FIFO): demote it to
tool_call_only, free its tokens. - Once it fits, mark it
full-contextand inline its content.
- If its token count exceeds the whole budget alone → keep
- Edge cases:
- If the model has no
max_token_contextconfigured → all documents staytool_call_only(warning logged). - If
DOCUMENT_CONTEXT_BUDGET_RATIOis0→ all documents staytool_call_only. - If reading an attachment from object storage fails → that document stays
tool_call_onlyand the failure is logged; other documents are not affected.
- If the model has no
Token estimation uses tiktoken with the cl100k_base encoding (GPT-4 tokenizer). For non-OpenAI models (Mistral, Llama, Anthropic) actual usage may run 5-15% higher; the security buffer absorbs that drift.
The assembled instruction is cached per turn keyed on:
conversation_id, user_id, model_hrid, model.max_token_context, DOCUMENT_CONTEXT_BUDGET_RATIO, DOCUMENT_CONTEXT_SECURITY_BUFFER_TOKENS, and a fingerprint of (attachment.id, attachment.updated_at) for every text attachment - conversation and project text attachments both contribute to the fingerprint. Any attachment add / remove / edit (including project files), or any settings change, invalidates the cache. TTL is 30 minutes (CACHE_TIMEOUT).
Three tools accept an optional document_id argument, each with its own IDOR boundary:
| Tool | Default scope (no document_id) |
document_id resolves against |
|---|---|---|
document_search_rag(query, document_id=None) |
Conversation collection + project collection (when applicable) | Conversation text attachments + project text attachments. |
summarize(instructions=None, document_id=None) |
Every conversation text attachment | Conversation text attachments only. |
summarize_project(instructions=None, document_id=None) (registered only when the conversation belongs to a project) |
Every project text attachment | Project text attachments only. |
For all three tools, the resolution is:
- Validate the value is a UUID.
- Look it up against the tool-specific attachment set (see table). If the UUID is not in that set, the tool raises
ModelRetry("not found") - this is the IDOR boundary. The boundaries are deliberately strict per-tool:summarizerejects a project-attachment id (the LLM should callsummarize_projectinstead),summarize_projectrejects a conversation-attachment id, anddocument_search_ragis the only widening tool because the model cannot tell from the question alone which scope to prefer. - For
document_search_rag, forward the resolved attachment to the backend, preferring the per-document id (rag_document_id) recorded at index time over the file name:- Albert: sends
document_ids: [<int>]on/v1/search. This is collection-aware and unambiguous, which matters when conversation and project collections both carry a file with the same name. - Find: per-document id is not available; the backend currently ignores both
document_idanddocument_namefilters and runs a plain query (known gap, follow-up work). - When
rag_document_idis missing (e.g. older attachment, Find backend), Albert falls back to ametadata_filters: { key: "document_name", value: ..., type: "eq" }clause; the file name isreport.pdf.mdstripped toreport.pdfif the attachment is a converted markdown copy.
- Albert: sends
The two summarize* tools share the same chunk-and-merge pipeline (_summarize_text_attachments in chat/tools/document_summarize.py) - they only differ in which attachment set they fetch and which scope-specific soft-fail message they emit ("no docs in this conversation" vs. "no project files").
If the model targeted a document via document_id and the backend returned no results, document_search_rag raises ModelRetry with explicit guidance: either retry without document_id (and explicitly tell the user the search was broadened) or stop and tell the user the targeted document does not contain the requested information. This replaces an earlier silent fallback that quietly widened scope without informing the model. See chat/tools/document_search_rag.py for the exact text.
FindRagBackend does not have feature parity with AlbertRagBackend. Operators selecting RAG_DOCUMENT_SEARCH_BACKEND should account for the following gaps:
| Capability | Albert | Find |
|---|---|---|
Per-document id captured at index time (rag_document_id) |
✅ | ❌ |
Per-attachment delete (DELETE /v1/documents/{id}) |
✅ removes chunks immediately | ❌ no-op; chunks remain searchable until the parent conversation/project is deleted and the whole collection is dropped |
document_id-targeted RAG search |
✅ filtered to one document | ❌ raises FindFilterUnsupportedError instead of silently returning full-collection hits |
document_name-targeted fallback |
✅ metadata_filters clause |
❌ raises FindFilterUnsupportedError |
RAG enable-gate (_check_should_enable_rag) |
✅ fires once any attachment has rag_document_id set |
❌ never fires - Find never populates rag_document_id, so the agent runs without RAG tools registered |
User-visible consequences when running Find:
- Deleting a sensitive attachment from a project does not remove its chunks from the search index. They keep surfacing in RAG searches until the project itself is deleted.
- A
document_id-targeted search fails fast withFindFilterUnsupportedErrorrather than silently returning unrelated hits. Callers (e.g. thedocument_search_ragtool) need to catch this and surface a clear "this backend cannot scope to one document; retry without document_id or switch backend" message. - The agent does not register any RAG tools. Because the enable-gate uses
rag_document_idas the "actually indexed" signal and Find never sets it, even successful Find indexing is invisible to the gate. The model answers from training data alone unless documents arrive in the current message. Selecting Find is therefore a deliberate "no per-document features, no RAG tools" choice until per-document ids are wired in.
The Albert backend is the recommended choice for any deployment that needs per-document control or RAG tooling. Wiring per-document operations into Find is a known follow-up.
Decision logic:
- No documents anywhere: Standard conversation, no RAG tools registered.
- Images on this turn: Send as direct (presigned) URLs to the LLM.
- PDFs on this turn: Send as direct (presigned) URLs to the LLM.
- Other documents present (this turn, the conversation, or the parent project): Convert to Markdown, build the documents listing (and
project_documentslisting if applicable) for the system instruction, inline what fits the budget asfull-context, expose the rest via thedocument_search_rag,summarize, and (when in a project)summarize_projecttools (with optionaldocument_idtargeting). RAG search covers the conversation collection and the project collection together.
The RAG-tool gate (AIAgentService._check_should_enable_rag) keys off rag_document_id - the "actually indexed" signal written only after parse_and_store_document round-trips successfully through the RAG backend - not raw READY upload state. It returns true when any of:
- the user message carries an in-message document upload,
- the conversation owns at least one attachment with a non-empty
rag_document_id, - the conversation belongs to a project that owns at least one such attachment.
A READY attachment whose rag_document_id is null (e.g. parse succeeded but the backend store call failed, or the backend simply never returns ids) does not trip the gate. This is why the Find backend - which never returns a per-document id - leaves RAG disabled even when its index call succeeds, as called out in the Find section above.
| Variable | Default | Description |
|---|---|---|
ATTACHMENT_MAX_SIZE |
Configurable | Maximum file size in bytes |
ATTACHMENT_CHECK_UNSAFE_MIME_TYPES_ENABLED |
True |
Enable/disable MIME type validation |
AWS_S3_UPLOAD_POLICY_EXPIRATION |
3600 | Presigned URL expiration (seconds) |
AWS_S3_RETRIEVE_POLICY_EXPIRATION |
3600 | Presigned retrieval URL expiration (seconds) |
AWS_S3_DOMAIN_REPLACE |
None | Alternative S3 domain for presigned URLs (for development) |
MALWARE_DETECTION_BACKEND |
DummyBackend |
Malware scanning backend class |
MALWARE_DETECTION_PARAMETERS |
{} |
Backend-specific configuration |
RAG_FILES_ACCEPTED_FORMATS |
See below | List of MIME types accepted for file uploads |
RAG_DOCUMENT_PARSER |
AlbertParser |
Import path of the parser that converts uploads to Markdown (PDF -> Albert API, ODT -> odfdo, others -> MarkItDown) |
RAG_DOCUMENT_SEARCH_BACKEND |
AlbertRagBackend |
Import path of the vector-search backend used for indexing and search (Albert or Find) |
PROJECT_FILES_MAX_COUNT |
10 |
Max non-image attachments per project (excludes hidden markdown companions). Enforced at upload-time in ChatProjectAttachmentViewSet. Bounds per-turn system-prompt token cost (every entry contributes to project_documents on every conversation turn). |
PROJECT_IMAGES_MAX_COUNT |
3 |
Max image attachments per project. Enforced at upload-time. Bounds per-turn vision token cost - every project image is pinned to every turn alongside conversation-message images, and provider request-level image caps (Anthropic ~20/request) clip the trailing entries first. |
DOCUMENT_CONTEXT_BUDGET_RATIO |
0.5 |
Fraction of model.max_token_context reserved for inlined documents (0 disables full-context inlining; everything stays tool_call_only) |
DOCUMENT_CONTEXT_SECURITY_BUFFER_TOKENS |
1000 |
Tokens subtracted from the inlining budget to absorb tokenizer drift on non-OpenAI models |
This environment variable controls which file types users are allowed to upload as attachments to conversations.
Configuration:
- Type: List of strings (comma-separated MIME types when using environment variable)
- Default value: Includes a comprehensive list of document and image formats:
- Microsoft Office documents (
.docx,.pptx,.xlsx,.xls) - Text files (
.txt,.csv) - PDF documents (
.pdf) - HTML files
- Markdown files (
.md) - Outlook messages (
.msg) - Images (
.jpeg,.png,.gif,.webp)
- Microsoft Office documents (
Example configuration:
# In environment variable (comma-separated)
RAG_FILES_ACCEPTED_FORMATS="application/pdf,text/plain,image/png,image/jpeg"# In Django settings (as a Python list)
RAG_FILES_ACCEPTED_FORMATS = [
"application/pdf",
"text/plain",
"image/png",
"image/jpeg",
]How it's used:
- Backend: The list is exposed via the
/api/v1.0/config/endpoint aschat_upload_accept(MIME types joined with commas) - Frontend: The configuration is used to validate files before upload in the chat interface:
- Checks exact MIME type matches
- Supports wildcard patterns (e.g.,
image/*for all image types) - Supports file extension patterns (e.g.,
.pdf)
- User experience: Files that don't match the accepted formats are rejected with a user-friendly error message
Notes:
- This setting controls frontend validation only. Backend validation should also be implemented for security.
- Future improvements may include per-model file type restrictions.
Each entry in LLM_CONFIGURATIONS accepts a max_token_context integer field declaring the model's context window size. When set, it drives the inlining budget for the documents listing (document_budget = max_token_context * DOCUMENT_CONTEXT_BUDGET_RATIO - DOCUMENT_CONTEXT_SECURITY_BUFFER_TOKENS).
If a model has no max_token_context, all of its documents are kept tool_call_only regardless of size and a warning is logged on every chat turn. Setting the field accurately matters: too low and small documents get pushed to RAG-only when they could be inlined; too high and the LLM may exceed its real window.
MinIO (Development):
# docker-compose.yml
minio:
image: minio/minio
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
command: server /data --console-address ":9001"Possible causes:
- Presigned URL has expired
- S3 storage is not accessible from the LLM provider
- CORS configuration issues
Solution: Check AWS_S3_RETRIEVE_POLICY_EXPIRATION and S3 access policies.
Possible causes:
- Document conversion failed
- Vector database indexing failed
Check logs: Look for errors in DocumentConverter and RAG backend logs.
- Installation Guide - S3 storage setup
- LLM Configuration - Model capabilities for attachments
- Architecture - System overview
- Tools - Document search and RAG tools