Do you need to ask a question?
Your Question
Problem
When uploading a scanned PDF (image-based, no text layer) to the Knowledge Base,
the initialization fails with the following error:
Skipped empty document: xxx.pdf
No valid documents found
RAG pipeline returned failure
Steps to reproduce
- Upload a scanned PDF to Knowledge Base
- Initialization starts but fails immediately
My setup
- DeepTutor version: v1.3.3
- OS: Windows
- LLM: DeepSeek
- Embedding: Jina
Question
I installed magic-pdf hoping it would enable OCR for scanned PDFs,
but the error persists.
Is there a way to enable MinerU/magic-pdf for scanned PDF processing
in the current v1.x architecture? If so, how should it be configured?
Thank you!
Related Module
Knowledge Base Management
Additional Context
No response
Do you need to ask a question?
Your Question
Problem
When uploading a scanned PDF (image-based, no text layer) to the Knowledge Base,
the initialization fails with the following error:
Skipped empty document: xxx.pdf
No valid documents found
RAG pipeline returned failure
Steps to reproduce
My setup
Question
I installed magic-pdf hoping it would enable OCR for scanned PDFs,
but the error persists.
Is there a way to enable MinerU/magic-pdf for scanned PDF processing
in the current v1.x architecture? If so, how should it be configured?
Thank you!
Related Module
Knowledge Base Management
Additional Context
No response