[Question]:Cannot process scanned PDF in Knowledge Base - "Skipped empty document" error

### Do you need to ask a question?

- [x] I have searched the existing questions and discussions and this question is not already answered.
- [x] I believe this is a legitimate question, not just a bug or feature request.

### Your Question

## Problem

When uploading a scanned PDF (image-based, no text layer) to the Knowledge Base, 
the initialization fails with the following error:

Skipped empty document: xxx.pdf
No valid documents found
RAG pipeline returned failure

## Steps to reproduce

1. Upload a scanned PDF to Knowledge Base
2. Initialization starts but fails immediately

## My setup

- DeepTutor version: v1.3.3
- OS: Windows
- LLM: DeepSeek
- Embedding: Jina

## Question

I installed magic-pdf hoping it would enable OCR for scanned PDFs, 
but the error persists. 

Is there a way to enable MinerU/magic-pdf for scanned PDF processing 
in the current v1.x architecture? If so, how should it be configured?

Thank you!

### Related Module

Knowledge Base Management

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]:Cannot process scanned PDF in Knowledge Base - "Skipped empty document" error #431

Do you need to ask a question?

Your Question

Problem

Steps to reproduce

My setup

Question

Related Module

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Question]:Cannot process scanned PDF in Knowledge Base - "Skipped empty document" error #431

Description

Do you need to ask a question?

Your Question

Problem

Steps to reproduce

My setup

Question

Related Module

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions