Skip to content

Commit 4b538ad

Browse files
reorder pdf parser to make the ones withmetadata first
1 parent b58bcfa commit 4b538ad

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

WDoc/utils/loaders.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -188,11 +188,11 @@ def load(self) -> List[Document]:
188188
return docs
189189

190190
pdf_loaders = {
191+
"PyMuPDF": PyMuPDFLoader, # good for metadata
192+
"PdfPlumber": PDFPlumberLoader, # good for metadata
191193
"PDFMiner": PDFMinerLoader, # little metadata
192194
"PyPDFLoader": PyPDFLoader, # little metadata
193195
"PyPDFium2": PyPDFium2Loader, # little metadata
194-
"PyMuPDF": PyMuPDFLoader, # good for metadata
195-
"PdfPlumber": PDFPlumberLoader, # good for metadata
196196
"pdftotext": None, # optional support, see below
197197
"openparse": OpenparseDocumentParser, # gets page number too, finds individual elements, kinda slow but good, optional table support
198198
# "Unstructured_fast": partial(

0 commit comments

Comments
 (0)