You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: handle partition_pdf import failure gracefully in unstructured 0.18.18
partition_pdf now requires unstructured_inference package which may not
be installed. Make the import optional and check availability only when
actually processing PDF files. MD/TXT files don't need partition
functions and should not fail due to missing PDF dependencies.
Co-Authored-By: unknown <>
# check whether unstructured library is actually available for better error message and to ensure proper typing (can't be None after this point)
359
-
raiseException("unstructured library is not available")
360
350
361
351
file: Any=file_handle
362
352
@@ -367,15 +357,25 @@ def _read_file_locally(
367
357
368
358
try:
369
359
iffiletype==FileType.PDF:
370
-
# for PDF, read the file into a BytesIO object because some code paths in pdf parsing are doing an instance check on the file object and don't work with file-like objects
360
+
ifnotunstructured_partition_pdf:
361
+
raiseself._create_parse_error(
362
+
remote_file,
363
+
"PDF parsing requires the 'unstructured_inference' package. Install it with: pip install unstructured-inference",
0 commit comments