- Fix some environments failing to split pdfs with
Can't patch loop of type <class 'uvloop.Loop'>, remove usage of nest-asyncio
- Remove some operations under
client.users that are not fully ready yet
- Provide a base
UnstructuredClientError to capture every error raised by the SDK. Note that some exceptions such as SDKError now have more information in the message field. This will impact any users who rely on string matching in their error handling.
- Improve PDF validation error handling by introducing FileValidationError base class for better error abstraction
- Replace RequestError with PDFValidationError for invalid PDF files to provide more accurate error context
- Throws appropriate error message in case the given PDF file is invalid (corrupted or encrypted).
- Add Unstructured Platform APIs to manage source and destination connectors, workflows, and workflow runs
WARNING: This is a breaking change for the use of non-default
server_url settings in the client usage.
To set the custom URL for the client, use the the server_url parameter in a given operation:
elements = client.general.partition(
request=operations.PartitionRequest(
partition_parameters=shared.PartitionParameters(
files=shared.Files(
content=doc_file,
file_name="your_document.pdf",
),
strategy=shared.Strategy.FAST,
)
),
server_url="your_server_url",
)
- Use the configured server_url for our split page "dummy" request
- Switch to a httpx based client instead of requests
- Switch to poetry for dependency management
- Add client side parameter checking via Pydantic or TypedDict interfaces
- Add
partition_async for a non blocking alternative to partition
- Address some asyncio based errors in pdf splitting logic