Self Checks
1. Is this request related to a challenge you're experiencing? Tell me about your story.
I would like to improve the official PaddleOCR plugin so it can consume Dify uploaded file variables directly.
Currently, the PaddleOCR tools expose the file parameter as type: string, and the descriptions say users should provide either:
- a publicly accessible image/PDF URL, or
- a base64-encoded image/PDF payload.
This works, but it is not very natural in Dify workflows. In many Chatflow / Workflow scenarios, the user already has a Dify uploaded file variable from an upload input, chat attachment, or upstream file-producing node. For those users, requiring a public URL or manual base64 conversion creates an extra integration step.
I am interested in contributing this feature, but before opening a PR I would like to confirm the preferred plugin interface design with maintainers, because there is a trade-off between a cleaner UX and backward compatibility.
2. Additional context or comments
I see two possible implementation paths.
Option A: keep one file parameter, change it from string to file
This would make the PaddleOCR UI cleaner:
file becomes type: file.
- The Python implementation still accepts both
File objects and legacy strings.
- If
file is a Dify File, the plugin reads file.blob, base64-encodes it, and sends it to the PaddleOCR API.
- If
file is a string, the plugin keeps the current behavior and passes it through as URL/base64.
- When
fileType is auto, the plugin can infer PDF/image from mime_type, extension, or filename.
This gives users a single obvious input field.
The concern is that changing the plugin schema from string to file changes the external parameter contract. Existing workflows that bind file to a text variable or constant URL/base64 may still work at the Python layer, but the Dify UI / workflow configuration layer may treat the parameter differently after the schema change. It may also affect Agent use cases, since file-type tool parameters are handled differently from string parameters.
Option B: keep the existing file string parameter and add a new file_upload parameter
This is more backward-compatible:
- Keep
file as type: string for existing URL/base64 users.
- Add
file_upload as type: file for Dify uploaded files.
- Runtime behavior: prefer
file_upload when provided, otherwise fall back to file.
- Document
file_upload as the recommended input for new Dify workflows, and file as the legacy URL/base64 input.
This avoids breaking existing workflows and Agent-style URL/base64 usage, but the UI will show two file-related inputs, which may be more confusing for new users.
Proposed shared behavior for either option
For the implementation itself, I would keep the change local to the PaddleOCR plugin:
- Normalize file input in a shared helper.
- Convert Dify
File.blob to base64 before calling the PaddleOCR API.
- Preserve current URL/base64 string behavior.
- Infer
fileType only when the user leaves it as auto.
- Add unit tests for uploaded file input, URL/base64 string input, file type inference, explicit
fileType override, and all three PaddleOCR tools.
My question for maintainers:
Which interface would you prefer for an official plugin?
- Option A: one clean
file parameter, with the schema changed to type: file, while preserving string compatibility in Python where possible.
- Option B: keep
file as string and add file_upload as a separate uploaded-file parameter for maximum compatibility.
I am happy to prepare the PR following the preferred direction.
3. Can you help us with this feature?
Self Checks
1. Is this request related to a challenge you're experiencing? Tell me about your story.
I would like to improve the official PaddleOCR plugin so it can consume Dify uploaded file variables directly.
Currently, the PaddleOCR tools expose the
fileparameter astype: string, and the descriptions say users should provide either:This works, but it is not very natural in Dify workflows. In many Chatflow / Workflow scenarios, the user already has a Dify uploaded file variable from an upload input, chat attachment, or upstream file-producing node. For those users, requiring a public URL or manual base64 conversion creates an extra integration step.
I am interested in contributing this feature, but before opening a PR I would like to confirm the preferred plugin interface design with maintainers, because there is a trade-off between a cleaner UX and backward compatibility.
2. Additional context or comments
I see two possible implementation paths.
Option A: keep one
fileparameter, change it fromstringtofileThis would make the PaddleOCR UI cleaner:
filebecomestype: file.Fileobjects and legacy strings.fileis a DifyFile, the plugin readsfile.blob, base64-encodes it, and sends it to the PaddleOCR API.fileis a string, the plugin keeps the current behavior and passes it through as URL/base64.fileTypeisauto, the plugin can infer PDF/image frommime_type,extension, orfilename.This gives users a single obvious input field.
The concern is that changing the plugin schema from
stringtofilechanges the external parameter contract. Existing workflows that bindfileto a text variable or constant URL/base64 may still work at the Python layer, but the Dify UI / workflow configuration layer may treat the parameter differently after the schema change. It may also affect Agent use cases, since file-type tool parameters are handled differently from string parameters.Option B: keep the existing
filestring parameter and add a newfile_uploadparameterThis is more backward-compatible:
fileastype: stringfor existing URL/base64 users.file_uploadastype: filefor Dify uploaded files.file_uploadwhen provided, otherwise fall back tofile.file_uploadas the recommended input for new Dify workflows, andfileas the legacy URL/base64 input.This avoids breaking existing workflows and Agent-style URL/base64 usage, but the UI will show two file-related inputs, which may be more confusing for new users.
Proposed shared behavior for either option
For the implementation itself, I would keep the change local to the PaddleOCR plugin:
File.blobto base64 before calling the PaddleOCR API.fileTypeonly when the user leaves it asauto.fileTypeoverride, and all three PaddleOCR tools.My question for maintainers:
Which interface would you prefer for an official plugin?
fileparameter, with the schema changed totype: file, while preserving string compatibility in Python where possible.fileas string and addfile_uploadas a separate uploaded-file parameter for maximum compatibility.I am happy to prepare the PR following the preferred direction.
3. Can you help us with this feature?