Skip to content

Enhance VisionParser with confidence assessment features. Added metho…#25

Open
hungranger wants to merge 7 commits into
mainfrom
feat/add-confident-score
Open

Enhance VisionParser with confidence assessment features. Added metho…#25
hungranger wants to merge 7 commits into
mainfrom
feat/add-confident-score

Conversation

@hungranger
Copy link
Copy Markdown
Contributor

@hungranger hungranger commented Nov 10, 2025

Enhance VisionParser with confidence assessment features.

…ds to generate and parse confidence scores and reasons for extracted content. Updated logging and output structure to include confidence metrics for improved accuracy evaluation.
…onfidence instead of average. Updated output messages and variable names accordingly for clarity.
…e document quality and extraction accuracy. Clarified scoring criteria and improved the structure of the confidence reason for better guidance on evaluating extracted content.
@sang-d
Copy link
Copy Markdown
Contributor

sang-d commented Nov 12, 2025

@hungranger Hi Roy, the logic (and prompt) looks great. I'm still thinking about a few drawbacks:

  • The change might impact existing integration (less likely) but we still need to test
  • Since we ask the llm to do a more complex task, it might produce a different quality for the extraction part (I really dont know whether it would be better or worse, those by common sense it could be more likely worse).

=> So if we can test this carefully, it will be more confident to merge the change. (Mostly likely we need to run test manually to observe the result on various sample files)

I also want to produce 2 more approaches:

  1. Add new parameter such as with_confidence_score -> it will go to this flow, if not then go to current one. So this is still 1 method but with more parameter.

  2. Clone vision_parser.py and make a new module vision_parser_with_confidence_score.py for example, so that developer can choose to call different methods with/without confidence score.

Pls consider and discuss a bit more

…ion to standardize format as [xmin, ymin, xmax, ymax]. Update related extraction instructions and validation checks for consistency.
- Introduced `locate_text_in_an_image.ipynb` to showcase the `locate_text()` feature for semantic text location in images.
- Added `parse_tcb_pdf.ipynb` to demonstrate parsing capabilities of TCB bank statement PDFs using OCRParser.
- Created `test_ocr_parser.ipynb` to test the batch processing mode of OCRParser, including bounding box coordinates in the output.
- Included sample image `c1.png` and PDF `tcb_3pages.pdf` for demonstration purposes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants