Enhance VisionParser with confidence assessment features. Added metho… by hungranger · Pull Request #25 · aitomatic/ai-vision-capture

hungranger · 2025-11-10T17:30:38Z

Enhance VisionParser with confidence assessment features.

…ds to generate and parse confidence scores and reasons for extracted content. Updated logging and output structure to include confidence metrics for improved accuracy evaluation.

…onfidence instead of average. Updated output messages and variable names accordingly for clarity.

…e document quality and extraction accuracy. Clarified scoring criteria and improved the structure of the confidence reason for better guidance on evaluating extracted content.

sang-d · 2025-11-12T07:05:53Z

@hungranger Hi Roy, the logic (and prompt) looks great. I'm still thinking about a few drawbacks:

The change might impact existing integration (less likely) but we still need to test
Since we ask the llm to do a more complex task, it might produce a different quality for the extraction part (I really dont know whether it would be better or worse, those by common sense it could be more likely worse).

=> So if we can test this carefully, it will be more confident to merge the change. (Mostly likely we need to run test manually to observe the result on various sample files)

I also want to produce 2 more approaches:

Add new parameter such as with_confidence_score -> it will go to this flow, if not then go to current one. So this is still 1 method but with more parameter.
Clone vision_parser.py and make a new module vision_parser_with_confidence_score.py for example, so that developer can choose to call different methods with/without confidence score.

Pls consider and discuss a bit more

…sing. .

…lysis

…ion to standardize format as [xmin, ymin, xmax, ymax]. Update related extraction instructions and validation checks for consistency.

- Introduced `locate_text_in_an_image.ipynb` to showcase the `locate_text()` feature for semantic text location in images. - Added `parse_tcb_pdf.ipynb` to demonstrate parsing capabilities of TCB bank statement PDFs using OCRParser. - Created `test_ocr_parser.ipynb` to test the batch processing mode of OCRParser, including bounding box coordinates in the output. - Included sample image `c1.png` and PDF `tcb_3pages.pdf` for demonstration purposes.

hungranger added 3 commits November 11, 2025 00:28

Enhance VisionParser with confidence assessment features. Added metho…

79593bf

…ds to generate and parse confidence scores and reasons for extracted content. Updated logging and output structure to include confidence metrics for improved accuracy evaluation.

Refactor confidence assessment in VisionParser to calculate minimum c…

e4e5b19

…onfidence instead of average. Updated output messages and variable names accordingly for clarity.

Refine confidence assessment instructions in VisionParser to emphasiz…

4f4a84a

…e document quality and extraction accuracy. Clarified scoring criteria and improved the structure of the confidence reason for better guidance on evaluating extracted content.

hungranger added 4 commits November 15, 2025 22:45

Introduce OCRParser for enhanced document block extraction and proces…

336e619

…sing. .

Add text location capabilities to OCRParser for enhanced document ana…

2762cb3

…lysis

Fix bounding box coordinate order in OCRParser prompts and documentat…

1d5a13d

…ion to standardize format as [xmin, ymin, xmax, ymax]. Update related extraction instructions and validation checks for consistency.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance VisionParser with confidence assessment features. Added metho…#25

Enhance VisionParser with confidence assessment features. Added metho…#25
hungranger wants to merge 7 commits into
mainfrom
feat/add-confident-score

hungranger commented Nov 10, 2025 •

edited

Loading

Uh oh!

sang-d commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hungranger commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sang-d commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hungranger commented Nov 10, 2025 •

edited

Loading