Commit df65ceb
committed
fix: filter batch error results instead of passing them as Documents
Kreuzberg's batch APIs return error results as ExtractionResult with
empty metadata and None quality_score instead of raising exceptions.
Previously these were silently passed through as valid Documents.
- Add _is_batch_error() to utils.py to detect error results
- Add _collect_batch_results() static method to filter errors with
structured warning logs (classify_error + error_code_name)
- Fix LogRecord conflict: rename 'name' kwarg to 'code_name' in
both batch and sequential error logging paths
- Add 3 unit tests for _is_batch_error detection logic
- Add 3 integration tests for corrupt file/bytestream filtering1 parent 419be44 commit df65ceb
4 files changed
Lines changed: 117 additions & 9 deletions
File tree
- integrations/kreuzberg
- src/haystack_integrations/components/converters/kreuzberg
- tests
Lines changed: 34 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| 37 | + | |
36 | 38 | | |
37 | 39 | | |
38 | 40 | | |
| |||
218 | 220 | | |
219 | 221 | | |
220 | 222 | | |
221 | | - | |
222 | | - | |
223 | | - | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
224 | 227 | | |
225 | 228 | | |
226 | 229 | | |
227 | | - | |
228 | | - | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
229 | 234 | | |
230 | | - | |
231 | | - | |
232 | 235 | | |
233 | 236 | | |
234 | 237 | | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
235 | 260 | | |
236 | 261 | | |
237 | 262 | | |
| |||
501 | 526 | | |
502 | 527 | | |
503 | 528 | | |
504 | | - | |
| 529 | + | |
505 | 530 | | |
506 | 531 | | |
507 | 532 | | |
508 | | - | |
| 533 | + | |
509 | 534 | | |
510 | 535 | | |
511 | 536 | | |
| |||
Lines changed: 11 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
16 | 27 | | |
17 | 28 | | |
18 | 29 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
693 | 694 | | |
694 | 695 | | |
695 | 696 | | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
Lines changed: 41 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
514 | 514 | | |
515 | 515 | | |
516 | 516 | | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
0 commit comments