Commit b814ece
authored
fix: properly handle the case when an element's text is None (#3995)
Some elements, like `Image`, can have `None` as its `text` attribute's
value. In that case current chunking logic fails because it expects the
field to always have a length or can be split. The fix is to update the
logic as `element.text or ""` for checking length and add flow control
to early exit to avoid calling split on `None`.1 parent 604c4a7 commit b814ece
4 files changed
Lines changed: 18 additions & 4 deletions
File tree
- test_unstructured/chunking
- unstructured
- chunking
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
51 | | - | |
| 52 | + | |
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
234 | 235 | | |
235 | 236 | | |
236 | 237 | | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
237 | 242 | | |
238 | 243 | | |
239 | 244 | | |
| |||
405 | 410 | | |
406 | 411 | | |
407 | 412 | | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
408 | 419 | | |
409 | 420 | | |
410 | 421 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
387 | 387 | | |
388 | 388 | | |
389 | 389 | | |
390 | | - | |
| 390 | + | |
391 | 391 | | |
392 | 392 | | |
393 | 393 | | |
| |||
503 | 503 | | |
504 | 504 | | |
505 | 505 | | |
| 506 | + | |
| 507 | + | |
506 | 508 | | |
507 | 509 | | |
508 | 510 | | |
| |||
0 commit comments