Commit e8d9e16
fix(src): hoist web-ingest regexes and document unsafe mach FFI
Two small correctness improvements prompted by the Rust audit. The
audit's broader claim — that production llm.rs, kb/ingest/youtube.rs,
and parts of kb/ingest/web.rs panicked on failure — turned out to be
wrong on closer reading: those .expect()/.unwrap() sites all live
inside #[cfg(test)] modules, which is the idiomatic Rust way to fail
a test. What did need work was narrower:
kb/ingest/web.rs
Eight regex_lite::Regex::new(...).unwrap() calls in production HTML
scrubbing helpers (extract_headings, html_to_text) recompiled the
same static pattern on every invocation and panicked with the
generic "called Option::unwrap on a None value" if a future edit
introduced an invalid literal. Hoists all eight to module-level
once_cell::sync::Lazy<Regex> constants — HEADING_RE, SCRIPT_RE,
STYLE_RE, COMMENT_RE, BLOCK_ELEMENT_RE, HTML_TAG_RE, WHITESPACE_RE,
NEWLINE_COLLAPSE_RE — each initialized with .expect("<name> regex
must compile"). Compilation happens once at first use and never
recompiles; the panic message now names which literal is broken.
once_cell is already a dep used across the crate.
diagnostics.rs
The macOS-only get_process_memory_bytes() contains an unsafe block
that calls mach_task_self() and task_info(). The block was
undocumented. Adds a SAFETY comment explaining the Mach FFI
contract: task-self port validity, flavor/out-struct match, zeroed
MaybeUninit alignment/bit-pattern validity, and the assume_init
ordering on the success path.
No behavior change. cargo check --all-targets clean; cargo test --lib
passes 311/312 (one pre-existing #[ignore]'d model-download test).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 04473dc commit e8d9e16
2 files changed
Lines changed: 52 additions & 18 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
650 | 650 | | |
651 | 651 | | |
652 | 652 | | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
653 | 670 | | |
654 | 671 | | |
655 | 672 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
20 | 22 | | |
21 | 23 | | |
22 | 24 | | |
23 | 25 | | |
24 | 26 | | |
25 | 27 | | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
26 | 52 | | |
27 | 53 | | |
28 | 54 | | |
| |||
739 | 765 | | |
740 | 766 | | |
741 | 767 | | |
742 | | - | |
743 | 768 | | |
744 | | - | |
| 769 | + | |
745 | 770 | | |
746 | 771 | | |
747 | 772 | | |
| |||
757 | 782 | | |
758 | 783 | | |
759 | 784 | | |
760 | | - | |
761 | | - | |
762 | | - | |
763 | | - | |
764 | | - | |
765 | | - | |
766 | | - | |
767 | | - | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
768 | 789 | | |
769 | 790 | | |
770 | | - | |
771 | | - | |
| 791 | + | |
772 | 792 | | |
773 | 793 | | |
774 | | - | |
775 | | - | |
| 794 | + | |
776 | 795 | | |
777 | 796 | | |
778 | 797 | | |
779 | 798 | | |
780 | 799 | | |
781 | | - | |
782 | | - | |
| 800 | + | |
783 | 801 | | |
784 | 802 | | |
785 | | - | |
786 | | - | |
| 803 | + | |
787 | 804 | | |
788 | 805 | | |
789 | 806 | | |
| |||
0 commit comments