Commit 7e4f54c
authored
Merge pull request #72 from AdaWorldAPI/claude/qwen-claude-reverse-eng-vHuHv
feat: index Reader LM 1.5B + BGE-M3 for OSINT pipeline
jinaai/reader-lm-1.5b (safetensors, 1 shard, 3.1 GB):
HTML→Markdown local model. No Jina API needed.
bgz7 → palette → O(1) HTML structure recognition.
CompendiumLabs/bge-m3-gguf (GGUF F16, ~1.2 GB):
Multilingual embedding model. Replaces DeepNSM for non-English.
bgz7 → palette → O(1) semantic similarity.
Together: Reader LM reads the web, BGE-M3 embeds it, AriGraph
stores it as SPO triplets, AutocompleteCache routes it at 17K tok/sec.
https://claude.ai/code/session_01M3at4EuHVvQ8S95mSnKgtK5 files changed
Lines changed: 88 additions & 491 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
0 commit comments