Commit e5aa142
Fix boric acid mapping: consolidate CHEBI:33134 → CHEBI:33118 (#574)
* Fix boric-acid row: chebi_id/culturemech_term_id CHEBI:33134 → CHEBI:33118
mappings/culturebotai_reviewed_ingredients.tsv row 141 ("Boric Acid", 168
occurrences) carried CHEBI:33134 in chebi_id and culturemech_term_id.
CHEBI:33134 is "1H-phosphole", not boric acid; the correct term is CHEBI:33118
(boric acid, cas:10043-35-3) — already present in the row's cas_rn /
kg_microbe_node_id / mim_id columns.
This source (priority=10 in scripts/consolidate_chemical_mappings.py) is what
injected the spurious "CHEBI:33134 = Boric Acid" entity — and its whole synonym
cluster — into kgmicrobe_unified_entity_mappings.sssom.tsv.gz. Regenerating the
unified SSSOM from this corrected source removes it (separate commit).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* unified SSSOM: remove spurious CHEBI:33134 "Boric Acid" entity (17 rows)
CHEBI:33134 is "1H-phosphole", not boric acid. The corrected source
(culturebotai_reviewed_ingredients.tsv, prior commit) no longer injects it, but
the consolidator seeds from its own previous output and unions synonyms, so the
stale entity survives a re-run — it must be removed directly. Removed 17 rows:
13 object_id == CHEBI:33134 (the bad "Boric Acid" entity): the backwards
CHEBI:33118 -> skos:exactMatch -> CHEBI:33134 row, cas:10043-35-3 -> 33134,
and 11 kgm.name:* -> 33134 synonym rows.
1 subject_id == CHEBI:33134 (CHEBI:33134 -> CHEBI:33118 forward row):
dropped rather than kept as a deprecated alias — 33134 is a real, distinct
chemical (phosphole), so asserting it exactMatches boric acid is false.
3 phosphole names cross-contaminated onto boric acid CHEBI:33118:
kgm.name:{phosphole,1h-phosphole,1h-phospholeaindene} -> CHEBI:33118.
These are 33134's real ChEBI synonyms, dragged in by the id collision;
phosphole is not a boric-acid synonym.
CHEBI:33134 now appears nowhere in the unified mappings. Boric acid CHEBI:33118
retains its 8 genuine kgm.name synonyms (h3bo3, boh3, boric_acid, h3bo,
boron_trihydroxide, orthoboric_acid, trihydroxidoboron, h3bo3baker_0084) and 15
xref rows. Row count 598244 -> 598227.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>1 parent 2f78ecc commit e5aa142
2 files changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
141 | | - | |
| 141 | + | |
142 | 142 | | |
143 | 143 | | |
144 | 144 | | |
| |||
Binary file not shown.
0 commit comments