Skip to content

Commit e5aa142

Browse files
realmarcinclaude
andauthored
Fix boric acid mapping: consolidate CHEBI:33134 → CHEBI:33118 (#574)
* Fix boric-acid row: chebi_id/culturemech_term_id CHEBI:33134 → CHEBI:33118 mappings/culturebotai_reviewed_ingredients.tsv row 141 ("Boric Acid", 168 occurrences) carried CHEBI:33134 in chebi_id and culturemech_term_id. CHEBI:33134 is "1H-phosphole", not boric acid; the correct term is CHEBI:33118 (boric acid, cas:10043-35-3) — already present in the row's cas_rn / kg_microbe_node_id / mim_id columns. This source (priority=10 in scripts/consolidate_chemical_mappings.py) is what injected the spurious "CHEBI:33134 = Boric Acid" entity — and its whole synonym cluster — into kgmicrobe_unified_entity_mappings.sssom.tsv.gz. Regenerating the unified SSSOM from this corrected source removes it (separate commit). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * unified SSSOM: remove spurious CHEBI:33134 "Boric Acid" entity (17 rows) CHEBI:33134 is "1H-phosphole", not boric acid. The corrected source (culturebotai_reviewed_ingredients.tsv, prior commit) no longer injects it, but the consolidator seeds from its own previous output and unions synonyms, so the stale entity survives a re-run — it must be removed directly. Removed 17 rows: 13 object_id == CHEBI:33134 (the bad "Boric Acid" entity): the backwards CHEBI:33118 -> skos:exactMatch -> CHEBI:33134 row, cas:10043-35-3 -> 33134, and 11 kgm.name:* -> 33134 synonym rows. 1 subject_id == CHEBI:33134 (CHEBI:33134 -> CHEBI:33118 forward row): dropped rather than kept as a deprecated alias — 33134 is a real, distinct chemical (phosphole), so asserting it exactMatches boric acid is false. 3 phosphole names cross-contaminated onto boric acid CHEBI:33118: kgm.name:{phosphole,1h-phosphole,1h-phospholeaindene} -> CHEBI:33118. These are 33134's real ChEBI synonyms, dragged in by the id collision; phosphole is not a boric-acid synonym. CHEBI:33134 now appears nowhere in the unified mappings. Boric acid CHEBI:33118 retains its 8 genuine kgm.name synonyms (h3bo3, boh3, boric_acid, h3bo, boron_trihydroxide, orthoboric_acid, trihydroxidoboron, h3bo3baker_0084) and 15 xref rows. Row count 598244 -> 598227. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
1 parent 2f78ecc commit e5aa142

2 files changed

Lines changed: 1 addition & 1 deletion

File tree

mappings/culturebotai_reviewed_ingredients.tsv

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ Methanol 173 CHEBI:17790 67-56-1 CHEBI:17790 CHEBI:17790 CHEBI:17790 MAPPED Cult
138138
Iron (II) sulfate heptahydrate 173 CHEBI:75836 7782-63-0 CHEBI:75836 CHEBI:75836 MAPPED CultureMech:015457; CultureMech:015458; CultureMech:015459
139139
Bacto Yeast Extract (Difco) 172 8013-01-2 FOODON:03315426 MAPPED CultureMech:008724; CultureMech:009578; CultureMech:008519
140140
L-Glutamine 172 CHEBI:18050 56-85-9 CHEBI:18050 CHEBI:18050 CHEBI:18050 MAPPED CultureMech:003664; CultureMech:003729; CultureMech:005218
141-
Boric Acid 168 CHEBI:33134 10043-35-3 CHEBI:33118 CHEBI:33118 CHEBI:33134 MAPPED CultureMech:015450; CultureMech:015451; CultureMech:015452
141+
Boric Acid 168 CHEBI:33118 10043-35-3 CHEBI:33118 CHEBI:33118 CHEBI:33118 MAPPED CultureMech:015450; CultureMech:015451; CultureMech:015452
142142
Bromothymol blue 168 CHEBI:86155 76-59-5 CHEBI:86155 CHEBI:86155 MAPPED CultureMech:004155; CultureMech:004261; CultureMech:005068
143143
Pyridoxine HCl 168 CHEBI:131531 5103-96-8 CHEBI:131531 CHEBI:131531 MAPPED CultureMech:007317; CultureMech:007318; CultureMech:007319
144144
Magnesium Sulfate Heptahydrate 166 CHEBI:86463 10043-67-1 CHEBI:86463 CHEBI:86463 MAPPED CultureMech:015450; CultureMech:015451; CultureMech:015452
-232 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)