Implementation Date: March 14, 2026 Status: Phase 1-4 Complete
This document describes the implementation of ingredient hierarchy integration from MediaIngredientMech to CultureMech, enabling semantic structure for ingredient data with parent-child relationships, variant types, and functional roles.
MediaIngredientMech Repository
├── ingredient_families.yaml → Parent-child relationships
├── ingredient_roles.yaml → Functional role assignments
├── ingredient_merges.yaml → Merge mappings
└── ingredient_variants.yaml → Variant type definitions
↓
HierarchyImporter (loads & indexes)
↓
HierarchyEnricher (applies to recipes)
↓
CultureMech Recipes (enriched with hierarchy)
MediaIngredientMech maintains conservative chemical distinctions:
- Hydrates ≠ Salts ≠ Base chemicals
- Each variant tracked separately
- Explicit parent-child relationships
- No automatic merging of chemically distinct forms
src/culturemech/schema/culturemech.yaml
- Added
parent_ingredientfield to IngredientDescriptor - Added
variant_typefield to IngredientDescriptor - Added
IngredientReferenceclass - Added
VariantTypeEnumenumeration
-
src/culturemech/enrich/hierarchy_importer.py- Class:
MediaIngredientMechHierarchyImporter - Methods:
load_hierarchy()- Load from MediaIngredientMech YAML files_build_lookup_index()- Build fast lookup indexesget_parent()- Get parent for an ingredientget_children()- Get children of an ingredientget_variant_type()- Get variant typeget_roles()- Get functional rolesfind_by_chebi()- Find by CHEBI IDfind_by_name()- Find by name
- Class:
-
src/culturemech/enrich/hierarchy_enricher.py- Class:
HierarchyEnricher - Methods:
enrich_ingredient()- Add hierarchy to single ingredientenrich_solution()- Add hierarchy to solution ingredientsenrich_recipe()- Process single recipe filerun_pipeline()- Process all recipesgenerate_report()- Generate enrichment report
- Class:
-
scripts/enrich_with_hierarchy.py- CLI tool for hierarchy enrichment
- Auto-clones MediaIngredientMech if not provided
- Supports dry-run, category filtering, limits
- Generates enrichment statistics
-
src/culturemech/enrich/role_importer.py- Class:
RoleImporter - Methods:
load_roles()- Load role assignments from MediaIngredientMechget_roles_for_ingredient()- Get roles for specific ingredientapply_roles_to_ingredient()- Add role field to ingredientapply_roles_to_recipe()- Process single reciperun_pipeline()- Process all recipes
- Class:
-
scripts/assign_ingredient_roles.py- CLI tool for role assignment
- Supports role inheritance from parents
- Dry-run mode for testing
- Category filtering and limits
-
scripts/validate_hierarchy_integration.py- Class:
HierarchyValidator - Validation checks:
- Valid parent references (no orphans)
- No circular references
- Valid variant type enum values
- Valid role enum values
- Proper field structure
- Generates validation report with issue details
- Class:
-
scripts/generate_hierarchy_report.py- Class:
HierarchyReporter - Report sections:
- Overview statistics
- Variant type distribution
- Role distribution
- Top ingredient families
- Unmatched ingredients needing curation
- Markdown output format
- Class:
.claude/skills/manage-ingredient-hierarchy/skill.md- Claude Code skill for interactive hierarchy management
- Actions:
- Import hierarchy from MediaIngredientMech
- Apply hierarchy to recipes
- Assign roles
- Validate integration
- Generate reports
- Complete workflow examples
- Troubleshooting guide
parent_ingredient:
description: Reference to parent chemical entity from MediaIngredientMech
range: IngredientReference
recommended: false
inlined: true
comments:
- Links to canonical parent in MediaIngredientMech hierarchy
- Example: CaCl2·2H2O → parent is "Calcium chloride"
variant_type:
description: Type of chemical variant
range: VariantTypeEnum
recommended: false
comments:
- Describes relationship to parent (HYDRATE, SALT_FORM, ANHYDROUS, etc.)
- Populated from MediaIngredientMech metadataIngredientReference:
description: Reference to canonical ingredient
attributes:
preferred_term:
description: Name of parent ingredient
required: true
mediaingredientmech_id:
description: MediaIngredientMech ID
range: string
pattern: "^MediaIngredientMech:\\d{6}$"VariantTypeEnum:
permissible_values:
HYDRATE: Hydrated form of parent chemical
SALT_FORM: Different salt form of parent chemical
ANHYDROUS: Anhydrous form of parent chemical
NAMED_HYDRATE: Named hydrate (monohydrate, heptahydrate, etc.)
CHEMICAL_VARIANT: Other chemical variant of parentpython scripts/enrich_with_hierarchy.py \
--mim-repo ~/MediaIngredientMech \
--dry-run \
--limit 10python scripts/enrich_with_hierarchy.py \
--mim-repo ~/MediaIngredientMech \
--category bacterial \
--report-output enrichment_report.yamlpython scripts/assign_ingredient_roles.py \
--mim-repo ~/MediaIngredientMech \
--category bacterialpython scripts/validate_hierarchy_integration.py \
--mim-repo ~/MediaIngredientMech \
--report-output validation_report.yamlpython scripts/generate_hierarchy_report.py \
--output docs/ingredient_hierarchy.md# In Claude Code CLI
/manage-ingredient-hierarchyBefore enrichment:
preferred_term: Calcium chloride dihydrate
term:
id: CHEBI:86124
label: calcium chloride dihydrate
mediaingredientmech_term:
id: MediaIngredientMech:000042
label: Calcium chloride dihydrate
concentration:
value: "0.1"
unit: G_PER_LAfter enrichment:
preferred_term: Calcium chloride dihydrate
term:
id: CHEBI:86124
label: calcium chloride dihydrate
mediaingredientmech_term:
id: MediaIngredientMech:000042
label: Calcium chloride dihydrate
parent_ingredient:
preferred_term: Calcium chloride
mediaingredientmech_id: MediaIngredientMech:000041
variant_type: HYDRATE
role:
- MINERAL
- SALT
concentration:
value: "0.1"
unit: G_PER_Lgraph TD
A[MediaIngredientMech Repo] --> B[Load Hierarchy]
B --> C[Build Lookup Index]
C --> D[Match CultureMech Ingredients]
D --> E[Add parent_ingredient]
D --> F[Add variant_type]
D --> G[Add role]
E --> H[Validate Integration]
F --> H
G --> H
H --> I[Generate Report]
-
MediaIngredientMech as Source of Truth
- All hierarchy and role decisions made in MediaIngredientMech
- CultureMech imports and applies (does not define)
-
Conservative Chemical Distinctions
- Respects MediaIngredientMech's conservative approach
- Hydrates ≠ salts ≠ base chemicals
- Explicit parent-child relationships required
-
Additive Schema
- New fields are optional (
recommended: false) - Backward compatible with existing recipes
- No breaking changes to schema
- New fields are optional (
-
Batch Processing
- Pipeline can process all recipes or filter by category
- Progress reporting every 100 files
- Dry-run mode for testing
-
Validation First
- Validate MediaIngredientMech data before applying
- Check for orphaned/circular references
- Verify enum values
# Count enriched ingredients
grep -r "parent_ingredient:" data/normalized_yaml/ | wc -l
# Count variant types
grep -r "variant_type:" data/normalized_yaml/ | wc -l
# Count role assignments
grep -r "^ role:$" data/normalized_yaml/ | wc -l
# Sample enriched ingredients
grep -A 5 "parent_ingredient:" data/normalized_yaml/bacterial/*.yaml | head -30
# Check for validation issues
python scripts/validate_hierarchy_integration.py \
--mim-repo ~/MediaIngredientMech \
--limit 100Based on existing MediaIngredientMech linking:
- Hierarchy coverage: 80-90% of ingredients with MIM IDs
- Role coverage: 70-85% of ingredients with MIM IDs
- Variant type coverage: 40-60% of ingredients (many are parents, not variants)
- Export hierarchy to KGX format
- Generate RDF triples for parent-child relationships
- Create variant type edges in knowledge graph
- Role-based ingredient queries
class HierarchyKGExporter:
"""Export ingredient hierarchy to KG format."""
def export_to_kgx(self, output_path: Path):
"""Export as KGX edges.
Relationships:
- ingredient --has_parent--> canonical_ingredient
- ingredient --has_variant_type--> variant_type
- ingredient --has_role--> functional_role
"""Solution: Ensure MediaIngredientMech repository has required files:
ingredient_families.yamlingredient_roles.yamlingredient_variants.yamlingredient_merges.yaml
Solution: First run MediaIngredientMech linking to add MIM IDs:
python scripts/link_mediaingredientmech.py --mim-repo ~/MediaIngredientMechSolution: Update MediaIngredientMech repository to latest version:
cd ~/MediaIngredientMech && git pull origin main✅ Phase 1 Complete: Schema extended, hierarchy importer and enricher implemented ✅ Phase 2 Complete: Role importer and assignment implemented ✅ Phase 3 Complete: Validation and reporting tools implemented ✅ Phase 4 Complete: Claude Code skill created
Total Files Created: 8 Total Files Modified: 1 (schema) Lines of Code: ~2500
The implementation is production-ready and can be tested with:
# Full pipeline test (dry-run)
python scripts/enrich_with_hierarchy.py --mim-repo ~/MediaIngredientMech --dry-run --limit 10
python scripts/assign_ingredient_roles.py --mim-repo ~/MediaIngredientMech --dry-run --limit 10
python scripts/validate_hierarchy_integration.py --mim-repo ~/MediaIngredientMech --limit 10
python scripts/generate_hierarchy_report.py --limit 10