# Audit all communities
just audit-network
# CI mode (exit 1 if issues)
just check-network-quality
# Generate JSON report
just audit-network-json > report.json
# Write detailed report to file
just audit-network-report audit.txtThe auditor checks for 5 types of network integrity issues:
-
ID_MISMATCH - NCBITaxon IDs don't match between taxonomy and interactions
# taxonomy section has: NCBITaxon:562 # Escherichia coli # but interaction references: NCBITaxon:9999 # Wrong ID!
-
MISSING_SOURCE - Interaction has no
source_taxonfieldecological_interactions: - name: "Some interaction" # source_taxon: MISSING! target_taxon: ...
-
UNKNOWN_SOURCE - Source taxon not found in taxonomy section
ecological_interactions: - source_taxon: preferred_term: "Mystery bacterium" # Not in taxonomy!
-
UNKNOWN_TARGET - Target taxon not found in taxonomy section
ecological_interactions: - target_taxon: preferred_term: "Unknown archaea" # Not in taxonomy!
-
DISCONNECTED - Taxon in taxonomy but not involved in any interactions
taxonomy: - taxon_term: preferred_term: "Lonely bacterium" # No interactions!
🔍 Auditing 76 communities for network integrity issues...
────────────────────────────────────────────────────────────────────────────────
📋 Richmond_Mine_AMD_Biofilm
────────────────────────────────────────────────────────────────────────────────
ID_MISMATCH:
• [Iron Oxidation] source: Leptospirillum group II
Expected: NCBITaxon:1228, Found: NCBITaxon:9999
DISCONNECTED:
• ARMAN (NCBITaxon:123456)
• Thermoplasmatales archaeon (NCBITaxon:234567)
Total issues: 3
================================================================================
Summary: 1/76 communities have issues
Total issues found: 3
================================================================================
{
"Richmond_Mine_AMD_Biofilm": [
{
"type": "ID_MISMATCH",
"interaction": "Iron Oxidation",
"taxon": "Leptospirillum group II",
"role": "source",
"expected_id": "NCBITaxon:1228",
"actual_id": "NCBITaxon:9999"
},
{
"type": "DISCONNECTED",
"taxon": "ARMAN",
"taxon_id": "NCBITaxon:123456"
}
]
}For simple ID mismatches, the old scripts/fix_network_integrity.py can automatically fix:
# Dry run
python scripts/fix_network_integrity.py
# Apply fixes
python scripts/fix_network_integrity.py --applyFor DISCONNECTED, UNKNOWN_SOURCE, UNKNOWN_TARGET, and MISSING_SOURCE issues, manual curation is required:
Example: Fixing a disconnected taxon
# Before: Taxon exists but has no interactions
taxonomy:
- taxon_term:
preferred_term: "Ferroplasma acidarmanus"
term:
id: "NCBITaxon:55206"
label: "Ferroplasma acidarmanus"
ecological_interactions: [] # Empty!
# After: Add biologically plausible interaction
ecological_interactions:
- name: "Iron Cycling Partnership"
interaction_type: "MUTUALISM"
description: "F. acidarmanus reduces Fe(III) to Fe(II), which is then oxidized by Leptospirillum"
source_taxon:
preferred_term: "Ferroplasma acidarmanus"
term:
id: "NCBITaxon:55206"
label: "Ferroplasma acidarmanus"
target_taxon:
preferred_term: "Leptospirillum group II"
term:
id: "NCBITaxon:1228"
label: "Leptospirillum group II"
metabolites_exchanged:
- metabolite_term:
id: "CHEBI:29033"
label: "iron(2+)"
direction: "source_to_target"
evidence:
- reference: "PMID:15066799"
supports: "SUPPORT"
evidence_source: "LITERATURE"
snippet: "Ferroplasma acidarmanus was capable of growing by reduction of Fe(III)..."Future versions will support LLM-assisted suggestions:
# Interactive repair with human approval
communitymech repair-network kb/communities/Richmond_Mine_AMD_Biofilm.yaml
# Generate suggestions report for batch review
communitymech repair-network-batch --report-only
# Apply pre-approved repairs
communitymech repair-network-batch --apply-from reports/approved_repairs.yamlThe .github/workflows/network-quality.yml workflow automatically:
- Runs on PR changes to
kb/communities/*.yaml - Audits network integrity
- Fails PR if issues detected
- Uploads detailed reports as artifacts
- Comments on PR with issue summary
Add to .git/hooks/pre-commit:
#!/bin/bash
just check-network-quality# Always audit before committing
just audit-network
# Or use CI mode to fail on issues
just check-network-quality- ID mismatches: Run automated fix script
- Disconnected taxa: Add biologically plausible interactions with evidence
- Unknown taxa: Add missing taxa to taxonomy or fix typos
When adding interactions to fix disconnected taxa, always:
- Use peer-reviewed literature (PMID preferred)
- Include metabolites with CHEBI IDs
- Include processes with GO IDs
- Extract exact snippets from abstracts
# After manual fixes
just validate kb/communities/YourCommunity.yaml
just validate-references kb/communities/YourCommunity.yaml
just audit-networkfrom pathlib import Path
from communitymech.network.auditor import NetworkIntegrityAuditor
# Create auditor
auditor = NetworkIntegrityAuditor(communities_dir=Path("kb/communities"))
# Audit all communities
issues = auditor.audit_all()
# Audit single community
issues = auditor.audit_community(Path("kb/communities/Test.yaml"))
# Check specific issue types
for issue in issues:
if issue["type"] == "DISCONNECTED":
taxon = issue["taxon"]
taxon_id = issue["taxon_id"]
print(f"Disconnected: {taxon} ({taxon_id})")
# Export as JSON
import json
with open("audit.json", "w") as f:
f.write(auditor.to_json())
# Get community data and taxonomy lookup (for context building)
data = auditor.get_community_data(Path("kb/communities/Test.yaml"))
taxonomy = auditor.get_taxonomy_lookup(data)Solution: Reinstall package
uv sync --all-extrasMeaning: Network integrity issues detected
Solution:
- Check CI logs or PR comment for issue details
- Download artifact reports for full details
- Fix issues manually or with scripts
- Re-run CI
Solution:
uv run pytest tests/test_network_auditor.py -v
# Fix any failures
uv sync --all-extras # Reinstall if neededcommunitymech audit-network --communities-dir /path/to/communities# Get all disconnected taxa
just audit-network-json | jq '.[] | .[] | select(.type=="DISCONNECTED") | .taxon'
# Count issues by type
just audit-network-json | jq '[.[] | .[] | .type] | group_by(.) | map({type: .[0], count: length})'
# Find communities with ID mismatches
just audit-network-json | jq 'to_entries | map(select(.value | any(.type=="ID_MISMATCH"))) | map(.key)'from communitymech.network.auditor import NetworkIntegrityAuditor
import sys
auditor = NetworkIntegrityAuditor()
issues = auditor.audit_all(check_only=True)
# Exits with code 1 if issues found
# Use for custom validation pipelines