SDC Agents User Documentation

SDC Agents is an open-source suite of nine purpose-scoped agents built on Google's Agent Development Kit (ADK) that transform data from SQL databases, CSV files, JSON sources, and MongoDB collections into validated, multi-format SDC4 artifacts — without requiring the user to write XML, RDF, or GQL by hand.

For installation and quick start, see the README.

Pipeline Overview

┌──────────────┐     ┌──────────────────┐     ┌──────────────┐
│ Catalog Agent│     │ Introspect Agent │     │              │
│  (5 tools)   │     │    (5 tools)     │     │              │
│              │     │                  │     │              │
│ Discovers    │     │ Examines your    │     │ Mapping Agent│
│ published    │     │ datasources      │     │  (3 tools)   │
│ SDC4 schemas │     │ (read-only)      │     │              │
└──────┬───────┘     └────────┬─────────┘     │ Suggests     │
       │                      │               │ column →     │
       ▼                      ▼               │ component    │
  .sdc-cache/           .sdc-cache/           │ mappings     │
  schemas/              introspections/       └──────┬───────┘
       │                      │                      │
       └──────────┬───────────┘                      │
                  │                                  ▼
                  │                            .sdc-cache/
                  │                            mappings/
                  │                                  │
                  └──────────┬───────────────────────┘
                             ▼
                    ┌──────────────────┐
                    │ Generator Agent  │
                    │   (3 tools)      │
                    │                  │
                    │ Produces SDC4    │
                    │ XML instances    │
                    └────────┬─────────┘
                             │
                             ▼
                       ./output/*.xml
                             │
                             ▼
                    ┌──────────────────┐
                    │ Validation Agent │
                    │   (3 tools)      │
                    │                  │
                    │ Validates & signs│
                    │ via VaaS API     │
                    └────────┬─────────┘
                             │
                             ▼
                    ./output/*.pkg.zip
                             │
                             ▼
                    ┌────────────────────┐
                    │ Distribution Agent │
                    │    (5 tools)       │
                    │                    │
                    │ Routes packages to │
                    │ your destinations  │
                    └────────────────────┘
                             │
                     ┌───────┼───────┐
                     ▼       ▼       ▼
                  Fuseki   Neo4j  Filesystem
                  GraphDB  REST API

┌──────────────────┐     ┌────────────────────┐
│ Knowledge Agent  │     │  Assembly Agent    │
│   (3 tools)      │     │    (4 tools)       │
│                  │     │                    │
│ Ingests customer │     │ Discovers catalog  │
│ context into     │────▶│ components, builds │
│ vector store     │     │ & publishes models │
└──────────────────┘     └────────────────────┘

┌────────────────────────────┐
│ Semantic Discovery Agent   │
│        (1 tool)            │
│        (ADK-only)          │
│                            │
│ Searches Vertex AI Search  │
│ for relevant SDC4 resources│
└────────────────────────────┘

Each agent communicates through files on disk (the .sdc-cache/ directory and ./output/), not direct calls. Every handoff is an inspectable, version-controllable artifact.

Security Model

No agent has both datasource access and network access. The Introspect Agent reads your data but has no network. The Catalog and Validation Agents access the network but never touch your datasources. The Semantic Discovery Agent accesses GCP Vertex AI Search but never touches datasources.
Read-only datasource access. SQL queries are restricted to SELECT. CSV and JSON files are read, never modified. MongoDB access uses find() only.
Append-only audit log. Every tool call is logged to .sdc-cache/audit.jsonl with agent name, tool name, inputs, outputs, timestamp, and duration. Credentials are automatically redacted.

Cache Directory Structure

.sdc-cache/
├── audit.jsonl              # Append-only audit log
├── schemas/
│   └── dm-{ct_id}.json      # Cached schema details (immutable)
├── ontologies/
│   ├── *.rdf                # Downloaded ontology files
│   └── *.ttl
├── introspections/          # Introspection results
├── mappings/
│   └── {name}.json          # Confirmed column-to-component mappings
├── knowledge/
│   └── {source_name}.json   # Knowledge source metadata (Knowledge Agent)
├── skeletons/
│   └── dm-{ct_id}.xml       # Downloaded XML skeleton templates
└── field_mappings/
    └── dm-{ct_id}.json      # Skeleton field → placeholder mappings

The cache root defaults to .sdc-cache but is configurable via the cache.root setting.

Documentation Contents

Document	Description
Configuration Reference	All config fields, annotated YAML, environment variable substitution, working examples
Agent & Tool Reference	All 32 tools across 9 agents — parameters, return shapes, access scopes
MCP Integration	Serve agents as MCP servers for Claude Desktop, Cursor, and generic stdio clients
Common Workflows	Step-by-step guides: CSV to validated XML, audit troubleshooting, triplestore bootstrap

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDC Agents User Documentation

Pipeline Overview

Security Model

Cache Directory Structure

Documentation Contents

External References

FilesExpand file tree

index.md

Latest commit

History

index.md

File metadata and controls

SDC Agents User Documentation

Pipeline Overview

Security Model

Cache Directory Structure

Documentation Contents

External References