plaid-mcp-server

An MCP (Model Context Protocol) server for working with PLAID datasets. It exposes tools that an LLM assistant (such as Claude via Cline) can call to scan raw simulation data, inspect its structure, and convert it into the PLAID format.

Overview

PLAID is a format and library for storing and sharing physics simulation datasets. Converting raw simulation outputs (HDF5, NumPy, PLY, CGNS, CSV, ...) into PLAID requires understanding the structure of the raw data and mapping its variables to PLAID features.

This MCP server automates that workflow by providing tools that:

Scan a raw data directory to detect its layout and candidate simulations
(Planned) Inspect individual simulation files to identify variables and their shapes
(Planned) Define a conversion configuration (split mapping, feature mapping, backend)
(Planned) Execute the conversion using plaid.storage.save_to_disk

Installation

This project is managed with uv.

git clone <repo-url>
cd plaid-mcp-server
uv sync

MCP Server Configuration

Add the following to your MCP client configuration (e.g. Cline's cline_mcp_settings.json):

{
  "mcpServers": {
    "plaid-mcp-server": {
      "timeout": 60,
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/path/to/plaid-mcp-server",
        "plaid-server"
      ],
      "type": "stdio"
    }
  }
}

Available Tools

`scan_raw_dataset`

Scans a directory of raw simulation files to detect its layout.

Input:

Parameter	Type	Required	Description
`raw_dir`	string	Yes	Path to the root directory containing raw simulation data

Output:

{
  "session_id": "6093e66b",
  "raw_dir": "/data/my_sims",
  "layout": "one_subdir_per_simulation",
  "dominant_extensions": [".h5"],
  "noise_files": ["README.md", "run_info.txt"],
  "candidate_simulations": [
    {"id": 0, "path": "run_001", "files": ["flow.h5", "mesh.cgns"]},
    {"id": 1, "path": "run_002", "files": ["flow.h5", "mesh.cgns"]}
  ],
  "total_candidates": 42,
  "truncated": true
}

The tool detects three layout patterns:

flat: all simulation files sit directly in the root directory (e.g. sim_001.h5, sim_002.h5, ...)
one_subdir_per_simulation: each immediate subdirectory contains one simulation (e.g. run_001/flow.h5, run_002/flow.h5, ...)
nested: deeper or mixed nesting

The output is capped at 20 candidate simulations. The full list is stored in the session for use by follow-up tools. Noise files (.txt, .md, .log, .DS_Store, etc.) are identified and excluded from candidates.

A session_id is returned and can be referenced in subsequent tool calls.

`init_from_disk`

Loads an existing PLAID dataset from a local directory and returns its metadata.

Input:

Parameter	Type	Required	Description
`local_dir`	string	Yes	Path to the local directory containing the saved dataset
`splits`	array of strings	No	Splits to load. If omitted, all splits are loaded.

Output:

{
  "local_dir": "/data/my_plaid_dataset",
  "splits": ["train", "test"],
  "num_samples_per_split": {"train": 500, "test": 100},
  "backend": "hf_datasets",
  "variable_features": ["pressure", "velocity_x"],
  "constant_features": ["reynolds_number"]
}

Session Manager

The server maintains an in-memory session registry for the duration of the server process. Each call to scan_raw_dataset creates a new session that stores the scan result. This allows follow-up tools to reference the scan without re-running it, using the returned session_id.

Sessions are not persisted across server restarts.

Development

Running tests

uv run pytest

Linting and formatting

uv run ruff format .
uv run ruff check --fix .

Type checking

uvx ty check src/

Project structure

src/plaid_mcp_server/
├── server.py           # MCP server entry point, tool registration
├── session.py          # SessionManager and ConversionSession
└── tools/
    ├── inspection.py   # inspect_simulation_file tool
    ├── scanning.py     # scan_raw_dataset tool
    └── storage.py      # init_from_disk tool

tests/
└── tools/
    ├── test_inspection.py
    └── test_scanning.py

Available Tools (continued)

`inspect_simulation_file`

Open a single raw simulation file and report its variables, shapes, and data types.

Input:

Parameter	Type	Required	Description
`file_path`	string	Yes	Path to the simulation file to inspect

Supported formats: .npy, .npz, .ply, .csv, .h5, .hdf5, .vtu, .vtp, .vtk

Output examples:

{
  "file": "/data/sim/press.npy",
  "format": "npy",
  "variables": [{"name": "press", "shape": [3682], "dtype": "float64"}],
  "summary": "NumPy array: shape=[3682], dtype=float64"
}

{
  "file": "/data/sim/tri_mesh.ply",
  "format": "ply",
  "variables": [
    {"name": "vertex", "count": 3586, "properties": [{"name": "x", "dtype": "=f8"}, ...]},
    {"name": "face", "count": 7168, "properties": [{"name": "vertex_indices", "dtype": "|O"}]}
  ],
  "summary": "PLY file: 2 element(s) — vertex(3586), face(7168)"
}

{
  "file": "/data/sim.h5",
  "format": "hdf5",
  "variables": [
    {"name": "pressure", "shape": [100], "dtype": "float32"},
    {"name": "velocity/u", "shape": [100], "dtype": "float64"}
  ],
  "summary": "HDF5 file: 2 dataset(s) — pressure, velocity/u"
}

`propose_conversion_plan`

Analyses the scan result stored in a session and produces a structured PLAID conversion plan, then writes a ready-to-run Python conversion script to output_script_path.

Infers: sample semantics (static vs temporal), mesh format and element type, PLAID CGNS feature path identifiers, and split definitions from sidecar files (train.txt, test.txt, ...).

Input:

Parameter	Type	Required	Description
`session_id`	string	yes	Session ID from a previous `scan_raw_dataset` call.
`output_script_path`	string	yes	Path where the generated conversion script is written.
`dataset_name`	string	no	Human-readable name used in the script header (default: `"dataset"`).

Output:

{
  "session_id": "6093e66b",
  "sample_semantics": "static",
  "closest_example": "shapenetcar.py",
  "external_dependencies": ["plyfile", "Muscat"],
  "mesh": {"format": "ply", "element_type": "Triangle_3"},
  "features": {
    "input": ["Base_3_3/Zone/GridCoordinates/CoordinateX", "..."],
    "output": ["Base_3_3/Zone/VertexFields/press"],
    "constant": ["Base_3_3/Zone/Elements_Triangle_3/ElementConnectivity", "..."]
  },
  "splits": {"train": {"source": "train.txt", "count": 2900}, "test": {"source": "test.txt", "count": 100}},
  "edge_cases": ["PLY watertight meshes may have fewer vertices than the pressure array..."],
  "backends": ["hf_datasets", "cgns", "zarr"],
  "generated_script_path": "/path/to/convert.py"
}

The generated script is a complete, runnable Python file. Edit RAW_DATA_DIR, OUTPUT_DIR, and the field loading block as needed before running.

`run_conversion`

Executes the conversion script generated by propose_conversion_plan. Patches OUTPUT_DIR and BACKEND in the script, runs it in a subprocess, and returns the status.

Input:

Parameter	Type	Required	Description
`session_id`	string	yes	Session ID with a completed conversion plan.
`output_dir`	string	yes	Directory where the PLAID dataset will be written.
`backend`	string	no	PLAID storage backend: `hf_datasets`, `cgns`, or `zarr` (default: `hf_datasets`).

Output:

{
  "session_id": "6093e66b",
  "output_dir": "/data/plaid/shapenetcar",
  "backend": "hf_datasets",
  "script_path": "/path/to/convert.py",
  "status": "success",
  "message": "Conversion completed successfully."
}

Roadmap

get_conversion_status: query the progress of an ongoing conversion job

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src/plaid_mcp_server		src/plaid_mcp_server
tests		tests
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

plaid-mcp-server

Overview

Installation

MCP Server Configuration

Available Tools

`scan_raw_dataset`

`init_from_disk`

Session Manager

Development

Running tests

Linting and formatting

Type checking

Project structure

Available Tools (continued)

`inspect_simulation_file`

`propose_conversion_plan`

`run_conversion`

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

plaid-mcp-server

Overview

Installation

MCP Server Configuration

Available Tools

scan_raw_dataset

init_from_disk

Session Manager

Development

Running tests

Linting and formatting

Type checking

Project structure

Available Tools (continued)

inspect_simulation_file

propose_conversion_plan

run_conversion

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`scan_raw_dataset`

`init_from_disk`

`inspect_simulation_file`

`propose_conversion_plan`

`run_conversion`

Packages