Skip to content

Commit 4c71f02

Browse files
authored
feat: add pipeline validation error analysis (#75)
1 parent 8e05b65 commit 4c71f02

14 files changed

Lines changed: 102080 additions & 7 deletions

README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -122,18 +122,16 @@ Claude will ask for your permission before a tool call is executed. You can opt
122122

123123
Unfortunately, you need to set the workspace and organization (through the API key) in the `claude_desktop_config.json`.
124124
There is no way to pass the API key dynamically to Claude in a secure way.
125-
The workspace could be passed into the tool call, it's an easy enhancement but I'd like to get feedback first.
125+
The workspace could be passed into the tool call, it's an easy enhancement, but I'd like to get feedback first.
126126

127127

128128

129129

130130
## Further improvements ideas
131131

132-
- remove hardcoded workspace
133132
- expose standard prompts via MCP e.g., for debugging, fixing pipelines, reading logs etc
134133
- fix the docker run command to clear cache
135134
- the ability to dump the conversation of improving the copilot
136-
- test with different models not just Claude Sonnet 3.7
137135
- test with different clients other than Claude Desktop app
138136

139137

TODO.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
How do we get from collection of tools to Agents that work well for defined use cases?
2+
3+
## Use Case Ideas
4+
- create a pipeline from scratch
5+
- edit an existing pipeline
6+
- let the agent create connections in an incomplete pipeline
7+
- resolve validation errors
8+
- fix a pipeline based on logs
9+
- optimize the prompt in a pipeline
10+
- optimize retrieval in a pipeline
11+
12+
13+
## Benchmark
14+
- create test cases
15+
- let agent run on test cases
16+
- score:
17+
- pipeline validates
18+
- pipeline matches expected solution
19+
- pipeline yields search results
20+
- dig into common validation errors to find best cases to cover

data/processed/created_updated_deployed_events.csv

Lines changed: 17362 additions & 0 deletions
Large diffs are not rendered by default.

data/raw/metabase_pipeline_created_last_3_months_exported_30-05-2025.csv

Lines changed: 4253 additions & 0 deletions
Large diffs are not rendered by default.

data/raw/metabase_pipeline_deployed_last_3_months_exported_30-05-2025.csv

Lines changed: 3132 additions & 0 deletions
Large diffs are not rendered by default.

data/raw/metabase_pipeline_updated_last_3_months_exported_30-05-2025.csv

Lines changed: 9979 additions & 0 deletions
Large diffs are not rendered by default.

data/raw/metabase_validation_errors_last_3_months_exported_28-05-2025.csv

Lines changed: 63635 additions & 0 deletions
Large diffs are not rendered by default.

pyproject.toml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,14 @@ dependencies = [
1717
[project.scripts]
1818
deepset-mcp = "deepset_mcp.main:main"
1919

20+
[project.optional-dependencies]
21+
analysis = [
22+
"jupyterlab",
23+
"pandas",
24+
"matplotlib",
25+
"seaborn"
26+
]
27+
2028
[build-system]
2129
requires = ["hatchling"]
2230
build-backend = "hatchling.build"
@@ -37,6 +45,7 @@ lint = [
3745
types = [
3846
"mypy",
3947
"types-PyYAML",
48+
"pandas-stubs",
4049
]
4150

4251
[tool.pytest.ini_options]

src/deepset_mcp/benchmark/__init__.py

Whitespace-only changes.

src/deepset_mcp/benchmark/dp_validation_error_analysis/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)