Skip to content

Commit 46806da

Browse files
authored
Merge branch 'main' into nmulepati/docs/dev-notes-push-to-huggingface-hub
2 parents 031ad32 + e4857f6 commit 46806da

File tree

16 files changed

+374
-198
lines changed

16 files changed

+374
-198
lines changed

docs/code_reference/mcp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ The `mcp` module defines configuration and execution classes for tool use via MC
44

55
## Configuration Classes
66

7-
[MCPProvider](#data_designer.config.mcp.MCPProvider) configures remote MCP servers via SSE transport. [LocalStdioMCPProvider](#data_designer.config.mcp.LocalStdioMCPProvider) configures local MCP servers as subprocesses via stdio transport. [ToolConfig](#data_designer.config.mcp.ToolConfig) defines which tools are available for LLM columns and how they are constrained.
7+
[MCPProvider](#data_designer.config.mcp.MCPProvider) configures remote MCP servers via SSE or Streamable HTTP transport. [LocalStdioMCPProvider](#data_designer.config.mcp.LocalStdioMCPProvider) configures local MCP servers as subprocesses via stdio transport. [ToolConfig](#data_designer.config.mcp.ToolConfig) defines which tools are available for LLM columns and how they are constrained.
88

99
For user-facing guides, see:
1010

docs/concepts/mcp/configure-mcp-cli.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -49,14 +49,15 @@ data-designer config mcp
4949
The wizard first asks you to choose a provider type:
5050

5151
1. **Remote SSE**: Connect to a pre-existing MCP server via HTTP Server-Sent Events
52-
2. **Local stdio subprocess**: Launch an MCP server as a subprocess
52+
2. **Remote Streamable HTTP**: Connect to a pre-existing MCP server via Streamable HTTP
53+
3. **Local stdio subprocess**: Launch an MCP server as a subprocess
5354

54-
### Remote SSE Configuration
55+
### Remote SSE / Streamable HTTP Configuration
5556

56-
When configuring a Remote SSE provider, you'll be prompted for:
57+
When configuring a remote provider (SSE or Streamable HTTP), you'll be prompted for:
5758

5859
- **Name**: Unique identifier (e.g., `"doc-search"`)
59-
- **Endpoint**: SSE endpoint URL (e.g., `"http://localhost:8080/sse"`)
60+
- **Endpoint**: Server endpoint URL (e.g., `"http://localhost:8080/sse"` or `"https://mcp.example.com/mcp"`)
6061
- **API Key**: Optional API key or environment variable name
6162

6263
### Local Stdio Configuration

docs/concepts/mcp/mcp-providers.md

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,36 +8,45 @@ An MCP provider defines how Data Designer connects to a tool server. Data Design
88

99
| Provider Class | Connection Method | Use Case |
1010
|---------------|-------------------|----------|
11-
| `MCPProvider` | HTTP Server-Sent Events | Connect to a pre-existing MCP server |
11+
| `MCPProvider` | SSE or Streamable HTTP | Connect to a pre-existing MCP server |
1212
| `LocalStdioMCPProvider` | Subprocess via stdin/stdout | Launch an MCP server as a subprocess |
1313

1414
When you create a `ToolConfig`, you reference providers by name, and Data Designer uses those provider settings to communicate with the appropriate MCP servers.
1515

16-
## MCPProvider (Remote SSE)
16+
## MCPProvider (Remote)
1717

18-
Use `MCPProvider` to connect to a pre-existing MCP server via Server-Sent Events:
18+
Use `MCPProvider` to connect to a pre-existing MCP server. Both SSE (Server-Sent Events) and Streamable HTTP transports are supported:
1919

2020
```python
2121
import data_designer.config as dd
2222
from data_designer.interface import DataDesigner
2323

24-
mcp_provider = dd.MCPProvider(
24+
# SSE transport (default)
25+
sse_provider = dd.MCPProvider(
2526
name="remote-mcp",
2627
endpoint="http://localhost:8080/sse",
2728
api_key="MCP_API_KEY", # Environment variable name
2829
)
2930

30-
data_designer = DataDesigner(mcp_providers=[mcp_provider])
31+
# Streamable HTTP transport
32+
http_provider = dd.MCPProvider(
33+
name="remote-tools",
34+
endpoint="https://mcp.example.com/mcp",
35+
api_key="MCP_API_KEY",
36+
provider_type="streamable_http",
37+
)
38+
39+
data_designer = DataDesigner(mcp_providers=[sse_provider, http_provider])
3140
```
3241

3342
### MCPProvider Fields
3443

3544
| Field | Type | Required | Description |
3645
|-------|------|----------|-------------|
3746
| `name` | `str` | Yes | Unique identifier for the provider |
38-
| `endpoint` | `str` | Yes | SSE endpoint URL (e.g., `"http://localhost:8080/sse"`) |
47+
| `endpoint` | `str` | Yes | Endpoint URL for the remote MCP server |
3948
| `api_key` | `str` | No | API key or environment variable name |
40-
| `provider_type` | `str` | No | Always `"sse"` (set automatically) |
49+
| `provider_type` | `str` | No | Transport type: `"sse"` (default) or `"streamable_http"` |
4150

4251
## LocalStdioMCPProvider (Subprocess)
4352

@@ -103,6 +112,12 @@ providers:
103112
endpoint: http://localhost:8080/sse
104113
api_key: ${MCP_API_KEY}
105114

115+
# Remote Streamable HTTP provider
116+
- name: remote-tools
117+
provider_type: streamable_http
118+
endpoint: https://mcp.example.com/mcp
119+
api_key: ${MCP_API_KEY}
120+
106121
# Local stdio provider
107122
- name: local-tools
108123
provider_type: stdio

packages/data-designer-config/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ dependencies = [
2222
"jinja2>=3.1.6,<4",
2323
"numpy>=1.23.5,<3",
2424
"pandas>=2.3.3,<3",
25-
"pillow>=12.0.0,<13",
25+
"pillow>=12.1.1,<13",
2626
"pyarrow>=19.0.1,<20", # Required for parquet I/O operations
2727
"pydantic[email]>=2.9.2,<3",
2828
"pygments>=2.19.2,<3",

packages/data-designer-config/src/data_designer/config/config_builder.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -705,7 +705,7 @@ def __repr__(self) -> str:
705705
Returns:
706706
A formatted string showing the builder's configuration including seed dataset and column information grouped by type.
707707
"""
708-
if len(self._column_configs) == 0:
708+
if len(self._column_configs) == 0 and self._seed_config is None:
709709
return f"{self.__class__.__name__}()"
710710

711711
props_to_repr = {

packages/data-designer-config/src/data_designer/config/mcp.py

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,25 +15,36 @@ class MCPProvider(ConfigBase):
1515
"""Configuration for a remote MCP server connection.
1616
1717
MCPProvider is used to connect to pre-existing MCP servers via SSE (Server-Sent Events)
18-
transport. For local subprocess-based MCP servers, use LocalStdioMCPProvider instead.
18+
or Streamable HTTP transport. For local subprocess-based MCP servers, use
19+
LocalStdioMCPProvider instead.
1920
2021
Attributes:
2122
name (str): Unique name used to reference this MCP provider.
22-
endpoint (str): SSE endpoint URL for connecting to the remote MCP server.
23+
endpoint (str): Endpoint URL for connecting to the remote MCP server.
2324
api_key (str | None): Optional API key for authentication. Defaults to None.
24-
provider_type (Literal["sse"]): Transport type discriminator, always "sse".
25+
provider_type (Literal["sse", "streamable_http"]): Transport type discriminator.
26+
Defaults to ``"sse"``.
2527
2628
Examples:
27-
Remote SSE transport:
29+
Remote SSE transport (default):
2830
2931
>>> MCPProvider(
3032
... name="remote-mcp",
3133
... endpoint="http://localhost:8080/sse",
3234
... api_key="your-api-key",
3335
... )
36+
37+
Remote Streamable HTTP transport:
38+
39+
>>> MCPProvider(
40+
... name="remote-mcp",
41+
... endpoint="https://api.example.com/mcp",
42+
... api_key="your-api-key",
43+
... provider_type="streamable_http",
44+
... )
3445
"""
3546

36-
provider_type: Literal["sse"] = "sse"
47+
provider_type: Literal["sse", "streamable_http"] = "sse"
3748
name: str
3849
endpoint: str
3950
api_key: str | None = None

packages/data-designer-config/tests/config/test_config_builder.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -768,6 +768,16 @@ def test_with_seed_dataset_sampling_strategy(stub_empty_builder):
768768
assert seed_config.sampling_strategy == SamplingStrategy.SHUFFLE
769769

770770

771+
def test_repr_includes_seed_dataset_when_no_columns(stub_empty_builder) -> None:
772+
"""repr should still show seed dataset when it is the only configured item."""
773+
source = HuggingFaceSeedSource(path="datasets/test-repo/testing/data.csv")
774+
stub_empty_builder.with_seed_dataset(source)
775+
776+
repr_string = repr(stub_empty_builder)
777+
778+
assert "seed_dataset: hf seed" in repr_string
779+
780+
771781
def test_add_model_config(stub_empty_builder):
772782
assert len(stub_empty_builder.model_configs) == 1
773783
assert stub_empty_builder.model_configs[0].alias == "stub-model"

packages/data-designer-config/tests/config/test_mcp.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,32 @@ def test_mcp_provider_requires_endpoint() -> None:
1414
provider = MCPProvider(name="sse", endpoint="http://localhost:8080")
1515
assert provider.endpoint == "http://localhost:8080"
1616
assert provider.api_key is None
17+
assert provider.provider_type == "sse"
1718

1819
provider_with_key = MCPProvider(name="sse-auth", endpoint="http://localhost:8080", api_key="secret")
1920
assert provider_with_key.api_key == "secret"
2021

2122

23+
def test_mcp_provider_streamable_http() -> None:
24+
provider = MCPProvider(
25+
name="streamable",
26+
endpoint="https://api.example.com/mcp",
27+
provider_type="streamable_http",
28+
)
29+
assert provider.provider_type == "streamable_http"
30+
assert provider.endpoint == "https://api.example.com/mcp"
31+
assert provider.api_key is None
32+
33+
provider_with_key = MCPProvider(
34+
name="streamable-auth",
35+
endpoint="https://api.example.com/mcp",
36+
provider_type="streamable_http",
37+
api_key="secret",
38+
)
39+
assert provider_with_key.api_key == "secret"
40+
assert provider_with_key.provider_type == "streamable_http"
41+
42+
2243
def test_local_stdio_mcp_provider_requires_command() -> None:
2344
with pytest.raises(ValidationError):
2445
LocalStdioMCPProvider(name="missing-command")

packages/data-designer-engine/src/data_designer/engine/mcp/io.py

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,9 @@
3939
from mcp import ClientSession, StdioServerParameters
4040
from mcp.client.sse import sse_client
4141
from mcp.client.stdio import stdio_client
42+
from mcp.client.streamable_http import streamablehttp_client
4243

43-
from data_designer.config.mcp import LocalStdioMCPProvider, MCPProviderT
44+
from data_designer.config.mcp import LocalStdioMCPProvider, MCPProvider, MCPProviderT
4445
from data_designer.engine.mcp.errors import MCPToolError
4546
from data_designer.engine.mcp.registry import MCPToolDefinition, MCPToolResult
4647

@@ -211,11 +212,15 @@ async def create_session() -> ClientSession:
211212
env=provider.env,
212213
)
213214
ctx = stdio_client(params)
215+
elif isinstance(provider, MCPProvider) and provider.provider_type == "streamable_http":
216+
headers = _build_auth_headers(provider.api_key)
217+
ctx = streamablehttp_client(provider.endpoint, headers=headers)
214218
else:
215219
headers = _build_auth_headers(provider.api_key)
216220
ctx = sse_client(provider.endpoint, headers=headers)
217221

218-
read, write = await ctx.__aenter__()
222+
ctx_result = await ctx.__aenter__()
223+
read, write = ctx_result[0], ctx_result[1]
219224
new_session = ClientSession(read, write)
220225
await new_session.__aenter__()
221226
await new_session.initialize()
@@ -399,6 +404,11 @@ def list_tools(provider: MCPProviderT, timeout_sec: float | None = None) -> tupl
399404
return _MCP_IO_SERVICE.list_tools(provider, timeout_sec=timeout_sec)
400405

401406

407+
def list_tool_names(provider: MCPProviderT, timeout_sec: float) -> list[str]:
408+
"""Return the names of all tools available on an MCP provider."""
409+
return [t.name for t in _MCP_IO_SERVICE.list_tools(provider, timeout_sec=timeout_sec)]
410+
411+
402412
def call_tools(
403413
calls: list[tuple[MCPProviderT, str, dict[str, Any]]],
404414
*,
@@ -434,7 +444,7 @@ def get_session_pool_info() -> dict[str, Any]:
434444

435445

436446
def _build_auth_headers(api_key: str | None) -> dict[str, Any] | None:
437-
"""Build authentication headers for SSE client."""
447+
"""Build authentication headers for remote MCP clients."""
438448
if not api_key:
439449
return None
440450
return {"Authorization": f"Bearer {api_key}"}

packages/data-designer-engine/src/data_designer/engine/testing/fixtures.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,17 @@ def stub_sse_provider() -> MCPProvider:
8181
)
8282

8383

84+
@pytest.fixture
85+
def stub_streamable_http_provider() -> MCPProvider:
86+
"""Create a stub Streamable HTTP MCP provider for testing."""
87+
return MCPProvider(
88+
name="test-streamable-http",
89+
endpoint="https://api.example.com/mcp",
90+
api_key="test-key",
91+
provider_type="streamable_http",
92+
)
93+
94+
8495
# =============================================================================
8596
# Tool config fixtures
8697
# =============================================================================

0 commit comments

Comments
 (0)