You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -28,16 +55,16 @@ Use `--constraints [CODE ...]` to customize output.
28
55
29
56
### 🛠️ Functional testing
30
57
31
-
MCP servers are intended to be used by LLM agents, so we can optionally test them with an LLM agent. When enabled with the `--test` flag, the interviewer uses your specified LLM to generate a test plan based on the MCP server's capabilities and then executes that plan (e.g. by calling tools), collecting statistics about observed tool behavior.
58
+
MCP servers are intended to be used by LLM agents, so the interviewer can optionally test them with an LLM agent. When enabled with the `--test` flag, the interviewer uses your specified LLM to generate a test plan based on the MCP server's capabilities and then executes that plan (e.g. by calling tools), collecting statistics about observed tool behavior.
32
59
33
-
### 🧪 LLM evaluation
60
+
### 🤖 LLM evaluation
34
61
35
62
***Note: this is an experimental feature. All LLM generated evaluations should be manually inspected for errors.***
36
63
37
64
The interviewer can also use your specified LLM to provide structured and natural language evaluations of the server's features.
38
65
39
66
40
-
### 📋 Reports
67
+
### 📋 Report generation
41
68
42
69
The interviewer generates a Markdown report (and accompanying `.json` file with raw data) summarizing the interview results.
43
70
@@ -63,39 +90,62 @@ Use `--reports [CODE ...]` to customize output.
63
90
64
91
</details>
65
92
93
+
## Installation
94
+
95
+
### As a CLI tool
96
+
97
+
The easiest way to install `mcp-interviewer` is as a `uv` tool. Follow [these instructions](https://docs.astral.sh/uv/getting-started/installation/) to install uv.
Read more about [Python usage](./README.md#python).
66
123
67
124
## Quick Start
68
125
69
126
⚠️ ***mcp-interviewer arbitrarily executes the provided MCP server command in a child process. Whenever possible, run your server in a container like in the examples below to isolate the server from your host system.***
70
127
71
-
🚨 ***mcp-interviewer actually invokes the server's tools, DO NOT use mcp-interviewer with admin privileges etc***
128
+
First, [install](./README.md#as-a-cli-tool)`mcp-interviewer` as a CLI tool.
72
129
73
130
```bash
74
131
# Command to run npx safely inside a Docker container
Which will generate a report like [this](./mcp-interview.md).
@@ -105,32 +155,47 @@ Which will generate a report like [this](./mcp-interview.md).
105
155
### CLI
106
156
107
157
**Key Flags:**
108
-
-`--test`: Enable functional testing (disabled by default for faster execution)
109
-
-`--judge`: Enable experimental LLM evaluation of tools and tests
110
-
-`--reports [CODE ...]`: Customize which report sections to include
158
+
111
159
-`--constraints [CODE ...]`: Customize which constraints to check
160
+
-`--reports [CODE ...]`: Customize which report sections to include
161
+
162
+
163
+
164
+
-`--test`: Enable functional testing. 🚨 ***This option causes mcp-interviewer to invoke the server's tools. Be careful to limit the server's access to your host system, sensitive data, etc before using these options.***
165
+
-`--judge-tools`: Enable experimental LLM evaluation of tools
166
+
-`--judge-test`: Enable experimental LLM evaluation of functional tests (requires `--test`)
167
+
-`--judge`: Enable all LLM evaluation (equivalent to `--judge-tools --judge-test`)
112
168
113
169
```bash
114
170
# Docker command to run uvx inside a container
115
171
UVX_CONTAINER="docker run -i --rm ghcr.io/astral-sh/uv:python3.12-alpine uvx"
116
172
117
-
# Basic constraint checking and server inspection (no functional testing)
@@ -141,14 +206,15 @@ The CLI provides two ways of customizing your model client:
141
206
142
207
1.`openai.OpenAI` keyword arguments
143
208
144
-
You can provide keyword arguments to the OpenAI client constructor via the "--client-kwargs" CLI option. For example, to connect to gpt-oss:20b running locally via Ollama:
209
+
You can provide keyword arguments to the OpenAI client constructor via the "--client-kwargs" CLI option. For example, to connect to gpt-oss:20b running locally via Ollama for LLM features:
145
210
146
211
```bash
147
212
mcp-interviewer \
148
213
--client-kwargs \
149
214
"base_url=http://localhost:11434/v1" \
150
215
"api_key=ollama" \
151
216
--model "gpt-oss:20b" \
217
+
--test \
152
218
"docker run -i --rm node:lts npx -y @modelcontextprotocol/server-everything"
153
219
```
154
220
@@ -166,13 +232,31 @@ The CLI provides two ways of customizing your model client:
166
232
```bash
167
233
mcp-interviewer \
168
234
--client "my_client.azure_client" \
169
-
--model "gpt-4o_2024-11-20" \
235
+
--model "gpt-4.1_2024-11-20" \
236
+
--test \
170
237
"docker run -i --rm node:lts npx -y @modelcontextprotocol/server-everything"
171
238
```
172
239
173
240
174
241
### Python
175
242
243
+
**Basic usage (constraint checking and server inspection only):**
244
+
245
+
```python
246
+
from mcp_interviewer import MCPInterviewer, StdioServerParameters
MCP Interviewer was developed for research and experimental purposes. Further testing and validation are needed before considering its application in commercial or real-world scenarios. The MCP Python SDK executes arbitrary commands on the host machine, so users should run server commands in isolated containers and use external security tools to validate MCP server safety before running MCP Interviewer. Additionally, MCP Servers may have malicious or misleading tool metadata that may cause inaccurate MCP Interviewer outputs. Users should manually examine MCP Interviewer outputs for signs of malicious manipulation.
296
+
MCP Interviewer was developed for research and experimental purposes. Further testing and validation are needed before considering its application in commercial or real-world scenarios.
297
+
298
+
The MCP Python SDK executes arbitrary commands on the host machine, so users should run server commands in isolated containers and use external security tools to validate MCP server safety before running MCP Interviewer.
299
+
300
+
Additionally, MCP Servers may have malicious or misleading tool metadata that may cause inaccurate MCP Interviewer outputs. Users should manually examine MCP Interviewer outputs for signs of malicious manipulation.
194
301
195
302
See [TRANSPARENCY.md](./TRANSPARENCY.md) for more information.
0 commit comments