You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+20-9Lines changed: 20 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,7 @@ Use `--constraints [CODE ...]` to customize output.
28
28
29
29
### 🛠️ Functional testing
30
30
31
-
MCP servers are intended to be used by LLM agents, so we test them with an LLM agent. Using your specified LLM, the interviewer generates a test plan based on the MCP server's capabilities and then executes that plan (e.g. by calling tools), collecting statistics about observed tool behavior.
31
+
MCP servers are intended to be used by LLM agents, so we can optionally test them with an LLM agent. When enabled with the `--test` flag, the interviewer uses your specified LLM to generate a test plan based on the MCP server's capabilities and then executes that plan (e.g. by calling tools), collecting statistics about observed tool behavior.
32
32
33
33
### 🧪 LLM evaluation
34
34
@@ -68,6 +68,8 @@ Use `--reports [CODE ...]` to customize output.
68
68
69
69
⚠️ ***mcp-interviewer arbitrarily executes the provided MCP server command in a child process. Whenever possible, run your server in a container like in the examples below to isolate the server from your host system.***
70
70
71
+
🚨 ***mcp-interviewer actually invokes the server's tools, DO NOT use mcp-interviewer with admin privileges etc***
72
+
71
73
```bash
72
74
# Command to run npx safely inside a Docker container
73
75
NPX_CONTAINER="docker run -i --rm node:lts npx"
@@ -102,21 +104,30 @@ Which will generate a report like [this](./mcp-interview.md).
102
104
103
105
### CLI
104
106
107
+
**Key Flags:**
108
+
-`--test`: Enable functional testing (disabled by default for faster execution)
109
+
-`--judge`: Enable experimental LLM evaluation of tools and tests
110
+
-`--reports [CODE ...]`: Customize which report sections to include
111
+
-`--constraints [CODE ...]`: Customize which constraints to check
112
+
105
113
```bash
106
114
# Docker command to run uvx inside a container
107
115
UVX_CONTAINER="docker run -i --rm ghcr.io/astral-sh/uv:python3.12-alpine uvx"
Copy file name to clipboardExpand all lines: src/mcp_interviewer/cli.py
+28-4Lines changed: 28 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,5 @@
1
1
importargparse
2
+
importsys
2
3
3
4
4
5
defcli():
@@ -72,6 +73,16 @@ def cli():
72
73
nargs="+",
73
74
help="Specify which constraint violations to check (all enabled by default). Can use full names (e.g., openai-tool-count, openai-name-length) or shorthand codes (e.g., OTC, ONL, ONP, OTL, OA)",
74
75
)
76
+
parser.add_argument(
77
+
"--test",
78
+
action="store_true",
79
+
help="Enable functional testing of the server",
80
+
)
81
+
parser.add_argument(
82
+
"--accept-risk",
83
+
action="store_true",
84
+
help="Bypass user confirmation of functional test risk.",
"🚨 MCP Interviewer will make tool call requests to your MCP server. Depending on the server's capabilities this can lead to irreversible outcomes (e.g. deleting files)."
151
+
)
152
+
accept_risk=args.accept_risk
153
+
whilenotaccept_risk:
154
+
input_str=input("Do you accept this risk? y|[n]: ").strip().lower()
155
+
ifnotinput_strorinput_str=="n":
156
+
sys.exit(1)
157
+
else:
158
+
accept_risk=input_str=="y"
159
+
133
160
importimportlib
134
161
135
162
module, client=args.client.rsplit(".")
@@ -155,17 +182,14 @@ def cli():
155
182
156
183
from .mainimportmain
157
184
158
-
# Handle the --judge flag which enables experimental judging operations (disabled by default)
0 commit comments