@@ -38,6 +38,7 @@ It implements the Model Context Protocol specification, handling model context r
3838- Supports resource registration and retrieval
3939- Supports stdio & Streamable HTTP (including SSE) transports
4040- Supports notifications for list changes (tools, prompts, resources)
41+ - Supports sampling (server-to-client LLM completion requests)
4142
4243### Supported Methods
4344
@@ -51,6 +52,7 @@ It implements the Model Context Protocol specification, handling model context r
5152- ` resources/read ` - Retrieves a specific resource by name
5253- ` resources/templates/list ` - Lists all registered resource templates and their schemas
5354- ` completion/complete ` - Returns autocompletion suggestions for prompt arguments and resource URIs
55+ - ` sampling/createMessage ` - Requests LLM completion from the client (server-to-client)
5456
5557### Custom Methods
5658
@@ -103,6 +105,163 @@ end
103105- Raises ` MCP::Server::MethodAlreadyDefinedError ` if trying to override an existing method
104106- Supports the same exception reporting and instrumentation as standard methods
105107
108+ ### Sampling
109+
110+ The Model Context Protocol allows servers to request LLM completions from clients through the ` sampling/createMessage ` method.
111+ This enables servers to leverage the client's LLM capabilities without needing direct access to AI models.
112+
113+ ** Key Concepts:**
114+
115+ - ** Server-to-Client Request** : Unlike typical MCP methods (client→server), sampling is initiated by the server
116+ - ** Client Capability** : Clients must declare ` sampling ` capability during initialization
117+ - ** Tool Support** : When using tools in sampling requests, clients must declare ` sampling.tools ` capability
118+ - ** Human-in-the-Loop** : Clients can implement user approval before forwarding requests to LLMs
119+
120+ ** Usage Example (Stdio transport):**
121+
122+ ` Server#create_sampling_message ` is for single-client transports (e.g., ` StdioTransport ` ).
123+ For multi-client transports (e.g., ` StreamableHTTPTransport ` ), use ` server_context.create_sampling_message ` inside tools instead,
124+ which routes the request to the correct client session.
125+
126+ ``` ruby
127+ server = MCP ::Server .new (name: " my_server" )
128+ transport = MCP ::Server ::Transports ::StdioTransport .new (server)
129+ server.transport = transport
130+ ```
131+
132+ Client must declare sampling capability during initialization.
133+ This happens automatically when the client connects.
134+
135+ ``` ruby
136+ result = server.create_sampling_message(
137+ messages: [
138+ { role: " user" , content: { type: " text" , text: " What is the capital of France?" } }
139+ ],
140+ max_tokens: 100 ,
141+ system_prompt: " You are a helpful assistant." ,
142+ temperature: 0.7
143+ )
144+ ```
145+
146+ Result contains the LLM response:
147+
148+ ``` ruby
149+ {
150+ role: " assistant" ,
151+ content: { type: " text" , text: " The capital of France is Paris." },
152+ model: " claude-3-sonnet-20240307" ,
153+ stopReason: " endTurn"
154+ }
155+ ```
156+
157+ ** Parameters:**
158+
159+ Required:
160+
161+ - ` messages: ` (Array) - Array of message objects with ` role ` and ` content `
162+ - ` max_tokens: ` (Integer) - Maximum tokens in the response
163+
164+ Optional:
165+
166+ - ` system_prompt: ` (String) - System prompt for the LLM
167+ - ` model_preferences: ` (Hash) - Model selection preferences (e.g., ` { intelligencePriority: 0.8 } ` )
168+ - ` include_context: ` (String) - Context inclusion: ` "none" ` , ` "thisServer" ` , or ` "allServers" ` (soft-deprecated)
169+ - ` temperature: ` (Float) - Sampling temperature
170+ - ` stop_sequences: ` (Array) - Sequences that stop generation
171+ - ` metadata: ` (Hash) - Additional metadata
172+ - ` tools: ` (Array) - Tools available to the LLM (requires ` sampling.tools ` capability)
173+ - ` tool_choice: ` (Hash) - Tool selection mode (e.g., ` { mode: "auto" } ` )
174+
175+ ** Using Sampling in Tools (works with both Stdio and HTTP transports):**
176+
177+ Tools that accept a ` server_context: ` parameter can call ` create_sampling_message ` on it.
178+ The request is automatically routed to the correct client session.
179+ Set ` server.server_context = server ` so that ` server_context.create_sampling_message ` delegates to the server:
180+
181+ ``` ruby
182+ class SummarizeTool < MCP ::Tool
183+ description " Summarize text using LLM"
184+ input_schema(
185+ properties: {
186+ text: { type: " string" }
187+ },
188+ required: [" text" ]
189+ )
190+
191+ def self .call (text: , server_context: )
192+ result = server_context.create_sampling_message(
193+ messages: [
194+ { role: " user" , content: { type: " text" , text: " Please summarize: #{ text } " } }
195+ ],
196+ max_tokens: 500
197+ )
198+
199+ MCP ::Tool ::Response .new ([{
200+ type: " text" ,
201+ text: result[:content ][:text ]
202+ }])
203+ end
204+ end
205+
206+ server = MCP ::Server .new (name: " my_server" , tools: [SummarizeTool ])
207+ server.server_context = server
208+ ```
209+
210+ ** Tool Use in Sampling:**
211+
212+ When tools are provided in a sampling request, the LLM can call them during generation.
213+ The server must handle tool calls and continue the conversation with tool results:
214+
215+ ``` ruby
216+ result = server.create_sampling_message(
217+ messages: [
218+ { role: " user" , content: { type: " text" , text: " What's the weather in Paris?" } }
219+ ],
220+ max_tokens: 1000 ,
221+ tools: [
222+ {
223+ name: " get_weather" ,
224+ description: " Get weather for a city" ,
225+ inputSchema: {
226+ type: " object" ,
227+ properties: { city: { type: " string" } },
228+ required: [" city" ]
229+ }
230+ }
231+ ],
232+ tool_choice: { mode: " auto" }
233+ )
234+
235+ if result[:stopReason ] == " toolUse"
236+ tool_results = result[:content ].map do |tool_use |
237+ weather_data = get_weather(tool_use[:input ][:city ])
238+
239+ {
240+ type: " tool_result" ,
241+ toolUseId: tool_use[:id ],
242+ content: [{ type: " text" , text: weather_data.to_json }]
243+ }
244+ end
245+
246+ final_result = server.create_sampling_message(
247+ messages: [
248+ { role: " user" , content: { type: " text" , text: " What's the weather in Paris?" } },
249+ { role: " assistant" , content: result[:content ] },
250+ { role: " user" , content: tool_results }
251+ ],
252+ max_tokens: 1000 ,
253+ tools: [...]
254+ )
255+ end
256+ ```
257+
258+ ** Error Handling:**
259+
260+ - Raises ` RuntimeError ` if transport is not set
261+ - Raises ` RuntimeError ` if client does not support ` sampling ` capability
262+ - Raises ` RuntimeError ` if ` tools ` are used but client lacks ` sampling.tools ` capability
263+ - Raises ` StandardError ` if client returns an error response
264+
106265### Notifications
107266
108267The server supports sending notifications to clients when lists of tools, prompts, or resources change. This enables real-time updates without polling.
0 commit comments