This content originally appeared on DEV Community and was authored by Maksim Kachurin
TL;DR
MCP Elicitation provides a standardized way for servers to request additional input from users through the client during a session.
In this article, I break down what problems it solves, and walk through building an MCP server and client with elicitation support — including a real chat UI that shows confirmation prompts to the user.
What is MCP
The Model Context Protocol (MCP) is an open standard developed by Anthropic that enables LLMs to receive data from any backend or application in a single, standardized format.
Before MCP, developers of agent-based AI systems had to rely on custom tools and logic to access the APIs of third-party applications. It was a tedious, manual process that didn’t scale well — every integration had to be built and maintained by the agent developers themselves.
MCP shifts that responsibility: now application developers can expose their APIs in a unified format that most models and agent frameworks can understand out of the box.
MCP evolution: update of June 18, 2025
The protocol keeps evolving and gaining new features. At the time of writing, the latest update was released on June 18, 2025, and here’s what it introduced:
Structured tool output
Tools can now return structured data instead of plain strings. This makes results easier to parse and enables more advanced pipelines. Many agent frameworks already supported structured output for tools declared within the agent itself — now that capability is officially part of the MCP spec as well.OAuth 2.1: Resource Server + Resource Indicators
MCP servers are now treated as full OAuth Resource Servers. Clients must use Resource Indicators (RFC 8707) to ensure access tokens aren’t misused across different servers.Resource links in tool call results
Tool responses can now include links to external resources — such as files, logs, or pages — in addition to raw data.Elicitation
This one’s a game-changer: servers can now ask users for input during a session, via the client, with validation handled through JSON Schema. This opens up a new kind of UX for agent interactions.
What is elicitation?
Elicitation is a mechanism in MCP that lets the server pause a tool’s execution until the client provides missing data from the user. A similar concept exists in agent frameworks — often called HITL (Human-In-The-Loop) — where the agent pauses its loop, asks the user for input, and then continues with updated data.
But MCP works differently. The MCP server is isolated: it runs independently, has no knowledge of the frontend, doesn’t control the UX, and can’t talk to the user directly. Typically, the architecture has three layers: the frontend, the agent (backend), and one or more MCP servers connected to the agent.
MCP only communicates with the agent — and that’s it.
In this setup, an MCP tool might simply not have the data it needs — and no way to get it on its own.
Take a common case: you build an MCP server for table bookings. It exposes a bookTable tool. The chatbot calls it, but to complete the booking, the tool needs all the inputs — date, time, number of guests, names, special requests. Some of this might be missing, some might be wrong, or the requested slot might be unavailable.
Even without MCP, this kind of human-in-the-loop flow is rarely implemented cleanly in agent frameworks. The usual hack is to create a tool that returns no result — a signal to pause the agent loop. The frontend then renders a UI element (a form, confirm buttons, etc.) asking the user for additional data. Once the user submits, that data is attached as the tool’s result, and a new request with updated message history is sent to the agent. The agent sees the original tool call plus the result and decides what to do next.
In this flow, the tool’s input comes from the LLM, and its output comes from the user.
In practice, you often have to define two tools:
one on the frontend for collecting or confirming data (gather_booking_info_tool)
and another on the backend for processing it (process_booking_tool)
This comes with a bunch of downsides — context loss on the backend, duplicate requests, extra token usage, and more.
Now, if you’re building an MCP server, things get even messier. The tool has no shared execution context with the agent that called it. That means it can’t pause the loop, can’t directly ask for more input — it’s fully decoupled.
Elicitation fixes this. It gives MCP tools a way to request data from the client during execution — and actually wait for it.
The best part? It all happens within a single request and session.
The server sends an elicitation request — it returns a Promise — and that Promise is resolved with user-provided data once the client responds.
Also this functionality can be used to notify the client about the current progress of the tool execution. MCP server can throw a request every time the task being executed changes, and the client can react to it and, for example, draw a progress bar in the interface.
Elicitation Protocol Format
When a tool realizes it’s missing data, it sends an elicitation/create request, passing a JSON Schema that describes the fields it needs.
For example:
{
"jsonrpc": "2.0",
"id": 1,
"method": "elicitation/create",
"params": {
"message": "Please provide your GitHub username",
"requestedSchema": {
"type": "object",
"properties": {
"name": {
"type": "string"
}
},
"required": ["name"]
}
}
}
There are 4 types of fields supported:
string
{
"type": "string",
"title": "Display Name",
"description": "Description text",
"minLength": 3,
"maxLength": 50,
"format": "email" // Supported: "email", "uri", "date", "date-time"
}
number
{
"type": "number", // or "integer"
"title": "Display Name",
"description": "Description text",
"minimum": 0,
"maximum": 100
}
boolean
{
"type": "boolean",
"title": "Display Name",
"description": "Description text",
"default": false
}
enum
{
"type": "string",
"title": "Display Name",
"description": "Description text",
"enum": ["option1", "option2", "option3"],
"enumNames": ["Option 1", "Option 2", "Option 3"]
}
In response, the protocol expects to receive a result with the action (accept, decline, cancel) and the requested data:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"action": "accept",
"content": {
"name": "octocat"
}
}
}
Required action format:
-
action: "accept"
: Client explicitly approved and submitted with data -
action: "decline"
: Client explicitly declined the request. The content field is typically omitted. -
action: "cancel"
: Client dismissed without making an explicit choice.
The handler should handle each of these scenarios accordingly.
Create MCP server and MCP client with Elicitation support
I’m going to build an MCP server that runs over HTTP, along with a small chat app based on assistant-ui. Then I’ll wire elicitation requests straight into the UI.
MCP Server
Not all frameworks/sdk support elicitation yet. We can write code using the official modelcontextprotocol/sdk
, but I’ll focus on Mastra — an open-source TypeScript agent framework that makes it easier to build agents, MCP servers, clients, and more.
Mastra is built on top of the AI SDK, so it works with most tools that support it.
For the server, I’m using Bun and Hono.js — that gives us a fast HTTP setup with support for multiple MCP servers in one service, plus WebSockets, auth server, and any other logic a real app might need.
Let’s start by creating a new project for the MCP server:
mkdir mcp-server && cd mcp-server
bun init -y
bun add typescript tsx @types/node mastra@latest @mastra/core@latest zod@^3 fetch-to-node hono -D
This commands will create a new project and install mastra.
Now create the necessary files
src/index.ts
import { Hono } from 'hono';
const app = new Hono();
import { toFetchResponse, toReqRes } from 'fetch-to-node';
import mcpServer from './mcp-server';
app.all('/mcp/server', async(c) => {
const { req, res } = toReqRes(c.req.raw);
await mcpServer.startHTTP({
url: new URL(c.req.url),
httpPath: '/mcp/server',
req,
res,
options: {
sessionIdGenerator: undefined,
},
});
return await toFetchResponse(res);
});
export default {
...app,
port: 4444,
};
Here we’re setting up a basic web server and mounting our mcpServer at the /mcp/server endpoint using the modern streamable HTTP protocol (You could also use HTTP + SSE, or connect to the MCP server directly via stdio — up to you)
The server will run on port 4444.
mcp-server.ts
import { MCPServer } from '@mastra/mcp';
import tools from './tools';
const mcpServer = new MCPServer({
version: '0.1.0',
name: 'MCP server for demonstration purposes',
description: 'This MCP server shows how to use the MCP server API to create a server that can be used in a chat application.',
tools,
});
export default mcpServer;
Here we define the MCP server itself — giving it a name, description, and the list of tools it exposes.
tools.ts
import { createTool } from '@mastra/core';
import { z } from 'zod';
import { spawn } from 'child_process';
export const runCommand = createTool({
id: 'runCommand',
description: 'Run a shell command on the server. Prompts for confirmation if the command is dangerous.',
inputSchema: z.object({
command: z.string().describe('The shell command to run, e.g., "ls"'),
args: z.array(z.string()).optional().describe('Arguments for the command'),
}),
outputSchema: z.object({
stdout: z.string(),
stderr: z.string(),
code: z.number(),
}),
async execute({ context }, options) {
const dangerousCommands = ['rm', 'mv', 'dd', 'shutdown', 'reboot', 'mkfs', 'chmod', 'chown', 'kill', 'killall', 'poweroff', 'halt'];
const cmd = context.command.trim();
const args = context.args || [];
// Ask for confirmation if the command is dangerous
if (dangerousCommands.includes(cmd)) {
const elicitation = (options as any)?.elicitation;
if (!elicitation || typeof elicitation.sendRequest !== 'function') {
return {
code: -1,
stdout: '',
stderr: 'Elicitation is not available in this context. Cannot confirm dangerous command.',
}
}
const result = await elicitation.sendRequest({
message: `Are you sure you want to run the dangerous command: '${cmd}'?`,
requestedSchema: {
type: 'object',
properties: {
confirm: {
type: 'boolean',
title: 'Confirm',
description: `Confirm running '${cmd}'?`,
},
},
required: ['confirm'],
},
});
if (result.action !== 'accept' || !result.content.confirm) {
return {
code: -1,
stdout: '',
stderr: 'Command was rejected by the user.',
}
}
}
return new Promise((resolve, reject) => {
const child = spawn(cmd, args, { shell: true });
let stdout = '';
let stderr = '';
child.stdout.on('data', (data) => { stdout += data.toString(); });
child.stderr.on('data', (data) => { stderr += data.toString(); });
child.on('close', (code) => {
resolve({ stdout, stderr, code: code ?? -1 });
});
child.on('error', (err) => {
reject(err);
});
});
},
});
export default {
runCommand,
};
For this demo, I’m creating a single tool that runs shell commands on the server (don’t do this in a real app — it’s just for demonstration).
If the command looks dangerous (like rm or kill), the tool asks the user for confirmation before running it.
Now let’s start the MCP server:
bun src/index.ts
Agent + MCP Client
To demonstrate and create a UI agent, I will use assistance-ui.
Create a new Next.js project and install assistant-ui and mastra:
npx assistant-ui@latest create mcp-app
cd mcp-app
bun add mastra -D
The command will create a new next.js application with a chat interface.
Create a file
.env.local
OPENAI_API_KEY=sk-proj-4fpws4nXpCYXsQVK7DdaUCdLNxMa...
And add your OPENAI API key here.
Now you can run the application and check that the chat works:
Basic assistant-ui chat interface
Now let’s create a mastra Agent and connect our MCP server to it:
app/api/chat/route.ts
import { openai } from '@ai-sdk/openai';
import { frontendTools } from '@assistant-ui/react-ai-sdk';
import { Agent } from '@mastra/core';
import { MCPClient } from '@mastra/mcp';
import { createDataStreamResponse } from 'ai';
// Define the MCP server name and URL for connecting to the tool server
const MCP_SERVER_NAME = 'testing';
const MCP_SERVER_URL = 'http://localhost:4444/mcp/server';
// Initialize the MCP client with server configuration
const mcp = new MCPClient({
id: '1', // Unique client ID
servers: {
[MCP_SERVER_NAME]: {
url: new URL(MCP_SERVER_URL),
},
},
});
// Create an AI agent with a model, name, and instructions
const agent = new Agent({
model: openai('gpt-4o'), // Use OpenAI GPT-4o model
name: 'Chat Agent',
instructions: `
You are a helpful assistant that provides accurate information.
Do not ask questions before running tools, just run them.
If you see that the user rejects a tool call, then stop and do not try to find another way to perform the task.
`,
});
// API route handler for POST requests
export async function POST(req: Request) {
// Parse the incoming request body for chat messages, system prompt, and tool list
const { messages, system, tools } = await req.json();
// Create a streaming response for the chat
const response = createDataStreamResponse({
status: 200,
statusText: 'OK',
async execute(dataStream) {
// Run the agent and stream its response to the client
const agentStream = await agent.stream(messages, {
system, // Optional system prompt
toolsets: {
...(await mcp.getToolsets()), // Dynamically load toolsets from MCP server
},
clientTools: frontendTools(tools), // Add any client-side tools
});
// Merge the agent's output into the HTTP data stream
agentStream.mergeIntoDataStream(dataStream);
},
// Custom error handler for the stream
onError: (error: any) => `Custom error: ${ error.message }`,
});
// Return the streaming response to the client
return response;
}
This code handles incoming requests from the frontend: it creates an agent and an MCP client, then calls the agent.
The agent responds to the frontend using the Stream AI SDK format. It handles tool execution via MCP, processes the results, and can run multiple tools in parallel until it reaches a final answer.
Let’s check that our agent is working and sees the tools exposed by our MCP server:
Our runCommand tool already knows how to return an elicitation request when a potentially dangerous command (like rm) is detected.
Now it’s time to handle that elicitation request in our app.
app/api/chat/route.ts
In the execute method, add:
// Set up elicitation handler to respond to tool confirmation requests from the server
await mcp.elicitation.onRequest(MCP_SERVER_NAME, async(request) => {
// Log the server's request and the schema it expects
console.log('Server request:', request);
await new Promise(resolve => setTimeout(resolve, 10_000));
// Respond to the elicitation request
// NOTE: The action must be one of 'accept', 'decline', or 'cancel' as per the MCP protocol
// Here, we always decline (for tutorial purposes)
return {
action: 'decline',
content: {
confirm: false, // Indicate that the dangerous command is NOT confirmed
},
};
});
// Run the agent and stream its response to the client
const agentStream = await agent...
Here we subscribe to elicitation events from our MCP server and wait 10 seconds before automatically rejecting the request.
We can now test that the agent runs a safe command (e.g. creating a file), and pauses for 10 seconds when trying to run a dangerous one (like deleting it). After the timeout, it reports that the command wasn’t allowed:
That works — but now we need to actually show the confirmation UI to the user.
Unfortunately, it’s not that simple. HTTP streaming only works one way — from the server to the client.
While the agent is responding, executing tools, and waiting for a reply from the MCP server, the frontend can’t send anything back on the same connection.
Because of this limitation, most frameworks require the current thread to be terminated when a tool asks the user for confirmation. Once the user responds, a new request is sent to the agent, where the user input is included as the tool’s result.
But this doesn’t work for us.
With MCP’s elicitation mechanism, the server is still waiting for a response from the backend via a Promise. The user’s confirmation isn’t the tool result — the real result will come later from MCP.
If we close the thread, we lose the connection to MCP, and its response will be dropped.
So we need a custom workaround — something outside the usual assistant-ui or AI SDK flow.
Here’s how I’m solving it:
When an elicitation request comes in, the backend stores the request data inside the current thread.
On the frontend, I’ll define a generative UI tool that shows the tool’s status and, if needed, renders confirmation buttons.
Clicking one of those buttons will trigger a separate request to a dedicated backend endpoint.
The backend will maintain a shared storage of elicitation Promises — so when a response comes in, it resolves the right one and continues the original MCP execution.
Forward elicitation request to the frontend
Let’s get started:
First, let’s create a Generative UI component for our runCommand tool.
@/components/assistant-ui/run-command-tool.tsx
import { ToolCallContentPartComponent, useMessage } from '@assistant-ui/react';
import { useState } from 'react';
import type { JSONValue } from 'ai';
type ElicitationData = {
type: 'elicitation';
toolCallId: string;
message: string;
requestedSchema: JSONValue;
};
export const RunCommandTool: ToolCallContentPartComponent = ({
toolCallId,
status,
args,
result,
}) => {
const elicitation = useMessage(
m => (m.metadata.unstable_data as ElicitationData[])?.find(
d => d.type === 'elicitation' && d.toolCallId === toolCallId
)
);
const content = result?.structuredContent;
const [isSubmitting, setIsSubmitting] = useState(false);
const [submitResult, setSubmitResult] = useState<string | null>(null);
const isRunning = status.type === 'running';
function handleElicitationResponse(action: 'accept' | 'decline') {
setIsSubmitting(true);
setSubmitResult(null);
fetch('/api/chat/elicitation', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
toolCallId,
action,
content: {
confirm: action === 'accept',
},
}),
}).then(async(res) => {
const data = await res.json();
setSubmitResult(data.success ? 'Submitted!' : 'Failed to submit');
}).catch((err) => {
setSubmitResult(`Error: ${ err.message }`);
}).finally(() => {
setIsSubmitting(false);
});
}
return (
<div className="flex flex-col gap-3 py-3 mb-4 w-full rounded-lg border">
<div className="flex gap-2 items-center px-4">
<code className="font-mono text-sm font-semibold">
{ `${ args.command }${ (Array.isArray(args.args) && args.args.length > 0) ? ` ${ args.args.join(' ') }` : '' }` }
</code>
</div>
{/* Elicitation while running, show button to accept or decline */}
{(isRunning && elicitation != null) && (
<div className="flex flex-col gap-2 px-4 pt-2 text-xs border-t border-dashed">
<pre className="whitespace-pre-wrap">
{elicitation.message}
</pre>
<div className="flex gap-2 mt-2">
<button
className="px-3 py-1 text-gray-800 bg-gray-200 rounded border border-gray-300 hover:bg-gray-300"
onClick={ () => handleElicitationResponse('accept') }
disabled={ isSubmitting }
>
Accept
</button>
<button
className="px-3 py-1 text-gray-800 bg-gray-200 rounded border border-gray-300 hover:bg-gray-300"
onClick={ () => handleElicitationResponse('decline') }
disabled={ isSubmitting }
>
Decline
</button>
{isSubmitting && <span className="ml-2">Submitting...</span>}
{submitResult && <span className="ml-2">{submitResult}</span>}
</div>
</div>
)}
{/* Show the result of the tool call */}
{content != null && (
<div className="flex flex-col gap-2 px-4 pt-2 text-xs border-t border-dashed">
<pre className="whitespace-pre-wrap">
{typeof content === 'string'
? content
: JSON.stringify(content, null, 2)}
</pre>
</div>
)}
</div>
);
};
In the component, we just display the command and its result.
If the current thread includes an active elicitation request, we also show Accept / Decline buttons.
Clicking a button sends a separate request to our backend at /api/chat/elicitation.
Finally, register the thread so assistant-ui can use it:
@/components/assistan-ui/thread.tsx
<MessagePrimitive.Content
components={ {
...
tools: {
by_name: {
runCommand: RunCommandTool,
},
Fallback: ToolFallback,
},
} }
/>
Now let’s update our elicitation handler on the backend to pass the data through to the frontend:
@/app/api/chat/route.ts
import { addElicitation, rejectElicitation } from './elicitation/elicitationStore';
...
async execute(dataStream) {
let currentToolCallId: string | null = null;
// Set up elicitation handler to respond to tool confirmation requests from the server
await mcp.elicitation.onRequest(MCP_SERVER_NAME, async(request) => {
const toolCallId = currentToolCallId;
if (!toolCallId) {
throw new Error('No tool call ID found');
}
const { promise, resolve, reject } = Promise.withResolvers<{
action: 'accept' | 'decline' | 'cancel';
content: {
confirm: boolean;
};
}>();
// Store the resolver globally for cross-request resolution
addElicitation(toolCallId, resolve, reject);
// Cancel the elicitation request after 60 seconds
const timeout = setTimeout(() => {
rejectElicitation(toolCallId, 'Timeout: elicitation cancelled');
}, 60_000);
dataStream.writeData({
type: 'elicitation',
toolCallId: currentToolCallId,
message: request.message,
requestedSchema: typeof request.requestedSchema?.toJSON === 'function'
? request.requestedSchema.toJSON()
: request.requestedSchema,
});
return promise.catch((err) => {
return {
action: 'cancel',
content: {
error: err.message,
},
};
}).finally(() => {
clearTimeout(timeout);
});
});
// Run the agent and stream its response to the client
const agentStream = await agent.stream(messages, {
system,
toolsets: {
...(await mcp.getToolsets()),
},
// Add any client-side tools
clientTools: frontendTools(tools),
onChunk: ({ chunk }) => {
// Track the current tool call ID
if ('toolCallId' in chunk) {
currentToolCallId = chunk.toolCallId;
}
},
});
// Merge the agent's output into the HTTP data stream
agentStream.mergeIntoDataStream(dataStream);
}
I use the writeData method to send extra info that the RunCommandTool component uses to show the Accept / Reject buttons.
Now all that’s left is to add the confirmation handler on the backend:
@/app/api/chat/elicitation/route.ts
import { resolveElicitation } from './elicitationStore';
export async function POST(req: Request) {
const { toolCallId, action, content } = await req.json();
const success = resolveElicitation(toolCallId, { action, content });
return new Response(JSON.stringify({ success }), { status: 200 });
}
As well as a store for storing active elicitation requests:
@/app/api/chat/elicitation/elicitationStore.ts
type ElicitationResolver = (value: any) => void;
type ElicitationRejecter = (reason?: any) => void;
interface ElicitationEntry {
resolve: ElicitationResolver;
reject: ElicitationRejecter;
}
const elicitationMap = globalThis.elicitationMap || new Map<string, ElicitationEntry>();
if (!globalThis.elicitationMap) {
globalThis.elicitationMap = elicitationMap;
}
export function addElicitation(toolCallId: string, resolve: ElicitationResolver, reject: ElicitationRejecter) {
elicitationMap.set(toolCallId, { resolve, reject });
}
export function resolveElicitation(toolCallId: string, value: any) {
const entry = elicitationMap.get(toolCallId);
if (entry) {
entry.resolve(value);
elicitationMap.delete(toolCallId);
return true;
}
return false;
}
export function rejectElicitation(toolCallId: string, reason?: any) {
const entry = elicitationMap.get(toolCallId);
if (entry) {
entry.reject(reason);
elicitationMap.delete(toolCallId);
return true;
}
return false;
}
Keep in mind that in a real application you may be working with an edge runtime where this is not possible, or your backend is spread across multiple threads or even servers, in which case you can’t just store Map with promises and you need some external solution like Redis with pub/sub or a separate websocket server. But the example above is enough for this article just to show the concept.
Done, check it out
Generative UI powered by MCP elicitation
On the first request, I’m asked to create and then immediately delete newfile.md. I click Decline, so the deletion is blocked.
On the second attempt, I click Accept, and the file is successfully deleted.
Conclusion
Elicitation fills a critical gap in the LLM agent architecture — it brings real user interaction into the loop without breaking flow or context.
With it, MCP servers can do more than just respond — they can ask. That’s a big shift.
Support across frameworks is still limited, but it’s only a matter of time. Frontend tools will catch up too — and when they do, building rich, dynamic agent flows will get way simpler.
Now you’ve got the core and you can go build something with it!
This content originally appeared on DEV Community and was authored by Maksim Kachurin