Noddle Deck
All posts

Agent Engineering series

MCP: The USB Port for AI Agents

Noddle Deck team15 min read
mcpintegrationstool use

Before USB, connecting a peripheral to a computer meant checking whether it had the right port — serial, parallel, PS/2, a dozen proprietary connectors that all did roughly the same job in mutually incompatible ways. Every device maker had to decide which ports to support, and every computer maker had to decide which ports to include, and the two lists never quite matched. USB didn't make any individual peripheral smarter. It just gave device makers and computer makers one connector to agree on, so a mouse built in one factory would work on a laptop built in another without either one knowing the other existed.

MCP — the Model Context Protocol — is that same idea applied to AI agents and the tools and data they need to reach. Before MCP, if you wanted an agent to query your database, you wrote an integration for that specific model, that specific agent framework, and that specific database. Wanted a second agent framework to have the same capability? Write it again. MCP standardizes the connector instead: write a server for your database once, and every MCP-compatible client — Claude Code, Claude Desktop, any other host that speaks the protocol — can use it without a line of custom integration code.

Key takeaways

  • MCP turns an N-integrations-per-pair problem into an N+M problem: write one server per tool and one client per host, instead of one integration per (host, tool) combination.
  • Three roles: host (the application, e.g. Claude Code), client (the connector inside the host, one per server), server (the program exposing tools, resources, and prompts for a specific system).
  • Two transports cover almost everything: stdio for a server running as a local subprocess, HTTP with SSE for a server running remotely as its own service.
  • An MCP server is code that runs with your permissions. Installing one is closer to installing a dependency than clicking a web link — vet the source the same way.
  • Under the hood it's JSON-RPC 2.0: initialize negotiates the protocol version once, tools/list is how the client discovers what's callable, tools/call is the only message that actually does something.
  • Pick a transport by lifecycle, not preference — stdio when the server's lifetime should be tied to the client, HTTP when multiple clients need to share one running instance. Pick a tool vs. a resource by who decides to invoke it — the model chooses a tool's arguments; the host chooses whether to hand over a resource at all.

The problem: N × M integrations#

Picture three different agent hosts — Claude Code, a custom in-house agent, a third-party IDE plugin — each needing access to three different systems: your internal database, your issue tracker, and a search API. Without a shared protocol, that's nine separate integrations, each written against whatever ad-hoc interface the host happens to expose for "custom tools," each maintained separately, each breaking separately when either side changes its API.

The moment you standardize the connector, that grid collapses. Each system gets exactly one server, written once by whoever knows that system best — often the vendor. Each host gets exactly one client implementation of the protocol, written once by whoever builds the host. Three hosts and three tools stop being nine integrations and become six pieces of code, and — this is the part that compounds — adding a tenth tool doesn't touch the hosts at all, and adding a second host doesn't touch the tools at all.

Figure 1

N×M integrations collapsing to N+M

N times M collapsing to N plus M diagramMCPAgent AAgent BAgent CFilesDatabaseSearch3 agents × 3 tools = 9 brittle integrations → 3 + 3 = 6 through one protocol
Nine point-to-point connections between three agents and three tools reduce to six once both sides agree on one protocol instead of writing to each other directly.

Three roles: host, client, server#

MCP's architecture has exactly three participants, and keeping them distinct is the key to understanding everything else about the protocol.

  • Host — the application the human is actually using. Claude Code is a host. Claude Desktop is a host. A host can talk to many servers at once, each through its own client.
  • Client — the connector living inside the host, one instance per server it talks to. The client owns the protocol handshake, the message framing, and translating between the host's internal tool-calling format and MCP's wire format. Most of the time you never interact with the client directly — it's implementation detail the host manages for you.
  • Server — a program that exposes a specific system's capabilities over MCP: tools it can call, resources it can read, prompt templates it can hand back. A server knows nothing about which host is talking to it, and doesn't need to — that's the whole point of standardizing the connector.

Figure 2

MCP architecture: host, clients, and two transports

MCP architecture diagramHost (Claude Code)MCP Clienttalks to the filesystem serverMCP Clienttalks to the issue trackerLocal MCP servertransport: stdioRemote MCP servertransport: HTTP + SSE
A host can hold multiple clients at once. One client talks to a server running as a local subprocess over stdio; another talks to a server running as its own service over HTTP and SSE.

Two transports cover almost everything#

A server needs some way to actually exchange messages with a client, and MCP defines this loosely enough to cover both the local and remote case without forcing either into an awkward shape.

  • stdio — the client launches the server as a local subprocess and talks to it over its standard input and output streams. This is the right choice for a server that wraps something already local to your machine: your filesystem, a local database, a CLI tool you already have installed. No network involved, no separate process to keep alive — the client owns the server's lifecycle.
  • HTTP + SSE — the server runs as its own long-lived service, reachable over HTTP, with server-sent events used for the server to push messages back to the client asynchronously. This is the right choice for a server that wraps something remote: a SaaS API, a shared team resource, a service that needs to run independently of any one client's lifetime.

What a server actually exposes: tools, resources, prompts#

Once connected, a client asks the server what it offers, and the server answers with three kinds of primitives:

  • Tools — actions the model can invoke, each with a name, a description, and a JSON schema for its arguments. This is the primitive that maps most directly onto "function calling" — a search_issues tool, a run_query tool, each one the model can decide to call mid-conversation.
  • Resources — addressable pieces of content the host can read, like files exposed by URI. Unlike a tool call, reading a resource is typically something the host application does on the user's behalf, not something the model autonomously decides to trigger.
  • Prompts — reusable prompt templates the server hands to the client, parallel in spirit to the slash commands covered earlier in this series but sourced from the server rather than a local file.

Discovery is how the client learns about all three without any hardcoded knowledge of the server ahead of time — on connecting, it asks the server to list its tools, resources, and prompts, gets back their names, descriptions, and schemas, and that's the entire contract the model needs to start calling them correctly.

A single tool call, end to end#

It's worth tracing one concrete round trip, because the number of hops involved is exactly why standardizing the protocol between each of them matters — a bug or a mismatch at any hop breaks the whole call, and there are more hops here than "the model calls a function" makes it sound like.

Figure 3

A tool call over MCP

MCP tool call sequence diagramrequest →← responseModelhop 1MCP Clienthop 2MCP Serverhop 3External APIhop 4
The model's decision to call a tool travels through the client and the server before it ever reaches the external system doing the real work, and the result travels the same four hops back.
  • Model → Client. The model decides to call a tool, say search_issues(query="lock timeout"). The host's client is the one that actually knows how to reach the server offering that tool.
  • Client → Server. The client sends the call over the transport — stdio or HTTP — as a standard MCP message. The server has no idea, and doesn't need to know, which host or which model originated the request.
  • Server → External API. The server does the actual work: queries the issue tracker's real API, using whatever credentials it was configured with. This hop is entirely outside MCP's concern — the protocol standardizes the connector, not what's on the other end of it.
  • Back through the same three hops. The result flows server → client → model, formatted as an MCP tool result, and the model continues reasoning with that result now in context.

Configuring a server in practice#

Adding an MCP server to a host is usually a small config entry, not a code change — the whole benefit is that you're consuming a server someone else already built, not writing an integration yourself.

.claude/mcp.json
{
"mcpServers": {
"postgres": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres", "$DATABASE_URL"]
},
"issue-tracker": {
"url": "https://mcp.example-tracker.com/sse",
"headers": {
"Authorization": "Bearer $ISSUE_TRACKER_TOKEN"
}
}
}
}

The postgres entry is a stdio server: the host launches it as a subprocess and talks to it over stdin/stdout, no network port involved. The issue-tracker entry is an HTTP/SSE server: the host connects to an already-running remote service. Both show up identically to the model once connected — as a set of tools it can call — which is exactly the abstraction MCP is supposed to provide.

An MCP server is code running with your permissions

A stdio server runs as a subprocess on your machine with whatever access your user account has. A remote server, once you've handed it an API token, can do anything that token is scoped to do. Adding an entry to your MCP config is closer to installing an npm package than bookmarking a URL — only add servers from sources you'd trust with a dependency, and scope credentials to the minimum the server actually needs.

When MCP is the right tool — and when it isn't#

MCP earns its complexity when a capability needs to be reused across multiple hosts, maintained independently of any one of them, or when you're consuming something someone else already built rather than integrating your own system for the first time. It's the right layer for "connect Claude Code to our internal deploy service" precisely because that connection is useful outside Claude Code too, and a team member's IDE plugin or a second agent host can reuse the exact same server.

It's the wrong layer for a capability a single agent needs once, with no expectation anyone else will reuse it. A one-off script called from a hook, or a built-in tool the host already ships (file read/write, a shell), does the same job with less infrastructure — you don't need a protocol handshake and a long-lived server process to run ls. Similarly, if what you actually need is "call this one API from a script," writing a direct HTTP call inside a hook or a command's bash embed is simpler than standing up an MCP server that only one command will ever talk to. Reach for MCP when the connector itself is the reusable asset — not every integration needs to become one.

Anti-patterns to avoid#

  • Standing up a server nobody else will use. If a capability is genuinely single-host, single-use, the protocol overhead buys you nothing — a direct tool integration or a script is less code to maintain.
  • Trusting a server's tool descriptions blindly. A malicious or careless server can describe its tools misleadingly to influence what the model calls and how. Treat tool descriptions from a third-party server the same way you'd treat any untrusted input to a system prompt.
  • Over-scoping credentials. Handing a server a token with broad access "because it's easier" turns a compromised or buggy server into a much bigger blast radius than the integration actually needed.
  • Ignoring transport fit. Wrapping a purely local tool in an HTTP server adds a network hop and a process to keep alive for no benefit — stdio exists precisely so local integrations don't pay that cost.

Worked example: the JSON-RPC trace behind one tool call#

"The model calls a tool" makes the mechanics sound simpler than they are. Underneath, every MCP exchange is a JSON-RPC 2.0 message over whichever transport the server uses, and a client doesn't skip straight to calling a tool — it negotiates a protocol version, asks what's available, and only then makes the call. Here's the actual message trace for the postgres server from the config above, answering one real question about pending orders.

initialize — agreeing on a protocol version once#

The client sends this the moment it launches the server subprocess, before either side knows anything about the other:

→ client to server
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {},
"clientInfo": { "name": "claude-code", "version": "1.4.0" }
}
}
← server to client
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2024-11-05",
"capabilities": { "tools": {} },
"serverInfo": { "name": "postgres-mcp", "version": "0.3.1" }
}
}

Both sides now agree on a protocol version and know what capabilities the other side supports — here just tools, so the client knows not to bother asking this server for resources or prompts. This message happens exactly once per connection, not once per tool call.

tools/list — discovering what's callable#

With the handshake done, the client asks what the server actually offers. This is also typically cached for the life of the connection rather than re-sent before every call:

→ client to server
{ "jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": {} }
← server to client
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"tools": [
{
"name": "query",
"description": "Run a read-only SQL query against the connected database",
"inputSchema": {
"type": "object",
"properties": {
"sql": { "type": "string" }
},
"required": ["sql"]
}
}
]
}
}

That inputSchema is the entire contract the model gets for calling query correctly — one required string argument, no other fields. The host merges this into whatever format its own tool-calling layer uses internally, and from here on the model reasons about query the same way it would reason about any other tool it has access to.

tools/call — the only message that does something#

The model decides it needs the count of pending orders, and the client sends the call:

→ client to server
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "query",
"arguments": {
"sql": "select count(*) from orders where status = 'pending'"
}
}
}
← server to client
{
"jsonrpc": "2.0",
"id": 3,
"result": {
"content": [
{ "type": "text", "text": "count\n-----\n42" }
],
"isError": false
}
}

Everything before this point — the handshake, the discovery call — happened so that this one message could be formed correctly and interpreted correctly on both ends. The server never had to know it was Claude Code asking rather than some other host; the client never had to know anything about Postgres beyond what the schema told it. The content array lands back in the model's context as a normal tool result, and isError: false is what tells the model the query actually succeeded rather than returning a formatted error it should stop and reconsider.

How this fails in practice#

Most MCP problems aren't protocol bugs — the spec is small and well-specified. They're configuration and trust problems that show up once a server is actually wired into a real host.

The server never finishes initializing#

A host shows a permanent "connecting" state for one server entry, and every tool it should offer is simply missing. Usually the stdio subprocess crashed immediately after launch — a missing binary, an npx package that couldn't be fetched because the machine has no network access at that moment, or an environment variable like $DATABASE_URL that wasn't actually expanded and got passed to the server literally as the string $DATABASE_URL. The client is waiting on an initialize response that will never arrive. Before wiring a server into a host's config, run the exact command from that config entry directly in a terminal — if it doesn't start cleanly by hand, it won't start cleanly launched by the client either.

Tool names collide across servers#

Two connected servers each expose a tool named search, and the model calls one expecting the other — or the host silently only shows one of them, and nobody notices which. A host has to flatten every connected server's tools into one namespace the model sees, and the protocol itself doesn't reserve names or require prefixing, so two unrelated servers can pick the exact same short, obvious name with zero coordination between their authors. Check what your specific host does on a name collision before assuming a call reaches the server you expect — some prefix tool names by server automatically, some don't, and that behavior is worth confirming once rather than discovering during an incident.

The description and the schema disagree#

The model keeps calling a tool with arguments that fail validation, retries with slightly different arguments, and fails again. The description field says the tool takes "a search query," but the actual inputSchema requires query and a numeric limit the prose never mentioned. The schema is what actually gets enforced; the description is only ever a hint the model uses to decide whether and how to call the tool in the first place. Treat a server's schema the way you'd treat an OpenAPI spec — the source of truth to validate example calls against before shipping — and keep the description in sync with it rather than writing the two independently.

A remote server's connection drops mid-session#

Tool calls that worked fine for the first twenty minutes of a session start timing out or returning stale-looking results. Long-lived HTTP/SSE connections sitting behind a corporate proxy or load balancer are a common target for idle-connection timeouts that neither the client nor the server notices until the next call fails outright. Verify the client actually reconnects and resubscribes after a drop rather than assuming the transport handles it transparently, and for anything on a flaky or heavily proxied network, consider a local stdio-based proxy in front of the remote service instead of holding the long-lived connection directly.

Design decisions: transport, auth, and tool vs. resource#

stdio vs. HTTP: choose by lifecycle, not preference#

Pick stdio when the server's lifetime should be tied 1:1 to the client's — it starts when the client starts, dies when the client dies, and there's no orphaned process to clean up or reason about after a crash. That's the right shape for anything wrapping something already local: a filesystem, a database reachable only from your machine, a CLI tool you've already got installed. Pick HTTP when the underlying capability needs to outlive any one client — multiple hosts or teammates sharing one running instance, a service that needs to keep running whether or not anyone's editor happens to be open, or a capability that's already a network service on the other end anyway, where wrapping it in a local subprocess would just add a redundant hop. The question that decides it isn't "which is easier to set up" — it's "does this server's lifetime belong to one client, or to the system it's wrapping."

Auth and secrets: where they actually live#

A stdio server typically gets its credentials the same way any local CLI tool does — an environment variable passed at launch, exactly like $DATABASE_URL in the config above. That's no more or less safe than any other environment variable on that machine, and it's scoped to whoever can read that machine's env, which is usually just you. An HTTP server needs a real auth story, because it's reachable independent of any specific caller's machine — a bearer token in a header, as in the issue-tracker entry above, or a full OAuth flow for anything more sensitive. The operational upside of centralizing auth at the server is real: rotating a compromised token means updating it once, on the server, rather than pushing a new config to every client that connects to it. Scope whatever credential you hand to a server to the minimum it actually needs — a read-only database role for a query tool, not the same superuser credential your migrations run with.

Tool vs. resource: who decides to invoke it#

A tool is for when the model needs to decide, mid-conversation, what to ask for and how — search_issues(query=...) only makes sense once the model has composed a specific query based on what it's currently reasoning about, and it might call it zero times, once, or five times with different arguments in the same session. A resource is for when the host, not the model, is the one deciding whether to hand something over — the same way an application decides to attach a file to a request without asking the model whether it wants that file first. Exposing a codebase's README as a resource makes sense because the host can just include it; exposing "search this codebase" as a tool makes sense because only the model, mid-task, knows what query is actually worth running. Reach for a resource when the content is fixed and the only decision is whether to include it at all; reach for a tool when the arguments themselves need to be composed by something that understands the current task.

MCP and Noddle Deck#

Skills and commands in Noddle Deck's marketplace assume the tools an agent already has — file access, shell, whatever the host exposes natively. MCP is what extends that surface to systems outside the host itself: your database, your issue tracker, an internal API. A persona pack's commands and skills are a great place to point at the tools an MCP server exposes once you've connected one, the same way this series' other posts show pointing at built-in tools.

bash
noddle-deck pack install developer

Pair a pack's commands with an MCP server for your own infrastructure and you get the full picture this series has been building toward: a command you invoke on purpose, a skill the agent triggers on its own, a hook that guarantees a rule holds regardless of either, and now a standard way to reach whatever system actually does the work outside the agent's own sandbox.

Put this into practice

Noddle Deck packs ship curated skills and slash-commands for your role — install one and see this in action.

Browse persona packs