Skip to main content

Command Palette

Search for a command to run...

The MCP Mistake That Cost Us 3 Weeks of Refactoring

Updated
6 min read
The MCP Mistake That Cost Us 3 Weeks of Refactoring
K
Solution Architect at IBM with 12+ years in enterprise software, cloud, and applied AI. I write about what production agentic AI actually looks like, the parts that don't fit the marketing. Senior Member, IEEE.

Most MCP content right now falls into two camps: either "here's why MCP is revolutionary" or "here's how to build your first MCP server in 10 minutes." Both are fine if you're just starting out.

This post is for the other side of that journey. The part where you've shipped MCP to production, everything works, and then one Tuesday morning everything doesn't.

Quick context for anyone new: MCP (Model Context Protocol) is how AI agents connect to external tools and data. Think of an MCP server as a menu of capabilities - "get customer record," "search tickets," "fetch transactions" - that agents can call. The protocol is new, moving fast, and most teams are still figuring out the patterns.

We were one of those teams. Here's what we got wrong.

What We Built

Three agents. Shared data needs. One MCP server in the middle serving as a single source of truth for customer records, transaction history, and support tickets.

On paper it was textbook. One place to manage data access. No duplicated connection logic. All three agents singing from the same hymn sheet.

Worked great for two months.

What Went Wrong

A field rename. That was it.

Product wanted customerName split into firstName and lastName. Standard request, maybe 20 minutes of work on the MCP server. We shipped the change.

Within an hour, two of our three agents were throwing errors in production. The third one didn't throw anything — it just quietly started returning wrong results because it was reading the renamed field as null and falling through to a default branch.

The silent one took the longest to catch. We only found it when a customer support ticket mentioned an agent cheerfully greeting someone as "Valued Customer" instead of their actual name.

Why This Bites Harder in MCP Than in Regular APIs

If you've worked with REST APIs for a while, you're probably already thinking "yeah, that's just API versioning." And you're right. But here's why it hits different with MCP.

Traditional APIs have decades of tooling around this. You put /v1/ in the URL. You publish an OpenAPI spec. You write contract tests. You add a Deprecation header. This stuff is boring because it's been solved for twenty years.

MCP doesn't have any of that yet. You expose a tool, agents discover it, and the contract is whatever the tool's schema said at discovery time. There's no /v1/. There's no deprecation header. There's no convention for "here's the migration path, you have six months."

The protocol maintainers know this. It's on the 2026 roadmap. But "on the roadmap" doesn't help you on Tuesday at 2am.

The Three Mistakes We'd Undo

First one, and this is the mindset issue underneath everything else — we thought of the MCP server as internal infrastructure. When you own all the consumers of a system, you can change it freely. But MCP servers aren't really internal, even when you built them. Agents discover them dynamically. Once an agent reads your tool's shape, it's coupled to it. You didn't sign up for that contract, but you're in it.

Second, we versioned the server but not the tools. Those aren't the same thing. A server can host ten tools that each evolve at their own pace. If you tag the whole server as v2, you're forcing a migration on consumers of tools that didn't actually change. What we should've done was version each tool independently.

Third, no contract tests anywhere. Each agent had assumptions about how the tools would respond. None of those assumptions lived in code. They lived in whoever-wrote-that-agent's head. When the assumptions broke, we had to go find them.

What Changed After the Refactor

Versioning moved from the server to the tool. Tool names now carry a version suffix. When something changes, the new version ships alongside the old one with a clear sunset date:

{
  "tools": [
    {
      "name": "getCustomerRecord_v1",
      "deprecated": true,
      "deprecation_notice": "Use getCustomerRecord_v2. Sunset date: 2026-06-01"
    },
    {
      "name": "getCustomerRecord_v2",
      "deprecated": false,
      "description": "Returns customer record with normalized name fields"
    }
  ]
}

It looks ugly having two versions of the same tool sitting next to each other. I've made peace with that. Ugly and explicit beats clean and surprising.

Every agent now owns a thin adapter layer between itself and the MCP server. This was the biggest change. The adapter is the only place that knows the shape of the MCP response. When the shape changes, the adapter breaks loudly - you get a clear error in a clear place, not mysterious downstream weirdness in the agent's reasoning:

# agents/order_agent/adapters/customer_mcp_adapter.py

class CustomerMCPAdapter:
    """
    Translates between order_agent and customer-data MCP server.
    All assumptions about tool response shape live here.
    When this breaks, it breaks loudly.
    """

    EXPECTED_TOOL_VERSION = "getCustomerRecord_v2"

    def get_customer(self, customer_id: str) -> Customer:
        response = self.mcp_client.call_tool(
            self.EXPECTED_TOOL_VERSION,
            {"customer_id": customer_id}
        )

        self._validate_response_shape(response)

        return Customer(
            id=response["id"],
            first_name=response["firstName"],
            last_name=response["lastName"]
        )

For anyone new to this pattern — the idea is simple. Your agent doesn't call the MCP server directly. It calls the adapter. The adapter calls the MCP server. If the MCP server's response changes, you only have to fix the adapter. The agent keeps working.

The mental shift that made the biggest difference though — we stopped thinking of our MCP server as internal. We treat it like a public API now, even though the only consumers are our own agents. That one mindset change kills about 80% of the bad decisions before they get made. You don't rename fields in public APIs without a deprecation period. You don't remove tools without notice. You write the changelog before you ship the change.

If You're Earlier in the Journey

If you're still in the prototype phase with MCP, none of this hits yet. One agent, one server, everything in one repo — you can change things freely and nobody gets hurt. Enjoy that phase. It's fun.

The moment you have two agents pointing at the same MCP server, start thinking about contracts. That's the inflection point where the cost of "we'll figure it out later" goes from small to large quickly.

The protocol will get better. The tooling will catch up. But right now you have to bring your own discipline around versioning, adapters, and contract tests. MCP won't enforce any of it for you.

Build like you'll be wrong. Because at this stage of the protocol, you probably will be.


I'm a Solution Architect at IBM working on Agentic AI orchestration and MCP integrations. Writing about what actually works in production - and what doesn't.


15 views

AI Systems in Production

Part 5 of 6

This series covers what actually happens when AI systems move from demo to production - agent workflows, LLM behavior, failure modes, and the architectural decisions that make systems reliable at scale.

Up next

What Actually Breaks When You Put Azure AI Foundry Agents in Production ?

Demos lie. Your agent works perfectly when you’re showing it to stakeholders, then falls apart the first time a tool times out at 2am. I’ve been deploying Foundry agents into enterprise environments f