We Replaced an MCP Server with FastAPI and It Worked Everywhere

We built an internal knowledge base server to give our AI agents access to Conselara’s company data — capabilities, past performance, GSA rates, certifications. The idea was straightforward: expose it as an MCP server so any AI client could query it semantically.

It worked in Claude Code. It worked nowhere else.

What MCP promises

The Model Context Protocol is Anthropic’s open standard for connecting AI models to external tools and data sources. The pitch is compelling: define your server once, and any MCP-compatible client can call it. Claude Code has native MCP support. The ecosystem is growing.

We built the server using FastMCP, a Python framework that makes standing up an MCP SSE (Server-Sent Events) endpoint straightforward. The server embedded documents using sentence-transformers/all-MiniLM-L6-v2, stored vectors in Qdrant, and exposed a search tool over SSE.

The client support problem

When we tried to connect it to the other tools in our stack, we hit a wall:

Client	MCP SSE	REST/OpenAPI
Claude Code	✅ Native	✅ via Bash
Pi agent	Needs custom SDK extension	✅ Simple fetch
OpenWebUI	❌ Not supported	✅ Python tool
ChatGPT	❌ Not supported	✅ Custom GPT action
claude.ai	✅ Remote MCP	✅ OpenAPI connector

MCP SSE works natively in Claude Code. For everything else, you are either writing custom protocol wiring or running mcpo — a proxy container that translates MCP to OpenAPI. We had already built the server. We did not want to add another container on top of it just to make it callable from OpenWebUI.

The fundamental issue: MCP is young and client support is inconsistent. REST and OpenAPI have 20 years of tooling behind them. Every HTTP client in existence knows how to make a POST request.

The rebuild

We rewrote the server in FastAPI in an afternoon. The core logic — embedding, Qdrant queries, chunking — stayed identical. We added:

POST /search — semantic search, returns ranked chunks
POST /ingest — wipes and re-ingests all KB files
GET /stats — point count and file list
GET /openapi.json — auto-generated by FastAPI, works directly as a ChatGPT action or claude.ai connector

The one endpoint worth calling out: GET /owu-tool. OpenWebUI lets you import Python tool definitions from a URL. Rather than asking users to paste code and deal with syntax errors, we serve the tool definition directly from the server:

@app.get("/owu-tool", response_class=PlainTextResponse)
def owu_tool():
    return OWU_TOOL  # Python string served as plain text

Point OpenWebUI at http://<host>/owu-tool and it imports cleanly. No copy-paste, no syntax errors.

Connecting each client

Claude Code — no configuration needed, just Bash:

curl -s -X POST http://<your-server>/search \
  -H "Content-Type: application/json" \
  -d '{"query": "GSA labor categories"}'

Pi agent — a TypeScript extension in ~/.pi/agent/extensions/ that calls the REST endpoint via fetch. Pi auto-discovers extensions on startup, no flags needed.

OpenWebUI — Workspace → Tools → Import from URL → http://<host>/owu-tool. Done.

ChatGPT / claude.ai — expose via a public URL, then point each at /openapi.json. FastAPI generates the spec automatically.

What we would do differently

One thing we have not resolved yet: the same KB source files now feed two separate Qdrant collections. Our Hermes agent uses mcp-server-qdrant with a namespaced collection. The FastAPI server manages its own separate collection. Edits to KB files need to be synced and ingested twice or the collections drift.

The fix is to point the FastAPI server at the existing Hermes collection directly, eliminating the duplicate. We documented it as a known issue and have not gotten to it yet.

The takeaway

MCP is the right long-term bet for AI tool interoperability. The protocol is well-designed and Anthropic is investing in it seriously. But today, client support is fragmented. If you need your tool server to work across Claude Code, OpenWebUI, ChatGPT, and custom agents simultaneously, REST with an OpenAPI spec is the pragmatic choice — one server, no proxy layer, every client covered.

We will likely add MCP back on top of the FastAPI server as an adapter when the client ecosystem matures. For now, HTTP is doing the job.

What MCP promises#

The client support problem#

The rebuild#

Connecting each client#

What we would do differently#

The takeaway#