We built an internal knowledge base server to give our AI agents access to Conselara’s company data — capabilities, past performance, GSA rates, certifications. The idea was straightforward: expose it as an MCP server so any AI client could query it semantically.
It worked in Claude Code. It worked nowhere else.
What MCP promises
The Model Context Protocol is Anthropic’s open standard for connecting AI models to external tools and data sources. The pitch is compelling: define your server once, and any MCP-compatible client can call it. Claude Code has native MCP support. The ecosystem is growing.
We built the server using FastMCP, a Python framework that makes standing up an MCP SSE (Server-Sent Events) endpoint straightforward. The server embedded documents using sentence-transformers/all-MiniLM-L6-v2, stored vectors in Qdrant, and exposed a search tool over SSE.
The client support problem
When we tried to connect it to the other tools in our stack, we hit a wall:
| Client | MCP SSE | REST/OpenAPI |
|---|---|---|
| Claude Code | ✅ Native | ✅ via Bash |
| Pi agent | Needs custom SDK extension | ✅ Simple fetch |
| OpenWebUI | ❌ Not supported | ✅ Python tool |
| ChatGPT | ❌ Not supported | ✅ Custom GPT action |
| claude.ai | ✅ Remote MCP | ✅ OpenAPI connector |
MCP SSE works natively in Claude Code. For everything else, you are either writing custom protocol wiring or running mcpo — a proxy container that translates MCP to OpenAPI. We had already built the server. We did not want to add another container on top of it just to make it callable from OpenWebUI.
The fundamental issue: MCP is young and client support is inconsistent. REST and OpenAPI have 20 years of tooling behind them. Every HTTP client in existence knows how to make a POST request.
The rebuild
We rewrote the server in FastAPI in an afternoon. The core logic — embedding, Qdrant queries, chunking — stayed identical. We added:
POST /search— semantic search, returns ranked chunksPOST /ingest— wipes and re-ingests all KB filesGET /stats— point count and file listGET /openapi.json— auto-generated by FastAPI, works directly as a ChatGPT action or claude.ai connector
The one endpoint worth calling out: GET /owu-tool. OpenWebUI lets you import Python tool definitions from a URL. Rather than asking users to paste code and deal with syntax errors, we serve the tool definition directly from the server:
@app.get("/owu-tool", response_class=PlainTextResponse)
def owu_tool():
return OWU_TOOL # Python string served as plain text
Point OpenWebUI at http://<host>/owu-tool and it imports cleanly. No copy-paste, no syntax errors.
Connecting each client
Claude Code — no configuration needed, just Bash:
curl -s -X POST http://<your-server>/search \
-H "Content-Type: application/json" \
-d '{"query": "GSA labor categories"}'
Pi agent — a TypeScript extension in ~/.pi/agent/extensions/ that calls the REST endpoint via fetch. Pi auto-discovers extensions on startup, no flags needed.
OpenWebUI — Workspace → Tools → Import from URL → http://<host>/owu-tool. Done.
ChatGPT / claude.ai — expose via a public URL, then point each at /openapi.json. FastAPI generates the spec automatically.
What we would do differently
One thing we have not resolved yet: the same KB source files now feed two separate Qdrant collections. Our Hermes agent uses mcp-server-qdrant with a namespaced collection. The FastAPI server manages its own separate collection. Edits to KB files need to be synced and ingested twice or the collections drift.
The fix is to point the FastAPI server at the existing Hermes collection directly, eliminating the duplicate. We documented it as a known issue and have not gotten to it yet.
The takeaway
MCP is the right long-term bet for AI tool interoperability. The protocol is well-designed and Anthropic is investing in it seriously. But today, client support is fragmented. If you need your tool server to work across Claude Code, OpenWebUI, ChatGPT, and custom agents simultaneously, REST with an OpenAPI spec is the pragmatic choice — one server, no proxy layer, every client covered.
We will likely add MCP back on top of the FastAPI server as an adapter when the client ecosystem matures. For now, HTTP is doing the job.