We expose our internal knowledge base over both REST and MCP using fastapi-mcp, which mounts existing FastAPI routes as MCP tools: no second server, no protocol-translation proxy. Two tools matter here: search_kb (semantic retrieval) and ask_kb (retrieval plus an LLM synthesis call that returns a cited answer).
search_kb worked flawlessly everywhere. ask_kb failed, intermittently, and the failure was maddeningly opaque. In the MCP client it surfaced as nothing more than:
Command failed with no output
No stack trace. No error payload. Just silence. And only sometimes.
The tell: one tool worked, the other didn’t
The pattern was the clue. search_kb does pure vector retrieval and returns in well under a second. ask_kb does the same retrieval and then calls an LLM to synthesize an answer, around seven seconds end to end. The fast tool always worked; the slow tool usually didn’t. “Usually,” because a short or cached answer occasionally came back quickly enough to succeed.
That smelled like a timeout. But our endpoint wasn’t timing out. Calling the REST /ask route directly returned a clean 200 in ~7 seconds. The failure only happened through MCP.
What fastapi-mcp actually does
Here’s the part that isn’t obvious: fastapi-mcp doesn’t invoke your route function in-process. When an MCP client calls a tool, fastapi-mcp makes an internal HTTP request to your own REST endpoint over localhost, using an httpx.AsyncClient.
And httpx defaults to a 5-second read timeout.
So the chain for ask_kb was: MCP client → fastapi-mcp → httpx (5s limit) → our /ask route → ~7s of retrieval + LLM synthesis. The synthesis blew past five seconds, httpx raised ReadTimeout, fastapi-mcp returned nothing, and the MCP client reported “Command failed with no output.” The container logs confirmed it:
httpx.ReadTimeout
File ".../fastapi_mcp/server.py", line 584, in _request
search_kb was fine because it never crossed the 5-second line.
The fix
Hand fastapi-mcp an httpx client with a timeout sized for LLM latency:
import httpx
from fastapi_mcp import FastApiMCP
mcp_client = httpx.AsyncClient(
base_url="http://localhost:8000",
timeout=httpx.Timeout(120.0), # default is 5s, far too short for LLM-backed tools
)
mcp = FastApiMCP(app, http_client=mcp_client, include_operations=["search_kb", "ask_kb"])
mcp.mount_http()
One line. The intermittent failures vanished.
Takeaways
- When a framework bridges calls internally over HTTP, it brings its own client config, and defaults are tuned for fast APIs, not model latency. fastapi-mcp’s localhost hop is easy to forget is even there.
- “Command failed with no output” from an MCP client usually means a timeout, not a crash. The real exception lives in your server’s logs, not the client’s.
- Any MCP tool that calls an LLM needs a generous timeout across the whole chain, and there are several: the MCP client, your reverse proxy (we run nginx with
proxy_read_timeout 300s), and the framework’s internal HTTP client. Miss the innermost one and the outer ones don’t matter.