The MCP Tool That Timed Out at Five Seconds

We expose our internal knowledge base over both REST and MCP using fastapi-mcp, which mounts existing FastAPI routes as MCP tools: no second server, no protocol-translation proxy. Two tools matter here: search_kb (semantic retrieval) and ask_kb (retrieval plus an LLM synthesis call that returns a cited answer).

search_kb worked flawlessly everywhere. ask_kb failed, intermittently, and the failure was maddeningly opaque. In the MCP client it surfaced as nothing more than:

Command failed with no output

No stack trace. No error payload. Just silence. And only sometimes.

The tell: one tool worked, the other didn’t

The pattern was the clue. search_kb does pure vector retrieval and returns in well under a second. ask_kb does the same retrieval and then calls an LLM to synthesize an answer, around seven seconds end to end. The fast tool always worked; the slow tool usually didn’t. “Usually,” because a short or cached answer occasionally came back quickly enough to succeed.

That smelled like a timeout. But our endpoint wasn’t timing out. Calling the REST /ask route directly returned a clean 200 in ~7 seconds. The failure only happened through MCP.

What fastapi-mcp actually does

Here’s the part that isn’t obvious: fastapi-mcp doesn’t invoke your route function in-process. When an MCP client calls a tool, fastapi-mcp makes an internal HTTP request to your own REST endpoint over localhost, using an httpx.AsyncClient.

And httpx defaults to a 5-second read timeout.

So the chain for ask_kb was: MCP client → fastapi-mcp → httpx (5s limit) → our /ask route → ~7s of retrieval + LLM synthesis. The synthesis blew past five seconds, httpx raised ReadTimeout, fastapi-mcp returned nothing, and the MCP client reported “Command failed with no output.” The container logs confirmed it:

httpx.ReadTimeout
  File ".../fastapi_mcp/server.py", line 584, in _request

search_kb was fine because it never crossed the 5-second line.

The fix

Hand fastapi-mcp an httpx client with a timeout sized for LLM latency:

import httpx
from fastapi_mcp import FastApiMCP

mcp_client = httpx.AsyncClient(
    base_url="http://localhost:8000",
    timeout=httpx.Timeout(120.0),   # default is 5s, far too short for LLM-backed tools
)

mcp = FastApiMCP(app, http_client=mcp_client, include_operations=["search_kb", "ask_kb"])
mcp.mount_http()

One line. The intermittent failures vanished.

Takeaways

When a framework bridges calls internally over HTTP, it brings its own client config, and defaults are tuned for fast APIs, not model latency. fastapi-mcp’s localhost hop is easy to forget is even there.
“Command failed with no output” from an MCP client usually means a timeout, not a crash. The real exception lives in your server’s logs, not the client’s.
Any MCP tool that calls an LLM needs a generous timeout across the whole chain, and there are several: the MCP client, your reverse proxy (we run nginx with proxy_read_timeout 300s), and the framework’s internal HTTP client. Miss the innermost one and the outer ones don’t matter.

The tell: one tool worked, the other didn’t#

What fastapi-mcp actually does#

The fix#

Takeaways#

The tell: one tool worked, the other didn’t

What fastapi-mcp actually does

The fix

Takeaways