Troubleshooting
When something isn’t working, run doctor first. It catches the vast
majority of issues in one round-trip. Beyond that, here are the most common
failure modes and what to do about each.
“No espctl tools available” / “Failed to start MCP server”
Your client can’t even spawn the MCP server.
Check:
- Is the absolute path to
espctlcorrect in your client config? Runls -l /path/to/espctlto confirm. - Does it have execute permission?
chmod +xif not. - Run
espctl mcp servein a terminal manually. What does it print to stderr? Common issues:cannot find store at <path>— the store doesn’t exist or has wrong permissions.- dynamic linker errors — the binary was built against newer libc than your system has; rebuild from source or grab a different release.
- For Claude Desktop on macOS specifically: GUI apps don’t inherit your
shell’s env vars. List every env var explicitly in
claude_desktop_config.jsonrather than relying on~/.zshrc.
doctor reports control_plane: error
Your MCP server is running fine but can’t reach the build server.
Check:
curl ${CONTROL_BASE_URL}/health— does it return 200 with a JSON body?- Is
CONTROL_BASE_URLactually a URL? Common mistakes: missinghttp://orhttps://scheme, trailing slash, or pasting an SSH alias instead of a routable hostname. - DNS —
digornslookupthe host. If it fails, you may need to use the IP form (http://<your-server-ip>) until DNS resolves. - Firewall — outbound port 80/443 must be reachable from your machine.
doctor reports control_plane: ok but builds still fail
The MCP server can reach the build server, but builds aren’t producing output.
Check:
- Is
MCP_AUTH_SECRETset and correct? Builds need it;doctoronly needs the build server to respond to/health. Without the secret, you’ll see “401 Unauthorized” in the response to/grant/request. If you suspect the secret was revoked, get a fresh access key from the control plane. - Is your machine’s clock in sync with the build server? Permissions have short TTLs; if either side’s clock is off by more than ~30 seconds, every permission expires before it can be used.
WebRTC connection establishes but immediately closes
on_open fires but the connection drops within seconds, or on_open
never fires at all.
Likely causes:
- Connection negotiation failed. No candidate pair worked. The peer
connection state goes to
Failedafter ~5 seconds and the data channels never open. Cause: network restrictions or firewalls block all UDP and the fallback servers aren’t configured or reachable. - Network restrictions on both sides. Direct peer-to-peer is impossible;
forces a relay through fallback servers. Make sure the build server
returns at least one relay entry in
ice_servers. - Relay credentials expired. Relay credentials rotate per-session; if your client cached one from an earlier session, it’s stale. Open a fresh session.
- Browser blocked WebRTC. Some corporate browser policies disable
WebRTC entirely. Check
chrome://webrtc-internals/(Chrome) for the connection candidate dump.
Fix pattern: Always implement a fast-fail in your client that watches
for RTCPeerConnection.connectionState === 'failed' in parallel with
waiting for on_open. Wrap connect() in a 3-attempt retry loop with a
2-second delay between attempts.
Build hangs in pending for a long time
The permission was issued, but no build machine picked up the job.
Likely causes:
- No build machine is currently free to respond — the job auto-assigns shortly.
- No build machine has the requested target’s toolchain (e.g.
esp32p4may not yet be on every machine). If the job is still unassigned after a few minutes, try a more common target to distinguish “no machines at all” from “specific-target toolchain missing”.
Build fails with a compiler error
This is the easy case. Ask your AI assistant:
Run
parse_build_errorson the latest build, then run thediagnose-build-errorprompt against the result.
You’ll get a structured “what’s wrong, why, here’s the fix” rather than a 500-line log dump.
Send queue full / firmware download stalls
Throughput drops dramatically partway through a firmware download (only
matters for large *.bin files over a relay connection).
Cause: Production build machines cap the send queue at 128 KB. Combined with a 500 ms round-trip relay, this caps throughput at ~256 KB/s, not the multi-MB/s you’d see on a direct peer-to-peer connection.
Fix: This is by design (preventing memory exhaustion when the receiver can’t keep up). If your firmware is large enough that it matters, prefer a direct peer-to-peer connection over a relay. Direct connections aren’t affected as severely because the round-trip time is much lower.
Still stuck
- Ask your AI assistant to read the
install://overviewresource — it returns the same env-var table from inside the MCP server, which lets you cross-reference what the server thinks its config is. - File an issue on this project’s repository with the output of
doctorattached.
See also
doctor— health-check tool.- Environment Variable Index — every env var in one place.