Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Troubleshooting

When something isn’t working, run doctor first. It catches the vast majority of issues in one round-trip. Beyond that, here are the most common failure modes and what to do about each.

“No espctl tools available” / “Failed to start MCP server”

Your client can’t even spawn the MCP server.

Check:

  • Is the absolute path to espctl correct in your client config? Run ls -l /path/to/espctl to confirm.
  • Does it have execute permission? chmod +x if not.
  • Run espctl mcp serve in a terminal manually. What does it print to stderr? Common issues:
    • cannot find store at <path> — the store doesn’t exist or has wrong permissions.
    • dynamic linker errors — the binary was built against newer libc than your system has; rebuild from source or grab a different release.
  • For Claude Desktop on macOS specifically: GUI apps don’t inherit your shell’s env vars. List every env var explicitly in claude_desktop_config.json rather than relying on ~/.zshrc.

doctor reports control_plane: error

Your MCP server is running fine but can’t reach the build server.

Check:

  • curl ${CONTROL_BASE_URL}/health — does it return 200 with a JSON body?
  • Is CONTROL_BASE_URL actually a URL? Common mistakes: missing http:// or https:// scheme, trailing slash, or pasting an SSH alias instead of a routable hostname.
  • DNS — dig or nslookup the host. If it fails, you may need to use the IP form (http://<your-server-ip>) until DNS resolves.
  • Firewall — outbound port 80/443 must be reachable from your machine.

doctor reports control_plane: ok but builds still fail

The MCP server can reach the build server, but builds aren’t producing output.

Check:

  • Is MCP_AUTH_SECRET set and correct? Builds need it; doctor only needs the build server to respond to /health. Without the secret, you’ll see “401 Unauthorized” in the response to /grant/request. If you suspect the secret was revoked, get a fresh access key from the control plane.
  • Is your machine’s clock in sync with the build server? Permissions have short TTLs; if either side’s clock is off by more than ~30 seconds, every permission expires before it can be used.

WebRTC connection establishes but immediately closes

on_open fires but the connection drops within seconds, or on_open never fires at all.

Likely causes:

  • Connection negotiation failed. No candidate pair worked. The peer connection state goes to Failed after ~5 seconds and the data channels never open. Cause: network restrictions or firewalls block all UDP and the fallback servers aren’t configured or reachable.
  • Network restrictions on both sides. Direct peer-to-peer is impossible; forces a relay through fallback servers. Make sure the build server returns at least one relay entry in ice_servers.
  • Relay credentials expired. Relay credentials rotate per-session; if your client cached one from an earlier session, it’s stale. Open a fresh session.
  • Browser blocked WebRTC. Some corporate browser policies disable WebRTC entirely. Check chrome://webrtc-internals/ (Chrome) for the connection candidate dump.

Fix pattern: Always implement a fast-fail in your client that watches for RTCPeerConnection.connectionState === 'failed' in parallel with waiting for on_open. Wrap connect() in a 3-attempt retry loop with a 2-second delay between attempts.

Build hangs in pending for a long time

The permission was issued, but no build machine picked up the job.

Likely causes:

  • No build machine is currently free to respond — the job auto-assigns shortly.
  • No build machine has the requested target’s toolchain (e.g. esp32p4 may not yet be on every machine). If the job is still unassigned after a few minutes, try a more common target to distinguish “no machines at all” from “specific-target toolchain missing”.

Build fails with a compiler error

This is the easy case. Ask your AI assistant:

Run parse_build_errors on the latest build, then run the diagnose-build-error prompt against the result.

You’ll get a structured “what’s wrong, why, here’s the fix” rather than a 500-line log dump.

Send queue full / firmware download stalls

Throughput drops dramatically partway through a firmware download (only matters for large *.bin files over a relay connection).

Cause: Production build machines cap the send queue at 128 KB. Combined with a 500 ms round-trip relay, this caps throughput at ~256 KB/s, not the multi-MB/s you’d see on a direct peer-to-peer connection.

Fix: This is by design (preventing memory exhaustion when the receiver can’t keep up). If your firmware is large enough that it matters, prefer a direct peer-to-peer connection over a relay. Direct connections aren’t affected as severely because the round-trip time is much lower.

Still stuck

  • Ask your AI assistant to read the install://overview resource — it returns the same env-var table from inside the MCP server, which lets you cross-reference what the server thinks its config is.
  • File an issue on this project’s repository with the output of doctor attached.

See also