What Claude Code Sends to the Cloud

(rastrigin.systems)

33 points | by rastriga 1 day ago

7 comments

  • d4rkp4ttern 6 hours ago
    Here's a related issue that took me a whole day to figure out why Claude Code telemetry pings were causing a total network failure when using CC with a local LLM via llama-server.

    I wanted to use local LLMs (~30B) on my M1 Macbook Pro Max, with Claude Code for a privacy-sensitive project. I spun up Qwen3-30B-A3B via llama-server and hooked it up to Claude Code, and after using it for an hour or so, found that my network connectivity got totally borked: browser not loading any web-pages at all.

    Some investigation showed that Claude Code assumes it's talking to the Anthropic API and sends event logging requests (/api/event_logging/batch) to the llama-server endpoint. The local server doesn't implement that route and returns 404s, but Claude Code retries aggressively. These failed requests pile up as TCP connections in TIME_WAIT state, and on macOS this can exhaust the ephemeral port range. So my browser stopped loading pages, my CLI tools couldn't reach the internet, and the only option was to reboot my macbook.

    After some more digging (with Claude Code's help of course) I found that the fix was to add this setting in my ~/.claude/settings.json

        {
        // ... other settings ...
         "env": {
            "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
         }
        // ... other settings ...
        
       }
    
    I added this to my local-LLM + Claude Code/ Codex-CLI guide here:

    https://github.com/pchalasani/claude-code-tools/blob/main/do...

    I don't know if others faced this issue; hopefully this is helpful, or maybe there are other fixes I'm not aware of.

  • dominicm 14 hours ago
    I'm...rather confused why the results here are surprising. The title and first paragraph are suggestive of unusual data like analytics or sending all your codebase, but it's just sending the prompt + context.

    This is how every LLM API has worked for years; the API is a stateless token machine, and the prompts + turns are managed by the client application. If anything it's interesting how standard it is; no inside baseball, they just use the normal public API.

  • mikro 20 hours ago
    I've been waiting for someone to dig into this more deeply! Looking forward to Part 2!

    I use both Claude Code and Xcode with a local LLM (running with LM Studio) and I noticed they both have system prompts that make it work like magic.

    If anyone reading this interested in setting up Claude Code to run offline, I followed these instructions:

    https://medium.com/@luongnv89/setting-up-claude-code-locally...

    My personal LLM preference is for Qwen3-Next-80B with 4bit quantization, about ~45GB in ram.

    • rastriga 7 hours ago
      Thanks — glad it resonated! Part 2 should uncover a lot of the magic behind the scenes. And thanks for sharing the link. Running claude code against a local llm is a really interesting direction, but I need more RAM...
  • rastriga 1 day ago
    I built a MITM proxy to inspect Claude Code’s network traffic and was surprised by how much context is sent on every request. This is Part 1 of a 4-part series focusing only on the wire format and transport layer. Happy to answer technical questions.
  • crbelaus 20 hours ago
    This is great! Would love to see if Opencode sends different payloads and hoe they differ.
    • rastriga 7 hours ago
      Thanks! I tried doing a similar comparison with Codex CLI and Cursor, but hit a wall. Codex doesn’t seem to respect standard proxy env vars, and Cursor uses gRPC. Claude Code was the only one that was straightforward to inspect. Opencode looks like a great next candidate.
  • hhh 23 hours ago
    Seems a bit obvious all the information claude code would send to the llm would be sent to Anthropic no? Isn’t that the point of using it via Azure or AWS Bedrock, for the guarantees of secrecy they provide you?
    • rastriga 22 hours ago
      yes, it's obvious the data goes to Anthropic. What wasn't obvious to me was what exactly is included and how it's structured: system prompt size, full conversation replay, file contents, git history, tool calls. The goal was to understand how the wire level works. On Azure/Bedrock - good point! My understanding is that they route requests through their infrastructure rather than Anthropic directly, which does change the trust boundary, but my focus here was strictly on what the client sends, that payload structure is the same regardless of backend.
      • bostonvaulter2 16 hours ago
        Specifically do you mean when you say git history is sent? How much of it is included in each request?
        • rastriga 8 hours ago
          It's the last 5 commits, not the full history. Here's what actually gets sent in the system prompt:

            gitStatus: This is the git status at the start of the conversation...
            Current branch: main
            Main branch: main
            Status: (clean)
          
            Recent commits:
            6578431 chore: Update security contact email (#417)
            0dc71cd chore: Open source readiness fixes (#416)
            ...
          
          Enough for Claude to understand what you've been working on without sending your entire repo history.
      • hhh 21 hours ago
        That makes sense :) Cool analysis!
  • MuffinFlavored 20 hours ago
    I am surprised that... if they have a session ID of some sort for your chat that they make you re-send the entire message history each time? Not a single kind of cache or stateful proxy/buffering mechanism? Guessing that extra cost in bandwidth is cheaper than having to develop and maintain that? Seems kind of like an obvious optimization/design tradeoff they could eventually decide to change one day?
    • rastriga 7 hours ago
      Statelessness simplifies scaling and operational complexity. They cache the system prompt, but otherwise each request is fully self-contained. It’s an obvious tradeoff, and I wouldn’t be surprised if they move toward some form of server-side state or delta encoding once the product stabilizes.
    • edmundsauto 16 hours ago
      I feel it has to be something bigger, if they’re just taking client side input for the system prompt seems like a security issue. Doesn’t this mean I could reprogram Claude at its core?
      • rastriga 7 hours ago
        You can change the system prompt claude code sends, which changes how the agent frames behavior, but claude still has internal and server side safety layers. So removing or rewriting the client system prompt won't allow to magically bypass those. I think of the client system prompt more as agent configuration than as the primary safety net — it shapes behavior, but it’s not the final authority. I’m covering this in Part 2 — breaking down what’s actually in the system prompt and how the client-side safety framing is constructed.