Ask HN: Can you tell the difference between Claude Sonnet and Opus?

Hello

I have been using Claude code for the past 6 months. In that time, multiple revisions of each model have come out. I have seen some improvement, especially in regards to sycophancy, with recent iterations.

However, I can't differentiate the outputs of either. To me, sonnet seems just as capable as opus.

Have any of y'all run real life tests? Mine seem to be too random to say either way.

4 points | by muddi900 5 hours ago

4 comments

  • eddyzh 3 hours ago
    At work I use opus max Fast It hardy ever fails for no reason even if I forget to give it all the right context. At home i run sonnet, and it does not get what I meant or expected 20-35% of the time. Due to the enormous difference in cost, depending on the value of your time (hourly rate) that might be a nett benefit.

    Sonnet being faster alone would not be worth the failure rate for me.

    At home i just not want to pay more than 20 bucks for incidental projects.

    And opus max would just consume my tokens in one round.

  • nawi 5 hours ago
    You are not missing anything. For 95% of dev work, sonnet, especially 3.5 and 3.7 has basically win opus, value per price. in my experience the difference boils down to this 1. Sonnet is the faster. It's concise, follows instructions literally, and is significantly better at agentic tasks. 2. Opus is the philosopher. It’s better at high level architecture, creative writing, or spotting subtle nuances in a 50 pages document. the reason your tests feel random is that for standard coding, sonnet is actually the superior model now. it is faster, less prone to over engineering, and has much lower latency. if you have a massive, messy refactor where you need the model to reason through 10 files without adding bugs, opus might still have a slight edge in coherence. for everythng else, Sonnet is the meta. Stick with it and save the credits.
  • sminchev 3 hours ago
    Yes. When things get too complex Sonnet misses some things. For example, it creates all the components, but does not link them. Or it does not go deep enough in the code and misses certain usages and possible regressions. In other words, it does not, pro-actively, search for things that I have forgotten to tell the model about.
    • eddyzh 3 hours ago
      Exactly this.

      This may be worth the discount. Or not if your time and attention is worth (quite) a lot.

  • aykutseker 1 hour ago
    in short tasks they look identical and most people can't tell. opus shows its edge in long agent loops and 50k+ context, when sonnet starts dropping tool calls or rerunning steps. sonnet's fine for short stuff and the price is better. on longer agentic flows opus actually earns the cost in my experience.