MiniMax M2.1: Built for Real-World Complex Tasks, Multi-Language Programming

(minimaxi.com)

150 points | by 110 9 hours ago

19 comments

gempir 0 minutes ago
Very anecdotal but for me this model has very weak prompt adherence. I compared it a tiny bit to gemini flash 3.0 and simple things like "don't use markdown tables in output" was very hard to get with m2.1
Took me like 5 prompt iterations until it finally listened.
kachapopopow 2 hours ago
I think people should stop comparing to sonnet, but to opus instead since it's so far ahead on producing code I would actually want to use (gemini 3 pro tends to be lacking in generalization and wants things to be using it's own style rather than adapting).
Whatever benchmark opus is ahead in should be treated as a very important metric of proper generalization in models.
viraptor 8 hours ago
I've played with this a bit and it's ok. I'd place it somewhere around sonnet 4.5 level, probably below. But with this aggressive pricing you can just run 3 copies to do the same thing, choose the one that succeeded and still come out way ahead with the cost. Not as great as following instructions as Claude models and can get lost, but still "good enough".
I'm very happy with using it to just "do things". When doing in depth debugging or a massive plan is needed, I'd go with something better, but later going through the motions? It works.
[-]
gcanyon 6 hours ago
Would it kill them to use the words "AI coding agent" somewhere prominent?
"MiniMax M2.1: Significantly Enhanced Multi-Language Programming, Built for Real-World Complex Tasks" could be an IDE, a UI framework, a performance library, or, or...
[-]
- spoaceman7777 5 hours ago
  It's not an AI coding agent. It's an LLM that can be used for whatever you'd like, including powering coding agents.
  [-]
  - pdyc 4 hours ago
    That reinforces OP’s point that it isn’t clear from their wording. I initially thought it was a speech model, then I saw Python, etc., and it took me a bit more reading to understand what it actually is
  - gcanyon 5 hours ago
    HA! I almost added a disclaimer to the original message that I wasn't certain in my identification, hence the request/complaint that they didn't make it clear. But I figured the message would be more effective if I "confidently got it wrong" rather than asking, so I went with it.
    [-]
    - martin-t 41 minutes ago
      Some sad irony: just like saying the wrong thing is more likely to get you a reply, using a poor title gets them more engagement.
- tw1984 5 hours ago
  its main Chinese competitor GLM is like making 50 cents USD each in the past 6 months from its 40 million "developer users", calling your flagship model "AI coding agent" is like telling investors "we are doing this for fun, not for money".
Tepix 1 hour ago
The weights got released on huggingface now.
https://huggingface.co/MiniMaxAI/MiniMax-M2.1
jondwillis 9 hours ago
> MiniMax has been continuously transforming itself in a more AI-native way. The core driving forces of this process are models, Agent scaffolding, and organization. Throughout the exploration process, we have gained increasingly deeper understanding of these three aspects. Today we are releasing updates to the model component, namely MiniMax M2.1, hoping to help more enterprises and individuals find more AI-native ways of working (and living) sooner.
This compresses to: “We are updating our model, MiniMax, to 2.1. Agent harnesses exist and Agents are getting more capable.”
A good model and agent harness, pointed at the task of writing this post, might suggest less verbosity and complexity— it comes off as fake and hype-chasing to me, even if your model is actually good. I disengage there.
I saw yall give a lightning talk recently and it was similarly hype-y. Perhaps this is a translation or cultural thing.
[-]
- tw1984 8 hours ago
  so when MiniMax released a pretty capable model, you choose to ignore the model itself and just focus a single sentence they wrote in the release note and started bad mouthing it.
  is it a cultural thing?
  [-]
  - pembrook 1 hour ago
    It’s called bikeshedding and yes it’s a cultural thing on HN. [1]
    Most people here are big company worker bees where they take zero risks and do very little of substance.
    In these organizations, it’s common for large groups of people to get together in “meetings” and endlessly nitpick surface-level details of unimportant things while completely missing the big picture because it’s far too complex to allow for easy opinions or smart-sounding critique.
    [1] https://en.wikipedia.org/wiki/Law_of_triviality
  - simlevesque 7 hours ago
    If I use a software I need to trust it.
    [-]
    - tw1984 5 hours ago
      a model is not software, it is a bunch of weights.
      you are more than welcomed to pick whatever model or software you choose to trust, that is totally fine. However, that is vastly different from bad mouthing a model or software just because its release note contains a single sentence you don't like.
      [-]
      - LoganDark 5 hours ago
        The API is software. You don't get the weights.
        [-]
        logicprog 34 minutes ago
        The weights are open.
- zaptrem 8 hours ago
  Not sure it’s a cultural thing since most of the copy coming out of DeepSeek has been pretty straightforward.
tomcam 8 hours ago
I still can’t figure out what it does
[-]
- esafak 8 hours ago
  It's an LLM for coding.
- yinuoli 7 hours ago
  It's a neural network model, and it could generate text following a given text.
- prmph 8 hours ago
  You are not alone
esafak 7 hours ago
> It exhibits consistent and stable results in tools such as Claude Code, Droid (Factory AI), Cline, Kilo Code, Roo Code, and BlackBox, while providing reliable support for Context Management mechanisms including Skill.md, Claude.md/agent.md/cursorrule, and Slash Commands.
One of the demos shows them using Claude Code, which is interesting. And the next sections are titled 'Digital Employee' and 'End-to-End Office Automation'. Their ambitions obviously go beyond coding. A sign of things to come...
[-]
- jimmydoe 6 hours ago
  they are going IPO in HKEX in a few weeks. some hype up are necessary, not too far fetched imo, pretty much same as anthropic playbook.
  [-]
  - tw1984 5 hours ago
    anthropic playbook does include the false claim publicly made by its CEO that "in six months AI would be writing 90 percent of code". he made that claim 10 months ago. it is a criminal offence for intentionally misleading investors in many countries.
    MiniMax is like 100x more honest.
stpedgwdgfhgdd 1 hour ago
Internal Server Error
[-]
- 01-_- 1 hour ago
  me too
integricho 3 hours ago
Their site crashes my phone browser while scrolling. Is that the expected quality of output of their product?
[-]
- Tepix 2 hours ago
  Should a website be able to crash a browser?
- jedisct1 1 hour ago
  If a website can crash your browser, the problem is your browser...
sosodev 6 hours ago
I’ve spent a little bit of time testing Minimax M2. It’s quite good given the small size but it did make some odd mistakes and struggle with precise instructions.
[-]
- viraptor 2 hours ago
  This is an announcement for M2.1 not M2. It got a decent bump in agent capabilities.
jdright 9 hours ago
https://www.minimax.io/news/minimax-m21
mr_o47 8 hours ago
I won't say it's same on the level of claude models but it's definitely good at coming up with frontend designs
Invictus0 7 hours ago
How is everyone monitoring the skill/utility of all these different models? I am overwhelmed by how many they are, and the challenge of monitoring their capability across so many different modalities.
[-]
- redman25 6 hours ago
  https://www.swebench.com
  https://swe-rebench.com
  https://livebench.ai/#/
  https://eqbench.com/#
  https://contextarena.ai/?needles=8
  https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...
  https://artificialanalysis.ai/leaderboards/models
  https://gorilla.cs.berkeley.edu/leaderboard.html
  https://github.com/lechmazur/confabulations
  https://dubesor.de/benchtable
  https://help.kagi.com/kagi/ai/llm-benchmark.html
  https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard
- spoaceman7777 5 hours ago
  This is the best summary, in my opinion. You can also see the individual scores on the benchmarks they use to compute their overall scores.
  It's nice and simple in the overview mode though. Breaks it down into an intelligence ranking, a coding ranking, and an agentic ranking.
  https://artificialanalysis.ai/
boredemployee 6 hours ago
Internal Server Error
p-e-w 9 hours ago
One of the cited reviews goes:
“We're excited for powerful open-source models like M2.1 […]”
Yet as far as I can tell, this model isn’t open at all. Not even open weights, nevermind open source.
[-]
- NitpickLawyer 2 hours ago
  Repo made public a few minutes ago:
  https://huggingface.co/MiniMaxAI/MiniMax-M2.1
- viraptor 8 hours ago
  It's scheduled for release. They jumped the gun with the news. But at far as we know, it's still coming out, just like M2.
  [-]
  - p-e-w 8 hours ago
    I don’t get it. What’s the holdup? Uploading a model to Hugging Face isn’t exactly difficult.
- bearjaws 8 hours ago
  Yeah I don't see anyway to download this, ollama has it as cloud only.
maximgeorge 3 hours ago
[dead]
Yash16 6 hours ago
[dead]
monster_truck 8 hours ago
That they are still training models against Objective-C is all the proof you need that it will outlive Swift.
When is someone going to vibe code Objective-C 3.0? Borrowing all of the actual good things that have happened since 2.0 is closer than you'd think thanks to LLVM and friends.
[-]
- viraptor 8 hours ago
  Why would they not? Existing objective-c apps will still need updates and various work. Models are still trained on assembler for architectures that don't meaningfully exist today as well.