MiniMax-M2.5 Released: Open-Source Model Matching Opus 4.6 at 10% of the Cost
MiniMax releases M2.5, an open-source 229B parameter model achieving 80.2% on SWE-Bench Verified with 10-20x cost reduction compared to Opus 4.6, Gemini 3 Pro, and GPT-5.
MiniMax released MiniMax-M2.5 on February 14, 2026, an open-source large language model designed for coding, agentic workflows, and office tasks. The 229-billion-parameter model achieves performance comparable to Claude Opus 4.6 while costing one-tenth to one-twentieth as much, according to the company’s official announcement on Hugging Face.
Performance Benchmarks
MiniMax-M2.5 demonstrates state-of-the-art performance across multiple benchmarks:
Coding:
- SWE-Bench Verified: 80.2%
- Multi-SWE-Bench: 51.3%
- SWE-Bench evaluation on Droid scaffolding: 79.7% (vs. 78.9% for Opus 4.6)
- SWE-Bench evaluation on OpenCode scaffolding: 76.1% (vs. 75.9% for Opus 4.6)
Search and Tool Calling:
- BrowseComp (with context management): 76.3%
- Achieves results with approximately 20% fewer search rounds compared to M2.1
Other Benchmarks:
- AIME25: 86.3%
- GPQA-D: 85.2%
- SciCode: 44.4%
- IFBench: 70.0%
The model was trained on over 10 programming languages (including Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, and Ruby) across more than 200,000 real-world environments.
Speed Improvements
MiniMax-M2.5 completes SWE-Bench Verified evaluations 37% faster than M2.1, matching the speed of Claude Opus 4.6. The end-to-end runtime decreased from an average of 31.3 minutes (M2.1) to 22.8 minutes (M2.5), on par with Opus 4.6’s 22.9 minutes.
The model is served at two throughput levels:
- M2.5-Lightning: 100 tokens per second (2x faster than other frontier models)
- M2.5: 50 tokens per second
Cost Efficiency
MiniMax positions M2.5 as “intelligence too cheap to meter.” Pricing is as follows:
M2.5-Lightning (100 TPS):
- Input: $0.30 per million tokens
- Output: $2.40 per million tokens
M2.5 (50 TPS):
- Input: $0.15 per million tokens
- Output: $1.20 per million tokens
According to MiniMax’s calculations, running M2.5-Lightning continuously for one hour at 100 tokens per second costs $1. At 50 tokens per second, the cost drops to $0.30 per hour. The company states that four M2.5 instances can run continuously for an entire year for $10,000.
Based on output pricing, M2.5 costs one-tenth to one-twentieth that of Claude Opus, Google Gemini 3 Pro, and OpenAI GPT-5.
Real-World Deployment at MiniMax
MiniMax reports that M2.5 autonomously completes 30% of the company’s daily tasks across functions including R&D, product, sales, HR, and finance. In coding scenarios specifically, M2.5-generated code accounts for 80% of newly committed code.
The company has deployed M2.5 in its MiniMax Agent platform, where users have built over 10,000 “Experts” (reusable task templates) combining domain expertise with standardized “Office Skills” for Word, PowerPoint, and Excel tasks.
Open-Source Availability
Model weights are available on Hugging Face: https://huggingface.co/MiniMaxAI/MiniMax-M2.5
GitHub repository: https://github.com/MiniMax-AI
Recommended inference frameworks (listed alphabetically):
- SGLang
- vLLM
- Transformers
- KTransformers
- ModelScope (for users in China)
Inference parameters:
- Temperature: 1.0
- Top-p: 0.95
- Top-k: 40
Default system prompt:
“You are a helpful assistant. Your name is MiniMax-M2.5 and is built by MiniMax.”
Technical Background: Reinforcement Learning Scaling
MiniMax attributes M2.5’s improvements to large-scale reinforcement learning. The model was trained across hundreds of thousands of real-world environments derived from tasks performed at the company.
Forge Framework: The company developed an in-house agent-native RL framework called “Forge,” which decouples the training-inference engine from the agent layer, enabling optimization across multiple agent scaffolds. A tree-structured merging strategy for training samples achieved approximately 40x training speedup.
Algorithm: MiniMax continued using the CISPO algorithm introduced in early 2025 to ensure stability of MoE (Mixture of Experts) models during large-scale training. A process reward mechanism was introduced to address credit assignment challenges in long-context agent rollouts.
Model Progression
Over three and a half months from late October 2025 to February 2026, MiniMax released three models:
- M2 (December 23, 2025)
- M2.1 (updated February 13, 2026)
- M2.5 (February 14, 2026)
According to MiniMax, the rate of progress on SWE-Bench Verified has been significantly faster than Claude, GPT, and Gemini model families over the same period.
Commercial Availability
In addition to open-source deployment, MiniMax offers M2.5 through:
- MiniMax Agent: https://agent.minimax.io/
- MiniMax API Platform: https://platform.minimax.io/
- MiniMax Coding Plan: https://platform.minimax.io/subscribe/coding-plan
Security and Trust Assessment
A security review of the MiniMax-AI GitHub organization and repositories was conducted on February 15, 2026.
Organization Verification:
- Organization ID: 194880281 (created January 14, 2025)
- Official website: https://www.minimax.io
- Official contact: model@minimax.io
- Twitter: @MiniMax_AI
- GitHub followers: 4,358
Repository Trust Indicators:
- MiniMax-M2.5: 6.09k stars, 519 forks (updated 21 hours ago)
- MiniMax-M2.1: 86.7k stars, 1.27k forks
- MiniMax-M2: 450k stars, 1.48k forks
- Mini-Agent: 1.6k stars, 232 forks
High community engagement levels indicate active maintenance and peer review.
License: Modified MIT License requiring commercial users to display “MiniMax M2.5” on product interfaces. Standard open-source license with minimal additional restrictions.
Code Safety Review:
- MiniMax-M2.5 repository contains documentation and deployment guides only; no executable code
- Model weights hosted on Hugging Face (external platform)
- Mini-Agent repository reviewed: standard dependencies (pydantic, openai, anthropic, httpx)
- No malicious code patterns detected (eval, exec, import abuse)
- bash_tool.py implements shell command execution (standard for AI agent tools)
Recommendations for Safe Usage:
- Avoid executing shell commands with untrusted inputs
- Use firewall/sandbox environment for initial local deployment
- Manage API keys via environment variables (not hardcoded)
- Review code before deployment in production environments
Based on this assessment, the MiniMax-AI organization and repositories appear legitimate with no evidence of malicious code or backdoors.
References
- Hugging Face Model Repository: https://huggingface.co/MiniMaxAI/MiniMax-M2.5
- GitHub Organization: https://github.com/MiniMax-AI
- X Announcement by Akshay (@akshay_pachaar): https://x.com/akshay_pachaar/status/2022574708051583120
Disclaimer: Information in this article is based on publicly available data as of February 15, 2026. Model performance, pricing, and availability are subject to change. Please refer to official sources for the latest information.
関連記事
MiniMax-M2.5リリース、Opus 4.6と同等の性能を10分の1のコストで実現するオープンソースモデル
MiniMaxがM2.5をリリース。2290億パラメータのオープンソースモデルで、SWE-Bench Verifiedで80.2%を達成し、Opus 4.6・Gemini 3 Pro・GPT-5と比較して10〜20分の1のコストを実現。
NVIDIA、リアルタイム音声会話AI「PersonaPlex-7B」を公開。フルデュプレックス対応で割り込み・同時発話を実現
NVIDIAが音声対話AI「PersonaPlex-7B-v1」を公開。音声認識と音声生成を同時に処理するフルデュプレックス構成により、人間に近い自然な会話を実現する。商用利用可能でGitHub・Hugging Faceで公開中。
AIエージェントがオープンソースを破壊:curl、matplotlibのメンテナーが悲鳴
curl開発者がバグバウンティを停止、GitHubがPR無効化機能を追加。AIエージェントによる低品質な貢献とハラスメントがオープンソースコミュニティを圧迫している。
人気記事
ChatGPT(OpenAI)とClaude(Anthropic)の機能比較 2026年版。コーディング・長文解析・コスト・API料金の違いを検証
ChatGPT(GPT-4o/o3)とClaude(Sonnet 4.6/Opus 4.5)を2026年時点の最新情報で比較する。コーディング能力、長文処理、日本語品質、API料金、無料プランの違いをSWE-benchなどのベンチマーク結果とともに解説する。
【2026年2月20日 所感】「AIがコードを書く」は仮説から現実になった——しかし私たちはその意味をまだ消化できていない
2026年2月20日に観測したコーディングエージェント関連ニュースの総括と所感。Anthropicの自律性研究、cmux、MJ Rathbunのエージェント事故、HN「外骨格 vs チーム」論争、Stripe Minions週1000件PR、Taalas 17k tokens/sec——朝から夜までの流れを通じて見えてきた「AIがコードを書く時代」の実相を考察する。
868のスキルをnpx 1コマンドで——「Antigravity Awesome Skills」が主要AIコーディングエージェントの共通スキル基盤になりつつある
Claude Code・Gemini CLI・Codex CLI・Cursor・GitHub Copilotなど主要AIコーディングアシスタントを横断する868以上のスキルライブラリ「Antigravity Awesome Skills」(v5.4.0)を詳細分析。Anthropic・Vercel・OpenAI・Supabase・Microsoftの公式スキルを統合した設計思想、ロール別バンドル・ワークフロー機能、SKILL.mdによる相互運用性のアーキテクチャを解説する。
最新記事
AIエージェント間通信の標準化競争が始まる——AquaとAgent Semantic Protocolが同日登場
2026年2月23日、Hacker Newsに2つのAIエージェント通信プロジェクトが同日掲載された。Go製CLI「Aqua」とセマンティックルーティングを実装する「Agent Semantic Protocol」は、MCPが解決できないP2P・非同期通信の課題に取り組む。
Claude Sonnet 4.6、無料・Proプランのデフォルトモデルに——社内テストでOpus 4.5を59%の確率で上回る
Anthropicは2026年2月17日にリリースしたClaude Sonnet 4.6を、claude.aiの無料・Proプランのデフォルトモデルに設定した。価格はSonnet 4.5と同額の$3/$15 per 1Mトークン。社内評価ではコーディングエージェント用途でOpus 4.5を上回る結果が出ている。
GoogleがOpenClaw経由のGemini利用ユーザーのアカウントを永久停止——月額$250請求継続のまま
2026年2月23日、Hacker Newsで140pt/107コメントを集めたレポートによると、GoogleはOpenClaw(サードパーティクライアント)経由でGeminiを使用していたGoogle AI Pro/Ultraユーザーを予告なしに永久停止した。技術的・経済的背景を整理する。