MiniMax-M2.5 Released: Open-Source Model Matching Opus 4.6 at 10% of the Cost
MiniMax releases M2.5, an open-source 229B parameter model achieving 80.2% on SWE-Bench Verified with 10-20x cost reduction compared to Opus 4.6, Gemini 3 Pro, and GPT-5.
MiniMax released MiniMax-M2.5 on February 14, 2026, an open-source large language model designed for coding, agentic workflows, and office tasks. The 229-billion-parameter model achieves performance comparable to Claude Opus 4.6 while costing one-tenth to one-twentieth as much, according to the company’s official announcement on Hugging Face.
Performance Benchmarks
MiniMax-M2.5 demonstrates state-of-the-art performance across multiple benchmarks:
Coding:
- SWE-Bench Verified: 80.2%
- Multi-SWE-Bench: 51.3%
- SWE-Bench evaluation on Droid scaffolding: 79.7% (vs. 78.9% for Opus 4.6)
- SWE-Bench evaluation on OpenCode scaffolding: 76.1% (vs. 75.9% for Opus 4.6)
Search and Tool Calling:
- BrowseComp (with context management): 76.3%
- Achieves results with approximately 20% fewer search rounds compared to M2.1
Other Benchmarks:
- AIME25: 86.3%
- GPQA-D: 85.2%
- SciCode: 44.4%
- IFBench: 70.0%
The model was trained on over 10 programming languages (including Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, and Ruby) across more than 200,000 real-world environments.
Speed Improvements
MiniMax-M2.5 completes SWE-Bench Verified evaluations 37% faster than M2.1, matching the speed of Claude Opus 4.6. The end-to-end runtime decreased from an average of 31.3 minutes (M2.1) to 22.8 minutes (M2.5), on par with Opus 4.6’s 22.9 minutes.
The model is served at two throughput levels:
- M2.5-Lightning: 100 tokens per second (2x faster than other frontier models)
- M2.5: 50 tokens per second
Cost Efficiency
MiniMax positions M2.5 as “intelligence too cheap to meter.” Pricing is as follows:
M2.5-Lightning (100 TPS):
- Input: $0.30 per million tokens
- Output: $2.40 per million tokens
M2.5 (50 TPS):
- Input: $0.15 per million tokens
- Output: $1.20 per million tokens
According to MiniMax’s calculations, running M2.5-Lightning continuously for one hour at 100 tokens per second costs $1. At 50 tokens per second, the cost drops to $0.30 per hour. The company states that four M2.5 instances can run continuously for an entire year for $10,000.
Based on output pricing, M2.5 costs one-tenth to one-twentieth that of Claude Opus, Google Gemini 3 Pro, and OpenAI GPT-5.
Real-World Deployment at MiniMax
MiniMax reports that M2.5 autonomously completes 30% of the company’s daily tasks across functions including R&D, product, sales, HR, and finance. In coding scenarios specifically, M2.5-generated code accounts for 80% of newly committed code.
The company has deployed M2.5 in its MiniMax Agent platform, where users have built over 10,000 “Experts” (reusable task templates) combining domain expertise with standardized “Office Skills” for Word, PowerPoint, and Excel tasks.
Open-Source Availability
Model weights are available on Hugging Face: https://huggingface.co/MiniMaxAI/MiniMax-M2.5
GitHub repository: https://github.com/MiniMax-AI
Recommended inference frameworks (listed alphabetically):
- SGLang
- vLLM
- Transformers
- KTransformers
- ModelScope (for users in China)
Inference parameters:
- Temperature: 1.0
- Top-p: 0.95
- Top-k: 40
Default system prompt:
“You are a helpful assistant. Your name is MiniMax-M2.5 and is built by MiniMax.”
Technical Background: Reinforcement Learning Scaling
MiniMax attributes M2.5’s improvements to large-scale reinforcement learning. The model was trained across hundreds of thousands of real-world environments derived from tasks performed at the company.
Forge Framework: The company developed an in-house agent-native RL framework called “Forge,” which decouples the training-inference engine from the agent layer, enabling optimization across multiple agent scaffolds. A tree-structured merging strategy for training samples achieved approximately 40x training speedup.
Algorithm: MiniMax continued using the CISPO algorithm introduced in early 2025 to ensure stability of MoE (Mixture of Experts) models during large-scale training. A process reward mechanism was introduced to address credit assignment challenges in long-context agent rollouts.
Model Progression
Over three and a half months from late October 2025 to February 2026, MiniMax released three models:
- M2 (December 23, 2025)
- M2.1 (updated February 13, 2026)
- M2.5 (February 14, 2026)
According to MiniMax, the rate of progress on SWE-Bench Verified has been significantly faster than Claude, GPT, and Gemini model families over the same period.
Commercial Availability
In addition to open-source deployment, MiniMax offers M2.5 through:
- MiniMax Agent: https://agent.minimax.io/
- MiniMax API Platform: https://platform.minimax.io/
- MiniMax Coding Plan: https://platform.minimax.io/subscribe/coding-plan
Security and Trust Assessment
A security review of the MiniMax-AI GitHub organization and repositories was conducted on February 15, 2026.
Organization Verification:
- Organization ID: 194880281 (created January 14, 2025)
- Official website: https://www.minimax.io
- Official contact: model@minimax.io
- Twitter: @MiniMax_AI
- GitHub followers: 4,358
Repository Trust Indicators:
- MiniMax-M2.5: 6.09k stars, 519 forks (updated 21 hours ago)
- MiniMax-M2.1: 86.7k stars, 1.27k forks
- MiniMax-M2: 450k stars, 1.48k forks
- Mini-Agent: 1.6k stars, 232 forks
High community engagement levels indicate active maintenance and peer review.
License: Modified MIT License requiring commercial users to display “MiniMax M2.5” on product interfaces. Standard open-source license with minimal additional restrictions.
Code Safety Review:
- MiniMax-M2.5 repository contains documentation and deployment guides only; no executable code
- Model weights hosted on Hugging Face (external platform)
- Mini-Agent repository reviewed: standard dependencies (pydantic, openai, anthropic, httpx)
- No malicious code patterns detected (eval, exec, import abuse)
- bash_tool.py implements shell command execution (standard for AI agent tools)
Recommendations for Safe Usage:
- Avoid executing shell commands with untrusted inputs
- Use firewall/sandbox environment for initial local deployment
- Manage API keys via environment variables (not hardcoded)
- Review code before deployment in production environments
Based on this assessment, the MiniMax-AI organization and repositories appear legitimate with no evidence of malicious code or backdoors.
References
- Hugging Face Model Repository: https://huggingface.co/MiniMaxAI/MiniMax-M2.5
- GitHub Organization: https://github.com/MiniMax-AI
- X Announcement by Akshay (@akshay_pachaar): https://x.com/akshay_pachaar/status/2022574708051583120
Disclaimer: Information in this article is based on publicly available data as of February 15, 2026. Model performance, pricing, and availability are subject to change. Please refer to official sources for the latest information.
Related Articles
OpenClaw v2026.2.14 Released: Major Security Hardening and 100+ Bug Fixes
OpenClaw releases v2026.2.14 with extensive security improvements, TUI stability enhancements, memory system optimizations, and 100+ bug fixes across channels, agents, and tools.
AI Agents Are Destroying Open Source: curl and matplotlib Maintainers Sound the Alarm
curl developer suspends bug bounty, GitHub adds PR disable feature. Low-quality contributions and harassment from AI agents are crushing open source communities.
Popular Articles
868 Agentic Skills, One Command: Antigravity Awesome Skills Becomes the Cross-Tool Skill Standard
Antigravity Awesome Skills (v5.4.0) delivers 868+ battle-tested skills for Claude Code, Gemini CLI, Codex CLI, Cursor, GitHub Copilot, and five other AI coding assistants via a single npx command. With official skills from Anthropic, Vercel, OpenAI, Supabase, and Microsoft consolidated under one MIT-licensed repository, it's emerging as the portable skill layer for the fragmented AI coding agent landscape.
How Claude Sonnet 4.6 Agent Teams Achieve 4x Productivity: Practical Insights from Anthropic's Own Research
Two Anthropic studies—a survey of 132 internal engineers and an analysis of 1M+ real-world agent interactions—reveal the precise delegation strategies and autonomy patterns that enable high-performing teams to multiply output with Claude Sonnet 4.6 agent teams.
What Actually Makes OpenClaw Special: The Full Story from VibeTunnel to 200k+ GitHub Stars
The three-stage VibeTunnel→Clawdbot→OpenClaw evolution, Pi runtime philosophy, why HEARTBEAT is the real differentiator from Claude Code, and the ClawHub supply chain attack (12% of skills were malicious). An unvarnished look at the most used and most misunderstood OSS agent.
Latest Articles
Two AI Agent Communication Projects Hit Hacker News Simultaneously, Targeting MCP's Blind Spots
Aqua and Agent Semantic Protocol appeared on Hacker News on the same day, both tackling the same unsolved problem: how AI agents communicate directly without a central broker, across network boundaries, and asynchronously.
Claude Sonnet 4.6 Becomes the Default for Free and Pro Users — Outperforms Opus 4.5 on Coding Agent Benchmarks
Anthropic has made Claude Sonnet 4.6 the default model for claude.ai's Free and Pro plans. Released February 17, 2026, it matches Sonnet 4.5 pricing at $3/$15 per million tokens while internal Claude Code evaluations show it beating the previous frontier model, Opus 4.5, 59% of the time on agentic coding tasks.
Google Permanently Bans AI Pro Users for Accessing Gemini via OpenClaw, Continues Charging $250/Month
A Hacker News post garnering 140 points and 107 comments details how Google terminated Google AI Pro and Ultra accounts without warning after users accessed Gemini through OpenClaw, a third-party client. The incident surfaces deeper issues around prompt caching, subscription economics, and how AI providers enforce terms of service.