AI

MiniMax-M2: Revolutionizing Agentic Tool Calling with Open Source LLMs

Published

7 months ago

October 27, 2025

MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)

Watch out, DeepSeek and Qwen! There’s a new king of open source large language models (LLMs), especially when it comes to something enterprises are increasingly valuing: agentic tool use — that is, the ability to go off and use other software capabilities like web search or bespoke applications — without much human guidance.

That model is none other than MiniMax-M2, the latest LLM from the Chinese startup of the same name. And in a big win for enterprises globally, the model is available under a permissive, enterprise-friendly MIT License, meaning it is made available freely for developers to take, deploy, retrain, and use how they see fit — even for commercial purposes. It can be found on Hugging Face, GitHub and ModelScope, as well as through MiniMax’s API here. It supports OpenAI and Anthropic API standards, as well, making it easy for customers of said proprietary AI startups to shift out their models to MiniMax’s API, if they want.

According to independent evaluations by Artificial Analysis, a third-party generative AI model benchmarking and research organization, M2 now ranks first among all open-weight systems worldwide on the Intelligence Index—a composite measure of reasoning, coding, and task-execution performance.

In agentic benchmarks that measure how well a model can plan, execute, and use external tools—skills that power coding assistants and autonomous agents—MiniMax’s own reported results, following the Artificial Analysis methodology, show τ²-Bench 77.2, BrowseComp 44.0, and FinSearchComp-global 65.5.

These scores place it at or near the level of top proprietary systems like GPT-5 (thinking) and Claude Sonnet 4.5, making MiniMax-M2 the highest-performing open model yet released for real-world agentic and tool-calling tasks.

What It Means For Enterprises and the AI Race

Built around an efficient Mixture-of-Experts (MoE) architecture, MiniMax-M2 delivers high-end capability for agentic and developer workflows while remaining practical for enterprise deployment.

For technical decision-makers, the release marks an important turning point for open models in business settings. MiniMax-M2 combines frontier-level reasoning with a manageable activation footprint—just 10 billion active parameters out of 230 billion total.

This design enables enterprises to operate advanced reasoning and automation workloads on fewer GPUs, achieving near-state-of-the-art results without the infrastructure demands or licensing costs associated with proprietary frontier systems.

Artificial Analysis’ data show that MiniMax-M2’s strengths go beyond raw intelligence scores. The model leads or closely trails top proprietary systems such as GPT-5 (thinking) and Claude Sonnet 4.5 across benchmarks for end-to-end coding, reasoning, and agentic tool use.

Its performance in τ²-Bench, SWE-Bench, and BrowseComp indicates particular advantages for organizations that depend on AI systems capable of planning, executing, and verifying complex workflows—key functions for agentic and developer tools inside enterprise environments.

As LLM engineer Pierre-Carl Langlais aka Alexander Doria posted on X: “MiniMax [is] making a case for mastering the technology end-to-end to get actual agentic automation.”

Compact Design, Scalable Performance

MiniMax-M2’s technical architecture is a sparse Mixture-of-Experts model with 230 billion total parameters and 10 billion active per inference.

This configuration significantly reduces latency and compute requirements while maintaining broad general intelligence.

The design allows for responsive agent loops—compile–run–test or browse–retrieve–cite cycles—that execute faster and more predictably than denser models.

For enterprise technology teams, this means easier scaling, lower cloud costs, and reduced deployment friction. According to Artificial Analysis, the model can be served efficiently on as few as four NVIDIA H100 GPUs at FP8 precision, a setup well within reach for mid-size organizations or departmental AI clusters.

Benchmark Leadership Across Agentic and Coding Workflows

MiniMax’s benchmark suite highlights strong real-world performance across developer and agent environments. The figure below, released with the model, compares MiniMax-M2 (in red) with several leading proprietary and open models, including GPT-5 (thinking), Claude Sonnet 4.5, Gemini 2.5 Pro, and DeepSeek-V3.2.

MiniMax-M2 achieves top or near-top performance in many categories:

SWE-bench Verified: 69.4 — close to GPT-5’s 74.9
ArtifactsBench: 66.8 — above Claude Sonnet 4.5 and DeepSeek-V3.2
τ²-Bench: 77.2 — approaching GPT-5’s 80.1
GAIA (text only): 75.7 — surpassing DeepSeek-V3.2
BrowseComp: 44.0 — notably stronger than other open models
FinSearchComp-global: 65.5 — best among tested open-weight systems

These results show MiniMax-M2’s capability in executing complex, tool-augmented tasks across multiple languages and environments—skills increasingly relevant for automated support, R&D, and data analysis inside enterprises.

Strong Showing in Artificial Analysis’ Intelligence Index

The model’s overall intelligence profile is confirmed in the latest Artificial Analysis Intelligence Index v3.0, which aggregates performance across ten reasoning benchmarks including MMLU-Pro, GPQA Diamond, AIME 2025, IFBench, and τ²-Bench Telecom.

MiniMax-M2 scored 61 points, ranking as the highest open-weight model globally and following closely behind GPT-5 (high) and Grok 4.

Artificial Analysis highlighted the model’s balance between technical accuracy, reasoning depth, and applied intelligence across domains. For enterprise users, this consistency indicates a reliable model foundation suitable for integration into software engineering, customer support, or knowledge automation systems.

Designed for Developers and Agentic Systems

MiniMax engineered M2 for end-to-end developer workflows, enabling multi-file code edits, automated testing, and regression repair directly within integrated development environments or CI/CD pipelines.

The model also excels in agentic planning—handling tasks that combine web search, command execution, and API calls while maintaining reasoning traceability.

These capabilities make MiniMax-M2 especially valuable for enterprises exploring autonomous developer agents, data analysis assistants, or AI-augmented operational tools.

Benchmarks such as Terminal-Bench and BrowseComp demonstrate the model’s ability to adapt to incomplete data and recover gracefully from intermediate errors, improving reliability in production settings.

Interleaved Thinking and Structured Tool Use

A distinctive aspect of MiniMax-M2 is its interleaved thinking format, which maintains visible reasoning traces between <think>…</think> tags.

This enables the model to plan and verify steps across multiple exchanges, a critical feature for agentic reasoning. MiniMax advises retaining these segments when passing conversation history to preserve the model’s logic and continuity.

The company also provides a Tool Calling Guide on Hugging Face, detailing how developers can connect external tools and APIs via structured XML-style calls.

Unlocking the Potential of MiniMax-M2: A Comprehensive Overview

MiniMax-M2 boasts a unique functionality that enables it to act as the central reasoning engine for complex agent frameworks, handling tasks such as search, retrieval, and computation with ease through external functions.

Exploring Open Source Access and Enterprise Deployment Options

Enterprises have the opportunity to leverage MiniMax’s capabilities through the MiniMax Open Platform API and MiniMax Agent interface, both currently available for free for a limited time.

For optimal performance, MiniMax recommends utilizing SGLang and vLLM, which provide day-one support for the model’s distinct interleaved reasoning and tool-calling structure. Detailed deployment guides and parameter configurations can be found in MiniMax’s documentation.

Cost Efficiency and Token Economics

MiniMax’s API pricing stands at $0.30 per million input tokens and $1.20 per million output tokens, positioning it as one of the most cost-effective options in the open-model ecosystem.

Provider

Model (doc link)

Input $/1M

Output $/1M

Notes

MiniMax

MiniMax-M2

$0.30

$1.20

Listed under “Chat Completion v2” for M2.

Notes & Caveats:

Prices are in USD per million tokens and are subject to change. Check the linked pages for updates and region-specific nuances.

Additional charges may apply for server-side tools or batch/context-cache discounts offered by vendors.

MiniMax sets itself apart with longer, more detailed reasoning traces while maintaining cost efficiency through sparse activation and optimized compute design, making it ideal for interactive agents and high-volume automation systems.

Delving into the Background of MiniMax: A Rising Force in the Chinese AI Landscape

MiniMax has rapidly gained prominence in China’s burgeoning AI industry, backed by major players like Alibaba and Tencent. The company gained international recognition through breakthroughs in AI video generation and the development of open-weight large language models tailored for developers and enterprises.

In late 2024, MiniMax made waves with its AI video generation tool, “video-01,” showcasing its ability to create lifelike scenes in seconds. The product, later integrated into MiniMax’s Hailuo platform, highlighted China’s prowess in generative video technology.

By early 2025, MiniMax unveiled the MiniMax-01 series, including MiniMax-Text-01 and MiniMax-VL-01, featuring an unprecedented 4-million-token context window. The company’s rapid innovation continued with the release of MiniMax-M1, a model focused on long-context reasoning and reinforcement learning efficiency.

MiniMax’s commitment to open-weight models and flexible licensing has positioned it as a key player in the global AI landscape, offering customizable solutions for real-world deployment.

MiniMax-M2: Setting the Standard for Open-Weight AI Models

The launch of MiniMax-M2 underscores the growing influence of Chinese AI research groups in developing open-weight models designed for practical applications. MiniMax’s approach prioritizes controllable reasoning and real utility, providing enterprises with a transparent and auditable model for internal deployment.

With MiniMax-M2, businesses can access a state-of-the-art open model that combines strong performance benchmarks with efficient scaling, making it an ideal choice for intelligent systems that require traceable logic and seamless integration.

“Tomorrow, I will conquer my fears and face my challenges with courage.”

to

“Tomorrow, I am determined to overcome my fears and approach my challenges with bravery.”

Related Topics:Agentic Calling LLMs MiniMaxM2 Open Revolutionizing Source Tool

Up Next
Google Cloud Unveils Enterprise-Scale AI Training Solution to Compete with CoreWeave and AWS

Don't Miss
Anthropic Unleashes Claude AI in Finance: Excel Integration to Challenge Microsoft Copilot

Continue Reading

You may like

Revolutionizing Photography: The Customizable Camera Experience of iOS 27

Revolutionizing AI Development: AWS Introduces Enhanced Spec Check in Kiro Coding Tool

Amazon Prime Now: Revolutionizing Fast Delivery Nationwide

Revolutionizing Workflows: Laserfiche Introduces AI Agents for Streamlining Natural Language Processes

Bain’s Forecast: Agentic AI Automation to Dominate US$100 Billion SaaS Market

Revolutionizing US Supply Chains with AI

Click to comment

Leave a Reply
Cancel reply
Your email address will not be published. Required fields are marked *
Comment *
Name *

Email *

Website

Latest

Trending

Videos

Google13 minutes ago

Introducing the Googlebook: Google’s Next Generation Laptop

Tech News19 minutes ago

Behind the Delay: The Story of the Fitbit Air Colour

Mobile Tech2 hours ago

Laugh Riot: The Comedy Phenomenon That Swept the Awards Season

Cars3 hours ago

Mazda CX-3: Thai-Built Small SUV Excellence

Security5 hours ago

Instructure Strikes Deal with ShinyHunters to Halt Massive Canvas Data Breach

Video Games5 hours ago

Deep Dive: Abyssus Brinepunk Shooter Makes Waves on Next-Gen Consoles in June, Plus Bonus Content Drop

Gadgets6 hours ago

Unlocking the Power of Gemini Intelligence: Exploring Rambler Mode as a Typing Alternative

Startups6 hours ago

A* Continues to Shine: Kevin Hartz’s Third Fund Raises $450M

Apple6 hours ago

Revolutionizing Photography: The Customizable Camera Experience of iOS 27

Facebook7 months ago

EU Takes Action Against Instagram and Facebook for Violating Illegal Content Rules

Facebook7 months ago

Warning: Facebook Creators Face Monetization Loss for Stealing and Reposting Videos

Facebook5 months ago

Facebook’s New Look: A Blend of Instagram’s Style

Facebook7 months ago

Facebook Compliance: ICE-tracking Page Removed After US Government Intervention

Facebook5 months ago

Facebook and Instagram to Reduce Personalized Ads for European Users

Facebook7 months ago

InstaDub: Meta’s AI Translation Tool for Instagram Videos

Facebook5 months ago

Reclaim Your Account: Facebook and Instagram Launch New Hub for Account Recovery

Apple7 months ago

Meta discontinues Messenger apps for Windows and macOS

Facebook7 months ago

Breaking Updates: Meta Connect 2025 Unveils Latest Developments

Videos12 hours ago

Samsung Galaxy Fold 2 – THIS is why you should be excited!

Videos24 hours ago

What you didn’t know about Samsung.

Videos1 day ago

The Ultimate Google Android Comparison.

Videos2 days ago

Qualcomm Snapdragon 865 Features!

Videos2 days ago

A Battery that’ll change Smartphones forever.

Videos3 days ago

20 Android Apps for 2020.

Videos3 days ago

Mystery Samsung Smartphone Unboxing!

Videos4 days ago

18 Smartphone Gadgets to Change Everything.

Videos5 days ago

Why Mid-Range smartphones are the new Flagships.

Trending

Facebook7 months ago

EU Takes Action Against Instagram and Facebook for Violating Illegal Content Rules

Facebook7 months ago

Warning: Facebook Creators Face Monetization Loss for Stealing and Reposting Videos

Facebook5 months ago

Facebook’s New Look: A Blend of Instagram’s Style

Facebook7 months ago

Facebook Compliance: ICE-tracking Page Removed After US Government Intervention

Facebook5 months ago

Facebook and Instagram to Reduce Personalized Ads for European Users

Facebook7 months ago

InstaDub: Meta’s AI Translation Tool for Instagram Videos

Facebook5 months ago

Reclaim Your Account: Facebook and Instagram Launch New Hub for Account Recovery

Apple7 months ago

Meta discontinues Messenger apps for Windows and macOS

Provider	Model (doc link)	Input $/1M	Output $/1M	Notes
MiniMax	MiniMax-M2	$0.30	$1.20	Listed under “Chat Completion v2” for M2.

Bennett Tech Innovation

MiniMax-M2: Revolutionizing Agentic Tool Calling with Open Source LLMs

What It Means For Enterprises and the AI Race

Compact Design, Scalable Performance

Benchmark Leadership Across Agentic and Coding Workflows

Strong Showing in Artificial Analysis’ Intelligence Index

Designed for Developers and Agentic Systems

Interleaved Thinking and Structured Tool Use

Unlocking the Potential of MiniMax-M2: A Comprehensive Overview

Exploring Open Source Access and Enterprise Deployment Options

Delving into the Background of MiniMax: A Rising Force in the Chinese AI Landscape

MiniMax-M2: Setting the Standard for Open-Weight AI Models

You may like

Leave a Reply Cancel reply

Leave a Reply

Introducing the Googlebook: Google’s Next Generation Laptop

Behind the Delay: The Story of the Fitbit Air Colour

Laugh Riot: The Comedy Phenomenon That Swept the Awards Season

Mazda CX-3: Thai-Built Small SUV Excellence

Instructure Strikes Deal with ShinyHunters to Halt Massive Canvas Data Breach

Deep Dive: Abyssus Brinepunk Shooter Makes Waves on Next-Gen Consoles in June, Plus Bonus Content Drop

Unlocking the Power of Gemini Intelligence: Exploring Rambler Mode as a Typing Alternative

A* Continues to Shine: Kevin Hartz’s Third Fund Raises $450M

Revolutionizing Photography: The Customizable Camera Experience of iOS 27

EU Takes Action Against Instagram and Facebook for Violating Illegal Content Rules

Warning: Facebook Creators Face Monetization Loss for Stealing and Reposting Videos

Facebook’s New Look: A Blend of Instagram’s Style

Facebook Compliance: ICE-tracking Page Removed After US Government Intervention

Facebook and Instagram to Reduce Personalized Ads for European Users

InstaDub: Meta’s AI Translation Tool for Instagram Videos

Reclaim Your Account: Facebook and Instagram Launch New Hub for Account Recovery

Meta discontinues Messenger apps for Windows and macOS

Breaking Updates: Meta Connect 2025 Unveils Latest Developments

Samsung Galaxy Fold 2 – THIS is why you should be excited!

What you didn’t know about Samsung.

The Ultimate Google Android Comparison.

Qualcomm Snapdragon 865 Features!

A Battery that’ll change Smartphones forever.

20 Android Apps for 2020.

Mystery Samsung Smartphone Unboxing!

18 Smartphone Gadgets to Change Everything.

Why Mid-Range smartphones are the new Flagships.

Trending

Leave a Reply
Cancel reply