Meituan Open-Sources LongCat-2.0: The 1.6T Agentic Coding Model That Topped OpenRouter — Built Without Nvidia

Meituan has unveiled LongCat-2.0, a 1.6-trillion-parameter MoE model trained entirely on domestic Chinese ASICs that spent two months dominating OpenRouter under a stealth identity. Released under the MIT license with a 1-million-token context window, it outperforms GPT-5.5 on SWE-bench Pro and signals a structural shift in global AI infrastructure.

Chinese food delivery giant Meituan has officially open-sourced LongCat-2.0 on GitHub, Hugging Face, and its native platform — revealing it as the model behind "Owl Alpha," an anonymous stealth system that spent two months at the top of global developer charts on OpenRouter.

The 1.6-trillion-parameter Mixture-of-Experts (MoE) model ships under the permissive MIT license with a native 1-million-token context window, targeting enterprise-scale autonomous software engineering.

The Stealth Run That Made It Famous

Before its official reveal, Owl Alpha had already racked up:

~10.1 trillion monthly tokens on OpenRouter
559 billion tokens/day on average
242% month-over-month volume growth
#1 on Hermes Agent workspace
#2 on Claude Code deployments
#3 across international OpenClaw environments

The model had effectively earned its reputation before Meituan claimed it.

Pricing: Aggressive and Cache-Friendly

LongCat-2.0's commercial API introduces a tiered pricing strategy designed to undercut closed-source alternatives:

Cache hits: completely free
Standard pay-as-you-go: $0.75 / $2.95 per million tokens (input/output)
Limited-time promo: $0.30 / $1.20 per million tokens (input/output)

At promotional rates, LongCat-2.0 sits competitively alongside MiniMax-M3 and well below GPT-5.5 ($5.00/$30.00) or Claude Opus 4.8 ($5.00/$25.00).

Trained Entirely on Chinese Silicon

Perhaps the most consequential aspect of this release: LongCat-2.0 was trained on a cluster of over 50,000 domestic Chinese ASICs — with no Nvidia GPUs involved.

This is a direct proof point that near-frontier, trillion-parameter models can be trained on homegrown silicon at scale. If Chinese labs can consistently iterate at this level without U.S. hardware, it introduces a meaningful long-term threat to Nvidia's stranglehold on frontier AI training infrastructure.

The timing amplifies the significance. The U.S. government has recently pressured OpenAI to restrict access to its GPT-5.6 models, and Anthropic previously took its Claude Fable 5 / Mythos 5 models entirely offline following similar directives. Critics argue these moves have inadvertently created a vacuum — one that open-source Chinese models are filling rapidly.

Architecture: Sparse Attention at Million-Token Scale

At the core of LongCat-2.0 is an aggressive MoE sparsity design:

1.6T total parameters, with only ~48B active per token
Dynamic activation range of 33B–56B parameters depending on query complexity
Zero-Compute Experts framework to eliminate idle overhead

To sustain the 1M-token context without hardware bottlenecks, Meituan developed LongCat Sparse Attention (LSA), an evolution of DeepSeek Sparse Attention, built on three orthogonal mechanisms:

Streaming-aware Indexing (SI) — converts fragmented memory access into sequential, HBM-coalesced reads
Cross-Layer Indexing (CLI) — amortizes attention indexing costs across adjacent layers using a single pass
Hierarchical Indexing (HI) — applies a coarse-to-fine two-stage scoring pass to filter and refine token candidates efficiently

The model also incorporates an N-gram Embedding module — adding 135 billion parameters across a 5-gram token framework — which expands the embedding space roughly 100-fold and reduces memory I/O bottlenecks during large-batch inference.

Benchmark Performance

LongCat-2.0 is explicitly optimized for agentic, multi-step software engineering tasks rather than conversational fluency.

| Benchmark | LongCat-2.0 | Competitor | |---|---|---| | SWE-bench Pro | 59.5 | GPT-5.5: 58.6 | | Terminal-Bench 2.1 | 70.8 | — | | SWE-bench Multilingual | 77.3 | — | | FORTE (enterprise workflows) | 73.2 | — |

These results position LongCat-2.0 as a credible alternative to closed-source leaders for autonomous repository manipulation, tool integration, and long-context engineering pipelines.

Post-Training: The MOPD Framework

Rather than blending human feedback into a single reward function, Meituan's post-training pipeline relies on Multi-Teacher Optimization via Mixture of Specialized Experts (MOPD). The framework separates optimization into distinct specialized expert streams — enabling more precise behavioral shaping for agentic, tool-use, and code-generation contexts without the performance dilution common in generalist RLHF approaches.

What This Means

LongCat-2.0 represents a convergence of several industry-defining trends: the maturation of Chinese domestic AI silicon, the strategic value of open-source distribution, and the unintended consequences of Western export controls.

For global developers seeking affordable, high-performance agentic coding tools, the calculus is shifting — and Meituan just made that shift very concrete.

With MIT licensing, free cache hits, and benchmark results that challenge GPT-5.5, LongCat-2.0 is not a regional story. It's a signal of where frontier AI infrastructure power is heading.