Z.ai's new 753B-parameter GLM-5.2 outperforms OpenAI's GPT-5.5 on long-horizon coding tasks while costing six times less per token

Z.ai’s GLM-5.2, a 753-billion-parameter open-weights model released June 16, beats OpenAI’s GPT-5.5 on multiple coding benchmarks while costing roughly one-sixth as much to operate, marking a significant milestone for open-source AI development.

On the SWE-bench Pro benchmark, which tests real-world software engineering tasks, GLM-5.2 scored 62.1, decisively outperforming GPT-5.5’s 58.6, according to VentureBeat. On FrontierSWE, a test designed to measure long-horizon task completion, GLM-5.2 achieved 74.4 percent, surpassing GPT-5.5 at 72.6 percent and coming close to Anthropic’s Claude Opus 4.8 at 75.1 percent.

The cost advantage is equally striking. GLM-5.2 API pricing runs $1.40 per million input tokens and $4.40 per million output tokens—a combined $5.80 per million tokens. GPT-5.5 costs $5.00 for input and $30.00 for output, totaling $35 per million tokens, making the open-weights model roughly one-sixth the price for comparable or superior coding performance.

Architecture and Technical Innovation

The model operates with 753 billion parameters and introduces a major architectural optimization called IndexShare, which reuses the same indexer across every four sparse attention layers. At the maximum 1-million-token context length, this single innovation reduces per-token compute operations by 2.9 times, according to the VentureBeat analysis.

GLM-5.2 also features an upgraded Multi-Token Prediction layer for speculative decoding that boosts accepted token length by up to 20 percent during inference. The model offers flexible “thinking modes”—users can toggle between “Max” effort for peak logical problem-solving and “High” effort for a balance between performance and latency-efficient token use.

Open-Source Release and Market Impact

Z.ai released GLM-5.2’s weights under an unrestricted MIT open-source license, allowing enterprises to download the model freely from Hugging Face, customize it, and run it locally or on virtual machines for only compute and electricity costs. This licensing approach stands in contrast to many dual-use licenses that impose restrictive governance policies.

The release arrived amid regulatory uncertainty for American proprietary models, following the Trump Administration’s export control directive that prohibited foreign nationals from using Anthropic’s Claude Fable 5 model. For enterprise decision-makers, GLM-5.2 provides a path to host frontier-level AI locally, entirely bypassing geographic fencing and commercial limitations.

Developer reception has been immediate and positive. Cline IDE noted on X that GLM-5.2 is “the first open-weights model to cross 80% on Terminal-Bench, and beats every other open model available,” scoring 81.0 on that benchmark. Kilo Code confirmed day-one integration, stating the 1-million-token context window and Max effort mode were both live at launch.

Z.ai launched the GLM Coding Plan to operationalize the model, with pricing tiers starting at $12.60 per month for the Lite plan, $50.40 for Pro, and $112.00 for Max, all billed annually. The plan offers out-of-the-box support for third-party coding harnesses including Claude Code, OpenClaw, Cline, and Kilo Code.

On other benchmarks, GLM-5.2 also demonstrated strength: it scored 77.0 on MCP-Atlas (tool usage), compared to GPT-5.5’s 75.3, and 54.7 on Humanity’s Last Exam with tools enabled, ahead of GPT-5.5’s 52.2. The model also took first place on the crowdsourced Design Arena benchmark with an ELO score of 1360, beating Claude Fable 5.

Sources

VentureBeat — detailed benchmark comparisons, architectural details (IndexShare), pricing tables, and developer reception
Z.ai Developer Documentation — model specifications, context window capabilities, and API pricing
Cline IDE (X post) — Terminal-Bench performance confirmation and open-weights status
Kilo Code (X post) — day-one integration confirmation and feature availability