GLM-4.7 vs GLM-4.6: Z.ai's Latest AI Model Advancements Detailed

Introduction

Z.ai, an AI research lab focused on advancing artificial general intelligence for humanity's benefit, unveiled GLM-4.7 on December 22, 2025. This open-source large language model builds upon GLM-4.6, delivering notable advancements in coding, complex reasoning, tool usage, chat, creative writing, and role-play. The release positions GLM-4.7 as a new open-source state-of-the-art (SOTA) model, with enhancements demonstrated through benchmarks, video demos, and innovative thinking modes.

The announcement, shared via an X post, rapidly accumulated over 617,000 views in its first day, indicating high community interest. Immediate support from partners like Unsloth AI and Baseten further highlights its potential impact.

Benchmark Performance Comparison

GLM-4.7 was evaluated under a 128K context length across eight benchmarks emphasizing agentic behavior, reasoning, and coding. The model shows clear improvements over GLM-4.6, outperforming it in most categories and setting new open-source SOTA levels.

Here's a detailed comparison table of benchmark scores between GLM-4.6 and GLM-4.7:

Benchmark	GLM-4.6 Score	GLM-4.7 Score	Improvement
AIME 25	93.9%	95.7%	+1.8%
GPQA-Diamond	81.0%	85.7%	+4.7%
HLE	40.8%	42.8%	+2.0%
LiveCodeBench v6	82.8%	84.9%	+2.1%
SWE-bench Verified	68.0%	73.8%	+5.8%
Terminal-Bench 2.0	N/A	24.5%	N/A
τ²-Bench	N/A	87.4%	N/A
BrowseComp	N/A	52.0%	N/A

Note: Scores for Terminal-Bench 2.0, τ²-Bench, and BrowseComp were not provided for GLM-4.6 in the announcement, limiting direct comparison in those areas.

These results underscore GLM-4.7's enhanced capabilities, particularly in mathematical reasoning (AIME 25), scientific knowledge (GPQA-Diamond), and software engineering (SWE-bench).

Real-World Coding and Task Performance

In practical development scenarios, GLM-4.7 demonstrates superiority over GLM-4.6. Head-to-head evaluations show GLM-4.7 achieving a win rate of 64.6% in frontend tasks, 46.7% in backend tasks, and 58.3% in instruction following, with fewer losses overall. This performance led Z.ai to designate GLM-4.7 as the default model in the GLM Coding Plan.

Video demonstrations further illustrate these upgrades. One demo contrasts code generation for complex interactions, such as a Plants vs. Zombies-inspired game, 3D wooden room modeling, merry-go-round simulations, and interactive red ball games with scoring systems. GLM-4.7 handles these with greater precision and completeness compared to GLM-4.6.

New Thinking Modes for Multi-Turn Stability

GLM-4.7 introduces and refines thinking modes to improve stability in multi-turn tasks, an area where GLM-4.6 had limitations. Key features include:

Interleaved Thinking: Supported since GLM-4.5 and default in 4.7, it enables reasoning between tool calls and after results, supporting chained tool usage.
Preserved Thinking: A new mode that retains reasoning from previous turns, enabled by default for coding endpoints to enhance continuity and cache efficiency.
Turn-level Thinking: Allows per-turn control via API, optimizing for speed in simple tasks and depth in complex ones.

These modes, absent or less refined in GLM-4.6, are depicted in a flowchart showing integrated reasoning across user inputs, tool calls, and outputs.

Creative and Aesthetic Enhancements

Beyond technical tasks, GLM-4.7 excels in creative applications. A frontend optimization demo generates diverse website styles, including color clashing, retro futurism, minimal modern, soft pastel, dark tech, oriental zen, industrial chic, classic print, cinematic, and organic naturalism—showcasing improvements over GLM-4.6's outputs.

Another demo focuses on aesthetic upgrades for slides and posters. GLM-4.7 creates polished Paris tourism presentations and "Spring New Arrival" shoe launch posters, with superior design elements like layout and color coordination compared to GLM-4.6.

Community Response and Ecosystem Integration

The release prompted swift community engagement. Unsloth AI is preparing GGUF versions for local deployment, while Baseten integrated GLM-4.7 into their Model APIs, building on GLM-4.6's popularity. Other platforms like Novita AI also announced support, reflecting broad adoption potential.

Availability and Implications

GLM-4.7 is immediately accessible via Hugging Face weights, the GLM Coding Plan, and trial interfaces. While the announcement provides comprehensive details, real-world usage will further confirm its advantages. This update reinforces Z.ai's commitment to open-source AI innovation.

Z.ai Releases GLM-4.7: A Comprehensive Upgrade Over GLM-4.6 in AI Capabilities