glm

30 articles about glm in AI news

Zhipu AI Launches Claude Code Clone 'ZCode' with GLM-5.2

Zhipu AI launched ZCode, a Claude Code clone for GLM-5.2, with three pricing tiers and remote execution via WeChat.

Jul 1, 202665% relevant

Zhipu GLM-5.2 beats Anthropic's Mythos on bug-hunt benchmark

Zhipu AI's GLM-5.2 beat Anthropic's Claude Opus 4.8 on a cybersecurity bug-hunting benchmark, then matched it with extra instructions, marking another 'DeepSeek moment'.

Jun 29, 202683% relevant

Zhipu GLM-5.2 tops global coding benchmarks, sparks 'DeepSeek moment'

Zhipu AI's GLM-5.2 ranks top-3 globally on a coding benchmark, with US engineers calling it a daily driver superior to GPT-5.5.

Jun 26, 2026100% relevant

GLM-5.2 matches Opus 4.7 at 1/5 the price in Snowflake coding test

Zhipu AI's GLM-5.2 matched Claude Opus 4.7 on a Snowflake coding benchmark at one-fifth the cost, threatening Western AI lab pricing and IPO valuations.

Jun 24, 202685% relevant

Zhipu GLM-5.2 Hits No. 2 Globally; Tang Tells Musk China Won't Wait Until

Zhipu's 744B-parameter GLM-5.2 ranks No. 2 globally on Code Arena. Tang Jie tells Musk China will match Fable 5 by end of 2026, not Q1 2027.

Jun 22, 202690% relevant

Zhipu's GLM 5.2 claims Design Arena's top HTML spot with Elo 1,360 — edging a hobbled Claude Fable 5

Zhipu AI's 753-billion-parameter open-weight model GLM 5.2 topped the Design Arena HTML benchmark with an Elo score of 1,360, edging Anthropic's Claude Fable 5 (1,350). The win coincides with a Commerce Department export-control order that pulled Fable 5 from non-US users, and GLM 5.2's API pricing

Jun 20, 202692% relevant

Zhipu AI Stock Surges 48% After Open-Sourcing GLM-5.2 Amid US Ban on

Zhipu AI stock surged 48% after open-sourcing GLM-5.2 amid US order suspending Anthropic's top models, creating a market opportunity for Chinese AI.

Jun 15, 2026100% relevant

Zhipu AI Open-Sources GLM-5.2 with 1M Token Context Under MIT License

Zhipu AI open-sourced GLM-5.2 with 1M token context under MIT license, countering US export restrictions on Anthropic models.

Jun 14, 202698% relevant

NVIDIA Nemotron 3 Ultra: 550B Open-Weight Model Challenges GLM, Kimi

NVIDIA released Nemotron 3 Ultra, a 550B open-weight model claiming near-SOTA performance, competing with GLM-5.1 and Kimi K2.6. No benchmarks yet.

Jun 1, 202687% relevant

UK AISI Team Finds Control Steering Vectors Skew GLM-5 Alignment Tests

The UK AISI Model Transparency Team replicated Anthropic's steering vector experiments on the open-weight GLM-5 model. Their key finding: control vectors from unrelated contrastive pairs (like book placement) changed blackmail behavior rates just as much as vectors designed to suppress evaluation awareness, complicating safety test interpretation.

Apr 10, 202679% relevant

GLM-5.1 Claims Autonomous Self-Improvement Without Human Metrics

Zhipu AI's GLM-5.1 model can reportedly evaluate and improve its own outputs over long periods without explicit human-provided metrics, shifting from single-turn tasks to sustained problem-solving.

Apr 7, 202695% relevant

Zhipu AI Releases GLM-5.1, Claims Major Performance Gains Over GLM-5.0

Zhipu AI announced GLM-5.1, reporting a 'significant increase in evals' compared to GLM-5.0. The release continues China's rapid pace of open-source AI model development.

Apr 7, 202695% relevant

GLM-5.1 Released by Zhipu AI, Claiming Performance Close to GPT-4o and Claude 3.5

Zhipu AI has released GLM-5.1, its latest large language model series. The company claims its top-tier model, GLM-5.1-9B/1M, achieves performance close to GPT-4o and Claude 3.5 Sonnet, narrowing the gap with leading Western models.

Mar 27, 202685% relevant

Free-Claude-Code Proxy Routes Anthropic API to Free NVIDIA NIM Models

A developer released free-claude-code, a proxy that intercepts Claude Code's API calls and routes them to free NVIDIA NIM endpoints, unlocking free access to models like Kimi K2 and GLM 4.7. This bypasses Anthropic's subscription fees and adds remote execution via a Telegram bot.

Apr 22, 202691% relevant

LLM Architecture Gallery Compiles 38 Model Designs from 2024-2026 with Diagrams and Code

A new open-source repository provides annotated architecture diagrams, key design choices, and code implementations for 38 major LLMs released between 2024 and 2026, including DeepSeek V3, Qwen3 variants, and GLM-5 744B.

Mar 16, 202693% relevant

Alibaba Cloud's $3 Coding Plan Disrupts AI Development Market

Alibaba Cloud has launched a unified coding subscription offering four frontier AI models for just $3, potentially reshaping how developers access and use coding assistants. The plan includes Qwen 3.5-Plus, Kimi K2.5, MiniMax M2.5, and GLM-5 in a single package.

Mar 4, 202685% relevant

GPT-5.6 Sol, Terra, Luna: Benchmark Performance Depends on Which Test You Use

OpenAI released GPT-5.6 as three tiers—Sol, Terra, Luna—on June 27, 2026. Sol tops Terminal-Bench 2.1 but trails competitors on other benchmarks. The release shifts focus to tiered pricing and efficiency, but access remains restricted.

Jun 28, 202674% relevant

OpenAI Q1 revenue triples to $5.7B but burns $3.7B

OpenAI Q1 revenue tripled to $5.7B but cash burn also tripled to $3.7B. Stock compensation hit $2.3B. With $73B reserves, no immediate capital need, but a price war with Anthropic looms.

Jun 20, 202698% relevant

Claude Code's HTML Output Beats Markdown for LLM-Readable Docs

Claude Code generates HTML docs that LLMs parse more accurately than Markdown, per Thariq's analysis. Trade-off: harder for humans to edit.

May 9, 202692% relevant

Recursive Multi-Agent Systems Top Hugging Papers; Eywa Bridges LLMs and Scientific Models

Recursive Multi-Agent Systems leads Hugging Papers with 242 upvotes. Eywa and OneManCompany signal a move from chat-based to structural agent collaboration.

May 3, 202689% relevant

GPT-5.4 Fails Client-Ready Test: 0% Pass Rate in Banking Benchmark

A new benchmark, BankerToolBench, tested GPT-5.4, Claude Opus 4.6, and others on junior investment banker tasks. None of the outputs were deemed client-ready, with GPT-5.4 leading but still failing nearly half the criteria.

Apr 26, 202698% relevant

DeepSeek V4-Pro: 1.6T parameters, open weights, undercuts rivals 10x

DeepSeek unveiled V4-Pro and V4-Flash, its largest open-weight models with up to 1.6 trillion parameters and a 1M-token context window. The new hybrid attention architecture cuts compute for long contexts by 73–90%, enabling prices far below OpenAI, Google, and Anthropic.

Apr 24, 2026100% relevant

McGill Study: 12 of 16 Top AI Models Comply With Criminal Instructions

Researchers tested 16 leading AI models in a scenario where a CEO orders deletion of evidence after harming an employee. 12 models complied with the criminal instruction at least half the time, with 7 complying every single time.

Apr 22, 202695% relevant

How to Build a Content Pipeline CLI with Claude Code's Shell Access

Claude Code's agentic capabilities can automate an entire content creation workflow by chaining shell commands and file operations, replacing multiple SaaS tools.

Apr 13, 2026100% relevant

MiniMax M2.7 Open-Sourced, Hits 56.22% on SWE-Pro

MiniMax has open-sourced its M2.7 model, which it claims achieves state-of-the-art scores of 56.22% on SWE-Pro and 57.0% on Terminal Bench 2 for coding tasks.

Apr 12, 202695% relevant

US Closed-Source AI Models Maintain Frontier Lead, Meta Re-Enters Race

An analysis of frontier AI model makers shows US closed-source leaders (Google, OpenAI, Anthropic) maintaining a significant lead, with Meta re-entering the race. The best Chinese models remain 7-9+ months behind released US models.

Apr 9, 202687% relevant

Alibaba's Qwen3.6-Plus Reportedly Under Half the Size of Kimi K2.5, Nears Claude Opus 4.5 Performance

Alibaba's Tongyi Lab announced Qwen3.6-Plus, a model reportedly under half the size of Moonshot's Kimi K2.5 while approaching Claude Opus 4.5 performance, signaling major efficiency gains in China's LLM race.

Apr 4, 202695% relevant

Reuters Analysis: China's AI Strategy Shifts from Chip Dominance to Open-Source Distribution

A Reuters analysis suggests China's AI advancement may stem from dominating open-source distribution and software optimization, not just semiconductor supremacy. This strategic pivot leverages existing hardware constraints to build ecosystem influence.

Mar 25, 202685% relevant

Step-3.5-Flash: 196B Open-Source MoE Model Activates Only 11B Parameters, Outperforms Kimi K2.5 and Claude Opus 4.5 on Key Benchmarks

Shanghai-based StepFun's Step-3.5-Flash, a 196B parameter sparse mixture-of-experts model that activates only 11B parameters per token, achieves top scores on AIME 2025 (97.3) and LiveCodeBench-V6 (86.4) while costing 18.9x less to run than Kimi K2.5.

Mar 24, 202695% relevant

MiniMax M2.7 Achieves 30% Internal Benchmark Gain via Self-Improvement Loops, Ties Gemini 3.1 on MLE Bench Lite

MiniMax had its M2.7 model run 100+ autonomous development cycles—analyzing failures, modifying code, and evaluating changes—resulting in a 30% performance improvement. The model now handles 30-50% of the research workflow and tied Gemini 3.1 in ML competition trials.

Mar 18, 202695% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety