Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

semianalysis

30 articles about semianalysis in AI news

Anthropic Opus 4.8 Cuts Bug-Finding Cost by 5x, SemiAnalysis Finds

Anthropic's Opus 4.8 + ultracode mode cuts severe bug-finding cost to ~1/5, per preliminary SemiAnalysis experiments with wide error bars.

87% relevant

SemiAnalysis Calls Jensen ComputeX Keynote 'F Tier' Over No AI DC News

SemiAnalysis rated Jensen Huang's ComputeX keynote 'F Tier' for no AI datacenter news and revealed a delayed NVIDIA ARM chip with broken video output.

82% relevant

SemiAnalysis: N3 chip demand far outstrips current consensus estimates

SemiAnalysis argues N3 chip demand far exceeds consensus accelerator models, implying a structural silicon shortage not priced by markets.

89% relevant

SemiAnalysis: Perplexity Slack Bot Beats Claude in Internal Trial

SemiAnalysis found Perplexity's Slack bot beats Claude in internal trial. 96% token budget goes to Anthropic, but usage may shift.

75% relevant

Cerebras Understates On-Chip SRAM by 8x, SemiAnalysis Notes

Cerebras understates on-chip SRAM by 8x per SemiAnalysis, a rare under-specification in chip marketing.

75% relevant

SemiAnalysis: NVIDIA's Customer Data Drives Disaggregated Inference, LPU Surpasses GPU

SemiAnalysis states NVIDIA's direct customer feedback is leading the industry toward disaggregated inference architectures. In this model, specialized LPUs can outperform GPUs for specific pipeline tasks.

85% relevant

Buffett Invests in Google After SemiAnalysis TPU Deep Dive

Berkshire Hathaway invested in Google in Q3 2025, after Buffett studied TPU v5p architecture. He compared it to railroads, citing 8,960 chips and 4.8 Tbps links.

85% relevant

ERCOT datacenter requests exceed grid capacity by 5x

ERCOT datacenter requests far exceed grid underwriting capacity, per @SemiAnalysis_, revealing grid approval as a binding constraint on AI infrastructure buildout.

87% relevant

Cerebras CS4 Stays on 5nm as SRAM Scaling Flattens

Cerebras CS4 stays on 5nm due to SRAM scaling flattening, per @SemiAnalysis_. 3nm offers no density gain, so the chip prioritizes yield and cost.

85% relevant

Median Coding Agent Hits 96k Input Tokens, Rewriting Inference Economics

SemiAnalysis found median coding agent uses 96k input tokens from 432k requests, shifting inference cost focus from output to context.

95% relevant

Vibe-Coding Bottleneck: CPU Box Rental Gets Harder

SemiAnalysis flags that vibe-coding wave makes cheap CPU box rentals less routine, bottlenecking developers who need quick cloud compute for AI prototyping.

75% relevant

Datacenter Developers Flee City Zoning for Unincorporated County Land

Datacenter developers are siting projects on unincorporated county land to avoid city zoning delays, redrawing the AI infrastructure map per @SemiAnalysis_.

100% relevant

NVIDIA Vera Rubin VR NVL72: Value Extraction Engine Arrives

NVIDIA's Vera Rubin VR NVL72 shifts from value vendor to value extractor, targeting TCO. SemiAnalysis argues this overturns prior pricing paradigm.

95% relevant

CPU Demand Flipping the AI Narrative as Datacenter Growth Shifts

A new analysis from SemiAnalysis indicates CPU demand is rising in AI datacenters, reversing a narrative of GPU-only dominance. This shift signals changing workload patterns and infrastructure priorities.

100% relevant

Huawei Hits 1.5µm Bond Pitch in Kirin 2026 Chips, Beats TSMC

Huawei's 2026 Kirin chips achieve 1.5µm hybrid bonding pitch, 16-36x denser than TSMC. Next year targets 1µm.

85% relevant

xAI Drops JAX, Builds Custom C Training Framework After <10% MFU

xAI dropped JAX for GPU training after <10% MFU, building a custom C framework with Grok Build. NVIDIA's JAX team loses its biggest customer.

91% relevant

Blackwell NVLink Breaks Confidential Compute, 61% Regression Reported

NVIDIA Blackwell confidential computing disables NVLink multicast, causing 61% regression on SGLang Qwen3.5 397B. Hopper had unencrypted NVLink, compounding the issue.

100% relevant

DeepSeek v4 Pricing Cuts 75%: $0.43/M Tokens In

DeepSeek v4 API pricing permanently cut 75% to $0.43/M input, $0.87/M output, enabled by 27% compute and 10% cache vs v3.2.

100% relevant

US 'Stop Stealing our Chips Act' Would Pay Whistleblowers 10-30% of Export Fines

Proposed US law would pay whistleblowers 10-30% of export-control fines, targeting AI chip smuggling to China through intermediaries like Malaysian resellers.

93% relevant

Google TPU 'Broadfly' Topology Scales Pod to 1,152 Chips

Google unveiled a Broadfly TPU topology at Cloud Next, scaling pods to 1,152 chips — 4.5x larger than Ironwood — with max 7 hops. This inference-first design challenges NVIDIA's NVLink on scale and latency.

94% relevant

Cerebras IPO Challenges GPU Scaling Orthodoxy

Cerebras filed for IPO on April 21, betting wafer-scale chips can disrupt Nvidia's GPU cluster model for AI workloads.

98% relevant

Cerebra's Tokenomics Bet: AWS, OpenAI Deals and Wafer-Scale Edge

Cerebra's tokenomics pricing and AWS/OpenAI partnerships challenge NVIDIA's inference dominance, offering a 5x cost reduction per token via its wafer-scale architecture.

89% relevant

AMD Gives OSS Maintainers $3.6M MI355X Cluster Access

AMD gives vLLM/SGLang maintainers $3.6M MI355X cluster access, ending NVIDIA's monopoly on OSS inference hardware access.

75% relevant

B200 PD Disaggregation Boosts Token Throughput 7x, Slashes Cost

B200 clusters with PD disaggregation over RoCEv2 Ethernet achieve 7x token throughput, cutting cost per million tokens 7x.

85% relevant

AMD ROCm Performance Jumps 75x in 14 Days Post-DeepSeek v4

AMD ROCm stack improved 75x in 14 days post-DeepSeek v4 via fused operations. Still needs 5x more to match B200 performance.

100% relevant

Micron's PSMC Fab Buy: A $3.8B Memory Bet

Micron acquires PSMC's P5 fab for $3.8B, converting it for HBM and DDR5 production. The deal cuts 12-18 months off time-to-volume, challenging Samsung's HBM lead.

75% relevant

NVIDIA Feynman GPU Power Semi Content Hits $191K, 17× Blackwell

NVIDIA Feynman GPUs require $191K in power semiconductors per system, 17× Blackwell, driven by 800V DC architecture shift.

95% relevant

ODMs Evolve from Manufacturers to AI Infrastructure Partners

ODMs shift from manufacturing to design/integration partners for AI racks, driven by GPU/ASIC complexity and liquid cooling.

75% relevant

Intel's UCIe-S Hits 48 Gb/s on 22nm, Beats 3nm EMIB

Intel demonstrated a UCIe-S die-to-die interconnect on 22nm hitting 48 Gb/s/lane over standard organic substrate, beating a 3nm EMIB design with 3× higher data rate and 2.8× higher bandwidth density. This signals a strategic shift away from EMIB for Intel's own products toward UCIe over substrate.

95% relevant

Nvidia B200 Costs $6,400 to Produce, Gross Margin Hits 82%

Epoch AI estimates Nvidia's B200 GPU costs $5,700–$7,300 to produce, with HBM memory and advanced packaging accounting for two-thirds of the cost. At a $30k–$40k sale price, chip-level gross margins reach ~82%, though rack-scale margins may be lower.

100% relevant