sambanova
14 articles about sambanova in AI news
MiniMax M2.7 Hits 400 TPS on SambaNova Hardware
MiniMax M2.7 reaches 400 TPS on SambaNova hardware, making latency imperceptible. Details on model size and batch size undisclosed.
Intel, SambaNova Blueprint Pairs GPUs for AI Prefill, RDUs for Decoding
Intel and SambaNova Systems have outlined a new inference architecture for agentic AI workloads. It splits tasks between GPUs for 'prefill' and SambaNova's Reconfigurable Dataflow Units (RDUs) for high-throughput token generation.
Inference shift opens door for AI chip startups to challenge Nvidia
Inference shift from training to serving creates opportunities for AI chip startups. Nvidia's $20B Groq acquihire validates disaggregated compute strategies.
Building an Agentic Enterprise Control Plane on Snowflake: A Technical Blueprint
Snowflake Intelligence and Cortex Code now enable a fully embedded agentic AI control plane. This article provides a tested, end-to-end blueprint for building a production-grade Streamlit dashboard that integrates five enterprise tables with six Cortex AI functions, all governed by existing data platform RBAC.
Horizon Launches Full-Stack AI Platform for Autonomous Driving
Horizon Robotics launched a trio of products—a new chip, an open-source OS, and a smart driving system—aiming to push cars closer to becoming autonomous AI agents. The platform integrates hardware and software for enhanced perception and decision-making.
DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program
DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.
Skill-RAG Uses Hidden-State Probes to Trigger Retrieval Only When Needed
Researchers introduced Skill-RAG, a system that uses hidden-state probing to detect when an LLM is about to fail, triggering targeted retrieval. This improves over uniform RAG baselines on HotpotQA, Natural Questions, and TriviaQA.
Nvidia: Cost Per Token Is the Only AI Infrastructure Metric That Matters
Nvidia asserts that total cost of ownership for AI infrastructure must be measured in cost per delivered token, not raw compute metrics. This shift is critical for scaling profitable agentic AI applications.
Entropy-Guided Branching Boosts Agent Success 15% on New SLATE E-commerce
A new paper introduces SLATE, a large-scale benchmark for evaluating tool-using AI agents, and Entropy-Guided Branching (EGB), an algorithm that improves task success rates by 15% by dynamically expanding search where the model is uncertain.
Hugging Face Launches 'Kernels' Hub for GPU Code, Like GitHub for AI Hardware
Hugging Face has launched 'Kernels,' a new section on its Hub for sharing and discovering optimized GPU kernels. This treats performance-critical code as a first-class artifact, similar to AI models.
Agentic Marketing AI Sustains Performance Gains in 11-Month Case Study
An 11-month longitudinal case study compared human-led vs. autonomous agentic personalization for marketing. While human management generated the highest lift, autonomous agents successfully sustained positive performance gains, pointing to a symbiotic operational model.
Anthropic Tests Sonnet-to-Opus 'Phone a Friend' for Cost-Effective AI
Anthropic is experimenting with a system where its Claude 3.5 Sonnet model can automatically invoke the more capable Claude 3 Opus for difficult tasks. This 'phone a friend' approach aims to improve final output quality while reducing overall token consumption and cost.
Groq's LPU Inference Engine Demonstrates 500+ Token/s Performance on Llama 3.1 70B
Groq's Language Processing Unit (LPU) inference engine achieves over 500 tokens/second on Meta's Llama 3.1 70B model, demonstrating significant performance gains for large language model inference.
MatX Secures $500M War Chest to Challenge Nvidia's AI Chip Dominance
AI chip startup MatX, founded by ex-Google semiconductor engineers, has raised over $500 million to develop hardware that directly competes with Nvidia. This massive funding round signals growing investor confidence in alternatives to the current AI chip market leader.