deepseek
30 articles about deepseek in AI news
DeepSeek v4 Pricing Cuts 75%: $0.43/M Tokens In
DeepSeek v4 API pricing permanently cut 75% to $0.43/M input, $0.87/M output, enabled by 27% compute and 10% cache vs v3.2.
Ollama Now Runs Codex Locally: DeepSeek V4, Gemma 4, Qwen 3.6 Supported
Ollama integrates Codex support for DeepSeek V4, Gemma 4, Qwen 3.6, enabling free local code generation, challenging OpenAI's API model.
AMD ROCm Performance Jumps 75x in 14 Days Post-DeepSeek v4
AMD ROCm stack improved 75x in 14 days post-DeepSeek v4 via fused operations. Still needs 5x more to match B200 performance.
DeepSeek Hits $45B Valuation in First VC Round, Led by China State Fund
DeepSeek valuation jumps from $20B to $45B in first VC round led by China state fund. The raise targets employee retention and chip independence via Huawei optimization.
Amazon's SageMaker Agentic Fine-Tuning Supports Llama, Qwen, DeepSeek, Nova
Amazon launched an AI agent on SageMaker that automates fine-tuning of Llama, Qwen, DeepSeek, and Nova models via plain-language instructions, abstracting API fragmentation.
DeepSeek-V4 Ported to MLX for Apple Silicon Inference
A developer has ported DeepSeek-V4 to Apple's MLX framework, allowing the large language model to run on Apple Silicon Macs. Early results show functional inference with room for optimization.
DeepSeek V4-Pro: 1.6T parameters, open weights, undercuts rivals 10x
DeepSeek unveiled V4-Pro and V4-Flash, its largest open-weight models with up to 1.6 trillion parameters and a 1M-token context window. The new hybrid attention architecture cuts compute for long contexts by 73–90%, enabling prices far below OpenAI, Google, and Anthropic.
DeepSeek Seeks $300M+ at $10B+ Valuation to Retain AI Talent
DeepSeek is raising its first external capital, targeting $300M+ at a $10B+ valuation. The round is small (≤3% equity) to set a valuation benchmark for employee stock options and combat poaching by rivals.
DeepSeek Seeks First Outside Funding at $10B Valuation
DeepSeek is in talks to raise at least $300 million in its first external funding round at a $10 billion valuation. This ends its reliance on parent hedge fund High-Flyer Capital and signals a new phase in the costly global AI race.
Stealth 100B Model Appears on OpenRouter, Possibly DeepSeek or Kimi
A new, unannounced 100-billion-parameter AI model has appeared on the OpenRouter API platform. Its origin is unknown, but observers speculate it could be a variant from DeepSeek or an update to Kimi's code model.
DeepSeek-V4 Rumored as 'Whale' Returns, Signaling Major Model Release
DeepSeek's cryptic 'whale' codename has reappeared, strongly hinting at the impending launch of DeepSeek-V4. This follows the company's pattern of using the whale symbol before major model releases.
DeepSeek V4 Begins Limited Rollout with Fast, Expert, Vision Modes
DeepSeek V4 is reportedly in limited gray-scale testing with a new interface offering Fast, Expert, and Vision modes. This mirrors competitor Kimi's tiered system and suggests a move towards performance-based rate limiting.
GPT4All Hits 77K GitHub Stars, Adds DeepSeek R1 for Free Local AI
The GPT4All project has surpassed 77,000 GitHub stars as it adds support for distilled DeepSeek R1 models, enabling reasoning-capable AI to run locally on consumer CPUs with zero API costs.
AI Weekly: GPT-6 Rumors, DeepSeek V4 on Huawei, Anthropic Models, Qwen 3.6-Plus
A weekly roundup video aggregates major AI rumors and announcements, including unverified GPT-6 details, DeepSeek V4 reportedly running on Huawei hardware, and launches of Anthropic's Conway and Ultraplan and Alibaba's Qwen 3.6-Plus.
DeepSeek's HISA: Hierarchical Sparse Attention Cuts 64K Context Indexing Cost
DeepSeek researchers introduced HISA, a hierarchical sparse attention method that replaces flat token scanning. It removes a computational bottleneck at 64K context lengths without requiring any model retraining.
DeepSeek V4 to Run on Huawei Ascend 950PR Chips, Sparking 20% Price Surge
DeepSeek's anticipated V4 model will be powered by Huawei's Ascend 950PR chips, with Alibaba, ByteDance, and Tencent stockpiling hundreds of thousands of units ahead of launch. This has driven chip prices up approximately 20% in recent weeks.
DeepSeek's R1 Model Triggers Major AI Market Valuation Shifts
Chinese AI startup DeepSeek has released its new large language model R1, causing significant market disruption. The launch reportedly reduced tech giant valuations by approximately one trillion dollars as the model demonstrates competitive capabilities at lower costs.
DeepSeek-R1 Reportedly Hits 78.9% on OS-World, Outperforming GPT-5.4 at 1/10th Cost
A new benchmark claim suggests DeepSeek-R1 has achieved 78.9% on the OS-World agentic coding benchmark, reportedly outperforming GPT-5.4 while operating at one-tenth the cost. If verified, this would represent a significant leap in cost-performance for AI coding agents.
DeepSeek Teases 'Much Larger' Base Model Release Amid Industry Silence and Hardware Challenges
DeepSeek staff confirmed a new, larger base model is coming soon, following months of quiet after reports of failed Huawei chip training. This comes as the Chinese AI lab faces heightened expectations after its breakthrough o1-level model in January 2025.
China's DeepSeek-R1: Open-Source AI Agent Runs Locally with Web Search, Code Generation, and Built-In Computer
Chinese AI company DeepSeek has released DeepSeek-R1, a fully open-source AI agent that runs locally on personal computers with web search capabilities, code generation, and built-in computer functionality. The model represents a significant move toward accessible, self-contained AI systems outside the dominant U.S. ecosystem.
DeepSeek-R1 Scores 79.8% on SWE-Bench Verified, Matching Claude 3.5 Sonnet in Code Generation
DeepSeek's new R1 reasoning model achieved 79.8% on SWE-Bench Verified, matching Claude 3.5 Sonnet's performance. This marks significant progress in AI's ability to solve real-world coding problems.
DeepSeek V4 Emerges: China's Next AI Contender Takes Shape
DeepSeek appears poised to release its fourth-generation AI model, signaling continued advancement in China's competitive large language model landscape. The upcoming release follows the company's established pattern of rapid iteration.
DeepSeek-V2.5 R1: The Next Frontier in Open-Source AI Arrives
DeepSeek's highly anticipated next-generation model, DeepSeek-V2.5 R1, is reportedly launching this week according to credible sources. This release promises significant advancements in the competitive open-source AI landscape.
The Whale Approaches: DeepSeek v4 Looms as China's Next AI Power Play
Chinese AI firm DeepSeek is preparing to launch its v4 model, potentially narrowing the gap with Western AI leaders to just five months. This development signals China's accelerating progress in the global AI race.
The AI Race Intensifies: DeepSeek v4 and GPT-5.3 Set for Imminent Release
DeepSeek v4 is reportedly launching next week, with OpenAI's GPT-5.3 expected to follow shortly. This rapid succession of releases signals escalating competition in the AI landscape as major players race to establish dominance.
DeepSeek V4 Launch Signals China's Strategic Shift in AI Chip Independence
DeepSeek's upcoming V4 multimodal model prioritizes domestic chip partners Huawei and Cambricon over NVIDIA and AMD, marking a significant move toward Chinese AI self-sufficiency amid ongoing U.S. export restrictions.
DeepSeek's Blackwell Training Exposes Critical Gaps in US Chip Export Controls
Chinese AI startup DeepSeek reportedly trained its latest model on Nvidia's restricted Blackwell chips, challenging US export controls. The development reveals significant loopholes in semiconductor restrictions amid escalating AI competition.
AI Training Data Scandal: DeepSeek Accused of Scraping 150K Claude Conversations
DeepSeek faces allegations of scraping 150,000 private Claude conversations for training data, prompting a developer to release 155,000 personal Claude messages publicly. This incident highlights growing tensions around AI data sourcing ethics and intellectual property.
DeepSeek's Blackwell Gambit: How a Chinese AI Firm Reportedly Circumvented U.S. Chip Export Controls
Chinese AI company DeepSeek reportedly trained its upcoming model using Nvidia's restricted Blackwell chips, potentially clustered in an Inner Mongolia data center. This development highlights the escalating tech rivalry and challenges of enforcing export controls in the AI arms race.
AI Power Shift: How DeepSeek's Alleged Blackwell Chip Access Could Reshape Global AI Race
Chinese AI startup DeepSeek reportedly trained its next major model on Nvidia's banned Blackwell chips, potentially triggering a seismic shift in the AI landscape. US giants Google, OpenAI, and Anthropic are preparing for what could be a market-disrupting release next week.