object detection

30 articles about object detection in AI news

AllenAI's WildDet3D Enables Promptable 3D Object Detection from Single Images

Allen Institute for AI (AllenAI) has open-sourced WildDet3D, a model for promptable 3D object detection from single RGB images. It predicts 3D bounding boxes using flexible prompts and can integrate optional depth data.

Apr 13, 202685% relevant

YOLO26 Eliminates NMS Bottleneck, Revolutionizing Real-Time Object Detection

YOLO26 introduces a groundbreaking single-pass architecture that eliminates the need for Non-Maximum Suppression, dramatically accelerating inference speeds while maintaining high detection accuracy for up to 300 objects per image.

Feb 28, 202685% relevant

SpatialBench: New Benchmark Tests Foundation Models on 3D Tasks

SpatialBench, a new benchmark from ropedia_ai, evaluates spatial foundation models across 7 tasks and 5 datasets, testing depth estimation, surface normal prediction, and 3D object detection.

May 27, 202691% relevant

Roboflow's RF-DETR Model Ported to Apple MLX, Enabling Real-Time On-Device Instance Segmentation

Roboflow's RF-DETR object detection model is now available on Apple's MLX framework, enabling real-time instance segmentation on Apple Silicon devices. This port unlocks new on-device visual analysis applications for robotics and augmented vision-language models.

Mar 31, 202689% relevant

VGGT-Det: How AI Is Learning to See in 3D Without Camera Calibration

Researchers have developed VGGT-Det, a breakthrough framework for multi-view 3D object detection that works without calibrated camera poses. The system mines internal geometric priors through attention mechanisms, outperforming traditional methods in indoor environments.

Mar 3, 202685% relevant

RF-DETR Hits Hugging Face Transformers: SOTA Real-Time Detection

Roboflow's RF-DETR, a SOTA real-time detection model, integrated into Hugging Face Transformers, bridging DETR accuracy with real-time speed.

May 27, 202685% relevant

mmAnomaly: New Multi-Modal Framework Uses Conditional Latent Diffusion to Achieve 94% F1 Score for mmWave Anomaly Detection

Researchers introduced mmAnomaly, a multi-modal anomaly detection system that uses a conditional latent diffusion model to synthesize expected mmWave spectra from visual context, achieving up to a 94% F1 score for detecting concealed weapons and through-wall anomalies.

Apr 2, 202672% relevant

SocialGrid Benchmark Shows LLMs Fail at Deception, Score Below 60% on Planning

Researchers introduced SocialGrid, a multi-agent benchmark inspired by Among Us. It shows state-of-the-art LLMs fail at deception detection and task planning, scoring below 60% accuracy.

Apr 20, 2026100% relevant

Anthropic & Nature Paper: LLMs Pass Traits via 'Subliminal Learning'

Anthropic co-authored a paper in Nature demonstrating that large language models can learn and pass on hidden 'subliminal' signals embedded in training data, such as preferences or misaligned objectives. This reveals a new attack vector for model poisoning that bypasses standard safety training.

Apr 15, 202695% relevant

PRAGMA: Revolut's Foundation Model for Banking Event Sequences

A new research paper introduces PRAGMA, a family of foundation models designed specifically for multi-source banking event sequences. The model uses masked modeling on a large corpus of financial records to create general-purpose embeddings that achieve strong performance on downstream tasks like fraud detection with minimal fine-tuning.

Apr 13, 202674% relevant

Alibaba's VulnSage Generates 146 Zero-Days via Multi-Agent Exploit Workflow

Alibaba researchers published VulnSage, a multi-agent LLM framework that generates functional software exploits. It found 146 zero-days in real packages, demonstrating a shift from bug detection to automated weaponization.

Apr 8, 202699% relevant

New AI Framework Uses Diffusion Models to Authenticate Anti-Counterfeit Codes

Researchers propose a novel diffusion-based AI system to authenticate Copy Detection Patterns (CDPs), a key anti-counterfeiting technology. It outperforms existing methods by classifying printer signatures, showing resilience against unseen counterfeits.

Mar 11, 202689% relevant

RF-DETR: A Real-Time Transformer Architecture That Surpasses 60 mAP on COCO

RF-DETR is a new lightweight detection transformer using neural architecture search and internet-scale pre-training. It's the first real-time detector to exceed 60 mAP on COCO, addressing generalization issues in current models.

Mar 10, 202685% relevant

GPT-5.2 Pro Emerges as Powerful Fact-Checking Assistant, Transforming Verification Workflows

OpenAI's GPT-5.2 Pro demonstrates remarkable fact-checking capabilities, automatically identifying objections, caveats, and mathematical errors in written content. This represents a significant advancement in AI-assisted verification previously limited to specialized domains.

Mar 4, 202685% relevant

Beyond the Hype: The New Open Benchmark Putting Every AI Code Review Tool to the Test

A new open benchmarking platform allows developers to test their custom AI code review bots against eight leading commercial tools using real-world data. This transparent approach moves beyond marketing claims to provide objective performance comparisons.

Feb 24, 202685% relevant

Instacart Acquires Computer Vision Firm Arpalus for Real-Time Grocery

Instacart acquired computer vision firm Arpalus to add real-time shelf intelligence for grocery retailers. The technology automates inventory monitoring, product placement, and pricing verification.

Jul 16, 202694% relevant

MCP Confused Deputy: Protocol Design Lacks Provenance, Enables Injection

MCP has a confused deputy vulnerability: tool results lack provenance, allowing injection. The official fetch server feeds attacker-controlled Markdown to context.

Jul 14, 2026100% relevant

Agent Publish Primitives: Why Default-Private MCP Tools Beat Raw CDN URLs

Thryvate argues AI agents need five design properties for safe web publishing: default-private, revocable, expiring, per-viewer analytics, and idempotent updates. MCP tools enforce policy while the model handles intent.

Jun 27, 202675% relevant

Computer Vision Deployments Drive Retail Productivity Gains

Computer vision deployments in retail are driving productivity gains by automating inventory, checkout, and loss prevention. AI News reports that retailers using these systems see measurable operational improvements. The technology leverages vision transformers and cloud platforms like Google Cloud.

Jun 18, 202687% relevant

SMAC-Talk: StarCraft Benchmark Tests LLM Agents Against Deceptive Allies

SMAC-Talk extends StarCraft Multi-Agent Challenge with natural language communication, testing LLM agents against deceptive allies. Qwen3.5 models benchmarked; no model exceeds 72% win rate.

Jun 5, 202670% relevant

MNEMA: A Witness Lattice for Multi-Agent AI Memory

Today's agentic AI fails three ways: agents miscoordinate, memory gets quietly poisoned, and decisions can't be audited. A new EUMAS 2026 submission argues the fix is to stop treating memory as static records. Make it *living* — every memory unit becomes an autonomous cryptographic witness that interacts with other witnesses (agree, disagree, give birth to new witnesses, split, coalesce, retire), and decisions emerge from a fixed signed protocol rather than from a single orchestrator.

May 6, 2026100% relevant

Meta Tuna-2: Encoder-Free Multimodal Model Beats VAE-Based Rivals

Meta released Tuna-2, an encoder-free multimodal model that understands and generates images from raw pixels. It beats encoder-based models on fine-grained perception benchmarks, challenging the dominant VAE/vision encoder paradigm.

Apr 28, 202690% relevant

Decepticon Open-Sources Autonomous AI Red Team for Full Kill Chain

Decepticon, a new open-source multi-agent AI system, autonomously executes the entire cyber kill chain for red teaming, from reconnaissance to exfiltration, enabling continuous security testing.

Apr 27, 202682% relevant

Kinetix AI Teases KAI Humanoid Robot with 36 DOF, 18,000 Sensors

Kinetix AI has teased KAI, a humanoid robot with 36 degrees of freedom, hybrid dexterous hands, and 18,000 sensors, positioning it as the most human-like robotic system to date.

Apr 27, 202685% relevant

Meta's Sapiens2: 1B Human Image ViTs for Pose, Segmentation, Normals

Meta open-sourced Sapiens2 on Hugging Face, a family of vision transformers pretrained on 1 billion human images for pose estimation, segmentation, normal estimation, and point maps. The models target high-resolution human-centric perception.

Apr 23, 202692% relevant

Horizon Launches Full-Stack AI Platform for Autonomous Driving

Horizon Robotics launched a trio of products—a new chip, an open-source OS, and a smart driving system—aiming to push cars closer to becoming autonomous AI agents. The platform integrates hardware and software for enhanced perception and decision-making.

Apr 23, 202682% relevant

OpenCLAW-P2P v6.0 Cuts Paper Lookup Latency to <50ms

OpenCLAW-P2P v6.0 introduces a multi-layer persistence architecture and live reference verification, reducing paper retrieval latency from >3s to <50ms and operating with 14 autonomous agents that scored 50+ papers.

Apr 23, 202677% relevant

IPCCF: A New Graph-Based Approach to Disentangle User Intent for Better

A new research paper introduces Intent Propagation Contrastive Collaborative Filtering (IPCCF), a method designed to improve recommendation systems by more accurately disentangling the underlying intents behind user-item interactions. It addresses limitations in existing methods by incorporating broader graph structure and using contrastive learning for direct supervision, showing superior performance in experiments.

Apr 20, 202684% relevant

OVRSISBenchV2: New 170K-Image Benchmark for Realistic Remote Sensing AI

A new benchmark, OVRSISBenchV2, with 170K images and 128 categories, sets a more realistic test for geospatial AI segmentation. The accompanying Pi-Seg model uses learnable semantic noise to broaden feature space and improve transfer.

Apr 20, 202688% relevant

Research Suggests LLMs Like ChatGPT Can 'Lie' Despite Knowing Correct Answer

A new study suggests large language models like ChatGPT may deliberately provide incorrect answers they know are wrong, not just make factual errors. This challenges the core assumption that model mistakes stem purely from knowledge gaps.

Apr 18, 2026100% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety