Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Computer Vision & Multimodal

Build systems that understand images, video, and multimodal inputs. Generation, perception, 3D.

16
Open Positions

Core Skills

Diffusion ModelsVision TransformersCLIPStable DiffusionNeRF3D ReconstructionOpenCVCUDA

Active Positions (16)

Conversational Modelling Research Engineer midRemote
Tavus·Remote
Vision-Language Models (VLMs)Multimodal AIModel Fine-TuningPyTorchDiffusion ModelsAudio Generation
[2026] Senior Machine Learning Engineer, Account Identity - PhD Early Careersenior
Roblox·San Mateo, CA, United States
Vision-Language Models (VLMs)Multimodal AIObject DetectionAdversarial MLEmbeddingsComputer Vision
AI Researcher (Multimodal Audio/Video Generation)mid
Tavus·San Francisco
Diffusion ModelsMultimodal AIAudio GenerationPyTorchDistributed TrainingVision-Language Models (VLMs)
Senior Software Engineer, Geometrysenior
Roblox·Vancouver, British Columbia, Canada
3D Reconstruction3D Vision
Android Engineer, ChatGPT ImageGenmid
OpenAI·San Francisco
Multimodal AIDiffusion ModelsOn-Device MLVision-Language Models (VLMs)
Full Stack Software Engineer, ChatGPT ImageGenmid
OpenAI·San Francisco
Diffusion ModelsMultimodal AIVision-Language Models (VLMs)Inference Optimization
Head of Computer VisiondirectorRemote
Photoroom·Paris | Remote
Computer VisionFoundation ModelsDiffusion ModelsVision TransformersEvaluation FrameworksModel Fine-Tuning
Senior Machine Learning Engineer - Enrichment & Content Intelligencesenior
Spotify·New York, NY
Multimodal AINatural Language Processing (NLP)EmbeddingsKnowledge Distillation
Machine Learning Engineer - Mappingmid
Waymo·Mountain View, CA, USA
Vision-Language Models (VLMs)Foundation ModelsFew-Shot LearningPyTorchJAXModel Fine-Tuning
Senior Machine Learning Engineer, Perception LLM/VLMsenior
Waymo·Mountain View, CA USA; San Francisco, CA USA;
Vision-Language Models (VLMs)Large Language Models (LLMs)Pre-TrainingFoundation ModelsMultimodal AISensor Fusion
Senior/Staff ML Engineer, 3D/4D World Modeling, Simulation senior
Waymo·Mountain View, CA, USA
Diffusion ModelsVision-Language Models (VLMs)Foundation Models3D ReconstructionQuantizationDistillation
Director, Machine Learning Engineering – Content & User UnderstandingdirectorRemote
Pinterest·San Francisco, CA, US; Remote, US
Vision-Language Models (VLMs)Large Language Models (LLMs)Recommendation SystemsFoundation ModelsMulti-modal AILearning-to-Rank
Machine Learning Engineer II, Computer Vision Applied Science midRemote
Pinterest·San Francisco, CA, US; Remote, US
Vision-Language Models (VLMs)Multimodal AIModel Fine-TuningDiffusion ModelsContrastive LearningEvaluation Frameworks
Sr. Machine Learning Engineer, Applied ScienceseniorRemote
Pinterest·San Francisco, CA, US; Remote, US
Diffusion ModelsVision-Language Models (VLMs)Multimodal AIFoundation ModelsContrastive LearningEmbeddings
Research Engineer, Information Qualitymid
Google DeepMind·Mountain View, California, US
Vision-Language Models (VLMs)Multimodal AIComputer VisionAdversarial TestingEvaluation FrameworksContrastive Learning
Research Engineer, Human Understandingmid
Google DeepMind·Los Angeles, California, US; Mountain View, California, US
Multimodal AIVision-Language Models (VLMs)Speech Recognition (ASR)Contrastive LearningAdversarial MLSelf-Supervised Learning