Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Data & Synthetic Data Engineer

Build data pipelines, curation systems, and synthetic data generation for training AI models.

45
Open Positions

Core Skills

Synthetic Data GenerationData Curation PipelinesSparkAirflowdbtData QualityAnnotation PipelinesBigQuery

Active Positions (45)

Forward Deployed Engineer - Data-as-a-Servicemid
Snorkel AI·New York City, NY (Hybrid); Redwood City, CA (Hybrid); San Francisco, CA (Hybrid)
Human-in-the-loop (HITL) data generationAI data pipeline lifecycleML-based workflowsData quality frameworksML-assisted applications
AI Data Operations Leadsenior
Figure AI·San Jose, CA
data collection pipelinedata quality assuranceuser acquisitionincentive managementdata dashboardingoperational infrastructure
Associate Data Scientist - User Fraudmid
Spotify·Toronto
fraud detectionabuse mitigationanomaly detectionaccount abuse preventionartificial manipulation detection
Research, Post-Training Datamid
Cognition (Devin)·San Francisco Bay Area
Post-training data researchHuman feedback data collectionSynthetic data generation pipelinesModel-assisted labelingHuman preference modelingScalable oversight paradigms
AI Tutor - VietnamesemidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - TamilmidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - GujaratimidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - JapanesemidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - UrdumidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - HindimidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - IndonesianmidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - ItalianmidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - KoreanmidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - MarathimidRemote
xAI·Remote
GrokMultilingual Audio AI TrainingSpeech RecognitionVoice Interaction TrainingAudio Data AnnotationCross-cultural Audio Nuances
AI Tutor - NorwegianmidRemote
xAI·Remote
GrokMultilingual Audio AI TrainingSpeech RecognitionVoice Interaction TrainingAudio Data AnnotationCross-cultural Audio Nuances
AI Tutor - PolishmidRemote
xAI·Remote
GrokMultilingual Audio AI TrainingSpeech RecognitionVoice Interaction TrainingAudio Data AnnotationCross-cultural Audio Nuances
AI Tutor - PortuguesemidRemote
xAI·Remote
GrokMultilingual Audio AI TrainingSpeech RecognitionVoice Interaction TrainingAudio Data AnnotationCross-cultural Audio Nuances
AI Tutor - PunjabimidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - SpanishmidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - SwedishmidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - TagalogmidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - TelugumidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - ThaimidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - TurkishmidRemote
xAI·Remote
Grokmultilingual audio annotationspeech recognition trainingvoice interaction trainingaudio data curationaccent adaptation
AI Tutor - ArabicmidRemote
xAI·Remote
Multilingual audio annotationSpeech recognition trainingVoice interaction designAccent and dialect analysisAudio data curationGrok AI model training
AI Tutor - BengalimidRemote
xAI·Remote
Multilingual audio annotationSpeech recognition trainingVoice interaction designAccent and dialect analysisAudio data curationGrok AI model training
AI Tutor - FinnishmidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - FrenchmidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - GermanmidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - HebrewmidRemote
xAI·Remote
GrokMultilingual Audio AnnotationSpeech Recognition TrainingVoice Interaction TrainingAudio Data CurationAccent Recognition
AI Tutor - DanishmidRemote
xAI·Remote
multilingual audio annotationspeech recognition for diverse accentsvoice interaction trainingaudio data curationGrok voice capabilitiesmultilingual speech processing
AI Tutor - DutchmidRemote
xAI·Remote
multilingual audio annotationspeech recognition for diverse accentsvoice interaction trainingaudio data curationGrok voice capabilitiesmultilingual speech processing
Engineering Manager, Financial Data Qualitymanager
OpenAI·San Francisco
data quality monitoringdata validationdata lineagesystem migrationsvendor dependency reductionfinancial data modeling
Data Engineer II - SRC - Musicmid
Spotify·London
Spotify Rights Center (SRC)rights management platformautomated content scanningpolicy managementenforcement pipelinesappeals workflows
AI Healthcare and Administration Tutor midRemote
xAI·Remote
AI Healthcare and Administrationdata annotationspatient care coordinationmedical billingadministrative workflowshealthcare operations
Technical Solutions Specialist, Data Operationsmid
Anthropic·San Francisco, CA | New York City, NY
data deduplicationdata pipeline managementcloud infrastructure for data deliverydata compliance trackingdata usage trackingdata filtering systems
Quality Manager & Trainer (x/f/m)manager
Doctolib·Berlin, Berlin, Germany
Google Cloud Platform (GCP)BigQueryDataflowPub/SubCloud StorageVertex AI
Data Engineer - Customer Service Platformmid
Spotify·London
BigQueryGCP DataflowGCP Pub/Subbatch data processingreal-time data processingdata quality checks
AI Tutor - CryptomidRemote
xAI·Remote
Crypto Expertcryptocurrencydigital asset marketsquantitative crypto strategieson-chain analysisDeFi protocols
Data Scientist, Subscriptionsmid
Spotify·New York, NY
subscriber behavior analysisplan change impact measurementnew proposition evaluationportfolio data sciencesubscription plan evolutionadd-on revenue streams
Data Engineer, Safeguardsmid
Anthropic·London, UK
safety monitoring data pipelinesabuse detection data infrastructureenforcement workflow data systemsdata warehousing for safetymodel behavior analyticsmisuse pattern analysis
Software Engineer, Human Data Interfacemid
Anthropic·San Francisco, CA | New York City, NY
data collection pipelineshuman data interfacescrowdworker toolingdata vendor interfaceshigh-quality data collection at scaleresearch data needs
Senior Data Engineer - AI Focused (x/f/m)senior
Doctolib·Paris, Paris, France
LLMVLMRAGAI Medical CompanionGoogle Cloud PlatformBigQuery
[Annotations] Operations Associatemid
Scale AI·Mexico City, MX
robotics models data creationperception training datamanipulation training dataautonomous decision-making datatraining data at scaleprocess improvement
Senior Software Engineer, Backend — Frontier Datasenior
Scale AI·San Francisco, CA; New York, NY
agent workflowscoding agentstool-use orchestrationGUI/computer-use automationfrontier agentic data productsscalable backend systems for AI