AI Compiler & Kernel Engineer
Write CUDA kernels, ML compilers, and low-level optimizations for AI workloads.
7
Open Positions
Core Skills
CUDA KernelsTritonXLAMLIRC++GPU ProgrammingCompiler DesignFlashAttention
Active Positions (7)
Systems Engineer, Kernelmid
CoreWeave· Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA
Linux kernel engineeringkernel-level debuggingkernel crash analysisupstreaming kernel fixesDPU (Data Processing Unit) integrationGPU kernel optimization
ML HW-SW Co-Design Software Tech Lead Manager (TLM)senior
Google DeepMind·Mountain View, California, US
HW-SW Co-designsoftware stackmachine learning accelerationtechnical executionarchitectural alignmentcodebase contribution
Performance Engineer, GPUmid
Anthropic·San Francisco, CA | New York City, NY | Seattle, WA
GPU programmingGPU kernel developmenttensor core optimizationsdistributed GPU orchestrationGPU utilization optimizationinference efficiency optimization
Senior GenAI Research Engineer - Optimization and Kernelssenior
Databricks·San Francisco, California
kernel fusionmixed precision optimizationmemory layout optimizationtiling strategiestensorizationGPU kernels for training
ML HW-SW Co-design Software Managermanager
Google DeepMind·Mountain View, California, US
HW-SW co-designmachine learning acceleratorsML hardware optimizationGenAI infrastructureAI hardware-software integrationML accelerator software
TPU Kernel Engineermid
Anthropic·San Francisco, CA | New York City, NY | Seattle, WA
TPU kernel optimizationlow-precision inference adaptationquantization for ML acceleratorshigh-throughput sampling for LLMsML accelerator architecturekernel design for TPUs
Staff Software Engineer - GenAI Performance and Kernelstaff
Databricks·San Francisco, California
GPU OptimizationAttention Kernel OptimizationMLP Kernel OptimizationKernel FusionMixed Precision TrainingQuantization Techniques