ML Runtime and Optimization team hiring at Waymo

Our team is expanding, and we’re seeking highly skilled Machine Learning Engineers (L4-L6) to drive innovation in model and runtime optimization for Waymo. If you possess deep expertise in any of the following areas, we want to hear from you:

Model Optimization: Vision Transformer (ViT) and text decoder optimization (mixed precision, quantization, low-rank approximation, etc.)
ML Compiler Optimization: Op fusion, sharding/partitioning/data parallelism, heterogeneous offloading, and related techniques.
GPU Kernel Development: Proficiency in Triton, PTX, or similar.
CUDA Runtime Expertise: Asynchronous scheduling, event management, memory allocation, and more.
ML Framework Device Integration: Developing device runtime plugins for Jax, TensorFlow, or other frameworks, and integrating accelerator device runtimes.

We offer highly competitive, above-market base salaries and pre-IPO stock with high potential. Send me or the hiring manager a message!

Job Links: L6, L5 and L4