Talent.com
Principal Software Engineer, ML Infrastructure

Principal Software Engineer, ML Infrastructure

ZooxFoster City, CA
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Zoox is on a mission to reimagine transportation and ground-up build autonomous robotaxis that are safe, reliable, clean, and enjoyable for everyone. We are still in the early stages of deploying our robotaxis, and it's a great time to join Zoox and make a significant impact on executing this mission. The ML Infrastructure team at Zoox plays a crucial role in enabling innovations in ML and CV and making autonomous driving as seamless as possible. The Opportunity We are seeking a deeply technical, influential, and hands-on Principal Software Engineer to shape and build our next-generation ML Infrastructure and significantly reduce the time to develop and deploy large-scale ML and Foundational models to our robotaxi. You will lead the design and development of our Data, Compute, Model Training, and Serving Infrastructure. You will work across all AI teams within Zoox, including Perception, Prediction, Planner, Simulation, Collision Avoidance, and have the opportunity to significantly push the boundaries of how ML is practiced within Zoox.We build and operate the data infrastructure responsible for ingesting PBs of sensor data and the systems used to assemble training datasets. We operate the compute infrastructure that powers Zoox’s model training, serving, and large-scale validation pipelines across tens of thousands of GPUs. We also operate the base layer of ML tools, deep learning frameworks, and inference systems used by our applied research teams for in- and off-vehicle ML use cases. You will lead a team of strong software engineers and act as a force multiplier for our teams. You can learn more about our ML Infrastructure here and our stack behind autonomous driving here.

In this role, you will

  • Vision : Develop and execute a strategic vision for ML Infrastructure that will unlock innovation in autonomous driving and enhance our rider experience.
  • Technical acumen : Lead the design and implementation of cutting-edge infrastructure spanning all stages of an ML lifecycle from data preparation to training to evaluation, deployment, and serving.
  • Partnership : Collaborate closely with cross-functional teams, including ML researchers, software engineers, data engineers, and hardware engineers, to define requirements and align on architectural decisions.
  • Mentorship : Enable the engineers in the team to grow their careers by providing technical guidance and mentorship.

Qualifications

  • Experience building and managing large-scale ML infrastructure that powers the development of large-scale ML models
  • Excellent leadership skills with a demonstrated ability to lead high-performing engineering teams.
  • Strong experience with training frameworks like PyTorch, JAX, etc., leveraging GPUs efficiently for distributed model training.
  • Experience with GPU-accelerated inference using TensorRT, Ray Serve, or similar frameworks.
  • Proficient in Python and / or C++.
  • Bonus Qualifications

  • Experience enabling the development and deployment of large-scale Foundation models.
  • Experience working on large-scale data infrastructure and big data processing frameworks like Apache Spark.
  • Experience working in the AV domain supporting Perception, Prediction, Planner et al.
  • CompensationThere are three major components to compensation for this position : salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. The salary will range from $373,000-$448,000. A sign-on bonus may be part of a compensation package. Compensation will vary based on geographic location, job-related knowledge, skills, and experience. Zoox also offers a comprehensive package of benefits including paid time off ( sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

    serp_jobs.job_alerts.create_a_job

    Principal Software Engineer • Foster City, CA

    Job_description.internal_linking.related_jobs
    • serp_jobs.job_card.promoted
    Senior Engineer, ML Infrastructure

    Senior Engineer, ML Infrastructure

    CoreWeaveSunnyvale, CA, US
    serp_jobs.job_card.permanent
    CoreWeave is the AI Hyperscaler™, delivering a cloud platform of cutting edge services powering the next wave of AI.Our technology provides enterprises and leading AI labs with the most perfo...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    ML Infrastructure Engineer (Staff / Principal)

    ML Infrastructure Engineer (Staff / Principal)

    Menlo VenturesBurlingame, CA, United States
    serp_jobs.job_card.full_time
    We’re a tight-knit team of proven drug hunters, deep learning researchers, and software engineers united by a common mission — drive AI innovation in biochemistry, discovering and developing ground...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Software Engineer, ML Infrastructure - Training Platform

    Software Engineer, ML Infrastructure - Training Platform

    Scale AI, Inc.San Francisco, California, United States
    serp_jobs.job_card.full_time
    Scale is looking for an AI / ML Infrastructure Engineer to join our Machine Learning Infrastructure team to build out our Training Platform. You will partner closely with Machine Learning researchers ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Software Infrastructure & Platform Engineer

    Software Infrastructure & Platform Engineer

    PsiQuantumPalo Alto, California, United States
    serp_jobs.job_card.full_time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal / Senior Principal Software Engineer, Solutions

    Principal / Senior Principal Software Engineer, Solutions

    GenentechSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    It’s what drives us to innovate.To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come. Creating a world where we all have more ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Engineer / Software Architect

    Principal Engineer / Software Architect

    PHILSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Founded in 2015, PHIL is a Series D health-tech startup that is building a platform that interfaces between doctors, pharmacies, and patients to streamline the process of patients receiving prescri...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Software Engineer, Infrastructure

    Software Engineer, Infrastructure

    Menlo VenturesSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Biotechnology is rewriting life as we know it, from the medicines we take, to the crops we grow, the materials we wear, and the household goods that we rely on every day. But moving at the new speed...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    Principal Engineer, Cloud Software

    Principal Engineer, Cloud Software

    Tarana WirelessMilpitas, CA, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Join the Team That's Redefining Wireless Technology At Tarana , we're more than just a fast-growing tech company—we’re a team of bold innovators on a mission to revolutionize broa...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Staff Software Engineer, Capacity Infrastructure

    Principal Staff Software Engineer, Capacity Infrastructure

    Next MatterMountain View, CA, United States
    serp_jobs.job_card.full_time
    Our vision is to create economic opportunity for every member of the global workforce.Every day our members use our products to make connections, discover opportunities, build skills and gain insig...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    PhizenixMenlo Park, CA, US
    serp_jobs.job_card.full_time +1
    Menlo Park, CA | On-Site | Full-Time / Direct Hire.Looking for ML Infra experts (Bay Area preferred) with deep experience in CUDA, GPU optimization, VLLMs, and LLM inference—pure language focus...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Software Engineer Ecosystem

    Principal Software Engineer Ecosystem

    Promote ProjectSan Jose, CA, United States
    serp_jobs.job_card.full_time
    We are looking for experienced software engineers that are excited to bring Pulumi’s cloud programming model to the world. You will be a part of realizing a vision where every developer can harness ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Core Infrastructure Engineer

    Principal Core Infrastructure Engineer

    HighnoteSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Founded in 2020 by a team of leaders from Braintree, PayPal, and Lending Club, Highnote is an all in one card issuer processor and program management platform. We give digital-first organizations th...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Sr. Principal Software Engineer - Analytics

    Sr. Principal Software Engineer - Analytics

    Coupa Software Inc.Foster City, CA, United States
    serp_jobs.job_card.full_time
    Principal Software Engineer - Analytics.Coupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small.Coupa AI ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal Staff Software Engineer - Core Infrastructure

    Principal Staff Software Engineer - Core Infrastructure

    Next MatterMountain View, CA, United States
    serp_jobs.job_card.full_time
    Our vision is to create economic opportunity for every member of the global workforce.Every day our members use our products to make connections, discover opportunities, build skills and gain insig...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Symbolica AISan Francisco, CA, US
    serp_jobs.job_card.full_time
    Symbolica is an AI research lab pioneering the application of category theory to enable logical reasoning in machines.We're a well-resourced, nimble team of experts on a mission to bridge the g...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Principal AI / ML System Software Engineer

    Principal AI / ML System Software Engineer

    d-MatrixSanta Clara, CA, United States
    serp_jobs.job_card.full_time
    AI to power the transformation of technology.We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. We value humility and believe in direct communic...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    • serp_jobs.job_card.promoted
    Principal Software Engineer, Crusoe Cloud

    Principal Software Engineer, Crusoe Cloud

    Crusoe Energy Systems LLCSan Francisco, CA, United States
    serp_jobs.job_card.full_time
    Cruose's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spe...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    • serp_jobs.job_card.promoted
    Sr. Principal Software Engineer - Analytics

    Sr. Principal Software Engineer - Analytics

    QplusequalityFoster City, CA, United States
    serp_jobs.job_card.full_time
    Coupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of d...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
    Principal Software Engineer

    Principal Software Engineer

    Anvilogic IncPalo Alto, CA, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Anvilogic is a Palo Alto-based AI cybersecurity startup founded in 2019 by security veterans and data scientists from Fortune 500 companies. Our mission is to democratize threat detection and huntin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
    Principal Software Engineer

    Principal Software Engineer

    ViantSan Francisco, California, United States, 94102
    serp_jobs.job_card.full_time
    Viant’s customers use the Demand Side Platform (DSP) to set up, run and monitor ad campaigns.The platform team owns a complex set of backend services and the frontend UI that make up the DSP.These ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30