Software Engineer, Model InferenceOpenai • San Francisco, California, United States

Software Engineer, Model Inference

Openai • San Francisco, California, United States

job_description.job_card.30_days_ago

serp_jobs.job_preview.job_type

serp_jobs.job_card.full_time

job_description.job_card.job_description

Our team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.

About the Role

We are looking for an engineer who wants to take the world's largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production environment.

In this role, you will :

Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production.

Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our deployed models.

Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.

Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.

You might thrive in this role if you :

Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.

Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done.

Have at least 3 years of professional software engineering experience.

Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, etc.

Have experience architecting, observing, and debugging production distributed systems.

Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.

Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.

Are self-directed and enjoy figuring out the most important problem to work on.

Have a good intuition for when off-the-shelf solutions will work, and build tools to accelerate your own workflow quickly if they won’t.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates : Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link .

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

serp_jobs.job_alerts.create_a_job

Software Engineer • San Francisco, California, United States

Job_description.internal_linking.related_jobs

Software Engineer - Large Scale Inference

The San Francisco Compute Company • San Francisco, CA, United States

serp_jobs.job_card.full_time

We think people should buy it like one.Startups shouldn’t be forced to buy a year’s worth of compute time in order to get market rate and compute providers shouldn’t go bankrupt because they can’t ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Model Deployment Engineer

Rime • San Francisco, CA, United States

serp_jobs.job_card.full_time

Rime builds enterprise‑grade voice models that sound truly human — trusted by global telcos, healthcare systems, and leading brands to power billions of real customer interactions.Our mission is to...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Software Engineer, Machine Learning

Grammarly • San Francisco, California, United States

serp_jobs.job_card.full_time

Grammarly offers a dynamic hybrid working model for this role.This flexible approach gives team members the best of both worlds : plenty of focus time along with in-person collaboration that helps f...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Software Engineer, Machine Learning

Twelvelabs • San Francisco, California, United States

serp_jobs.job_card.full_time

At TwelveLabs, we are pioneering the development of frontier multimodal foundation models that can see, hear and understand the world as humans do. Our models have redefined the standards in video-l...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Senior Software Engineer, Machine Learning

Planet Labs PBC • San Francisco, CA, United States

serp_jobs.job_card.full_time

We believe in using space to help life on Earth.Planet designs, builds, and operates the largest constellation of imaging satellites in history. This constellation delivers an unprecedented dataset ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Sr. Machine Learning Engineer (Recommendation Systems)

Philo • San Francisco, California, United States

serp_jobs.job_card.full_time

At Philo, we’re a group of technology and product people who set out to build the future of television, marrying the best in modern technology with the most compelling medium ever invented — in sho...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Software Engineer - Learned Trajectory Machine Learning Engineer

Zoox • Foster City, California, United States

serp_jobs.job_card.full_time

The Prediction & Behavior ML team is responsible for developing machine learning (ML) algorithms that learn and predict behaviors from data, applying them both on-vehicle to influence driving behav...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Software Engineer, Model Inference

OpenAI • San Francisco, CA, United States

serp_jobs.job_card.full_time

Our Inference team brings OpenAI's most capable research and technology to the world through our products.We empower consumers, enterprise and developers alike to use and access our start-of-the-ar...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

GenAI Inference Engineer — Scalable LLM Serving

Databricks Inc. • San Francisco, CA, United States

serp_jobs.job_card.full_time

A leading AI-focused technology company in San Francisco is seeking a Software Engineer for GenAI inference.In this role, you'll design, develop, and optimize the inference engine powering the Foun...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted

AI / ML Inference Engineer

Krea • San Francisco, California, United States

serp_jobs.job_card.full_time

At Krea, we're dedicated to making AI intuitive and controllable for creatives.Our mission is to build tools that empower human creativity, not replace it. We believe AI is a new medium that allows ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Software Engineer - ML / LLM Inference

Alldus • San Francisco, CA, United States

serp_jobs.job_card.full_time

Get AI-powered advice on this job and more exclusive features.Direct message the job poster from Alldus.Principal Recruitment Consultant | AI & Machine Learning | Co-organizer of the AI in Action P...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Distributed ML Systems Engineer- Inference

Together AI • San Francisco, CA, United States

serp_jobs.job_card.full_time

Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, f...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Software Engineer, Machine Learning Infrastructure

Datologyai • Redwood City, California, United States

serp_jobs.job_card.full_time

Companies want to train their own large models on their own data.The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to mode...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Software Engineer - Localization, State Estimation & Prediction

Lodestar • San Francisco, CA, United States

serp_jobs.job_card.full_time +1

Lodestar's mission is to develop the first "Protect and Defend" capability for high-value space assets in orbit.Our flagship product MITHRIL is our hardware-agnostic, AI-enabled autonomy software s...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Software Engineer, Inference

Trypulse • San Francisco, CA, United States

serp_jobs.job_card.full_time

Pulse is tackling one of the most persistent challenges in data infrastructure : extracting accurate, structured information from complex documents at scale. We have a breakthrough approach to docume...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Software Engineer - Machine Learning Platform

Snowflake • Menlo Park, California, United States

serp_jobs.job_card.full_time

The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their ML / AI workload to Snowflake. Our customers want to leverage ML / AI to extract business values from ever in...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Senior Machine Learning Engineer (Modeling), Support

Block • San Francisco, California, United States

serp_jobs.job_card.full_time

Block is one company built from many blocks, all united by the same purpose of economic empowerment.The blocks that form our foundational teams — People, Finance, Counsel, Hardware, Information Sec...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

ML Inference Engineer - Scalable AI Systems

Together • San Francisco, CA, US

serp_jobs.job_card.full_time

A pioneering AI company in San Francisco seeks a Machine Learning Engineer to join their Inference Engine team.This role involves optimizing AI inference systems, developing high-performance servic...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new