Talent.com

Performance engineer serp_jobs.h1.location_city

serp_jobs.job_alerts.create_a_job

Performance engineer • berkeley ca

serp_jobs.last_updated.last_updated_variable_hours
Machine Learning Engineer - Model Performance

Machine Learning Engineer - Model Performance

InferenceSan Francisco, California, United States
serp_jobs.job_card.full_time
Machine Learning Engineer to join our team, focusing on optimizing the performance of our cutting-edge AI inference systems. This role involves working with state-of-the-art large language models an...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Performance Test Engineer - Senior Manager

Performance Test Engineer - Senior Manager

PwCSan Francisco,CA
serp_jobs.job_card.full_time
SummaryAt PwC, our people in software and product innovation focus on developing cutting-edge software solutions and driving product innovation to meet the evolving needs of clients.These individua...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Performance Engineer- Java / LoadRunner

Performance Engineer- Java / LoadRunner

StaffingSan Francisco, CA, US
serp_jobs.job_card.full_time
Performance Engineer- Java / Loadrunner.Job Location : San Francisco, CA or San Jose, CA.Key Skillset : 5+ years of Performance Center or LoadRunner experience. Java performance tuning and optimization ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Senior AI Performance Engineer

Senior AI Performance Engineer

GenmoSan Francisco, California, United States
serp_jobs.job_card.full_time
We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the bo...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
AI Agent Software Engineer - Agent Performance Engineering

AI Agent Software Engineer - Agent Performance Engineering

AssembledSan Francisco, CA, US
serp_jobs.job_card.full_time
Agent Performance Engineering Role.As part of the Agent Performance Engineering team, you'll be working on the core systems that make our AI agents smarter, more accurate, and more capable of handl...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Performance Engineer

Performance Engineer

AnthropicSan Francisco, CA, US
serp_jobs.job_card.full_time
Research Engineer, Frontier Red Team (Rsp Evaluations).San Francisco, CA | Seattle, WA.Research Scientist, Frontier Red Team (Autonomy). Remote-Friendly (Travel-Required) | San Francisco, CA | Seatt...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Software Engineer, ML Performance

Software Engineer, ML Performance

OpenAISan Francisco, CA, United States
serp_jobs.job_card.full_time
Software Engineer, ML Performance | OpenAI.Software Engineer, ML Performance.Apply now (opens in a new window).Our Inference team brings OpenAI’s most capable research and technology to the world t...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Senior HPC Performance Engineer

Senior HPC Performance Engineer

NVIDIARemote, CA, US
serp_jobs.filters.remote
serp_jobs.job_card.full_time
As a member of our team in NVIDIA's NVHPC compilers & tools group, you will analyze and run High Performance Computing (HPC) applications on HPC servers and systems to gain insight into the per...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Sr Building Performance Engineer

Sr Building Performance Engineer

HGASan Francisco, CA, US
serp_jobs.job_card.full_time
HGA is an award winning architectural, engineering and planning firm with a full-time opportunity for a talented, ambitious. HGA's Building Performance presence in the West Coast.We define a buildin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Principal AI Performance Engineer

Principal AI Performance Engineer

Epoch BiodesignSan Francisco, CA, United States
serp_jobs.job_card.full_time
Crusoe is building the World’s Favorite AI-first Cloud infrastructure company.We’re pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to p...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
QE Lead Performance Engineer

QE Lead Performance Engineer

US012 Marsh & McLennan Agency LLCCalifornia,San Francisco
serp_jobs.job_card.full_time
Award-winning, inclusive, Top Workplace culture doesn’t happen overnight.It’s a result of hard work by extraordinary people. The industry’s brightest talent drive our efforts to deliver purposeful w...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Lead Performance Tester / Engineer

Lead Performance Tester / Engineer

Diverse LynxSan Francisco, CA, US
serp_jobs.job_card.full_time
Design and implement comprehensive performance testing strategies for web, mobile, and backend systems.Develop, maintain, and execute performance test scripts using LoadRunner and NeoLoad and famil...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
  • serp_jobs.job_card.promoted
Sr. Performance Engineer San Francisco, California

Sr. Performance Engineer San Francisco, California

Databricks Inc.San Francisco, CA, US
serp_jobs.job_card.full_time
At Databricks, we are passionate about enabling data teams to solve the world's toughest problems.We do this by building and running the world's best data and AI infrastructure platform so our cust...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Sr. Software Engineer - Performance

Sr. Software Engineer - Performance

DatabricksSan Francisco, California
serp_jobs.job_card.full_time
At Databricks, we are passionate about enabling data teams to solve the world's toughest problems.We do this by building and running the world's best data and AI infrastructure platform so our cust...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Principal AI Performance Engineer

Principal AI Performance Engineer

CrusoeSan Francisco, CA, US
serp_jobs.job_card.full_time
Crusoe is building the World's Favorite AI-first Cloud infrastructure company.We're pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to p...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Performance engineer

Performance engineer

writer.comSan Francisco, CA, US
serp_jobs.job_card.full_time
Writer is seeking a highly skilled and motivated Principal performance engineer to lead the performance optimization of our cutting-edge Generative AI technology stack. This role is critical in ensu...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
performance Engineer / tester

performance Engineer / tester

Omega Solutions Inc.San Francisco, CA, US
serp_jobs.job_card.full_time
This is Ashok from Omega solutions.This is regarding an immediate opening for Performance Engineer.Please find the below description and let me know your interest. Designs, configures and runs perfo...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Machine Learning Engineer - Model Performance

Machine Learning Engineer - Model Performance

SOLANA FOUNDATIONSan Francisco, CA, US
serp_jobs.job_card.full_time
Machine Learning Engineer to join our team, focusing on optimizing the performance of our cutting-edge AI inference systems. This role involves working with state-of-the-art large language models an...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Senior Embedded Software Engineer, Performance

Senior Embedded Software Engineer, Performance

SquareSan Francisco, CA, US
serp_jobs.job_card.full_time
Since we opened our doors in 2009, the world of commerce has evolved immensely, and so has Square.After enabling anyone to take payments and never miss a sale, we saw sellers stymied by disparate, ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Senior Site Reliability Engineer, Performance

Senior Site Reliability Engineer, Performance

Cisco MerakiUS; San Francisco, CA, United States
serp_jobs.filters.remote
serp_jobs.job_card.full_time
Application window is open until further notice.The Meraki cloud supports millions of customer devices from 10 data centers and numerous public cloud regions from around the world.Meraki’s customer...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Machine Learning Engineer - Model Performance

Machine Learning Engineer - Model Performance

InferenceSan Francisco, California, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Inference.net is seeking a Machine Learning Engineer to join our team, focusing on optimizing the performance of our cutting-edge AI inference systems. This role involves working with state-of-the-art large language models and ensuring they run efficiently and effectively at scale. You will be responsible for deploying state-of-the-art models at scale and performing optimizations to increase throughput and enable new features. This position offers the chance to collaborate closely with our engineering team and make significant contributions to open source projects, like SGLang and vLLM.

About Inference.net

We are building a distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute that can be used for running large-language models like DeepSeek and Llama 4. At any given moment, we have over 5,000 GPUs and hundreds of terabytes of VRAM connected to the network.

We are a small, well-funded team working on difficult, high-impact problems at the intersection of AI and distributed systems. We primarily work in-person from our office in downtown San Francisco. Our investors include A16z CSX and Multicoin. We are high-agency, adaptable, and collaborative. We value creativity alongside technical prowess and humility. We work hard, and deeply enjoy the work that we do.

Responsibilities

Design and implement optimization techniques to increase model throughput and reduce latency across our suite of models

Deploy and maintain large language models at scale in production environments

Deploy new models as they are released by frontier labs

Implement techniques like quantization, speculative decoding, and KV cache reuse

Contribute regularly to open source projects such as SGLang and vLLM

Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vLLM, SGLang, CUDA, and other libraries to debug ML performance issues

Collaborate with the engineering team to bring new features and capabilities to our inference platform

Develop robust and scalable infrastructure for AI model serving

Create and maintain technical documentation for inference systems

Requirements

3+ years of experience writing high-performance, production-quality code

Strong proficiency with Python and deep learning frameworks, particularly PyTorch

Demonstrated experience with LLM inference optimization techniques

Hands-on experience with SGLang and vLLM, with contributions to these projects strongly preferred

Familiarity with Docker and Kubernetes for containerized deployments

Experience with CUDA programming and GPU optimization

Strong understanding of distributed systems and scalability challenges

Proven track record of optimizing AI models for production environments

Nice to Have

Familiarity with TensorRT and TensorRT-LLM

Knowledge of vision models and multimodal AI systems

Experience implementing techniques like quantization and speculative decoding

Contributions to open source machine learning projects

Experience with large-scale distributed computing

Compensation

We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $180,000 - $250,000, plus competitive equity and benefits including :

Full healthcare coverage

Quarterly offsites

Flexible PTO

Equal Opportunity

Inference.net is an equal opportunity employer. We welcome applicants from all backgrounds and don't discriminate based on race, color, religion, gender, sexual orientation, national origin, genetics, disability, age, or veteran status.

If you're passionate about building the next generation of high-performance systems that push the boundaries of what's possible with large language models, we want to hear from you!