Software Engineer - ML/LLM InferenceAlldus • San Francisco, CA, United States

Software Engineer - ML / LLM Inference

Alldus • San Francisco, CA, United States

job_description.job_card.30_days_ago

serp_jobs.job_preview.job_type

serp_jobs.job_card.full_time

job_description.job_card.job_description

Get AI-powered advice on this job and more exclusive features.

Direct message the job poster from Alldus

Principal Recruitment Consultant | AI & Machine Learning | Co-organizer of the AI in Action Podcast

My client is searching for a talented engineer to work on ML / LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.

We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you will bridge the gap between AI / ML research and systems programming to build and enhance our next-generation LLM Inference Engine. You will play a crucial role in optimizing the performance, scalability, and efficiency of our LLM serving systems.

Key Responsibilities :

Develop and Enhance Inference Engine :

Design, implement, and optimize the next-generation LLM Inference Engine.
Integrate the latest LLM inference techniques from research to enhance latency and throughput.

Performance Optimization :

Conduct deep performance optimizations across multiple layers of the technology stack, including PyTorch, C++, and CUDA.

Analyze and improve system performance to meet the demands of various use cases.

Work closely with customers to understand specific performance requirements and optimize solutions accordingly.

Provide technical expertise and support to ensure successful deployment and operation of inference systems.

Technical Leadership :

Define the roadmap and technical vision for the inference stack.

Lead initiatives to drive innovation and maintain the competitive edge of our inference technologies.

Infrastructure Development :

Collaborate with partner teams to build and maintain scalable, multi-replica serving infrastructure.

Ensure the reliability and scalability of LLM serving systems to handle increasing workloads.

Qualifications : Technical Skills :

Proficiency in systems programming languages such as C++.

Strong experience with machine learning frameworks, particularly PyTorch.

Expertise in GPU programming and CUDA for performance optimization.

Solid understanding of AI / ML concepts, especially related to large language models.

Experience :

Proven experience in developing and optimizing ML / LLM inference systems.

Demonstrated ability to integrate research advancements into production systems.

Experience with performance tuning and profiling across various technology stacks.

Experience with vLLM

Seniority level

Mid-Senior level

Employment type

Full-time

Job function

Industries

Staffing and Recruiting and Software Development

Referrals increase your chances of interviewing at Alldus by 2x

Inferred from the description for this job

San Francisco, CA $130,000.00-$238,000.00 3 days ago

San Francisco, CA $40,000.00-$70,000.00 2 weeks ago

San Francisco, CA $145,000.00-$230,000.00 5 days ago

Full-Stack Software Engineer (Jr / Mid level)

San Francisco, CA $220,000.00-$350,000.00 4 hours ago

San Francisco, CA $150,000.00-$230,000.00 2 months ago

San Francisco, CA $150,000.00-$176,000.00 2 months ago

San Francisco, CA $99,500.00-$200,000.00 1 day ago

San Francisco, CA $130,000.00-$140,000.00 2 days ago

San Francisco, CA $120,000.00-$190,000.00 8 months ago

San Francisco, CA $125,000.00-$175,000.00 1 month ago

Software Engineer, Frontend (All Levels)

San Francisco, CA $150,000.00-$220,000.00 1 hour ago

San Francisco, CA $56.25-$173,000.00 2 weeks ago

San Francisco, CA $176,000.00-$250,000.00 2 weeks ago

Alameda, CA $130,000.00-$160,000.00 4 weeks ago

San Francisco, CA $150,000.00-$283,000.00 2 weeks ago

San Francisco, CA $150,000.00-$300,000.00 5 days ago

San Francisco, CA $165,000.00-$165,000.00 2 years ago

San Francisco, CA $140,000.00-$280,000.00 7 months ago

San Francisco, CA $140,000.00-$180,000.00 1 month ago

San Francisco, CA $130,000.00-$185,000.00 2 months ago

San Francisco, CA $99,500.00-$200,000.00 1 day ago

San Francisco, CA $150,500.00-$269,200.00 2 days ago

San Francisco, CA $100,000.00-$200,000.00 1 year ago

San Francisco, CA $120,000.00-$200,000.00 2 years ago

San Francisco, CA $150,000.00-$250,000.00 9 months ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Software Engineer • San Francisco, CA, United States

Job_description.internal_linking.related_jobs

ML Systems Engineer : Distributed LLM Training & Inference

Scale AI • San Francisco, CA, United States

serp_jobs.job_card.full_time

A leading AI technology company in San Francisco seeks a team member to build and optimize a machine learning framework for large language models. Candidates should have system optimization experien...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Sr. Software Engineer, ML

Relyance AI • San Francisco, CA, United States

serp_jobs.job_card.full_time

NLP for information extraction from legal documents, ML / NLP for information extraction from code and general ML in code analysis, as well as overall AI backend initiatives.You will partner with cro...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

AI / LLM Software Engineer - onsite in San Francisco

A5 Talent Finders • San Francisco, CA, United States

serp_jobs.job_card.full_time

About the job AI / LLM Software Engineer - onsite in San Francisco.AI / LLM Systems Software Engineer.As a founding engineer, you'll design and build core systems from the ground up, working across t...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Software Engineer, AI / ML

Glu Mobile Inc. • San Francisco, CA, United States

serp_jobs.job_card.full_time

Glue is a well-funded startup working on the next generation of work communication tools.We believe that today’s work chat is noisy, unstructured, and not designed for productivity.We’re drawing fr...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Senior Software Engineer, Machine Learning

Planet Labs PBC • San Francisco, CA, United States

serp_jobs.job_card.full_time

We believe in using space to help life on Earth.Planet designs, builds, and operates the largest constellation of imaging satellites in history. This constellation delivers an unprecedented dataset ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

Apple Inc. • San Francisco, CA, United States

serp_jobs.job_card.full_time

Software Engineer, ML Platform Technologies (MLPT).San Francisco Bay Area, California, United States Machine Learning and AI. Want to build the platform that enables the next generation of intellige...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

GenAI Inference Engineer — Scalable LLM Serving

Databricks Inc. • San Francisco, CA, United States

serp_jobs.job_card.full_time

A leading AI-focused technology company in San Francisco is seeking a Software Engineer for GenAI inference.In this role, you'll design, develop, and optimize the inference engine powering the Foun...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted

Software Engineer, Model Inference

Openai • San Francisco, CA, United States

serp_jobs.job_card.full_time

Our Inference team brings OpenAI's most capable research and technology to the world through our products.We empower consumers, enterprise and developers alike to use and access our start-of-the-ar...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new

ML / AI Software Engineer - Metrics Frameworks

General Motors • San Francisco, CA, United States

serp_jobs.job_card.full_time

As an AI / ML Engineer on the Metrics Frameworks team, part of the Simulation, Evaluation, and Data organization, you will be an individual contributor focused on developing and optimizing infrastruc...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Distributed ML Systems Engineer- Inference

Together AI • San Francisco, CA, United States

serp_jobs.job_card.full_time

Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, f...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Senior ML Engineer, Recommendations & Search (Remote)

Grow Therapy • San Francisco, CA, United States

serp_jobs.filters.remote

serp_jobs.job_card.full_time

A growing mental healthcare technology firm is seeking a Senior / Staff ML Engineer to develop and deploy algorithms for enhancing user experience. With over 5 years of experience required, the positi...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

ML Research Engineer, ML Systems

Scale AI, Inc. • San Francisco, CA, United States

serp_jobs.job_card.full_time

Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

Apple • San Francisco, CA, United States

serp_jobs.job_card.full_time

Want to build the platform that enables the next generation of intelligent experiences on Apple products & services? As a software engineer on the Machine Learning Platform team, you will be respon...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

LLM Inference Performance Engineer

Baseten • San Francisco, CA, United States

serp_jobs.job_card.full_time

A dynamic AI startup in San Francisco is seeking a Software Engineer focused on ML performance.This role involves optimizing large language models, debugging and enhancing ML solutions, and produci...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Software Engineer, ML Infrastructure

Profluent • Emeryville, CA, United States

serp_jobs.job_card.full_time

Profluent is an AI-first protein design company.Founded in 2022, we develop deep generative models to design and validate novel, functional proteins to revolutionize biomedicine.Based in Emeryville...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Sr. Software Engineer - Applied ML

Databricks • San Francisco, CA, United States

serp_jobs.job_card.full_time

Senior Applied AI Engineer – ML for Systems & Infrastructure.The Applied AI team at Databricks sits at the forefront of advancing GenAI-powered products. Over the past years, we’ve launched Databric...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

ML Inference Engineer - Scalable AI Systems

Together • San Francisco, CA, US

serp_jobs.job_card.full_time

A pioneering AI company in San Francisco seeks a Machine Learning Engineer to join their Inference Engine team.This role involves optimizing AI inference systems, developing high-performance servic...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted

Remote Solutions Engineer for ML Testing Platform

Getclera • San Francisco, CA, United States

serp_jobs.filters.remote

serp_jobs.job_card.full_time

A leading AI solutions company is seeking a Solutions Engineer to customize software for top clients globally.The role requires strong Python skills, excellent communication capabilities, and a dee...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted