Talent.com
Software Engineer - ML/LLM Inference
Software Engineer - ML/LLM InferenceAlldus • San Francisco, CA, United States
Software Engineer - ML / LLM Inference

Software Engineer - ML / LLM Inference

Alldus • San Francisco, CA, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Get AI-powered advice on this job and more exclusive features.

Direct message the job poster from Alldus

Principal Recruitment Consultant | AI & Machine Learning | Co-organizer of the AI in Action Podcast

My client is searching for a talented engineer to work on ML / LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.

We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you will bridge the gap between AI / ML research and systems programming to build and enhance our next-generation LLM Inference Engine. You will play a crucial role in optimizing the performance, scalability, and efficiency of our LLM serving systems.

Key Responsibilities :

Develop and Enhance Inference Engine :

  • Design, implement, and optimize the next-generation LLM Inference Engine.
  • Integrate the latest LLM inference techniques from research to enhance latency and throughput.

Performance Optimization :

  • Conduct deep performance optimizations across multiple layers of the technology stack, including PyTorch, C++, and CUDA.
  • Analyze and improve system performance to meet the demands of various use cases.
  • Work closely with customers to understand specific performance requirements and optimize solutions accordingly.
  • Provide technical expertise and support to ensure successful deployment and operation of inference systems.
  • Technical Leadership :

  • Define the roadmap and technical vision for the inference stack.
  • Lead initiatives to drive innovation and maintain the competitive edge of our inference technologies.
  • Infrastructure Development :

  • Collaborate with partner teams to build and maintain scalable, multi-replica serving infrastructure.
  • Ensure the reliability and scalability of LLM serving systems to handle increasing workloads.
  • Qualifications : Technical Skills :

  • Proficiency in systems programming languages such as C++.
  • Strong experience with machine learning frameworks, particularly PyTorch.
  • Expertise in GPU programming and CUDA for performance optimization.
  • Solid understanding of AI / ML concepts, especially related to large language models.
  • Experience :

  • Proven experience in developing and optimizing ML / LLM inference systems.
  • Demonstrated ability to integrate research advancements into production systems.
  • Experience with performance tuning and profiling across various technology stacks.
  • Experience with vLLM
  • Seniority level

    Seniority level

    Mid-Senior level

    Employment type

    Employment type

    Full-time

    Job function

    Industries

    Staffing and Recruiting and Software Development

    Referrals increase your chances of interviewing at Alldus by 2x

    Inferred from the description for this job

    San Francisco, CA $130,000.00-$238,000.00 3 days ago

    San Francisco, CA $40,000.00-$70,000.00 2 weeks ago

    San Francisco, CA $145,000.00-$230,000.00 5 days ago

    Full-Stack Software Engineer (Jr / Mid level)

    San Francisco, CA $220,000.00-$350,000.00 4 hours ago

    San Francisco, CA $150,000.00-$230,000.00 2 months ago

    San Francisco, CA $150,000.00-$176,000.00 2 months ago

    San Francisco, CA $99,500.00-$200,000.00 1 day ago

    San Francisco, CA $130,000.00-$140,000.00 2 days ago

    San Francisco, CA $120,000.00-$190,000.00 8 months ago

    San Francisco, CA $125,000.00-$175,000.00 1 month ago

    Software Engineer, Frontend (All Levels)

    San Francisco, CA $150,000.00-$220,000.00 1 hour ago

    San Francisco, CA $56.25-$173,000.00 2 weeks ago

    San Francisco, CA $176,000.00-$250,000.00 2 weeks ago

    Alameda, CA $130,000.00-$160,000.00 4 weeks ago

    San Francisco, CA $150,000.00-$283,000.00 2 weeks ago

    San Francisco, CA $150,000.00-$300,000.00 5 days ago

    San Francisco, CA $165,000.00-$165,000.00 2 years ago

    San Francisco, CA $140,000.00-$280,000.00 7 months ago

    San Francisco, CA $140,000.00-$180,000.00 1 month ago

    San Francisco, CA $130,000.00-$185,000.00 2 months ago

    San Francisco, CA $99,500.00-$200,000.00 1 day ago

    San Francisco, CA $150,500.00-$269,200.00 2 days ago

    San Francisco, CA $100,000.00-$200,000.00 1 year ago

    San Francisco, CA $120,000.00-$200,000.00 2 years ago

    San Francisco, CA $150,000.00-$250,000.00 9 months ago

    We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Software Engineer • San Francisco, CA, United States

    Job_description.internal_linking.related_jobs
    ML Systems Engineer : Distributed LLM Training & Inference

    ML Systems Engineer : Distributed LLM Training & Inference

    Scale AI • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    A leading AI technology company in San Francisco seeks a team member to build and optimize a machine learning framework for large language models. Candidates should have system optimization experien...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Sr. Software Engineer, ML

    Sr. Software Engineer, ML

    Relyance AI • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    NLP for information extraction from legal documents, ML / NLP for information extraction from code and general ML in code analysis, and overall AI backend initiatives. You will partner with cross-func...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior MLOps Engineer - Remote-First, High-Impact ML

    Senior MLOps Engineer - Remote-First, High-Impact ML

    CompScience • San Francisco, CA, US
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    A technology-driven startup in California is seeking an experienced Sr MLOps Engineer to build and maintain the infrastructure that powers their machine learning products.In this high-impact role, ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    AI / ML Engineer

    AI / ML Engineer

    Cogent Security • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Cogent Security is on a mission to stop breaches and prevent cybercrime by innovating at the frontier of generative AI systems. We are building the world’s first AI cyber taskforce, composed of AI a...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    ML Inference Engineer — Scalable AI Systems

    ML Inference Engineer — Scalable AI Systems

    Together • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    A pioneering AI company in San Francisco seeks a Machine Learning Engineer to join their Inference Engine team.This role involves optimizing AI inference systems, developing high-performance servic...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    LLM Engineer

    LLM Engineer

    PeopleCaddie • San Francisco Bay Area, United States
    serp_jobs.job_card.full_time
    San Jose, CA (Bay Area) - 2x / wk at client site.Up to $90 per hour (W2), depending on experience.Month (with possible extension). We are seeking a highly experienced.The ideal candidate has deep tech...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    AIML - Sr. Software Engineer, ML Platform Technologies (MLPT)

    Apple Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Software Engineer, ML Platform Technologies (MLPT).San Francisco Bay Area, California, United States Machine Learning and AI. Want to build the platform that enables the next generation of intellige...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    GenAI Inference Engineer — Scalable LLM Serving

    GenAI Inference Engineer — Scalable LLM Serving

    Databricks Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    A leading AI-focused technology company in San Francisco is seeking a Software Engineer for GenAI inference.In this role, you'll design, develop, and optimize the inference engine powering the Foun...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Distributed ML Systems Engineer- Inference

    Distributed ML Systems Engineer- Inference

    Together AI • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, f...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior ML Engineer, Recommendations & Search (Remote)

    Senior ML Engineer, Recommendations & Search (Remote)

    Grow Therapy • San Francisco, CA, United States
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    A growing mental healthcare technology firm is seeking a Senior / Staff ML Engineer to develop and deploy algorithms for enhancing user experience. With over 5 years of experience required, the positi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Software Engineer - ML Infrastructure

    Software Engineer - ML Infrastructure

    Specter • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Specter is creating a software-defined "control plane" for the physical world.We are starting with protecting American businesses by granting them ubiquitous perception over their physical assets.T...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    LLM Inference Performance Engineer

    LLM Inference Performance Engineer

    Baseten • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    A dynamic AI startup in San Francisco is seeking a Software Engineer focused on ML performance.This role involves optimizing large language models, debugging and enhancing ML solutions, and produci...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    ML Research Engineer, ML Systems

    ML Research Engineer, ML Systems

    Scale AI, Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Remote Solutions Engineer for ML Testing Platform

    Remote Solutions Engineer for ML Testing Platform

    Getclera • San Francisco, CA, United States
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    A leading AI solutions company is seeking a Solutions Engineer to customize software for top clients globally.The role requires strong Python skills, excellent communication capabilities, and a dee...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior ML Engineer - Scalable Vision-Language Systems

    Senior ML Engineer - Scalable Vision-Language Systems

    TwelveLabs • San Francisco, California, United States
    serp_jobs.job_card.full_time
    A pioneering AI technology firm based in San Francisco is seeking a Machine Learning Engineer to enhance its ML systems and engineering workflows. The ideal candidate will have over 6 years of exper...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Software Engineer, AI / ML

    Software Engineer, AI / ML

    Glu Mobile Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Glue is a well-funded startup working on the next generation of work communication tools.We believe that today’s work chat is noisy, unstructured, and not designed for productivity.We’re drawing fr...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Software Engineer - Applied ML (US / CAN)

    Software Engineer - Applied ML (US / CAN)

    Cohere • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Software Engineer - Applied ML (US / CAN).Get AI-powered advice on this job and more exclusive features.Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier mo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Staff ML Infrastructure Engineer - Scale & Inference

    Staff ML Infrastructure Engineer - Scale & Inference

    Snap Inc. • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    A leading tech company is seeking a Software Engineer for ML Infrastructure in San Francisco.This role involves designing high-performance systems for machine learning workloads, collaborating with...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted