Talent.com
Machine Learning Engineer, Training Infrastructure
Machine Learning Engineer, Training InfrastructureHedra, Inc • San Francisco, CA, United States
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

Hedra, Inc • San Francisco, CA, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

About Hedra

Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures. We're building Hedra Studio, a multimodal creation platform capable of control, emotion, and creative intelligence.

At the core of Hedra Studio is our Character-3 foundation model, the first omnimodal model in production. Character-3 jointly reasons across image, text, and audio for more intelligent video generation — it’s the next evolution of AI-driven content creation.

At Hedra, we’re a team of hard-working, passionate individuals seeking to fundamentally change content creation and build a generational company together. We value startup energy, initiative, and the ability to turn bold ideas into real products. Our team is fully in-person in SF / NY with a shared love for whiteboard problem-solving.

Overview

We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement — we value curiosity, creativity, and the drive to solve hard problems.

Responsibilities

Design, implement, and maintain scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets.

Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training.

Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models.

Monitor system performance and implement improvements to maximize efficiency and utilization , using tools like Airflow for orchestration.

Collaborate across research teams to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment.

Qualifications

Bachelor’s degree in Computer Science, Information Technology, or a related field, with a focus on system administration.

Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads.

Values engineering processes and version control (CI / CD).

Knowledge of containerization technologies like Docker and Kubernetes required for deployments at scale.

Understanding of distributed training techniques and how to scale models across multi-node clusters aligning with video generation needs.

Strong problem-solving and communication skills, given the need to collaborate with diverse teams.

This role is vital for ensuring the computational backbone supports the company’s ML efforts, focusing on deployment and scalability.

Benefits

Competitive compensation + equity

401k (no match)

Healthcare (Silver PPO Medical, Vision, Dental)

Lunch and snacks at the office

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Machine Learning Engineer • San Francisco, CA, United States

Job_description.internal_linking.related_jobs
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Greylock Partners • San Francisco, CA, United States
serp_jobs.job_card.full_time
Machine Learning Infrastructure Engineer — join early B2C investment to help build large-scale ML infrastructure for a cutting-edge AI-first mobile product. Founders have experience building iconic ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Principle Machine Learning Infrastructure Engineer, Ads

Principle Machine Learning Infrastructure Engineer, Ads

Roblox • San Mateo, California, United States
serp_jobs.job_card.full_time
With Roblox Ads business growing at a rapid rate, we are building large scale ads machine learning infrastructure to deliver effective performance ads to our users, and more business values to our ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Machine Learning Engineer, Relevance

Machine Learning Engineer, Relevance

Patreon • San Francisco, California, United States
serp_jobs.job_card.full_time
Patreon is a media and community platform where over 300,000 creators give their biggest fans access to exclusive work and experiences. We offer creators a variety of ways to engage with their fans ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Abridge • San Francisco, CA, United States
serp_jobs.job_card.full_time
Machine Learning Infrastructure Engineer.Machine Learning Infrastructure Engineer.Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare.Our AI‑powered platform...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Machine Learning Engineer Model Evaluations, Public Sector

Machine Learning Engineer Model Evaluations, Public Sector

Scale AI • San Francisco, California, USA
serp_jobs.job_card.full_time
Machine Learning Engineer - Model Evaluations Public Sector.The Public Sector ML team at Scale deploys advanced AI systemsincluding LLMs agentic models and multimodal pipelinesinto mission-critical...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Machine Learning Engineer - Training & Infrastructure

Machine Learning Engineer - Training & Infrastructure

P-1 AI • San Francisco, CA, United States
serp_jobs.job_card.full_time
We are building an engineering AGI.We founded P-1 AI with the conviction that the greatest impact of artificial intelligence will be on the built world—helping mankind conquer nature and bend it to...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior ML Platform Engineer - Training & Inference

Senior ML Platform Engineer - Training & Inference

Zoox • Foster City, CA, United States
serp_jobs.job_card.full_time
A tech company specializing in autonomous vehicles is seeking an experienced ML Infrastructure Engineer to build scalable ML training frameworks and lead the design of a robust ML platform.Candidat...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

IntelliPro Group Inc. • San Francisco, CA, US
serp_jobs.job_card.full_time
serp_jobs.filters_job_card.quick_apply
Machine Learning Engineer, Training Infrastructure Position Type : Full time Location : San Francisco, CA, USA Salary Range : $150,000 - $250, 000 (USD) Job ID# : 158135 Job Description : We are l...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Ambience Healthcare • San Francisco, California, United States
serp_jobs.job_card.full_time
Ambience is developing the most capable AI systems for healthcare and medicine.As healthcare costs soar to 17.US GDP and a projected shortage of 100,000 physicians within the next decade, the need ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Machine Learning Engineer, Distributed & Scalable Training

Machine Learning Engineer, Distributed & Scalable Training

Lila Sciences • San Francisco, California, United States
serp_jobs.job_card.full_time
We’re seeking a ML Engineer specializing in.You’ll design and maintain large-scale training systems, optimize performance for massive models, and integrate cutting-edge techniques to improve effici...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
AIML - Sr. Machine Learning Infrastructure Engineer, Evaluation

AIML - Sr. Machine Learning Infrastructure Engineer, Evaluation

Apple Inc. • San Francisco, CA, United States
serp_jobs.job_card.full_time
Machine Learning Infrastructure Engineer, Evaluation.San Francisco, California, United States Software and Services.How do we ensure that Apple's most advanced AI features perform flawlessly for ev...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Character.AI • San Francisco, CA, United States
serp_jobs.job_card.full_time
Machine Learning Infrastructure Engineer.Machine Learning Infrastructure Engineer.Machine Learning Infrastructure Engineer. Machine Learning Infrastructure Engineer.Get AI-powered advice on this job...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Workshop Labs • San Francisco, California, United States
serp_jobs.job_card.full_time
Build the infrastructure to serve personal AI models privately and at scale.We're building the first truly private, personal AI – one that learns your skills, judgment, and preferences without big ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

Hedra • San Francisco, CA, United States
serp_jobs.job_card.full_time
Hedra is a pioneering generative media company backed by top investors at Index, A16Z, and Abstract Ventures.We're building Hedra Studio, a multimodal creation platform capable of control, emotion,...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

Ipro Networks Pte. Ltd. • San Francisco, CA, United States
serp_jobs.job_card.full_time
Job Title : Machine Learning Engineer, Training Infrastructure | Position Type : Full time | Location : San Francisco, CA, USA | Salary Range : $150,000 - $250,000 (USD) | Job ID# : 158135.Design, imple...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Machine Learning Engineer, Training Infrastructure

Machine Learning Engineer, Training Infrastructure

Intellipro Group • San Francisco, California, United States
serp_jobs.job_card.full_time
Machine Learning Engineer, Training Infrastructure.We are looking for an ML Engineer with .ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if y...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Machine Learning Engineer - Model Evaluations, Public Sector

Machine Learning Engineer - Model Evaluations, Public Sector

Scale AI, Inc. • San Francisco, CA, United States
serp_jobs.job_card.full_time
Machine Learning Engineer - Model Evaluations, Public Sector.The Public Sector ML team at Scale deploys advanced AI systems-including LLMs, agentic models, and multimodal pipelines-into mission-cri...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Machine Learning Engineer, Foundation Model

Machine Learning Engineer, Foundation Model

Stripe • San Francisco, California, United States
serp_jobs.job_card.full_time
Stripe’s mission is to accelerate global economic and technological development.We offer financial infrastructure and a variety of services to serve the needs of a wide range of users, from startup...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted