Platform Engineer, Model ShapingTogether AI • San Francisco, CA, United States

serp_jobs.error_messages.no_longer_accepting

Platform Engineer, Model Shaping

Together AI • San Francisco, CA, United States

job_description.job_card.30_days_ago

serp_jobs.job_preview.job_type

serp_jobs.job_card.full_time

job_description.job_card.job_description

About Model Shaping

The Model Shaping team at Together AI works on products and research for tailoring open foundation models to downstream applications. We build services that allow machine learning developers to choose the best models for their tasks and further improve these models using domain-specific data. In addition to that, we develop new methods for more efficient model training and evaluation, drawing inspiration from a broad spectrum of ideas across machine learning, natural language processing, and ML systems.

About the Role

As a Platform Engineer at Model Shaping, you will work on the foundational layers of Together's platform for model customization and evaluation. You will design the infrastructure and backend services that will allow us to sustainably and reliably scale the systems powering production workflows launched by our users, as well as internal research experiments.

You will operate in a cross-functional environment, collaborating with other engineers and researchers in the team to improve the infrastructure based on the needs of projects they work on. You will also interact with other engineering teams at Together (such as Commerce, Data Engineering, and Cloud Infrastructure) to integrate the services developed by Model Shaping with systems developed by those teams.

Responsibilities

Design and build Together's systems and infrastructure for model customization, including user-facing features and internal improvements
Contribute to reliability improvements for the platform, participating in an on-call rotation and improving processes for incident response
Create and improve internal tooling for deployment, continuous integration, and observability
Build a job orchestration platform spanning multiple data centers, supporting a highly heterogeneous hardware landscape
Partner with teams developing internal services, co-designing these services and incorporating them in systems built by Model Shaping

Requirements

3+ years of experience in building infrastructure or backend components of production services

Comfortable with the fundamentals of Linux environments and modern container / orchestration stacks (e.g., Docker and Kubernetes)

Strong software engineering background in Python or Go

Experienced with infrastructure automation tools (Terraform, Ansible), monitoring / observability stacks (Prometheus, Grafana), and CI / CD pipelines (GitHub Actions, ArgoCD)

Skilled with analyzing non-trivial issues of complex software systems and documenting your findings

Have cloud environment (e.g., AWS / GCP / Azure) administration experience, preferably with a hybrid bare-metal / cloud environment

Strong communication skills, willing to document systems and processes and collaborate with peers of varying technical expertise

Stand-out experience

Developing large-scale production systems with high reliability requirements

Pipeline orchestration frameworks (e.g., Kubeflow, Argo Workflows, Flyte)

Managing GPU workloads on HPC clusters, ideally with hands-on experience in operating NVIDIA's networking stack (e.g., NCCL, Mellanox firmware, GPUDirect RDMA)

Deployment of services for AI training or inference

Maintaining or contributing to open-source projects

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancements such as FlashAttention, RedPajama, SWARM Parallelism, and SpecExec. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is $200,000 - $290,000. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at

serp_jobs.job_alerts.create_a_job

Platform Engineer • San Francisco, CA, United States

Job_description.internal_linking.related_jobs

Feature Platform Engineer

Whatnot • San Francisco, California, United States

serp_jobs.job_card.full_time

Join the Future of Commerce with Whatnot!.Whatnot is the largest live shopping platform in North America and Europe to buy, sell, and discover the things you love. We’re re-defining e-commerce by bl...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Founding Applied ML Engineer

David AI • San Francisco, California, United States

serp_jobs.job_card.full_time

David AI is the first audio data research company.We bring an R&D approach to data–developing datasets with the same rigor AI labs bring to models. Speech is versatile, accessible, and.To unlock the...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Principal Software Engineer AI Platform

Snorkel Ai • Redwood City, California, United States

serp_jobs.job_card.full_time

At Snorkel, we believe meaningful AI doesn’t start with the model, it starts with the data.We’re on a mission to help enterprises transform expert knowledge into specialized AI at scale.The AI land...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Platform Engineer, Model Shaping

Together AI • San Francisco, CA, United States

serp_jobs.job_card.full_time

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Senior ML Platform Engineer : Scale Production Models

Turo Inc • San Francisco, CA, United States

serp_jobs.job_card.full_time

A leading car-sharing platform is seeking a Senior Software Engineer to work with the Machine Learning Engineering team.You'll build scalable systems and integrate machine learning models into the ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Founding Engineer - ML

Datawizz • San Francisco, California, United States

serp_jobs.job_card.full_time

Datawizz helps companies reduce LLM costs by 85% while improving accuracy by over 20% by combining distillation, model routing, and pruning to route requests to smaller, more efficient models.We st...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Machine Learning, Platform Engineer

Together Ai • San Francisco, California, United States

serp_jobs.job_card.full_time

This role focuses on enabling custom models and dedicated inference on Together.We are responsible for optimizing autoscaling, minimizing cold starts, achieving the best end-to-end model performanc...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Staff Machine Learning Engineer (Modeling), Support

Block • San Francisco, California, United States

serp_jobs.job_card.full_time

Block is one company built from many blocks, all united by the same purpose of economic empowerment.The blocks that form our foundational teams — People, Finance, Counsel, Hardware, Information Sec...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

ML Engineer

Wispr Flow • San Francisco, California, United States

serp_jobs.job_card.full_time

Wispr Flow is making it as effortless to interact with your devices as talking to a close friend.Voice is the most natural, powerful way to communicate — and we’re building the interfaces to make t...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

ML Engineer

Phizenix • Menlo Park, California, United States

serp_jobs.job_card.full_time +1

Client Opportunity | Through Phizenix.Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an innovative generative AI startup that’s developing diffusion-based larg...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

AI Infrastructure Engineer, Model Serving Platform

Scale AI, Inc. • San Francisco, CA, United States

serp_jobs.job_card.full_time

As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting-edge research and product...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Machine Learning Engineer

Skild AI • San Mateo, Pennsylvania, United States

serp_jobs.job_card.full_time

At Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. We believe massive scale through data-driven machin...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Model Deployment Engineer

Rime • San Francisco, California, United States

serp_jobs.job_card.full_time

Rime builds enterprise-grade voice models that sound truly human — trusted by global telcos, healthcare systems, and leading brands to power billions of real customer interactions.Our mission is to...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted

Senior Software Engineer - Machine Learning Platform

Snowflake • Menlo Park, California, United States

serp_jobs.job_card.full_time

The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their machine learning and deep learning workloads to Snowflake. Our customers want to build powerful models wi...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Lead Performance Modelling Engineer - Systems & Simulators

Flux • San Francisco, CA, US

serp_jobs.job_card.full_time

A leading technology company in San Francisco is seeking a Staff Performance Modelling Engineer to develop analytical and simulation models that drive architecture evolution.The ideal candidate wil...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Forward Deployed AI Engineer

Datologyai • Redwood City, California, United States

serp_jobs.job_card.full_time

But a large portion of training compute is wasted training on data that are already learned, irrelevant, or even harmful, leading to worse models that cost more to train and deploy.At DatologyAI, w...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Senior Software Engineer - Machine Learning

Celonis • Redwood City, California, United States

serp_jobs.job_card.full_time

We're Celonis, the global leader in Process Intelligence technology and one of the world's fastest-growing SaaS firms.We believe there is a massive opportunity to unlock productivity by placing AI,...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Staff Machine Learning Platform Engineer

Faire • San Francisco, California, United States

serp_jobs.job_card.full_time

Faire is an online wholesale marketplace built on the belief that the future is local — independent retailers around the globe are doing more revenue than Walmart and Amazon combined, but individua...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted