Talent.com
CUDA Kernel Optimizer - ML Engineer
CUDA Kernel Optimizer - ML EngineerMercor • San Francisco, California, United States
CUDA Kernel Optimizer - ML Engineer

CUDA Kernel Optimizer - ML Engineer

Mercor • San Francisco, California, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
  • serp_jobs.filters.remote
  • serp_jobs.filters_job_card.quick_apply
job_description.job_card.job_description

1) Role Overview

Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while maintaining correctness and reproducibility,

2) Key Responsibilities

Develop, tune, and benchmark CUDA kernels for tensor and operator workloads.

Optimize for occupancy, memory coalescing, instruction-level parallelism, and warp scheduling.

Profile and diagnose performance bottlenecks using Nsight Systems, Nsight Compute, and comparable tools.

Report performance metrics, analyze speedups, and propose architectural improvements.

Collaborate asynchronously with PyTorch Operator Specialists to integrate kernels into production frameworks.

Produce well-documented, reproducible benchmarks and performance write-ups.

3) Ideal Qualifications

Deep expertise in CUDA programming, GPU architecture, and memory optimization.

Proven ability to achieve quantifiable performance improvements across hardware generations.

Proficiency with mixed precision, Tensor Core usage, and low-level numerical stability considerations.

Familiarity with frameworks like PyTorch, TensorFlow, or Triton (not required but beneficial).

Strong communication skills and independent problem-solving ability.

Demonstrated open-source, research, or performance benchmarking contributions.

4) More About the Opportunity

Ideal for independent contractors who thrive in performance-critical, systems-level work.

Engagements focus on measurable, high-impact kernel optimizations and scalability studies.

Work is fully remote and asynchronous; deliverables are outcome-driven.

Access to shared benchmarking infrastructure and reproducibility tooling via Mercor support resources.

5) Compensation & Contract Terms

Typical range : $120–$250 / hour , depending on scope, specialization, and results achieved. Payments will be based on accepted task output over flat hourly.

Structured as a contract-based engagement , not an employment relationship.

Compensation tied to measurable deliverables or agreed milestones.

Confidentiality, IP, and NDA terms as defined per engagement.

6) Application Process

Submit a brief overview of prior CUDA optimization experience, profiling results, or performance reports.

Include links to relevant GitHub repos, papers, or benchmarks if available.

Indicate your hourly rate, time availability, and preferred engagement length.

Selected experts may complete a small, paid pilot kernel optimization project

7) About Mercor

Mercor connects domain experts with top AI research and technology organizations through project-based contracts.

Contractors operate independently, with full flexibility over methods, timelines, and tools.

Our mission is to help top engineers and researchers access frontier technical work without rigid employment structures.

serp_jobs.job_alerts.create_a_job

Ml Engineer • San Francisco, California, United States

Job_description.internal_linking.related_jobs
Founding Applied ML Engineer

Founding Applied ML Engineer

David AI • San Francisco, California, United States
serp_jobs.job_card.full_time
David AI is the first audio data research company.We bring an R&D approach to data–developing datasets with the same rigor AI labs bring to models. Speech is versatile, accessible, and.To unlock the...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior ML Ops Engineer | Distributed Systems Lead

Senior ML Ops Engineer | Distributed Systems Lead

Baton Trucking, Inc. • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading logistics technology company in San Francisco is seeking a Staff Software Engineer - ML Ops to enhance its machine learning infrastructure. The role involves building robust distributed sy...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior ML Engineer (Immediate Joiner)

Senior ML Engineer (Immediate Joiner)

Proximity Works • San Francisco, CA, United States
serp_jobs.job_card.full_time
This role is for a hands‑on ML Engineer who can design, train, and productionize models powering search relevance, retrieval, personalization, and LLM‑based conversational experiences at a massive ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
ML Research Engineer - GPUs go Brrr

ML Research Engineer - GPUs go Brrr

Achira • San Francisco, CA, United States
serp_jobs.job_card.full_time
Join a world-class team of scientists, ML researchers, and engineers working together to make the physical microcosm predictable and reshape the future of drug discovery. Move beyond the beaten path...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
ML Engineer

ML Engineer

Wispr Flow • San Francisco, California, United States
serp_jobs.job_card.full_time
Wispr Flow is making it as effortless to interact with your devices as talking to a close friend.Voice is the most natural, powerful way to communicate — and we’re building the interfaces to make t...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
AI Kernel Engineer

AI Kernel Engineer

Quadric, Inc • Burlingame, California, United States
serp_jobs.job_card.full_time
Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture.Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
ML Engineer

ML Engineer

Phizenix • Menlo Park, California, United States
serp_jobs.job_card.full_time +1
Client Opportunity | Through Phizenix.Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an innovative generative AI startup that’s developing diffusion-based larg...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)

Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)

Mercor • Redwood City, California, US
serp_jobs.filters.remote
serp_jobs.job_card.full_time
Role Overview • • Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model o...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Enterprise ML Engineer — Onsite in SF Bay Area

Enterprise ML Engineer — Onsite in SF Bay Area

Fractal • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading AI analytics firm is seeking a Machine Learning Engineer in San Francisco.Responsibilities include designing and deploying ML algorithms for enterprise applications and engaging with clie...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
ML Engineer

ML Engineer

Catalyst Labs • Menlo Park, California, USA
serp_jobs.job_card.full_time
Is a rapidly growing Tier 1 VC backed startup based in New York with $60 million in funding revolutionizing how outside sales and service teams work. Their AI technology captures and analyzes real-w...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Founding Engineer (Systems + ML)

Founding Engineer (Systems + ML)

Partcl • San Francisco, CA, United States
serp_jobs.job_card.full_time
Founding Engineer (Systems + ML).Get AI-powered advice on this job and more exclusive features.This range is provided by Partcl. Your actual pay will be based on your skills and experience — talk wi...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Embedded ML Engineer

Embedded ML Engineer

Stealth Company • San Francisco, CA, United States
serp_jobs.job_card.full_time
Join as our embedded ML specialist to build AI inference directly on our device hardware.You'll work directly with founders who've built unicorn companies and know how to ship fast.This is embedded...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Performance ML Engineer : CUDA, GPU Systems

Performance ML Engineer : CUDA, GPU Systems

Relace • San Francisco, CA, United States
serp_jobs.job_card.full_time
A tech company specializing in ML infrastructure is seeking a Machine Learning Engineer who excels at making models faster and more efficient through performance tuning and optimization.The ideal c...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
ML Engineer

ML Engineer

Fixity Technologies LLC • San Francisco, California, USA
serp_jobs.job_card.full_time
Location - San Francisco CA (Hybrid).Strong programming skills in languages like Python Java and potentially others used in machine learning. Knowledge of ML Algorithms and Techniques.Experience wit...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior ML Engineer

Senior ML Engineer

Veryfi, Inc. • San Mateo, California, United States, 94401
serp_jobs.job_card.full_time
Veryfi AI document capture (Veryfi Lens) and AI-powered data extraction (Veryfi OCR API) software delivers Day 1 Accuracy™ and immediate go-to-market prowess. Veryfi enables fintech products, retent...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30
Senior ML Systems Engineer, Frameworks & Tooling

Senior ML Systems Engineer, Frameworks & Tooling

Cohere • San Francisco, CA, United States
serp_jobs.job_card.full_time
Senior ML Systems Engineer, Frameworks & Tooling.Our mission is to scale intelligence to serve humanity.We’re training and deploying frontier models for developers and enterprises who are building ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Applied ML Engineer

Applied ML Engineer

Fal • San Francisco, CA, United States
serp_jobs.job_card.full_time
You are an ML Engineer with a broad view of the generative media space and up-to-date awareness of new methods.You can identify missing products and features in the market and develop new solutions...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Principal ML Engineer

Principal ML Engineer

Grindr LLC • San Francisco, CA, United States
serp_jobs.job_card.full_time
This is a hybrid role based in our Palo Alto or San Francisco offices and will require you to be in office Tuesdays and Thursdays. What’s so interesting about this role?.At Grindr, we’re at the dawn...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted