Talent.com
Senior Software Engineer, Profiling Services
Senior Software Engineer, Profiling ServicesNvidia Corporation • Santa Clara, CA, United States
Senior Software Engineer, Profiling Services

Senior Software Engineer, Profiling Services

Nvidia Corporation • Santa Clara, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Overview

Are you ready to innovate GPU performance analysis for Machine Learning workloads?! Join our Developer Tools Always-On Profiling (AON) team as a Senior Software Architect, where you'll be pivotal in designing, implementing, and leading our Always-On Profiling service. This role demands deep technical expertise, a proven track record to solve ambiguous challenges, and strong technical leadership skills.

Responsibilities

  • Architect and Build Scalable Systems : Drive the design and implementation of the AON profiling service's core systems. Master inter-process communication (IPC), memory management, and low-overhead architectures to handle profiling data from complex multi-node, multi-process, multi-GPU, and cluster environments.
  • Elevate Software Engineering Excellence : Promote high standards in software development, including design patterns, concurrency, parallelism, and advanced debugging for asynchronous systems. Commit to code quality and robust testing to ensure a reliable profiling service.
  • Lead, Mentor, and Innovate : Guide and mentor engineers, provide impactful code reviews, and shape technical roadmaps. Proactively identify complex technical issues within the AON project, break them down, and craft innovative solutions. Problem-solving prowess is crucial for AON's success with ML workloads.
  • Architect and Build High-Performance Platforms : Transform user needs into clear requirements and design documents. Explore diverse approaches to problems, make well-reasoned recommendations, and lead end-to-end feature development—from planning and prototyping to implementation, testing, and customer evaluation. Hands-on development across user applications, drivers, performance counter libraries, and lower-level platform / hardware abstraction layers.
  • Collaborate Across Boundaries : Partner effectively with diverse internal and external teams. Exceptional communication and collaboration skills are key to integrating AON seamlessly into the broader profiling and ML ecosystem.

Qualifications

  • BS or MS degree or equivalent experience in Computer Engineering, Computer Science, or related degree.
  • 6+ years of meaningful software development experience in C, C++, and Python.
  • 6+ years in system software design, operating systems fundamentals, computer architectures, performance analysis, and delivering production-quality software.
  • Strong interpersonal, verbal, and written communication, demonstrating the ability to build cross-organizational partnerships and lead technical teams through complex challenges.
  • Profiling & Performance Tools Expert : Extensive knowledge of profiling technologies (sampling, tracing), overhead analysis, and diverse profiling data (CPU / GPU events, performance counters, API traces, event correlation). Familiarity with existing profiling ecosystems and their limitations is a plus.
  • GPU & CUDA Proficiency : In-depth knowledge of CUDA APIs, runtime, streams, kernels, and GPU architecture.
  • ML Ecosystem & Performance Analysis : Familiarity with ML frameworks such as PyTorch and JAX, and knowledge of performance analysis for AI training / inference applications.
  • Large-Scale System Development & Debugging : Experience developing and debugging across complex multi-layered software systems, including user mode and kernel drivers, with a proven ability to contribute to and extend substantial codebases (100s of millions of lines).
  • Proficiency in Designing APIs and Interfaces for Profiling Tools : Designs robust, flexible APIs and interfaces enabling seamless integration of profiling tools with various frameworks and custom code.
  • Mastery of Problem Simplification : A history of breaking down ill-defined problems in complex technical domains, designing effective solutions, and leading teams to implement them.
  • Ways to Stand Out

  • Pioneering Low-Overhead Profiling Systems : A track record of designing and implementing profiling systems with minimal performance impact on target workloads, especially in complex multi-process and distributed environments.
  • Deep Understanding of PyTorch Internals & CUDA Usage : A comprehensive grasp of how PyTorch uses CUDA, including tensor memory, operations, and distributed training functionalities.
  • GPU Performance Analysis & Optimization Acuity : The ability to analyze profiling data and translate it into concrete, actionable insights, particularly within CUDA and ML Frameworks like PyTorch.
  • Translating Customer Needs : Skilled at redefining customer requests into actionable use cases and requirements.
  • Strong understanding of system security principles.
  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

    You will also be eligible for equity and benefits.

    Applications for this job will be accepted at least until November 10, 2025.

    NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Senior Software Engineer • Santa Clara, CA, United States

    Job_description.internal_linking.related_jobs
    Senior Software Engineer

    Senior Software Engineer

    Mongodb • Palo Alto, California, United States
    serp_jobs.job_card.full_time
    We’re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic search, retrieval, and AI-native experiences in MongoDB Atl...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer

    Senior Software Engineer

    Anvilogic, Inc. • Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Anvilogic is a Palo Alto-based AI cybersecurity startup founded in 2019 by security veterans and data scientists from Fortune 500 companies. Our mission is to democratize threat detection and huntin...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer

    Senior Software Engineer

    Rainmaker Systems • Campbell, California, United States
    serp_jobs.job_card.full_time
    We are looking for a few exceptional software engineers to work on our cloud based B2B e-commerce, renewals and subscriptions platform. As a member of the engineering team, you will work with produc...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Software Engineer

    Senior Software Engineer

    Wing • Palo Alto, California, United States
    serp_jobs.job_card.full_time
    Wing offers drone delivery as a safe, fast, and sustainable solution for last mile logistics.Consumer appetites for on-demand services are increasing, but current delivery methods are inefficient, ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer - Service Mesh

    Senior Software Engineer - Service Mesh

    Roku, Inc. • San Jose, CA, United States
    serp_jobs.job_card.full_time
    Teamwork makes the stream work.Roku is changing how the world watches TV.Roku is the #1 TV streaming platform in the U.Canada, and Mexico, and we've set our sights on powering every television in t...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer

    Senior Software Engineer

    Clockwork.io • Palo Alto, California, United States
    serp_jobs.job_card.full_time
    Silicon Valley startup that delivers state-of-the-art AI compute acceleration.We are founded by Stanford researchers and veteran systems engineers with a shared belief : distributed systems powering...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Senior Software Engineer - AI Agent Infrastructure (Healthcare)

    Honey Health • Fremont, CA, US
    serp_jobs.job_card.full_time
    Honey Health is the all-in-one AI back office for primary and specialty care.Our AI agents autonomously handle core back-office jobs, such as aggregating patients data, processing orders and prescr...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer - Fullstack

    Senior Software Engineer - Fullstack

    Databricks Inc. • Mountain View, CA, United States
    serp_jobs.job_card.full_time
    At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer, Observability

    Senior Software Engineer, Observability

    Expedia, Inc. • San Jose, CA, United States
    serp_jobs.job_card.full_time
    Expedia Group brands power global travel for everyone, everywhere.We design cutting‑edge tech to make travel smoother and more memorable, and we create groundbreaking solutions for our partners.Our...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer

    Senior Software Engineer

    Applied Intuition • Mountain View, California, United States
    serp_jobs.job_card.full_time
    Applied Intuition is the vehicle intelligence company that accelerates the global adoption of safe, AI-driven machines.Founded in 2017, Applied Intuition delivers the toolchain, Vehicle OS, and aut...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Software Engineer - Onboarding

    Senior Software Engineer - Onboarding

    Bitgo • Palo Alto, California, United States
    serp_jobs.job_card.full_time
    BitGo is the leading infrastructure provider of digital asset solutions, delivering custody, wallets, staking, trading, financing, and settlement services from regulated cold storage.Since our foun...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Software Engineer, Chronicle

    Senior Software Engineer, Chronicle

    Google Inc. • Sunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Google place Sunnyvale, CA, USA.Experience driving progress, solving problems, and mentoring more junior team members; deeper expertise and applied knowledge within relevant area.Bachelor's degree ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Senior Software Engineer

    Senior Software Engineer

    Onto Innovation Inc. • Milpitas, CA, United States
    serp_jobs.job_card.permanent
    Onto Innovation is a leader in process control, combining global scale with an expanded portfolio of leading-edge technologies that include : 3D metrology spanning the chip from nanometer-scale tran...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer

    Senior Software Engineer

    NMK Global Inc • Sunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Design, develop, and maintain features and enhancements for the SONiC NOS platform.Develop and execute test plans using PTF and SPyTest for infrastructure. Bachelor’s or Master’s degree in Computer ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer - Aurora Services Engineering

    Senior Software Engineer - Aurora Services Engineering

    Australian Competition and Consumer Commission • Mountain View, CA, United States
    serp_jobs.job_card.full_time
    Software Platform Software & Services Mountain View, California.Design complex systems from the ground up, working closely with software, hardware, and infrastructure engineering teams along with o...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Software Engineer, Payments

    Senior Software Engineer, Payments

    Apple Inc. • Cupertino, CA, United States
    serp_jobs.job_card.full_time
    Cupertino, California, United States Software and Services.You'll have the opportunity to tackle the intricate challenge of building resilient and reliable distributed software systems at Apple's s...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Software Engineer

    Senior Software Engineer

    Jobs Board • Mountain View, California, United States
    serp_jobs.job_card.full_time
    Applied Intuition is a vehicle software supplier that accelerates the adoption of safe and intelligent machines worldwide. Founded in 2017, Applied Intuition provides a simulation and validation pla...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Kubernetes Software Engineer

    Senior Kubernetes Software Engineer

    Broadcom Inc. • Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Leverage common patterns to develop fixes and features for Kubernetes and CNCF projects • Design customer-oriented and community-aligned features by building consensus through Key Enhancement Propos...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted