Talent.com
Senior Solution Architect, HPC and AI - NVIS
Senior Solution Architect, HPC and AI - NVISNVIDIA Corporation • Santa Clara, CA, United States
Senior Solution Architect, HPC and AI - NVIS

Senior Solution Architect, HPC and AI - NVIS

NVIDIA Corporation • Santa Clara, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

NVIDIA is the world leader in computer graphics, artificial intelligence, and accelerated computing. For over 25 years, we have been at the forefront of research and engineering around the greatest advances in technology. Our history of innovation drives us to solve the worlds hardest problems.NVIDIA is looking for Senior HPC / AI Solutions Architect to join its NVIDIA Infrastructure Specialists Team. Academic and commercial groups around the world are using NVIDIA products to revolutionize deep learning and data analytics, and to power data centers. Join the team building many of the largest and fastest AI / HPC systems in the world! We are looking for someone with the ability to work on a dynamic customer focused team that requires excellent interpersonal skills. This role will be interacting with customers, partners and internal teams, to analyze, define and implement large scale AI / HPC projects. The scope of these efforts includes a combination of Networking, System Design and Automation and being the face to the customer!

  • What You’ll Be Doing :
  • Primary responsibilities will include building robust AI / HPC infrastructure for new and existing customers.
  • Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, training stability, real-time monitoring, logging, and alerting.
  • Engage in and improve the whole lifecycle of services from inception and design through deployment, operation, and refinement.
  • Your primary focus would be on understanding the AI workload and how it interacts with other parts of the system like networking, storage, deep learning frameworks, data cleaning tools, etc.
  • Help maintain services once they are live by measuring and monitoring progress of AI jobs and helping engineering design solutions for more robust training at scale.
  • Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.
  • What We Need to See :
  • BS / MS / PhD or equivalent experience in Computer Science, Data Science, Electrical / Computer Engineering, Physics, Mathematics, other Engineering fields with at least 8 years work or research experience with Python / C++ / other software development.
  • Track record of medium to large scale AI training and understanding of key libraries used for NLP / LLM / VLA training (NeMo Framework, DeepSpeed etc.)
  • Experience with integration and deployment of software products in production enterprise environments, and microservices software architecture.
  • You are excited to work with multiple levels and teams across organisations (Engineering, Product, Sales and Marketing team) Capable of working in a constantly evolving environment without losing focus. Ability to multitask in a fast-paced environment.
  • Driven with strong analytical and problem-solving skills. Strong time-management and organization skills for coordinating multiple initiatives, priorities and implementations of new technology and products into very sophisticated projects.
  • You are a self-starter with demeanour for growth, passion for continuous learning and sharing findings across the team.
  • Technical leadership and strong understanding of NVIDIA technologies, and success in working with customers.
  • Excellent verbal, written communication, and technical presentation skills in English.
  • Ways to Stand Out from The Crowd :
  • Experience working with large transformer-based architectures for NLP, CV, ASR or other. Experience running large scale distributed DL training.
  • Understanding of HPC systems : data center design, high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and / or management experience.
  • Proven experience with one or more Tier-1 Clouds (AWS, Azure, GCP or OCI) and cloud-native architectures and software.
  • Expertise with parallel filesystems (e.g. Lustre, GPFS, BeeGFS, WekaIO) and high-speed interconnects (InfiniBand, Omni Path, and Gig-E).
  • Strong coding and debugging skills, and demonstrated expertise in one or more of the following areas : Machine Learning, Deep Learning, Slurm, Docker / Kubernetes, Kubernetes, Singularity, MPI, MLOps, LLMOps, Ansible, Terraform, and other high-performance AI cluster solutions.
  • Technical leadership and strong understanding of NVIDIA technologies including GX Cloud, NVIDIA AI Enterprise AI Software, Base Command Manager, NEMO and NVIDIA Inference Microservices. Success in working with customers using NVIDIA technologies.NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking individuals in the world working for us. If you're creative and autonomous, we want to hear from you.The base salary range is 148,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and .
  • NVIDIA accepts applications on an ongoing basis.
  • NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Senior Solution Architect • Santa Clara, CA, United States

Job_description.internal_linking.related_jobs
Senior Solution Architect – AI / GPU Cloud

Senior Solution Architect – AI / GPU Cloud

GMI Cloud • Mountain View, CA, United States
serp_jobs.job_card.full_time
Senior Solution Architect – AI / GPU Cloud.We are seeking a Senior Solution Architect to design GPU‑cloud and AI infrastructure solutions, lead PoCs and benchmarks, guide customers through deployme...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Lead System Solutions Architect – AI & HPC Clusters

Lead System Solutions Architect – AI & HPC Clusters

AMD • San Jose, CA, United States
serp_jobs.job_card.full_time
A leading technology company in San Jose is seeking an experienced System Solutions Architect focused on large clusters for AI workloads. In this role, you will lead customer discovery sessions, tra...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
AI Solution Manager

AI Solution Manager

Supermicro • San Jose, CA, United States
serp_jobs.job_card.full_time
Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Solution Architect

Senior Solution Architect

LotusFlare, Inc. • Santa Clara, US
serp_jobs.job_card.full_time
serp_jobs.filters_job_card.quick_apply
LotusFlare employees join and remain at LotusFlare for two simple reasons.First, they can see immediately that their work makes a positive impact on LotusFlare customers, and second, they grow on a...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days
Senior Solution Architect, HPC and AI - NVIS

Senior Solution Architect, HPC and AI - NVIS

NVIDIA Corporation • Santa Clara, CA, United States
serp_jobs.job_card.full_time
NVIDIA is the world leader in computer graphics, artificial intelligence, and accelerated computing.For over 25 years, we have been at the forefront of research and engineering around the greatest ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
System Solutions Architect -Large Clusters for AI & HPC workloads

System Solutions Architect -Large Clusters for AI & HPC workloads

CareerArc • San Jose, CA, United States
serp_jobs.job_card.full_time
WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next‑generation computing experiences—from AI and data centers, to PCs, gaming and embedded syst...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Cloud Solution Architect

Senior Cloud Solution Architect

Tencent • Palo Alto, CA, United States
serp_jobs.job_card.full_time
Senior Cloud Solution Architect.Be among the first 25 applicants.Senior Cloud Solution Architect.Get AI-powered advice on this job and more exclusive features. Cloud & Smart Industries Group (CSIG) ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Solution Architect, Strategic Technology Partnerships

Solution Architect, Strategic Technology Partnerships

JFrog • Sunnyvale, CA, United States
serp_jobs.job_card.full_time
Solution Architect, Strategic Technology Partnerships.Solution Architect, Strategic Technology Partnerships.Talent @ JFrog 🐸 Pushing Talents Frogward! At JFrog, we’re reinventing DevOps to help th...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Azure Solution Architect

Azure Solution Architect

Technology Credit Union (Tech CU) • San Jose, CA, United States
serp_jobs.job_card.full_time
Azure Solution Architect at Technology Credit Union (Tech CU).Technology Credit Union (Tech CU) provided pay range.This range is provided by Technology Credit Union (Tech CU).Your actual pay will b...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Solution Architect, HPC and AI - NVIS

Senior Solution Architect, HPC and AI - NVIS

NVIDIA • Santa Clara, CA, United States
serp_jobs.job_card.full_time
Senior Solution Architect, HPC and AI - NVIS.Join to apply for the Senior Solution Architect, HPC and AI - NVIS role at NVIDIA. Do you want to be part of the team that brings Artificial Intelligence...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior HPC / EDA Solutions Architect

Senior HPC / EDA Solutions Architect

Synopsys, Inc. • Sunnyvale, CA, United States
serp_jobs.job_card.full_time
A leading technology company in Sunnyvale, California is seeking a visionary technical leader to architect robust, scalable solutions in high-performance computing and electronic design automation....serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
(US)Databricks Solution Architect

(US)Databricks Solution Architect

Codvo Private Limited • Santa Clara, CA, United States
serp_jobs.job_card.full_time
At Codvo, we are committed to building scalable, future-ready data platforms that power business impact.We believe in a culture of innovation, collaboration, and growth, where engineers can experim...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Solutions Architect - Enterprise Networking & AI

Senior Solutions Architect - Enterprise Networking & AI

Presidio • Pleasanton, CA, United States
serp_jobs.job_card.full_time
A leading technology firm in Pleasanton, CA is seeking a skilled Senior Solutions Architect to join their Pre-Sales Engineering Team. The role involves meeting with clients to gather business requir...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
Solution Architect, Strategic Technology Partnerships

Solution Architect, Strategic Technology Partnerships

JFrog Ltd • Sunnyvale, CA, United States
serp_jobs.job_card.full_time
Solution Architect, Strategic Technology Partnerships.At JFrog, we’re reinventing DevOps to help the world’s greatest companies innovate and we want you along for the ride.This is a special plac...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Solution Architect

Solution Architect

TradeJobsWorkForce • 95190 San Jose, CA, US
serp_jobs.job_card.full_time
Solution Architect Job Duties : Responsible for assisting in the establishment of an IT Architectur...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
AI & HPC Infra Architect for Large Clusters

AI & HPC Infra Architect for Large Clusters

Advanced Micro Devices, Inc. • San Jose, CA, United States
serp_jobs.job_card.full_time
A leading technology company in San Jose is seeking an experienced System Solutions Architect to drive AI infrastructure projects. The ideal candidate will have a strong background in Kubernetes-bas...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
Solutions Architect

Solutions Architect

7wdata • Santa Clara, CA, United States
serp_jobs.job_card.full_time
We are looking for a Machine Learning Engineer / Solution Architect with experience in deploying Machine Learning (ML), Deep Learning (DL) models on prem and in the cloud. As part of the Solution Arch...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
AI Solutions Architect — Integrations & POC Lead

AI Solutions Architect — Integrations & POC Lead

Thelevel • Mountain View, CA, United States
serp_jobs.job_card.full_time
A cutting-edge AI startup in Mountain View is seeking a Solutions Architect to lead integrations between their platform and customer systems. The role requires a combination of technical expertise, ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted