Talent.com
AI System Solution Architect
AI System Solution ArchitectCango Inc. • San Francisco, California, United States
AI System Solution Architect

AI System Solution Architect

Cango Inc. • San Francisco, California, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Responsibilities

Design end-to-end technical architecture for LLM and Diffusion model inference on large-scale GPU clusters.

Develop innovative solutions in KV Cache management, distributed scheduling, pipelining / batching strategies, memory allocation, and P2P / IB communication.

Architect a multi-tenant serving framework that balances throughput, latency, and cost.

Define product positioning and differentiation based on industry trends and company strategy.

Develop technical evolution plans (e.g., token streaming like vLLM, syntax parsing like SGLang, Diffusion acceleration).

Align closely with internal GPU infrastructure and business teams to ensure timely product delivery.

Lead performance engineering efforts including NCCL tuning, NUMA binding, CUDA kernel optimization.

Drive cross-team collaboration (GPU kernel, compiler, distributed system, frontend APIs) to ensure system stability and scalability.

Organize benchmarking and performance testing against industry leaders (vLLM, SGLang, TensorRT, etc.).

Guide engineering team on implementation strategies, experimental methodologies, and optimization pathways.

Engage with open-source communities and contribute core components to enhance technical influence.

Communicate directly with North America-based clients to understand their needs for AI inference, training, and deployment.

Translate customer needs into internal implementation plans and coordinate across operations, engineering, and delivery teams.

Qualifications

5+ years of experience in computer infrastructure, GPU cloud, or large-scale cloud computing in the U.S., with a deep understanding of the North American tech ecosystem.

Master’s or Ph.D. in Computer Science, Electrical Engineering, or related fields preferred.

5+ years of hands‑on experience in deep learning systems or GPU optimization, including leading the design of at least one large‑scale AI inference or training system.

Proficiency with PyTorch, CUDA, NCCL, Triton, TensorRT, MPI / IB / RDMA, etc.

Deep understanding of projects like vLLM, SGLang, DeepSpeed, FasterTransformer.

Core Competencies

Practical experience in LLM inference optimization (e.g., KV Cache, P2P vs CPU routing, batching strategies).

Ability to integrate system‑level optimization with product usability (API and serving layers).

Strong architectural thinking and cross‑functional communication skills to translate complexity into clear product roadmaps.

Preferred

Open‑source contributions (e.g., to vLLM, DeepSpeed, Ray, Triton‑Server, SGLang, etc.).

Experience launching GPU cloud or AI infrastructure products (e.g., RunPod, Lambda, Modal, SageMaker).

Familiarity with emerging LLM inference trends such as speculative decoding, continuous batching, and streaming inference.

What We Offer

Hands‑on opportunity to manage and optimize GPU clusters at multi‑thousand‑card scale, operating at the forefront of global compute infrastructure.

Strategic partner role in both product architecture and business decisions alongside core leadership team.

Key role in building the next‑generation GPU‑based AI inference infrastructure.

High degree of autonomy in product and architectural decisions.

Competitive compensation package with equity incentives.

Global team and access to cross‑regional GPU cluster resources.

Job Details

Seniority Level : Mid‑Senior Level

Employment Type : Full‑time

Job Function : Information Technology

Industries : Technology, Information and Internet

#J-18808-Ljbffr

serp_jobs.job_alerts.create_a_job

Solution Architect • San Francisco, California, United States

Job_description.internal_linking.related_jobs
Solution Architect – Agentic AI Platform

Solution Architect – Agentic AI Platform

Droisys • Menlo Park, CA, United States
serp_jobs.job_card.full_time
Solution Architect – Agentic AI Platform.Droisys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution.W...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Enterprise AI Solutions Architect

Senior Enterprise AI Solutions Architect

Anaplan • San Francisco, CA, United States
serp_jobs.job_card.full_time
A global technology firm in San Francisco is seeking a Senior Solution Consultant with extensive experience in technical strategy and business transformation. You will work closely with enterprise c...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Enterprise AI Architect & Systems Leader

Enterprise AI Architect & Systems Leader

EY • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading global professional services firm is looking for a Lead Software & AI Architect to define and deliver architecture for scalable solutions in San Francisco. The role involves overseeing eng...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
Senior AI Solutions Architect - Integrations & Automation

Senior AI Solutions Architect - Integrations & Automation

Intercom • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading AI Customer Service provider is looking for a Senior Solutions Architect in San Francisco to help businesses maximize the value of their customer service solutions.In this role, you'll be...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
Senior AI API Solutions Architect

Senior AI API Solutions Architect

Glean • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading Work AI platform is seeking a Senior Solution Architect to enhance the use of their APIs and SDKs in internal AI applications. The role involves collaborating with engineers and customers ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior Generative AI Solutions Architect

Senior Generative AI Solutions Architect

Black Forest Labs • San Francisco, CA, United States
serp_jobs.job_card.full_time
A cutting-edge startup is looking for a Senior Solutions Architect in San Francisco to lead customer onboarding and drive deployment of generative AI models. The ideal candidate will have a deep und...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
AWS Bedrock Solutions Architect – Generative AI & RAG Systems

AWS Bedrock Solutions Architect – Generative AI & RAG Systems

Mogi I / O : OTT / Podcast / Short Video Apps for you • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading global technology and consulting firm is seeking an AWS Bedrock Architect to design, implement, and scale Generative AI and RAG-based architectures leveraging AWS Bedrock.Location : USA – ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Onsite AI-Driven Enterprise Systems Architect

Onsite AI-Driven Enterprise Systems Architect

Notion • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading productivity software company in San Francisco is looking for an architect to design the future of its internal systems, integrating AI for enhanced workflows and productivity.You will wo...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Enterprise AI Solutions Architect – NA

Enterprise AI Solutions Architect – NA

Ema Unlimited, Inc. • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading technology firm in San Francisco is seeking a Solutions Architect to drive customer success with Ema's innovative AI solutions. In this role, you will collaborate closely with sales leader...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
AI System Solution Architect

AI System Solution Architect

Cango Inc. • San Francisco, CA, United States
serp_jobs.job_card.full_time
Design end-to-end technical architecture for LLM and Diffusion model inference on large-scale GPU clusters.Develop innovative solutions in KV Cache management, distributed scheduling, pipelining / ba...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior AI Architect – Multi-Agent Systems & Platform Infrastructure

Senior AI Architect – Multi-Agent Systems & Platform Infrastructure

Nivalto • San Francisco, CA, United States
serp_jobs.job_card.full_time
Senior AI Architect – Multi-Agent Systems & Platform Infrastructure.Senior AI Architect – Multi-Agent Systems & Platform Infrastructure. Senior AI Architect – Multi-Agent Systems & Platform Infrastr...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Sr. Solution Architect, AI / Cloudflare Developer Platform

Sr. Solution Architect, AI / Cloudflare Developer Platform

Cloudflare, Inc. • San Francisco, CA, United States
serp_jobs.job_card.full_time
At Cloudflare, we are on a mission to help build a better Internet.Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for cust...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Conversational AI Solution Architect

Conversational AI Solution Architect

Empower Staffing • San Francisco, CA, United States
serp_jobs.job_card.full_time
Our client, a leader in healthcare innovation for over a decade, is seeking a seasoned AI Solution Architect to join its AI Platform Partnership team. In this position, you’ll be one of five team me...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Principal AI Solutions Architect

Principal AI Solutions Architect

Oracle • Redwood City, California, USA
serp_jobs.job_card.full_time
As an AI Solutions Architect you will be responsible for creating architectural blueprints developing scalable AI solutions and guiding engineering teams on best practices.You will directly impact ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Solution Architect - Presales

Solution Architect - Presales

Informatica LLC • Redwood City, CA, United States
serp_jobs.job_card.full_time
Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous minds eager to solve the world's most...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Applied AI, Enterprise Solutions Architect

Applied AI, Enterprise Solutions Architect

Anthropic • San Francisco, CA, US
serp_jobs.job_card.full_time
About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole.Our team is a quick...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
AI Solutions Architect

AI Solutions Architect

Jobs via Dice • San Francisco, CA, United States
serp_jobs.job_card.full_time
San Francisco Bay Area / Palo Alto / Remote / Hybrid.ShimentoX is seeking a strong, hands‑on.Design and architect end‑to‑end AI / GenAI solutions for enterprise customers. Build and integrate LLM‑base...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
Solutions Architect, Generative AI Deployment

Solutions Architect, Generative AI Deployment

OpenAI • San Francisco, CA, United States
serp_jobs.job_card.full_time
The Solutions Architecture team ensures the safe and effective deployment of Generative AI applications for developers and enterprises. We act as trusted advisors and technical partners to our custo...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted