Talent.com
Senior Staff Machine Learning Engineer - Site Reliability Engineer
Senior Staff Machine Learning Engineer - Site Reliability EngineerServicenow • Santa Clara, California, United States
Senior Staff Machine Learning Engineer - Site Reliability Engineer

Senior Staff Machine Learning Engineer - Site Reliability Engineer

Servicenow • Santa Clara, California, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
  • serp_jobs.job_card.permanent
job_description.job_card.job_description

Company Description

It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But this is just the beginning of our journey. Join us as we pursue our purpose to make the world work better for everyone.

Job Description

This position requires passing a ServiceNow background screening, USFedPASS (US Federal Personnel Authorization Screening Standards). This includes a credit check, criminal / misdemeanor check and taking a drug test. Any employment is contingent upon passing the screening. Due to Federal requirements, only US citizens, US naturalized citizens or US Permanent Residents, holding a green card, will be considered.

PLATO (Platform Engineering and AI Technology Organization) at ServiceNow is a customer-focused innovative group building intelligent software using a variety of technology stacks to enable end-to-end, industry-leading work experiences for our customers. We are a group of people deeply invested in the success of our customers that happen to have expertise and knowledge in advanced technologies and software engineering best practices. We are data driven, structured, committed and we enjoy what we are doing. We prioritize robustness, performance and user experience over the technology stack and tools.

We are a group of technology professionals and platform engineers with a dual mission. We build and evolve the AI platform, and partner with teams to build products and end-to-end AI-powered work experiences. In equal measure, we lay the foundations, research, experiment, and de-risk AI technologies that unlock new work experiences in the future.

As a Senior Staff Machine Learning Engineer - Site  Reliability Engineer you will :

  • Contribute to the design, development and implementation of infrastructure, platform, deployment and observability features that power AI workloads.
  • Collaborate with researchers, AI engineers, and infrastructure teams to ensure our GPU clusters perform efficiently, scale well, and remain reliable.
  • Contribute to the continuous improvement of the SRE practice by turning operational use cases into requirements for software tooling.
  • Contribute to the execution of deployment and support activities for AI / ML developers;
  • Build high-quality, clean, scalable and reusable code by enforcing best practices around software engineering architecture and processes (Code Reviews, Unit testing, etc.);
  • Work with the product owners to understand detailed requirements and own your code from design, implementation, test automation and delivery of high-quality product to our users;
  • Experience with operating LLMs on NVIDIA GPUs.
  • Be a mentor for colleagues and help promote knowledge-sharing.

Qualifications

To be successful in this role you have :

  • Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving. This may include using AI-powered tools, automating workflows, analyzing AI-driven insights, or exploring AI's potential impact on the function or industry.
  • Experience with infrastructure and platform operations, deployments, SRE, and DevOps with a continued focus on improving Platform health;
  • Experience of operating highly-available distributed workloads on Kubernetes following a DevOps approach.
  • Development experience with Python, GoLang, Java or similar languages;
  • Experience with DevOps tooling  (e.g. Helm / Ansible / Kubernetes / Prometheus / Splunk / GitLab CI);
  • Strong working experience operating distributed systems built on Linux and J2EE;
  • Experience with software-defined networking, infrastructure as code and configuration management;
  • Experience building software for compliance and security in regulated environments
  • Ability to drive outcome in projects with material technical risk.
  • Not sure if you meet every qualification? We still encourage you to apply! We value inclusivity, welcoming candidates from diverse backgrounds, including non-traditional paths. Unique experiences enrich our team, and the willingness to dream big makes you an exceptional candidate!

    For positions in this location, we offer a base pay of <

    , plus equity (when applicable), variable / incentive compensation and benefits. Sales positions generally offer a competitive On Target Earnings (OTE) incentive compensation structure. Please note that the base pay shown is a guideline, and individual total compensation will vary based on factors such as qualifications, skill level, competencies, and work location. We also offer health plans, including flexible spending accounts, a 401(k) Plan with company match, ESPP, matching donations, a flexible time away plan and family leave programs.

    Compensation is based on the geographic location in which the role is located and is subject to change based on work location.

    Additional Information

    Work Personas

    We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work.  Learn more here .

    Equal Opportunity Employer

    ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements.

    Accommodations

    We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact [email protected] for assistance.

    Export Control Regulations

    For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities.

    From Fortune. ©2024 Fortune Media IP Limited. All rights reserved. Used under license.

    serp_jobs.job_alerts.create_a_job

    Staff Machine Learning Engineer • Santa Clara, California, United States

    Job_description.internal_linking.related_jobs
    Staff Machine Learning Engineer

    Staff Machine Learning Engineer

    Adobe • San Jose, CA, United States
    serp_jobs.job_card.full_time
    Staff Machine Learning Engineer.Adobe Photoshop is seeking a Staff Machine Learning Services Engineer to serve as the technical lead for our Generative AI Services domain.In this high-impact role, ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior / Staff Site Reliability Engineer

    Senior / Staff Site Reliability Engineer

    Gatik Ai • Mountain View, California, United States
    serp_jobs.job_card.full_time
    Gatik, the leader in autonomous middle-mile logistics, is revolutionizing the B2B supply chain with its autonomous transportation-as-a-service (ATaaS) solution and prioritizing safe, consistent del...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Technology Site Reliability Engineer

    Senior Technology Site Reliability Engineer

    Cooley LLP • Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Senior Technology Site Reliability Engineer.Cooley is seeking a Senior Site Reliability Engineer to join the.Infrastructure & Development Operations. The Senior Technology Site Reliability Engineer(...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantum • Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Staff Machine Learning Engineer

    Staff Machine Learning Engineer

    GEICO • Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Staff Machine Learning Engineer • • • •Overview : • • •single • AI / Machine Learning team, responsible for the tech design and tech health of the team. You will build and architect scalable and reliable AIML...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Staff Machine Learning Engineer, AI Platform

    Staff Machine Learning Engineer, AI Platform

    General Motors • Sunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Remote : This role is based remotely but if you live within a 50-mile radius of Mountain View, you are expected to report to that location three times a week, at minimum. We are seeking an experience...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Psiquantum • Palo Alto, California, United States
    serp_jobs.job_card.full_time
    Quantum computing holds the promise of humanity’s mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Sr. Reliability Engineer (26861)

    Sr. Reliability Engineer (26861)

    Supermicro • San Jose, CA, United States
    serp_jobs.job_card.full_time
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Staff Machine Learning Engineer

    Senior Staff Machine Learning Engineer

    Cisco Systems, Inc. • San Jose, CA, United States
    serp_jobs.job_card.full_time
    Join the engineering team building theintelligent backbone of Splunk Observability Cloud.This role involvesresearching, developing, and deploying core analytical componentsfocused on streaming anom...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Staff Site Reliability Engineer (Cortex Observability)

    Senior Staff Site Reliability Engineer (Cortex Observability)

    Palo Alto Networks • Santa Clara, California, United States
    serp_jobs.job_card.full_time
    At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Staff Machine Learning Engineer

    Staff Machine Learning Engineer

    Cisco Systems • San Jose, CA, United States
    serp_jobs.job_card.full_time
    Join the engineering team building the intelligent backbone of Splunk Observability Cloud.We are committed to leveraging the latest advancements in data science and machine learning to unlock unpre...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Staff Machine Learning R&D Engineer

    Staff Machine Learning R&D Engineer

    Visual Lease • Sunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Matterport is leading the digital transformation of the built world.Our groundbreaking spatial computing platform turns buildings into data making every space more valuable and accessible.Millions ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Sr. Software Engineer- AI / ML, AWS Neuron Distributed Training

    Sr. Software Engineer- AI / ML, AWS Neuron Distributed Training

    Amazon • Cupertino, CA, United States
    serp_jobs.job_card.full_time
    Annapurna Labs designs silicon and software that accelerates innovation.Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago-even yesterday.Ou...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Zscaler • San Jose, California, United States
    serp_jobs.job_card.full_time
    Zscaler accelerates digital transformation so our customers can be more agile, efficient, resilient, and secure.Our cloud native Zero Trust Exchange platform protects thousands of customers from cy...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Grindr • Palo Alto, CA, United States
    serp_jobs.job_card.full_time
    Staff Site Reliability Engineer.Get AI-powered advice on this job and more exclusive features.This range is provided by Grindr. Your actual pay will be based on your skills and experience — talk wit...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Staff Machine Learning R&D Engineer

    Staff Machine Learning R&D Engineer

    Matterport • Sunnyvale, CA, United States
    serp_jobs.job_card.full_time
    Matterport is leading the digital transformation of the built world.Our groundbreaking spatial computing platform turns buildings into data making every space more valuable and accessible.Millions ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Staff Machine Learning Engineer, Intelligent Scheduling Systems

    Staff Machine Learning Engineer, Intelligent Scheduling Systems

    Tesla • Fremont, CA, United States
    serp_jobs.job_card.full_time
    Staff Machine Learning Engineer, Intelligent Scheduling Systems.Be among the first 25 applicants.Staff Machine Learning Engineer, Intelligent Scheduling Systems. Tesla is seeking a Machine Learning ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Member of Technical Staff, Machine Learning Engineer

    Member of Technical Staff, Machine Learning Engineer

    Microsoft Corporation • Mountain View, CA, United States
    serp_jobs.job_card.full_time
    Member of Technical Staff, Machine Learning Engineer.Mountain View, California, United States.As a Member of Technical Staff - Machine Learning Engineer, you will work on the Technical Safety squad...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted