Talent.com
Director, Site Reliability Engineering - Infrastructure Platform
Director, Site Reliability Engineering - Infrastructure PlatformOkta • San Francisco, CA, United States
Director, Site Reliability Engineering - Infrastructure Platform

Director, Site Reliability Engineering - Infrastructure Platform

Okta • San Francisco, CA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.permanent
job_description.job_card.job_description

Director, Site Reliability Engineering – Infrastructure Platform

Okta is The World’s Identity Company. Okta provides secure access, authentication, and automation, placing identity at the core of business security and growth.

The Infrastructure Platform and Shared Services Team

Okta authenticates, authorizes and provisions millions of users a day. The service is hosted on Amazon Web Services (AWS) across multiple availability zones and geographically separated regions. The service is designed for high throughput, and 99.999% availability. We’re looking for a technical leader to help us to continue to scale the service with great people and reliable, cost-effective and efficient infrastructure, processes and tooling.

What You’ll Be Doing

  • Lead the infra platform and shared services org and various initiatives across SRE & Infrastructure organization.
  • Lead the DevOps transformation, microservice journey, and next generation infra platform capabilities in partnership with architects and product engineering.
  • Build a world‑class observability platform and monitoring capabilities enabled with self‑service.
  • Accelerate the velocity of SRE and product engineering by developing robust platforms, powerful tooling, and intuitive self‑service capabilities.
  • Own the design and operation of scalable, self‑service Cloud infrastructure platforms (e.g., Kubernetes, service mesh, CI / CD pipelines, IaC & Edge Infrastructure).
  • Lead, mentor, and grow a high‑performing team of engineers and managers across platform, infrastructure, and shared services domains.
  • Perform engineering design evaluations and ensure the completion of projects within resource, budget, and scheduling constraints.
  • Improve SDLC processes for Cloud infrastructure as code, including the maturity of CI / CD pipelines, change and release management.
  • Manage service and business expectations and prioritize resource allocation.
  • Maintain a deep knowledge of industry best practices, evolving trends, and technologies.

What You’ll Bring To The Role

  • 8+ years of experience in technical leadership & people management.
  • Extensive experience using Agile and DevOps methodologies to build product infrastructure and shared services at scale.
  • 3+ years of experience running large‑scale infrastructure platforms supporting a SaaS / Cloud service in a public Cloud, preferably AWS. Experience supporting a multi‑Cloud environment will be a plus.
  • Strong expertise in cloud‑native architectures, containerization (Kubernetes), IaC (Terraform), and CI / CD pipelines.
  • Strong background and hands‑on experience in SW development, PaaS and automation.
  • Deep experience with building and operating observability platforms and monitoring tools (Grafana, Splunk, APM, etc.) in a large‑scale environment.
  • Demonstrated ability to lead cross‑functional teams and manage large‑scale programs.
  • Effective verbal, written communication and interpersonal skills.
  • Computer Science Degree or related degree or equivalent experience.
  • Additional Requirements

  • This position requires the ability to access federal environments and / or have access to protected federal data. As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g., a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.
  • Compensation and Benefits

    Annual base salary range for candidates located in California : $266,000—$398,000 USD. Okta offers equity (where applicable), bonus, and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave). To learn more about our Total Rewards program please visit : https : / / rewards.okta.com / us.

    End‑of‑Job Legal Statements

    Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws. If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation.

    #J-18808-Ljbffr

    serp_jobs.job_alerts.create_a_job

    Director Engineering • San Francisco, CA, United States

    Job_description.internal_linking.related_jobs
    Engineering Manager, Site Reliability

    Engineering Manager, Site Reliability

    Reddit • San Francisco, California, USA
    serp_jobs.job_card.full_time
    Reddit is a community of communities.Its built on shared interests passion and trust and is home to the most open and authentic conversations on the internet. Every day Reddit users submit vote and ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Director of AI Infra & Cloud Reliability

    Director of AI Infra & Cloud Reliability

    Supio • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    A growing AI solutions provider in San Francisco is seeking a Director of DevOps & Infrastructure to lead and scale engineering teams responsible for the reliability and performance of its platform...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Infrastructure Site Reliability Engineer (Local only)

    Infrastructure Site Reliability Engineer (Local only)

    Maxonic Inc. • San Francisco, California, United States
    serp_jobs.job_card.full_time
    Maxonic maintains a close and long-term relationship with our direct client.In support of their needs, we are looking for an. Infrastructure Site Reliability Engineer.Job Title : Infrastructure Site ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking

    Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking

    Collective Health • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking.At Collective Health, we’re transforming how employers and their people engage with their health benefits by seamles...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer, Frontier Systems Infrastructure

    Site Reliability Engineer, Frontier Systems Infrastructure

    OpenAI • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    The Frontier Systems team at OpenAI builds, launches, and supports the largest supercomputers in the world that OpenAI uses for its most cutting edge model training. We take data center designs, tur...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Chainlink Labs • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Senior Site Reliability Engineer.We’re looking for an experienced Site Reliability Engineer to join the Infrastructure Platform team, help builders at Chainlink, and accelerate delivery of internal...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Latent • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    San Francisco, CA (5 Days In-Office).You are the infrastructure expert who enables our rapid product development and guarantees. AI platform for major health systems.Your focus on operational excell...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Director, Platform Engineering - ML Infrastructure Leader

    Director, Platform Engineering - ML Infrastructure Leader

    Weights & Biases • San Francisco, CA, US
    serp_jobs.job_card.full_time
    A leading AI platform company in San Francisco seeks a Director of Engineering to lead critical teams in developing essential tools for machine learning. This role involves driving strategy across e...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Director, Site Reliability Engineering - Infrastructure Platform

    Director, Site Reliability Engineering - Infrastructure Platform

    Okta for Developers • San Francisco, CA, United States
    serp_jobs.job_card.permanent
    Director, Site Reliability Engineering - Infrastructure Platform.Join as the Director of Infrastructure Platform and Shared Services at Okta for Developers. Oversee multiple teams focused on Edge ne...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Alembic Technologies • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Senior Site Reliability Engineer.This range is provided by Alembic Technologies.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.We’re looking fo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Principal Site Reliability Operations Engineer

    Principal Site Reliability Operations Engineer

    Roblox • San Mateo, California, USA
    serp_jobs.job_card.full_time
    As a Senior Site Reliability Operations Engineer on the Reliability Team you will manage production incidents and improve Robloxs incident processes while reporting to the Senior Operations Manager...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Site Reliability Engineer I

    Site Reliability Engineer I

    Prosper • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry-level position is desi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Hive • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Hive is the leading provider of cloud-based AI solutions to understand, search, and generate content, and is trusted by hundreds of the world's largest and most innovative organizations.The company...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Director, Cloud Engineering

    Director, Cloud Engineering

    Early Warning Services LLC • San Francisco, CA, US
    serp_jobs.job_card.full_time
    At Early Warning, we've powered and protected the U.As a trusted name in payments, we partner with thousands of institutions to increase access to financial services and protect transactions for hu...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    Hinge Health • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    From scaling Kubernetes clusters to improving observability with Datadog, we build the tooling and automation that empower product teams to ship with confidence. Collaborate with engineering teams t...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    Hinge-Health • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Site Reliability Engineers at Hinge Health are infrastructure engineers with a strong sense of ownership over the systems that keep our platform running reliably, securely, and efficiently.From sca...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Technology Site Reliability Engineering Manager

    Senior Technology Site Reliability Engineering Manager

    Cooley LLP • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    Senior Technology Site Reliability Engineering Manager page is loaded## Senior Technology Site Reliability Engineering Managerlocations : San Francisco : New York : Santa Monica : Los Angeles : ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Director of Engineering - Infrastructure

    Director of Engineering - Infrastructure

    BitGo • San Francisco, CA, United States
    serp_jobs.job_card.full_time
    BitGo is the leading infrastructure provider of digital asset solutions, delivering custody, wallets, staking, trading, financing, and settlement services from regulated cold storage.Since our foun...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted