Talent.com
Principal Site Reliability Engineer
Principal Site Reliability EngineerVarda Space Industries • El Segundo, California, United States
Principal Site Reliability Engineer

Principal Site Reliability Engineer

Varda Space Industries • El Segundo, California, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
  • serp_jobs.job_card.permanent
job_description.job_card.job_description

About Varda

Low Earth orbit is open for business . Varda is accelerating the development of commercial space infrastructure, from in-orbit pharmaceutical processing to reliable and economical reentry capsules.

From life-saving pharmaceuticals to more powerful fiber optics, there is a world of products used on Earth today that can only be manufactured in space. Varda is accelerating innovation in the orbital economy by creating both the products and infrastructure needed so space can directly benefit life on Earth. Our mission is to expand the economic bounds of humankind.

Our team is uniquely suited to accomplishing this goal, with leadership and staff comprised of veterans from SpaceX, Blue Origin, major pharmaceutical companies and Silicon Valley. Varda was founded in January 2021 by Will Bruey and Delian Asparouhov with significant backing from world class investors including Khosla Ventures, Lux Capital, Founders Fund, Caffeinated Capital, General Catalyst, and Also Capital.

Varda is headquartered in El Segundo, California, where we have offices and a production facility where our vehicles, equipment, and materials are built, integrated, and tested. Varda also has offices in Washington, DC and Huntsville, AL (coming soon).

Join Varda, and work to create a bustling in-space ecosystem.

About This Role

As a Principal Site Reliability Engineer, you will help set the technical vision and strategy for reliability across spacecraft, ground systems, and enterprise platforms. You’ll define standards, mentor senior engineers, and drive cross-organizational initiatives to ensure systems are highly operable, secure, and mission-ready. This role combines deep technical expertise with the ability to influence architectural direction at the company level.

Responsibilities

  • Lead and contribute hands-on to the deployment, maintenance, and operations of mission-critical applications and infrastructure supporting spacecraft, ground systems, and company-wide platforms.
  • Design, execute, and manage highly scalable, reliable, and operable software and infrastructure platforms, applying Infrastructure as Code (IaC) principles to drive automation, consistency, and repeatability across Kubernetes environments.
  • Collaborate closely with software and hardware teams to align reliability best practices, CI / CD pipelines, and compliance with their workflows, enabling faster, more secure deployments for mission-critical systems.
  • Anticipate and address reliability risks, capacity challenges, and performance bottlenecks; develop long-term strategies in partnership with leadership.
  • Rotate through the team’s on-call schedule to keep critical systems healthy and responsive.
  • Occasionally travel to customer sites and other Varda locations to troubleshoot, deploy, or test critical infrastructure.

Basic Qualifications

  • 10+ years of experience in SRE, DevOps, or systems engineering, including leadership of large-scale, mission-critical systems.
  • Experience leading technical direction and architecture for large-scale systems
  • Hands-on experience with observability stacks and telemetry pipelines—including metrics collection, alerting, and dashboards—for Linux systems and Kubernetes workloads (e.g., Prometheus and Grafana).
  • Strong background in systems architecture and software-defined networking (VPC, subnets, firewalls, VPNs, etc.).
  • Proficiency in automation and scripting with Python, Bash, or similar languages
  • Positive and strong communication skills, both written and oral
  • Preferred Skills and Experience

  • Expertise in time-series databases (e.g., InfluxDB) for large-scale telemetry pipeline.
  • Expertise in provisioning and managing scalable Azure cloud infrastructure using native tools and best practices (Azure GCC High preferred).
  • Experience with IaC tools like Terraform, and Ansible and CI / CD systems like Git and  ArgoCD
  • Experience building and maintaining dynamic system configurations with templating frameworks such as YAML, and Helm.
  • Strong understanding of Linux systems, containerization technologies, and Kubernetes internals
  • Pay Range

  • Senior Site Reliability Engineer : 153,000.00 - $185,00.00 / per year
  • This role is on-site   in El Segundo, CA
  • Leveling and base salary is determined by job-related skills, education level, experience level, and job performance
  • You will be eligible for long-term incentives in the form of stock options and / or long-term cash awards
  • ITAR Requirements

    Varda, like all employers, must ensure that its employees working in the United States are lawfully authorized to work in the U.S.  Additionally, our employees are exposed to and have access to certain export-controlled items. At present, some of our technology to which employees have access requires a license to be exported to individuals other than “U.S. Persons” as defined in U.S. export regulations. Because our employees are provided access to export-controlled items, our current policy is to only hire “U.S. persons” who are permitted to have access to our technology without an export license.

    “US person” means : U.S. citizen, U.S. lawful permanent resident, or protected individual as defined by 8 U.S.C. 1324b(a)(3) (i.e., individual admitted to the U.S. as a refugee or granted asylum in the U.S.)

    Learn more about the ITAR here .

    Benefits

  • Exciting team of professionals at the top of their field working by your side
  • Equity in a fully funded space startup with potential for significant growth (interns excluded)
  • 401(k) matching (interns excluded)
  • Unlimited PTO (interns excluded)
  • Health insurance, including Vision and Dental
  • Lunch and snacks provided on site every day. Dinners provided twice a week.
  • Maternity / Paternity leave (interns excluded)
  • Varda Space Industries is an Equal Opportunity Employer.  We celebrate diversity and are committed to creating an inclusive environment for all employees.  Candidates and employees are always evaluated based on merit, qualifications, and performance.  We will never discriminate on the basis of race, color, gender, national origin, ethnicity, veteran status, disability status, age, sexual orientation, gender identity, martial status, mental or physical disability, or any other legally protected status.

    E-Verify Statement

    Varda Space Industries, Inc. participates in the U.S. Department of Homeland Security E-Verify program. The E-Verify program is an Internet-based employment eligibility verification system operated by the U.S. Citizenship and Immigration Services. Learn more about the  E-Verify  program.

    E-Verify Notice                                                               Right To Work Notice

    Read more                                                                               Read more

    serp_jobs.job_alerts.create_a_job

    Site Reliability Engineer • El Segundo, California, United States

    Job_description.internal_linking.related_jobs
    Site Reliability Engineer, GNC (Falcon)

    Site Reliability Engineer, GNC (Falcon)

    Spacex • Hawthorne, California, United States
    serp_jobs.job_card.full_time +1
    SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technolo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Data Center Site Reliability Engineer

    Data Center Site Reliability Engineer

    Expert Technology Services • Culver, California, USA
    serp_jobs.job_card.full_time
    Please Note : As of July 22 2021 our team will require that all candidate submissions include a LinkedIn profile.Please do not submit any candidates that do not have a LinkedIn.Data Center Site Reli...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Technology Site Reliability Engineer

    Senior Technology Site Reliability Engineer

    Cooley LLP • Santa Monica, CA, United States
    serp_jobs.job_card.full_time
    Senior Technology Site Reliability Engineer.Cooley is seeking a Senior Site Reliability Engineer to join the.Infrastructure & Development Operations. The Senior Technology Site Reliability Engineer(...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Facilities Infrastructure Engineer (Reliability)

    Facilities Infrastructure Engineer (Reliability)

    Spacex • Hawthorne, California, United States
    serp_jobs.job_card.full_time +1
    SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technolo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Nuclear Engineer

    Nuclear Engineer

    US Navy • Long Beach, California, United States
    serp_jobs.job_card.part_time
    It takes hard work and smarts to get you into the reactor room.But if you have a strong interest in math, chemistry, physics and engineering, you might just have what it takes to be a Machinist's M...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer, Kubernetes Platform (Starshield)

    Site Reliability Engineer, Kubernetes Platform (Starshield)

    Spacex • Hawthorne, California, United States
    serp_jobs.job_card.full_time +1
    SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technolo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Technology Site Reliability Engineering Manager

    Senior Technology Site Reliability Engineering Manager

    Cooley LLP • Los Angeles, CA, United States
    serp_jobs.job_card.full_time
    Senior Technology Site Reliability Engineering Manager.Cooley is seeking a Senior Site Reliability Engineering Manager to join the. Infrastructure & Development Operations.The Senior Technology Site...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Systems Engineer, Principal - TS / SCI / SAP

    Systems Engineer, Principal - TS / SCI / SAP

    DCS Corporation • Los Angeles, California, US
    serp_jobs.job_card.full_time
    DCS has an exciting opportunity for a Principal Systems Engineer providing support to the Command, Control, Communications, and Battle Management Division (C3BM). Command, Control, Communications, a...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer (SRE)

    Senior Site Reliability Engineer (SRE)

    Stubhub • Los Angeles, California, United States
    serp_jobs.job_card.full_time
    StubHub is on a mission to redefine the live event experience on a global scale.Whether someone is looking to attend their first event or their hundredth, we’re here to delight them all the way fro...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer, US Gov

    Site Reliability Engineer, US Gov

    Quindar • Los Angeles, California, United States
    serp_jobs.job_card.full_time +1
    Architect, automate, test, deploy, and maintain a well-designed, highly available cloud infrastructure in AWS GovCloud and AWS C2E with a strong focus on security, compliance, and operational excel...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Launch Systems Integration Project Engineer

    Launch Systems Integration Project Engineer

    The Aerospace Corporation • Los Angeles, CA, United States
    serp_jobs.job_card.full_time
    The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    System of Systems Integration Engineer - Missile Warning, Tracking, Defense

    System of Systems Integration Engineer - Missile Warning, Tracking, Defense

    The Aerospace Corporation • El Segundo, CA, United States
    serp_jobs.job_card.full_time
    The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Lead Site Reliability Engineer - Federal Team

    Lead Site Reliability Engineer - Federal Team

    Saviynt • Los Angeles, California, United States
    serp_jobs.job_card.full_time
    Saviynt is an identity authority platform built to power and protect the world at work.In a world of digital transformation, where organizations are faced with increasing cyber risk but cannot affo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Lead Software Engineer

    Lead Software Engineer

    Relativity • Los Angeles, CA, United States
    serp_jobs.job_card.full_time
    We are seeking a Lead Software Engineer to join the Retrieval Ingestion Team at Relativity.This role is ideal for an experienced engineer who thrives on designing and operating high throughput inge...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Hivewatch • El Segundo, California, United States
    serp_jobs.job_card.full_time
    HiveWatch is a tech-forward, inclusive organization fostering the evolution of the physical security industry.We are a diverse team of forward thinkers who empower each other to find creative and c...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Northwoodspace • Torrance, California, United States
    serp_jobs.job_card.full_time
    Northwood is looking for a Senior Site Reliability Engineer to architect and lead the monitoring and reliability systems. As we rapidly scale our ground station network across multiple continents, y...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer (Special Programs)

    Site Reliability Engineer (Special Programs)

    Spacex • Hawthorne, California, United States
    serp_jobs.job_card.full_time +1
    SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technolo...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Varda Space Industries • El Segundo, California, United States
    serp_jobs.job_card.full_time +1
    Low Earth orbit is open for business.Varda is accelerating the development of commercial space infrastructure, from in-orbit pharmaceutical processing to reliable and economical reentry capsules.Fr...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted