Talent.com
Senior Site Reliability Engineer
Senior Site Reliability EngineerGreenLite • New York, New York, United States
Senior Site Reliability Engineer

Senior Site Reliability Engineer

GreenLite • New York, New York, United States
job_description.job_card.30_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Our Company

Founded in 2022, GreenLite is revolutionizing development in America by streamlining the collaboration between developers, builders, and local regulatory authorities. GreenLite’s software powers its Private Plan Review offering, serving many of the nation’s largest public retailers, developers, and production home builders. By leveraging GreenLite’s technology, its customers save months on each project, significantly accelerating their timelines and staying within budget.

GreenLite is founded by experts in technology, development, and within the AEC (Architecture, Engineering, and Construction) industry, and backed by leading venture capital firms. GreenLite is at the forefront of the privatization of construction permitting and plan review, reshaping a multi-hundred billion dollar industry.

GreenLite has raised nearly $40M from the country’s leading venture capital investors, including Craft Ventures, who led GreenLite’s $28.5M Series A. We’re well capitalized to achieve our mission of revolutionizing the plan review and construction permitting process across the country.

The role and why it matters

Reliability is a product for our customers : city officials rely on us to be available whenever a submission deadline looms, and builders stake millions of dollars on predictable turnaround. As our first dedicated SRE you will establish the patterns, tooling and culture that keep our systems fast, observable and resilient while we 10x traffic over the next 18 months.

Our operating principles— Winning Mentality, Speed & Urgency, Disagree & Commit, Ownership & Integrity, Customer Centricity —are not wall art; they guide hiring, architecture and on‑call decisions.

What you’ll do

Design & harden production infrastructure AWS ECS / Fargate via AWS Copilot (migrating to Terraform), RDS / Postgres, S3, EventBridge, Bedrock.

Lead reliability engineering : SLO / SLA definition, error‑budget policies, capacity planning and load testing ahead of major launches.

Own CI / CD : advance our GitHub Actions pipeline, introduce progressive delivery and automated rollbacks to steadily maintain & improve deployment frequency and lead time for changes.

Instrument & Observe : deploy metrics, tracing and logging (Datadog) and drive an on‑call culture focused on MTTR and learning reviews, not blame.

Security & compliance : partner with the engineers to automate patching, secrets management & rotation, least‑privilege IAM and SOC 2 controls.

Coach & collaborate : mentor engineers on SRE best practices, work closely with ML and product squads, and influence architecture decisions through strong opinions loosely held.

Continuously improve : identify systemic bottlenecks, build tooling that eliminates toil and scale our platform without scaling pager fatigue.

What you’ll bring

Must‑have :

6+ yrs building and operating production systems in AWS, GCP or Azure (AWS preferred).

Demonstrated ownership of SLOs, incident response and post‑incident analysis.

Expert in IaC (Terraform, CDK, Pulumi) and container orchestration (ECS, EKS or K8s).

Proficient with at least one modern language (Python, Rust, Go) and strong bash skills.

Deep familiarity with observability stacks (Datadog, Grafana, Prometheus, OTEL).

Track record of raising the bar for security, compliance and cost optimisation.

Nice-to-have :

Experience with infrastructure for ML workflows (model training, feature stores).

Prior work in construction‑tech, gov‑tech or other regulated domains.

Certification : AWS Solutions Architect or DevOps Pro.

Experience introducing chaos engineering or game‑days.

Public track record (blog posts, OSS) advancing the SRE discipline.

Leadership in defining hiring / on‑call processes at a high‑growth startup.

In your first 180 days you will

30 days – Stand up staging / production dashboards, own the on‑call rotation and deliver a gap‑analysis of our reliability posture. Take ownership of our migration into AWS Control Tower, and contribute to architecture for hosting our production applications, including AI engineering.

60 days – Roll out error‑budget policies, automated canary deploys and service‑level telemetry across all micro‑services. Complete migration off of AWS CoPilot. Plan migration from RDS Postgres to Aurora Postgres, including metrics. Establish production infrastructure for AI engineering.

90 days – Reduce p95 latency by ≥20 %, cut mean time‑to‑recovery (MTTR) to

180 days – Mentor two mid‑level engineers into effective first responders and established infrastructure for ML products. Train team on disaster recovery plan, and do a dry run of restoration from backups.

What success looks like

99.95 % customer‑visible uptime with clearly defined SLAs.

Engineering velocity accelerates because infrastructure just works and developers ship confidently.

Post‑incident reviews focus on learning; recurring classes of incidents drop each quarter.

Stakeholders (product, customer success, city reviewers) describe reliability as a core differentiator.

Our team thrives on collaboration, so we’re in the office 4 days per week. In the summer, from Memorial Day to Labor Day, we switch to a 3-day in-office schedule to give everyone extra flexibility.

Our hiring process

Intro with Talent Partner

values & architecture deep‑dive with Head of Engineering

On-site : Practical systems‑design exercise (real scenarios we face)

On-site : On‑call simulation & retrospective with two engineers

On-site : Cross functional Panel interview

Final exec conversation and offer discussion

Thrive With GreenLite

Competitive Compensation - Generous base salary & access to our Employee Equity Program, so you can grow with us.

Performance-Based Annual Bonuses - Rewards for high-impact results and contributions that move the needle.

Premium Health Coverage - Comprehensive medical, dental, and vision insurance for full-time team members : 100% of premiums covered under our HDHP plan & 98% coverage for employees and their spouses.

401(k) Retirement Plan - Helping you invest in your future with smart saving options.

Parental Leave - Generous parental leave for all parents to support your growing family.

Wellness Support - Monthly Wellness Stipend and full access to Wellhub, Talkspace, & Teladoc for your physical and mental well-being.

Weekly Team Lunches - Enjoy catered lunches every week in our NYC office. Great food, better company.

Company-Wide Team All Hands - Held twice a year, fostering transparency, alignment, and inspiration.

Team-Building Events - Regular opportunities to connect, collaborate, and celebrate as a team.

Unlimited PTO - Flexible time off so you can recharge, travel, or take care of life as needed.

Hybrid Work Environment – Our team thrives on collaboration, so we’re in the office 4 days per week. In the summer, from Memorial Day to Labor Day, we switch to a 3-day in-office schedule to give everyone extra flexibility.

Equal Opportunity Statement

GreenLite values people from all walks of life and professional backgrounds. We understand not everyone will meet all the above qualifications on day one. That's okay. If you’re passionate about the construction industry or solving the housing crisis in America, and want the opportunity to grow in your career, we encourage you to apply.

GreenLite is an equal employment opportunity employer, committed to an inclusive workplace where we do not discriminate on the basis of race, sex, gender, national origin, religion, sexual orientation, gender identity, marital or familial status, age, ancestry, disability, genetic information, or any other characteristic protected by applicable laws. We believe in diversity and encourage any qualified individual to apply.

serp_jobs.job_alerts.create_a_job

Senior Site Reliability Engineer • New York, New York, United States

Job_description.internal_linking.related_jobs
Site Reliability Engineer

Site Reliability Engineer

Marketaxess • New York, New York, United States
serp_jobs.job_card.full_time
MarketAxess is on a journey to digitally transform one of the world’s largest financial markets, enabling the shift from analog, phone-based trading to a fully electronic marketplace.Why does this ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

S&P Global • New York, New York, United States
serp_jobs.job_card.full_time
This job is with S&P Global, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.About the Rol...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Site Reliability Engineer, Commodities Technology

Site Reliability Engineer, Commodities Technology

Point72 • New York, New York, United States
serp_jobs.job_card.full_time
Site Reliability Engineer, Commodities Technology.A Career with point72’s technology team.As Point72 reimagines the future of investing, our Technology group is constantly improving our company’s I...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Site Reliability Engineer 3

Site Reliability Engineer 3

Mongodb • New York, New York, United States
serp_jobs.job_card.full_time
MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to easily build, scale, and...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Site Reliability Engineer (Genetec) (Englewood Cliffs)

Site Reliability Engineer (Genetec) (Englewood Cliffs)

STAND 8 Technology Consulting • Englewood Cliffs, NJ, United States
serp_jobs.job_card.full_time
STAND 8 provides end to end IT solutions to enterprise partners across the United States and with offices in Los Angeles, New York, New Jersey, Atlanta, and more including internationally in Mexico...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Site Reliability Engineer - Cloud

Site Reliability Engineer - Cloud

Dataiku • New York, New York, United States
serp_jobs.job_card.full_time
At Dataiku, we're not just adapting to the AI revolution, we're leading it.Since our beginning in Paris in 2013, we've been pioneering the future of AI with a platform that makes data actionable an...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Technology Site Reliability Engineer

Senior Technology Site Reliability Engineer

Cooley LLP • New York, NY, United States
serp_jobs.job_card.full_time
Senior Technology Site Reliability Engineer.Cooley is seeking a Senior Site Reliability Engineer to join the.Infrastructure & Development Operations. The Senior Technology Site Reliability Engineer(...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

Alchemy • New York, New York, United States
serp_jobs.job_card.full_time
Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers th...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

Tekgence Private Ltd • Jersey City, NJ, United States
serp_jobs.job_card.full_time
serp_jobs.filters_job_card.quick_apply
Site Reliability Engineer Location : Jersey City, NJ Day 1 onsite Hybrid Required skills Python,...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days
Site Reliability Engineer

Site Reliability Engineer

Hebbia • New York, New York, United States
serp_jobs.job_card.full_time
The user interface for AGI – Hebbia is AI that works the way you work.Designed to be generally capable– it can tackle even the most complex tasks, citing answers over any amount of sources.By showi...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Sr. Site Reliability Engineer

Sr. Site Reliability Engineer

Vimeo • New York, New York, United States
serp_jobs.job_card.full_time
Do you love working with cloud infrastructure at scale? Optimizing the last bit of performance and efficiency out of applications that get hundreds of thousands of requests per second? Digging deep...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

Stubhub • New York, New York, United States
serp_jobs.job_card.full_time
StubHub is on a mission to redefine the live event experience on a global scale.Whether someone is looking to attend their first event or their hundredth, we’re here to delight them all the way fro...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Senior Infrastructure / Site Reliability Engineer

Senior Infrastructure / Site Reliability Engineer

Particle Health • New York, New York, United States
serp_jobs.job_card.full_time
Particle Health is revolutionizing healthcare data analytics and interoperability.Our mission is to unlock the power of medical records in an intelligent platform that focuses health back on the pa...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Staff Site Reliability Engineer

Staff Site Reliability Engineer

Altana AI • New York, NY, United States
serp_jobs.job_card.full_time
AI can be a powerful tool for good in the world – at Altana we apply AI to the world’s largest organized body of supply chain data to power a more resilient, more secure, and more sustainable model...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

Cape • New York, New York, United States
serp_jobs.job_card.full_time
Cape was founded in early 2022 by Palantir and Anduril alums with deep expertise in privacy and national security.While running Palantir’s US national security business, our CEO became passionate a...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

Freed • New York, NY, United States
serp_jobs.job_card.full_time
Doctors are overworked, burnt out, and are quitting in masses.At Freed, we combine clinician love with the latest AI tech and intense execution to create products that make clinicians happier.Our f...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Site Reliability Engineer

Site Reliability Engineer

Mio Partners • New York, NY, United States
serp_jobs.job_card.full_time
MIO) provides proprietary investment products to McKinsey’s retirement plan and partners and offers independent, high-quality financial advice to McKinsey’s partners. We manage a wide array of inves...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Staff Site Reliability Engineer

Staff Site Reliability Engineer

Stash • New York, NY, United States
serp_jobs.job_card.full_time
Want to help everyday Americans invest and build wealth? Financial inequality is increasing, and too many people are getting left behind. At Stash, we are passionate about democratizing wealth creat...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_1_day • serp_jobs.job_card.promoted