Required U.S. Citizenship / No clearance needed / 100% remote within the US / EST Time Zone
Staff Site Reliability Engineer / Cloud SME
Location : 100% remote in the continental US
Type : Long-term contract (3+ years)
Role Summary
As the Staff SRE / Cloud SME, you will be a critical technical leader driving the rearchitecting of our existing monolithic system into a resilient, cloud-native architecture. This role requires deep expertise across multiple cloud platforms (Azure and AWS) and container orchestration (Kubernetes) to ensure the next-generation platform meets the highest standards of scalability, reliability, and security.
Key Responsibilities
Architecture & Transformation Leadership
- Lead the technical rearchitecting efforts, transforming a large-scale monolithic system into a modern microservices-based, cloud-native application.
- Collaborate with cross-functional teams (Engineering, Architecture, Product) to define and implement the new system architecture using domain-driven design (DDD) principles.
- Conduct technology evaluations and provide recommendations for new tools, frameworks, and cloud services to enhance our infrastructure.
Reliability Engineering & Cloud Operations
Utilize Kubernetes (K8S) for container orchestration and management, ensuring extreme scalability, reliability, and high availability of the system.Implement robust, highly resilient, and highly available components for the system.Develop and implement comprehensive monitoring, logging, and alerting mechanisms to ensure optimal system performance and availability.Drive the adoption of DevOps principles and practices throughout the software development lifecycle, ensuring seamless integration and continuous deployment processes.Technical Expertise & Mentorship
Stay up-to-date with emerging technologies, frameworks, and industry trends related to systems and cloud computing.Mentor and provide technical guidance to junior team members, fostering a culture of continuous learning and professional growth.Required Qualifications
Cloud Platforms : 7+ years of experience with cloud computing platforms. Strong multi-cloud expertise required with AWS and Azure .Cloud-Native Transformation : 7+ years of experience in rearchitecting large-scale monolithic applications to cloud-native architectures.Container Orchestration : Strong expertise in Kubernetes (K8S) is required, including hands-on experience with both AKS (Azure Kubernetes Service) and EKS (Elastic Kubernetes Service) .Networking : Strong experience with Cloud Networking , with the ability to design and resolve complex cloud networking architecture problems.IaC : Expert knowledge of Terraform for infrastructure-as-code deployment and management.Security : Must possess strong knowledge of security best practices for containers and Kubernetes clusters.Education : Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field.Bonus Knowledge : Knowledge of load balancing algorithms.Thanks for applying!