Talent.com
Site Reliability Engineer (SRE)
Site Reliability Engineer (SRE)Openkyber • GA, United States
serp_jobs.error_messages.no_longer_accepting
Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Openkyber • GA, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.temporary
  • serp_jobs.filters_job_card.quick_apply
job_description.job_card.job_description

TEKsystems is hiring for a fully remote, Level 5 SRE for one of our clients. The role can sit in any US state and any timezone.

This is a short-term contract role with funding till end of January 2026 but may extend beyond.

Our client, a digital asset exchange platform where users can buy, sell, and store cryptocurrencies, is seeking a high-level, Senior SRE to join their AI Infrastructure team.

The following experience is REQUIRED :

  • Site Reliability Engineering (SRE) background
  • AI infrastructure familiarity (nice-to-have, not mandatory)
  • Strong Go and Python scripting skills
  • Terraform for infrastructure as code
  • GCP or AWS Cloud Infra (logging, observability, pub / sub, cloud syncs)
  • Vector.dev and Datadog for observability pipeline
  • Security risk assessment and remediation
  • Ability to own projects end-to-end with minimal supervision

Description

We are looking for a Site Reliability Engineer (SRE) to join the IT AI Infrastructure team to deploy, manage, and optimize AI-powered productivity tools and in-house AI solutions that enhance employee efficiency at scale. A successful candidate will have demonstrated success in similar roles within high-growth, security-conscious environments, bringing deep expertise in public cloud infrastructure (AWS / GCP), backend development (Python, Go, or Java), and automation tooling. The right person is passionate about building scalable and reliable AI infrastructure, driving automation, and collaborating across disciplines to integrate AI systems while maintaining strong security and compliance standards.

  • Deployment and Management of AI Tools : Deploy, configure, and manage AI-powered employee productivity tools and in-house AI built solutions
  • Reliability and Performance : Ensure high availability, reliability, and optimal performance of AI platforms and services. Implement monitoring, alerting, and incident response procedures.
  • Scalability and Infrastructure : Design and implement scalable infrastructure to support the growing demands of AI tools and user base. Optimize resource utilization and manage capacity planning.
  • Automation and Tooling : Develop and maintain automation scripts and tools to streamline deployment, monitoring, and maintenance tasks. Contribute to the experimental sandbox environments for testing new AI solutions.
  • Collaboration and Support : Collaborate with cross-functional teams (Machine-Learning, HR, Security, Data Science, Developer Experience) to support the development and integration of AI solutions. Provide technical support and troubleshooting for AI-related issues.
  • Security and Compliance : Adhere to security and privacy policies while deploying and managing AI tools. Ensure compliance with regulatory requirements.
  • Monitoring and Metrics : Implement comprehensive monitoring and metrics to track the performance and health of AI systems. Analyze data to identify areas for improvement and optimization.
  • Incident Response : Participate in incident response and troubleshooting for AI-related outages or performance issues. Develop and maintain incident response plans.
  • Backend Development : Contribute to backend development tasks to support the integration and functionality of AI tools.
  • Public Cloud Management : Deploy and manage AI solutions on public cloud platforms (AWS / GCP), leveraging cloud-native services and best practices.
  • Written and Verbal Communication : Excellent communication skills and experience presenting technical information to non-technical audiences, including senior leadership.
  • Skills

    Proven experience as a Site Reliability Engineer (SRE) or similar role. Strong understanding of AI technologies and platforms. Experience with deploying and managing applications in a cloud environment (AWS / GCP). Solid backend development experience with programming languages such as Python, Java, or Go. Strong proficiency in managing and configuring public cloud services (AWS / GCP) for scalability and reliability.

    Experience with automation tools and scripting (e.g., Ansible, Terraform, Bash, Python). Excellent troubleshooting and problem-solving skills. Strong communication and collaboration skills. Strong security and compliance understanding. Experience working in a highly regulated environment Experience in a fast-paced, high-growth company

    Education

    Proven experience as a Site Reliability Engineer (SRE) or similar role. Strong understanding of AI technologies and platforms. Experience with deploying and managing applications in a cloud environment (AWS / GCP). Solid backend development experience with programming languages such as Python, Java, or Go. Strong proficiency in managing and configuring public cloud services (AWS / GCP) for scalability and reliability.

    Experience with automation tools and scripting (e.g., Ansible, Terraform, Bash, Python). Excellent troubleshooting and problem-solving skills. Strong communication and collaboration skills. Strong security and compliance understanding. Experience working in a highly regulated environment. Experience in a fast-paced, high-growth company

    Additional Skills & Qualifications

    Role : AI Site Reliability Engineer (Contractor, IC5 level)

    Team : IT EMPA (Employee Productivity & Automation)

    Duration : Open until end of January (possible extension)

    Location : Remote

    Responsibilities :

  • Manage and enhance AI-driven employee productivity tools (e.g., Glean, Google Workspace, Slack AI)
  • Implement observability solutions (logging, metrics, dashboards)
  • Automate infrastructure tasks using Terraform
  • Assess and mitigate security risks in AI systems
  • Build scaffolding APIs for unsupported Glean features
  • Collaborate with engineering teams to deliver production-ready solutions quickly
  • Job Type & Location

    This is a Contract position based out of Oakland, CA.

    Pay and Benefits

    The pay range for this position is $90.00 - $100.00 / hr.

    Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following :

  • Medical, dental & vision
  • Critical Illness, Accident, and Hospital
  • 401(k) Retirement Plan Pre-tax and Roth post-tax contributions available
  • Life Insurance (Voluntary Life & AD&D for the employee and dependents)
  • Short and long-term disability
  • Health Spending Account (HSA)
  • Transportation benefits
  • Employee Assistance Program
  • Time Off / Leave (PTO, Vacation or Sick Leave)
  • Workplace Type

    This is a fully remote position.

    Application Deadline

    This position is anticipated to close on Dec 5, 2025.

    About TEKsystems :

    We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.

    The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

    About TEKsystems and TEKsystems Global Services

    We're a leading provider of business and technology services. We accelerate business transformation for our customers. Our expertise in strategy, design, execution and operations unlocks business value through a range of solutions. We're a team of 80,000 strong, working with over 6,000 customers, including 80% of the Fortune 500 across North America, Europe and Asia, who partner with us for our scale, full-stack capabilities and speed. We're strategic thinkers, hands-on collaborators, helping customers capitalize on change and master the momentum of technology. We're building tomorrow by delivering business outcomes and making positive impacts in our global communities. TEKsystems and TEKsystems Global Services are Allegis Group companies. Learn more at TEKsystems.com.

    The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

    serp_jobs.job_alerts.create_a_job

    Site Reliability Engineer • GA, United States

    Job_description.internal_linking.related_jobs
    ML Release Engineer

    ML Release Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Position : AI / ML Engineer Location : Remote Skills Required : AWS, Apache Arrow, AI / ML ONLY W2 ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day
    Reliability Architect

    Reliability Architect

    Openkyber • GA, United States
    serp_jobs.job_card.temporary
    serp_jobs.filters_job_card.quick_apply
    Title : Enterprise Architect IV Duration : 6+ Months potential contract to hire Location : Onsite 5 days a week at (Berkeley Hei...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Industrial Systems Engineer

    Industrial Systems Engineer

    Stark Pharma Solutions Inc • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Job Title : Industrial Systems Engineer Location : Atlanta, GA (Onsite)serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Sr. Full Stack Engineer (US Remote)

    Sr. Full Stack Engineer (US Remote)

    First Advantage • Remote, GA, US
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Senior Full Stack Engineer Location : United States remote Job Type : Full-Time At First Advantage (Nasdaq : FA), people are at the heart of everything we do. From our customers and partners to our gre...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30
    Cloud SRE

    Cloud SRE

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Job Details : Job Title : Sr.Site Reliability Engineer (SRE) Duration : Contract to Hire (On the Payroll of Datum Technology Group) Location : C...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Production Engineer

    Production Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Contract Opportunity : SME Microsoft Fabric Engineer - End-to-End Delivery We're supporting a major organisation on a high-impact project that requires an experienced Microso...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Senior Reliability Engineer

    Senior Reliability Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Senior Reliability Maintenance Engineering Technician, RME Job ID : | Amazon UK Services Ltd.Our Reliability Maintenance Engineering (RME) team is central to Amazon's commitment to...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    AI Energy Efficiency Engineer

    AI Energy Efficiency Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    This Jobot Consulting Job is hosted by : Dan Dungy Are you a fit? Easy Apply now by clicking the "Apply Now" button and sending us your resume. Salary : $110,000 - $130,000 per year A bit ab...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Kubernetes Engineer

    Kubernetes Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    The ideal candidate will have a strong foundation in Python programming, experience with Snowflake for data warehousing, proficiency in AWS and Kubernetes (EKS) for cloud services management, and e...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Learning & Development Systems and Process Training Specialist

    Learning & Development Systems and Process Training Specialist

    Modern Family Law • GA, US
    serp_jobs.filters.remote
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Learning & Development Systems and Process Training Specialist.Learning & Development Department.Modern Family Law offers competitive compensation, a wide range of benefits, and a culture b...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day
    ML Systems Engineer

    ML Systems Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    AddanEx JOB POSTING : Title : AI Engineer Location : Remote Time Zone : The position in...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Automation Engineer (SRE track)

    Automation Engineer (SRE track)

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Job Title : SQA Automation Engineer Duration : 2 hrs / day | Budget : 22K 24K | Time : IST Evening JD Summary : <...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Model Operations Engineer

    Model Operations Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Role : Senior Network Operations Engineer SD-WAN (Silver Peak) & Security Type : Contract Location : Boston (Day 1 Onsite)&l...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Senior MLOps Engineer

    Senior MLOps Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Senior MLOps Engineer Location : Remote from Spain (Spanish employment contract) We are looking for a skilled and experienced MLOps Engineer with bu...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    ML Governance Engineer

    ML Governance Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.permanent
    serp_jobs.filters_job_card.quick_apply
    We at Randstad are seeking a driven and experienced Managing Engineer to lead our engineering team.In this pivotal role, you will be responsible for ensuring technical excellence, driving project e...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day
    Model Deployment Engineer

    Model Deployment Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.temporary
    serp_jobs.filters_job_card.quick_apply
    Edge Deployment Engineer (AI & Embedded Systems) | AI Start-up | fixed-term Contract Join a European deep-tech leader in quantum and AI. A well-funded, fast-growing ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_1_day
    Model Risk Management Engineer

    Model Risk Management Engineer

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Job Title : Model Risk Management Engineer Job Summary : serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Kubernetes SRE

    Kubernetes SRE

    Openkyber • GA, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Senior MLOps Consultant (Agentic AI Engineering Delivery) Overview : Our client, a leading organization in the banking sector, is seeking a Senior MLOps Consultant with proven, ha...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days