Talent.com
Reliability Engineer
Reliability EngineerPrestige Development Group • Ashburn, Illinois, USA
Reliability Engineer

Reliability Engineer

Prestige Development Group • Ashburn, Illinois, USA
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

Job Title : Reliability Engineer

Location : On-Site

Job Type : Full-Time

About Prestige Development Group (PDG)

Prestige Development Group (PDG) specializes in providing innovative human capital management solutions tailored to meet the needs of both private and public sector organizations. We are a certified SBA HUBZone and Economically Disadvantaged Woman-Owned Small Business dedicated to fostering diversity inclusion and operational excellence.

Position Summary

We are seeking a skilled Reliability Engineer to support our clients mission by enhancing Production Monitoring and ensuring optimal service delivery for their applications. This role involves proactive issue identification incident resolution and system health optimization within a 24x7x365 operational environment. The ideal candidate will lead monitoring solutions manage ITIL engineers automate processes and collaborate across IT and business teams to improve service reliability. Expertise in AWS environments root cause analysis and technical troubleshooting is essential along with strong communication and leadership skills to drive continuous improvement.

Key Responsibilities

  • Proactive and early notification of potential and actual issues impacting service delivery.
  • Frequent and succinct communication to PSPD leadership during and post incident.
  • Identification of trends and corrective measures.
  • Provide needed metrics to PSPD leadership team.
  • The enhanced Production Monitoring Services Branch will provide resources to staff the operation 24x7x365. The resources should provide additional technical support and diagnosis.

Customer Facing :

  • Build monitoring and production support solutions to provide customer with visibility towards our services.
  • Manage ITIL engineers.
  • Triage and resolve production incidents related to the cloud platform and participate in root cause analysis and postmortem discussions.
  • Function as a solution manager in support of the Manager Production Support by leading the implementation of short-term and long-term solutions automating manual processes and building alerts to monitor the operation of services.
  • Asses initial severity gather impacts create tickets engage support teams and escalate issues properly as they arrive.
  • Optimizes Work Processes :

  • Participate in the creation and maintenance of technical and knowledge base documentation.
  • Troubleshoot production issues problems and collaborate in developing simple technical solutions.
  • Use diagnostic tools to maintain troubleshoot and restore standard service or data to systems.
  • Lead Implementation of production support activities in an Amazon Web Services environment.
  • Lead technical and design discussions with IT to help enterprises speed their adoption of new technologies and practices.
  • Perform System health monitoring and optimizing performance
  • Define and establish monitoring and other processes and tooling for monitoring and performing routine system health checks to ensure optimization and stability of application.
  • Collaborates :

  • Work as a technical leader alongside business development and infrastructure teams.
  • Effectively work with IT and business teams as well as external customers to lead the resolution of production incidents and provide communication during outage.
  • Collaborate with other members of IT and business in streamlining production support processes.
  • Work closely with other teams and recommend solutions to improve production support current processes that reflect business needs security and SLAs of our production services.
  • Work closely with Infrastructure team and other support staff to identify and resolve incidents and create and implement long term remediation techniques and fixes.
  • Provide support and coach other members of the Production Support team.
  • Communicates Effectively :

  • Communicate clearly and effectively across IT business process owners and customers at all levels of the organization.
  • Communicate progress and any challenges to management.
  • Communicate overall status and health of the application to business and application support teams.
  • Active CBP / BI or Top Secret clearance is highly desired. Must be open to working 2nd or 3rd shift in a 24 / 7 / 365 environment.

    Qualifications

    Required :

  • Experience in Production Monitoring & Support within a 24x7x365 operational environment.
  • Strong expertise in incident management root cause analysis and problem resolution for cloud-based applications.
  • Hands-on experience with Amazon Web Services (AWS) and cloud-based monitoring tools.
  • Proficiency in ITIL processes and managing ITIL engineers for efficient service delivery.
  • Ability to build and implement monitoring solutions automate manual processes and create alerts to ensure system stability.
  • Experience with system health monitoring performance optimization and troubleshooting production issues.
  • Strong leadership skills to collaborate with IT business and infrastructure teams to improve production support processes.
  • Effective communication skills to provide updates incident reports and status updates to leadership and stakeholders.
  • Ability to develop and maintain technical documentation and knowledge base resources for production support.
  • Experience in triaging and resolving production incidents assessing severity and properly escalating issues.
  • Equal Employment Opportunity (EEO) Statement

    Prestige Development Group (PDG) is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. PDG prohibits discrimination and harassment of any kind including based on race color religion sex pregnancy sexual orientation gender identity national origin age disability genetic information or any other protected characteristic as outlined by federal state or local laws.

    Americans with Disabilities Act (ADA) Statement

    PDG is committed to providing reasonable accommodations for individuals with disabilities in our job application and hiring process.

    Background Check Policy

    Employment is contingent upon the successful completion of a background check. PDG complies with all applicable laws regarding background checks.

    How to Apply

    Interested candidates are encouraged to submit their resume. Applications will be reviewed on a rolling basis until the position is template ensures compliance with major federal and state-specific labor laws incorporates diversity and inclusivity practices and aligns with standard job description structures. Adjustments may be made based on specific job roles and legal requirements in certain states.

    Key Skills

    Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting

    Employment Type : Full Time

    Experience : years

    Vacancy : 1

    serp_jobs.job_alerts.create_a_job

    Reliability Engineer • Ashburn, Illinois, USA

    Job_description.internal_linking.related_jobs
    Site Reliability Engineer

    Site Reliability Engineer

    iManage • Chicago, IL, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    SRE is part of a global organization that leverages the latest technology to communicate with our colleagues across the globe. We organize ourselves into distributed teams SRE teams are anchored ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30
    Site Reliability and Security Engineer

    Site Reliability and Security Engineer

    Quality Technology Services • Ashburn, Illinois, USA
    serp_jobs.job_card.full_time +1
    The Senior Site Reliability and Security Engineer is responsible for ensuring the reliability observability and security posture of the QTS OS and SDP platforms deployed on AWS.This role combines d...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Motorola Solutions • Chicago, Illinois, USA
    serp_jobs.job_card.full_time
    At Motorola Solutions we believe that everything starts with our people.Were a global close-knit community united by the relentless pursuit to help keep people safer everywhere.Our critical communi...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Sr. DevOps / Site Reliability Engineer

    Sr. DevOps / Site Reliability Engineer

    Sputnik Solutions Inc • Chicago, IL, us
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    We are looking for a Senior Site Reliability Engineer (SRE) with deep experience in AWS infrastructure, automation, observability, and production support. As an SRE, you will ensure our cloud-native...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Reliability Engineer - Data & Cloud (Mid-Level, Retail)

    Reliability Engineer - Data & Cloud (Mid-Level, Retail)

    Accellor • Chicago, IL, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    At Accellor, we are a trusted digital transformation partner that uses best-of-breed Cloud technology to deliver superior customer engagement and business effectiveness for clients.We’ve created an...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days
    Senior Technology Site Reliability Engineer

    Senior Technology Site Reliability Engineer

    Cooley LLP • Chicago, IL, United States
    serp_jobs.job_card.full_time
    Senior Technology Site Reliability Engineer.Cooley is seeking a Senior Site Reliability Engineer to join the.Infrastructure & Development Operations. The Senior Technology Site Reliability Engineer(...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Lead Software Engineer

    Lead Software Engineer

    Relativity • Chicago, IL, United States
    serp_jobs.job_card.full_time
    We are seeking a Lead Software Engineer to join the Retrieval Ingestion Team at Relativity.This role is ideal for an experienced engineer who thrives on designing and operating high throughput inge...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Associate Principal, Site Reliability Engineering

    Associate Principal, Site Reliability Engineering

    The Options Clearing Corporation • Chicago, IL, United States
    serp_jobs.job_card.full_time
    THIS POSITION IS NOT ELIGIBLE FOR VISA SPONSORSHIP • • • • •.Provide strong support for the availability and performance of OCC's next generation Ovation platform. Enhance system reliability and develope...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Reliability Engineer

    Reliability Engineer

    Mondelez International • Chicago, Illinois, United States
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Are You Ready to Make It Happen at Mondelēz International?.Join our Mission to Lead the Future of Snacking.Your goal will be to ensure that the site manufacturing & support activities, without ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30
    Senior Engineer, Reliability

    Senior Engineer, Reliability

    Shure Incorporated • Niles, IL, United States
    serp_jobs.job_card.full_time +1
    Senior Reliability and Product Quality Testing Engineer.You'll play a key role in developing and maintaining reliability tests and procedures, optimizing test equipment to align with real-world usa...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Hotel Engineer I

    Hotel Engineer I

    Marriott International, Inc • Chicago, IL, US
    serp_jobs.job_card.full_time
    Chicago Marriott Downtown Magnificent Mile, 540 North Michigan Avenue, Chicago, Illinois, United States, 60611VIEW ON MAP. Respond and attend to guest repair requests.Fix minor plumbing problems suc...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
    Reliability Engineer

    Reliability Engineer

    Mondelez Manufacturing • Chicago, IL, US
    serp_jobs.job_card.full_time
    serp_jobs.filters_job_card.quick_apply
    Are You Ready to Make It Happen at Mondelēz International?.Join our Mission to Lead the Future of Snacking.Your goal will be to ensure that the site manufacturing & support activities, without ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30
    Site Reliability Engineer (Chicago)

    Site Reliability Engineer (Chicago)

    Request Technology, LLC • Chicago, IL, United States
    serp_jobs.job_card.full_time +1
    Hybrid, 3 days onsite, 2 days remote • • •.We are unable to sponsor as this is a permanent full-time role • • •.A prestigious company is looking for a Site Reliability Engineer.This role is focused on ob...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Lead Site Reliability Engineer (Chicago)

    Lead Site Reliability Engineer (Chicago)

    Algo Capital Group • Chicago, IL, United States
    serp_jobs.job_card.full_time
    Site Reliability Engineer - Market Connectivity & Infrastructure.A leading quantitative trading firm managing significant global assets is seeking a skilled Site Reliability Engineer specializing i...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Relief Engineer

    Relief Engineer

    ABM • Chicago, Illinois, USA
    serp_jobs.job_card.part_time
    The Relief Engineer performs scheduled maintenance safety inspections and repairs to varying types of equipment and reports to the Operations Engineer III. The pay listed is the hourly rate for this...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Analytics Enablement Engineer

    Analytics Enablement Engineer

    RAPP • Chicago, IL, United States
    serp_jobs.job_card.full_time
    RAPP Chicago is looking for an Analytics Enablement Engineer to join our award-winning Marketing Sciences team.We are RAPP - world leaders in activating growth with precision and empathy at scale.A...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Cisco Systems, Inc. • Chicago, IL, United States
    serp_jobs.job_card.full_time
    Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified securit...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
    Site Reliability Engineer Incident ManagementResiliency (Hybrid)

    Site Reliability Engineer Incident ManagementResiliency (Hybrid)

    Enova International • Chicago, Illinois, USA
    serp_jobs.job_card.full_time
    We are interested in every qualified candidate who is eligible to work in the United States.However we are not able to sponsor visas or take over sponsorship at this time.Resilience Engineering is ...serp_jobs.internal_linking.show_more
    serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted