Lead: Site Reliability Engineer

Details of the offer

Let's Write Africa's Story Together!Old Mutual is a firm believer in the African opportunity and our diverse talent reflects this.Job DescriptionROLE OVERVIEWThe Head of Site Reliability Engineering (SRE) is a critical leadership position responsible for ensuring the bank's technology systems and services are reliable, scalable, and resilient.
This role requires a deep understanding of infrastructure, monitoring, incident management, and automation, as well as a strong ability to lead and inspire a team of SRE engineers.
The successful candidate will play a pivotal role in driving operational excellence, optimizing service delivery, and fostering a culture of reliability across the bank's digital ecosystem.KEY RESULT AREASStrategy & LeadershipDefine and implement the SRE strategy, ensuring alignment with the bank's business and technology goals.Lead initiatives to enhance the reliability, availability, and performance of the bank's services.Promote and embed SRE principles across engineering and operations teams.Operational ReliabilityEstablish and maintain Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and improve service reliability.Oversee the development and operation of monitoring, logging, and alerting systems to detect and resolve issues proactively.Manage incident response and post-mortem processes, driving root cause analysis and preventive actions.Automation & EfficiencyDrive automation of operational tasks to reduce manual effort and improve efficiency.Lead initiatives to optimize system performance, reduce latency, and enhance system resilience.Champion the use of infrastructure as code and other modern engineering practices.Collaboration & Stakeholder ManagementPartner with development, infrastructure, and security teams to ensure seamless integration of SRE practices.Collaborate with business units to understand priorities and ensure reliability initiatives align with their needs.Act as the primary point of contact for SRE-related discussions with internal and external stakeholders.Team Leadership & DevelopmentBuild, mentor, and manage a high-performing SRE team, fostering a culture of collaboration and innovation.Drive continuous learning and skill development within the team to stay ahead of technological advancements.Identify and address resource gaps to ensure effective delivery of SRE initiatives.ROLE REQUIREMENTSBachelor's or Master's degree in Computer Science, Engineering, or a related field.10+ years of experience in infrastructure, operations, or site reliability engineering, with at least 3 years in a leadership role.Strong expertise in monitoring tools (e.g., Datadog, Prometheus, Grafana) and incident management platforms (e.g., PagerDuty).Experience in cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).In-depth knowledge of automation tools, scripting languages, and CI/CD pipelines.Proven track record in driving system reliability, scalability, and performance improvements.Exceptional leadership and people management skills, with a focus on team development and motivation.Excellent problem-solving and analytical abilities, with strong attention to detail.Outstanding communication and stakeholder management skills.Closing Date09 January 2025, 23:59Old Mutual Limited is pro-vaccination and encourages its workforce to be fully vaccinated against Covid-19.All prospective employees are required to disclose their vaccination status as part of the recruitment process.
#J-18808-Ljbffr


Nominal Salary: To be agreed

Source: Whatjobs_Ppc

Requirements

Front End Developer

Key Responsibilities:Develop and maintain responsive web applications to deliver a top-notch user experience.Collaborate with web designers to translate desi...


Network Recruitment - Gauteng

Published 16 days ago

Senior It Infrastructure Engineer

Please note that this position is based in The Middle East (Qatar). Flights, Visa, Accommodation and Additional benefits apply.Job description:Senior Infrast...


Deka Minas Pty Ltd - Gauteng

Published 16 days ago

1958 Full Stack Devops Engineer (Entry)

What will be your role and responsibilities?Be part of a DevOps team that implements and operates functional services in our Generative AI platform applicati...


Imizizi - Gauteng

Published 16 days ago

Enterprise Solutions Architect

The purpose of this role is to:To develop, maintain and co-ordinate an explicit set of representative models of the business processes, applications, technol...


Fact - Gauteng

Published 16 days ago

Built at: 2025-01-19T09:15:03.238Z