Lead: Site Reliability Engineer
Lead: Site Reliability Engineer Apply locations Johannesburg time type Full time posted on Posted Yesterday job requisition id JR-61582
Let's Write Africa's Story Together! Old Mutual is a firm believer in the African opportunity and our diverse talent reflects this.
Job Description ROLE OVERVIEW The Head of Site Reliability Engineering (SRE) is a critical leadership position responsible for ensuring the bank's technology systems and services are reliable, scalable, and resilient. This role requires a deep understanding of infrastructure, monitoring, incident management, and automation, as well as a strong ability to lead and inspire a team of SRE engineers. The successful candidate will play a pivotal role in driving operational excellence, optimizing service delivery, and fostering a culture of reliability across the bank's digital ecosystem.
KEY RESULT AREAS Strategy & Leadership Define and implement the SRE strategy, ensuring alignment with the bank's business and technology goals. Lead initiatives to enhance the reliability, availability, and performance of the bank's services. Promote and embed SRE principles across engineering and operations teams. Operational Reliability Establish and maintain Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and improve service reliability. Oversee the development and operation of monitoring, logging, and alerting systems to detect and resolve issues proactively. Manage incident response and post-mortem processes, driving root cause analysis and preventive actions. Automation & Efficiency Drive automation of operational tasks to reduce manual effort and improve efficiency. Lead initiatives to optimize system performance, reduce latency, and enhance system resilience. Champion the use of infrastructure as code and other modern engineering practices. Collaboration & Stakeholder Management Partner with development, infrastructure, and security teams to ensure seamless integration of SRE practices. Collaborate with business units to understand priorities and ensure reliability initiatives align with their needs. Act as the primary point of contact for SRE-related discussions with internal and external stakeholders. Team Leadership & Development Build, mentor, and manage a high-performing SRE team, fostering a culture of collaboration and innovation. Drive continuous learning and skill development within the team to stay ahead of technological advancements. Identify and address resource gaps to ensure effective delivery of SRE initiatives. ROLE REQUIREMENTS Bachelor's or Master's degree in Computer Science, Engineering, or a related field. 10+ years of experience in infrastructure, operations, or site reliability engineering, with at least 3 years in a leadership role. Strong expertise in monitoring tools (e.g., Datadog, Prometheus, Grafana) and incident management platforms (e.g., PagerDuty). Experience in cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes). In-depth knowledge of automation tools, scripting languages, and CI/CD pipelines. Proven track record in driving system reliability, scalability, and performance improvements. Exceptional leadership and people management skills, with a focus on team development and motivation. Excellent problem-solving and analytical abilities, with strong attention to detail. Outstanding communication and stakeholder management skills. The appointment will be made from the designated group in line with the Employment Equity Plan of Old Mutual South Africa and the specific business unit. Closing Date 09 January 2025 , 23:59
Old Mutual Limited is pro-vaccination and encourages its workforce to be fully vaccinated against Covid-19. All prospective employees are required to disclose their vaccination status as part of the recruitment process. Please refer to the Old Mutual's Covid-19 vaccination policy for further detail. Kindly note that Old Mutual reserves the right to reinstate the requirement to vaccinate at any point if it is of the view that it is imperative to do so. About Us Old Mutual is a premium African financial services organisation that offers a broad spectrum of financial solutions to retail and corporate customers across key market segments in 14 countries. The lines of business include Life and Savings, Property and Casualty, Asset Management and Banking and Lending.
We are rooted in our purpose of Championing Mutually Positive Futures Every Day and believe that a great customer experience is anchored in a great employee experience.
#J-18808-Ljbffr