Senior Incident Response Manager

Details of the offer

JOB PURPOSE
The Senior Incident Response Manager is responsible for overseeing the coordination and execution of the organization's incident response processes, ensuring the timely and efficient resolution of major incidents. This role involves developing strategies for handling incidents, coordinating with key stakeholders, and maintaining operational stability. The individual will be expected to lead high-impact situations, improve incident management processes, and work with cross-functional teams to resolve complex technical issues.
RESPONSIBILITIES
Incident Management & Responsibilities

Lead the incident response team in identifying, diagnosing, and resolving high-severity incidents in a timely manner
Manage the entire lifecycle of incidents from detection to post-incident review, ensuring appropriate measures are taken to prevent future occurrences
Coordinate with internal departments, third-party vendors, and external partners during major incidents
Escalate unresolved issues to senior management and drive the resolution process
Ensure incidents are identified promptly through proactive monitoring or incident reports
Assign priority levels to incidents based on business impact, urgency, and severity
Lead the incident resolution process from detection through closure, maintaining oversight and ensuring swift action
Implement corrective actions to prevent recurrence and minimize downtime
Deliver timely resolution of incidents within established SLAs (Service Level Agreements)
Ensure all incidents are properly logged, tracked, and documented in the ticketing system
Ensure communication of status updates to stakeholders throughout the incident lifecycle

Incident Reporting & Documentation

Ensure accurate and comprehensive documentation of all incidents, including root cause analysis, impact analysis, and post-incident reviews
Develop detailed incident reports for management, outlining incident resolution timelines, mitigation actions, and lessons learned
Develop incident reports that capture the full lifecycle of each incident, from detection to resolution
Perform in-depth root cause analysis for major incidents, identifying both technical and process-related factors
Present incident reports to senior leadership, including potential risks and suggestions for process improvements
Establish metrics (MTTR, number of incidents, etc.) for tracking the performance of the incident response process
Ensure timely and accurate reporting on incidents for internal and external stakeholders
Deliver detailed reports that can be used to improve system resilience and incident management
Maintain transparency with senior leadership on incident impacts and outcomes

Response Coordination

Serve as the central point of contact for all incident response activities, coordinating between various teams (e.g., IT, DevOps, Security, Vendors)
Lead regular incident response meetings, war rooms, or conference calls to facilitate real-time problem-solving
Ensure that the right stakeholders are involved based on the nature of the incident (e.g., business continuity, legal, communications)
Escalate incidents to senior management when critical thresholds are met or breached
Ensure seamless coordination across teams, minimizing delays in incident resolution
Ensure accurate escalation paths are followed based on the severity of the incident
Keep stakeholders informed with real-time updates, ensuring they understand the potential impact and mitigation efforts

Emergency Response

Serve as the primary leader during high-severity incidents or business-impacting crises
Mobilize and direct resources rapidly to respond to incidents, minimizing downtime and impact on operations
Ensure contingency plans are activated and business continuity protocols are followed in extreme cases
Communicate incident status and resolution plans with clarity, particularly when operations are significantly impacted
Ensure that response to critical incidents is swift, minimizing impact on customers and operations
Ensure that emergency protocols are followed accurately, minimizing operational risks
Maintain a high state of readiness within the incident response team for major incidents or crises

Continuous Improvement

Analyse past incidents to identify trends, root causes, and recurring issues
Lead post-incident reviews (PIRs) and after-action meetings to gather insights and lessons learned
Propose and implement improvements to incident response processes, tools, and methodologies based on feedback from PIRs
Stay updated on new incident management frameworks, tools, and practices and introduce relevant innovations into the team's workflows
Ensure a measurable reduction in the recurrence of incidents over time
Deliver clear, actionable recommendations from post-incident reviews to prevent similar incidents
Keep the incident response process current with industry best practices and technological advancements

Cross-functional Collaboration

Work closely with IT operations, network teams, security teams, and software development to ensure cohesive incident response
Collaborate with external vendors and service providers as needed for incident resolution
Maintain close relationships with business leaders to align incident response priorities with overall business objectives
Facilitate communication between technical and non-technical stakeholders, ensuring clear understanding of incidents
Ensure alignment of incident response activities with business priorities, ensuring minimal disruption to core operations
Maintain effective communication and collaboration between internal teams and external partners
Ensure stakeholder expectations are managed, and clear, concise updates are provided

Policy & Procedure Development

Develop and maintain incident management policies, procedures, and runbooks based on industry best practices (e.g., ITIL, NIST)
Ensure that all documentation is up to date, covering standard operating procedures for various incident types
Work with compliance and security teams to ensure incident management practices align with regulatory and legal requirements
Regularly review and update policies to incorporate new risks, tools, and business requirements
Ensure that incident response documentation is clear, actionable, and followed during incidents
Maintain compliance with regulatory requirements related to incident management
Ensure that all team members are familiar with and adhere to incident management policies

Process Improvement & Strategy Development

Continuously review and improve the incident management processes, frameworks, and protocols to enhance operational efficiency
Develop and maintain incident response plans, ensuring the organization is prepared to address critical incidents quickly and effectively
Collaborate with service delivery and IT operations teams to ensure alignment between incident management and overall business objectives

Team Leadership & Mentorship

Lead and mentor a team of Incident Response Specialists, ensuring professional development and technical proficiency
Provide guidance and training to team members, promoting best practices in incident management and technical troubleshooting
Ensure team performance aligns with the business objectives and targets
Maintain high levels of team morale and engagement, fostering a collaborative and accountable work environment
Identify skills gaps within the team and facilitate upskilling initiatives

Stakeholder Engagement

Serve as the primary point of contact for incident escalation and resolution, communicating with internal and external stakeholders to ensure they are informed throughout the incident lifecycle
Maintain strong working relationships with cross-functional teams including Service Delivery, IT Infrastructure, Application Support, and third-party vendors

Technology & Tools Management

Ensure the appropriate tools, systems, and resources are in place for effective incident detection, tracking, and resolution
Stay updated on emerging technologies and tools relevant to incident management and recommend improvements to the organization's incident response capabilities



BEHAVIOURAL COMPETENCIES

Tech Savvy
Customer-focused
Evaluating problems
Investigate issues
Information seeking
Processing details and information
Communicating information
Showing resilience
Adjusting to change
Learning ability
Teamwork
Business knowledge and approach
Instils Trust
Plans and Aligns

EDUCATION

Matric
Bachelor's degree in Computer Science, Information Technology, or a related field. Advanced certifications in Incident Management or IT Service Management (e.g., ITIL, CISSP) are a plus.\
Strong Microsoft Office productivity tools knowledge

EXPERIENCE

Minimum of 10 years of experience in a similar role
Experience in understanding the Technology systems and processes
Experience in creating Incident processes within SLAs
Extensive experience in Service management function
Strong stakeholder management experience


#J-18808-Ljbffr


Nominal Salary: To be agreed

Source: Whatjobs_Ppc

Requirements

Specialist Solutions Analyst

*Please note that the hiring team responsible for this position will be using the PikUniq platform for candidate screening and conducting one-way interviews....


Pikuniq - Gauteng

Published a month ago

Sap Cloud Platform Integration Consultant

About the Role: We are seeking a seasoned SAP Cloud Platform Integration Consultant to join our dynamic team in the financial services sector. This is a cri...


Adept Digital Advisory (Pty) Ltd - Gauteng

Published a month ago

Intermediate Full Stack Developer (Remote)

LOCATION: 100% Remote (with occasional on-site PI planning in Johannesburg) We are seeking a highly skilled and motivated Intermediate Full Stack Developer ...


Baec Specialists - Gauteng

Published a month ago

Firmware Developer (Iot)

Sandton (Office Based (no remote/hybrid)) Our company is seeking a Firmware Developer with experience and a keen interest in IoT Technology. We are faced wi...


Baec Specialists - Gauteng

Published a month ago

Built at: 2024-11-15T10:15:49.353Z