Roles and Responsibilities: System Delivery and Deployment: - Deploy and configure on-premise and cloud Linux and Windows servers, and related services. - Set up and manage virtualized environments, including ProxMox and Hyper-V hypervisors. - Install and configure monitoring and observability tools such as Grafana, Prometheus, ELK Stack, SaltStack, and Telegraf. - Integrate databases like PostgreSQL, Mimir, and Elasticsearch to support the data infrastructure. System Maintenance and Monitoring: - Monitor infrastructure performance and availability, using observability tools to ensure continuous operation and health of all systems. - Manage updates, patches, and lifecycle maintenance for Linux and Windows systems. - Troubleshoot system and application issues, providing quick, accurate resolutions to maintain uptime. System Improvement and Optimization: - Continuously optimize systems and infrastructure for enhanced performance, reliability, and scalability. - Develop and maintain automation scripts and configurations (using SaltStack, Terraform and Ansible) to streamline system processes and reduce manual intervention. - Analyze logs, metrics, and data trends to identify potential system enhancements. Customer and Product Support: - Serve as a technical point of contact for customers, providing Tier 2 and Tier 3 support as needed. - Work with cross-functional teams to troubleshoot, escalate, and resolve issues impacting customer experience. - Communicate proactively with customers about system improvements, updates, and troubleshooting processes. Documentation and Knowledge Sharing: - Create and update comprehensive documentation for system configurations, participate in knowledge-sharing sessions to keep the team updated on best practices, new features, and changes. Deliverables: Operational Excellence: - Uptime metrics for core services meet or exceed 99.9%. - Timely completion of scheduled system maintenance with minimal disruptions. - Rapid and accurate resolution of issues, tracked via support and incident metrics. Deployment and Configuration: - Successful deployment of new systems or enhancements within project deadlines. - Configuration management files/scripts stored securely and maintained in version control. Documentation: - Up-to-date system documentation, including configurations, deployment steps, and troubleshooting guides. - Clear and comprehensive incident reports for major support cases or outages. Customer Satisfaction: - Positive customer feedback on system reliability and responsiveness to support requests. - Completion of customer-facing updates, notifications, and support queries in a timely manner. Qualifications: - Bachelors degree in Computer Science, Information Systems, or related field, or equivalent practical experience. - 3 years of experience in system engineering, infrastructure, or observability solutions. - Proficiency in Linux and Windows operating systems and virtualized environments. - Experience with monitoring and observability tools (Grafana, Prometheus, ELK Stack) and automation tools (SaltStack). - Strong knowledge of SQL and NoSQL databases (PostgreSQL, Elasticsearch).