https://bayt.page.link/E9b8gh8jfc3gNSAL8
Create a job alert for similar positions

Job Description

Job Summary:


We are seeking a highly skilled and motivated Infrastructure Manager to lead and manage our Infrastructure Operations (InfraOps), Application Operations (AppOps), and Site Reliability Engineering (SRE) teams. This is a pivotal role within our engineering department, tasked with ensuring our platform's reliability, scalability, and security while fostering a high-performing team culture.


The ideal candidate will bring a strong technical background in infrastructure and operations management, coupled with exceptional leadership and organizational skills.


Main Areas of Responsibility:


1.   Team Leadership and Development


  • Lead and mentor three teams: InfraOps, AppOps, and SRE.
  • Recruit, develop, and retain top talent to ensure a high-performing team.
  • Foster a collaborative culture with a strong focus on accountability, innovation, and continuous improvement.
  • Define team goals and KPIs aligned with organizational objectives.

2.    Infrastructure Management


  • Oversee the design, deployment, and maintenance of scalable, reliable, and secure infrastructure.
  • Ensure compliance with uptime SLAs (99.99%) through proactive monitoring and incident management.
  • Drive automation initiatives to reduce manual work and improve efficiency.
  • Manage capacity planning and cost optimization strategies.

3.    Application Operations (AppOps)


  • Ensure the seamless operation of deployed applications and services.
  • Optimize application performance and reliability, working closely with engineering teams.
  • Oversee release management processes to minimize downtime and ensure smooth rollouts.

4.   Site Reliability Engineering (SRE)


  • Implement and uphold SRE practices to enhance platform reliability and scalability.
  • Oversee observability initiatives, including logging, monitoring, and alerting frameworks.
  • Drive post-incident reviews to identify root causes and implement preventive measures.

5.    Security and Compliance


  • Collaborate with security teams to enforce best practices across infrastructure and applications.
  • Ensure compliance with industry standards and regulations (e.g., ISO 27001, GDPR).

6.      Cross-functional Collaboration


  • Work closely with engineering, product, and business stakeholders to align infrastructure initiatives with organizational goals.
  • Serve as a point of escalation for critical infrastructure and operational issues.
You have reached your limit of 15 Job Alerts. To create a new Job Alert, delete one of your existing Job Alerts first.
Similar jobs alert created successfully. You can manage alerts in settings.
Similar jobs alert disabled successfully. You can manage alerts in settings.