Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


https://bayt.page.link/rZqiZoFYR9wrCXbU8
Back to the job results

Site Reliability Engineer - Incident Management, Troubleshooting, Debugging, Scripting (Python/Ruby/Bash), Exp: 8-12 Yrs

26 days ago 2026/05/10
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

Meet the Team



The SRE Incident Commander team is a pivotal and constantly evolving team in Cisco’s Network Platform. This team is focused on influencing and scaling global incident management capabilities, actively empowering engineering teams to improve their incident response and adopt centralized, efficient workflows. Operating with a self-motivated culture, the team continuously develops processes, culture, and our collective system reliability.
 



Your Impact



As an SRE with an Incident Management focus, you will be a critical supporter of our production environment, directly influencing the stability and performance of the Cisco’s Network Platform. You'll apply your engineering experience to not only participate in Incident Response with peers to achieve swift restoration of service. But also, to build the tools and automation that enable fellow responders and help prevent incidents. This role offers the phenomenal opportunity to blend deep technical problem-solving with strategic process improvement, making a tangible difference in our MTTR and overall customer satisfaction.



  • Lead real-time efforts, partnering with on-callers, engineering teams, product, and support to restore service quickly.
  • Apply engineering principles to understand, fix, and resolve issues within complex production systems.
  • Develop and maintain scripts incident management and tools to improve system observability, automation, and incident response capabilities.
  • Own Incident Commander responsibilities, including coordinating response preparedness activities with engineering/product teams.
  • Contribute and make recommendations for incident management process enhancements and overall reliability improvements.
     

Minimum Qualifications



  • Proven experience with system troubleshooting and debugging in production environments.
  • Familiarity with software development, production code management, and IT operations via real world development experience.
  • Experience writing scripts and developing tools for automation (e.g., Python/Ruby/Bash).
  • Experience with monitoring tools like Jira, Confluence, PagerDuty, Splunk, ELK, Prometheus, and Grafana.
  • Willingness to be on-call, including nights and/or weekends, as part of a rotation.
     

Preferred Qualifications



  • Strong curiosity about incident management principles, tools, and processes.
  • Experience supporting an externally facing production environment, ideally in a globally distributed team.
  • Eagerness to learn and grow expertise in incident command and SRE practices.
  • Outstanding ability to translate complex technical issues into clear, concise, and impactful summaries for diverse audiences, including leadership.
  • Strong critical thinking and influencing skills, particularly in evaluating technical trade-offs and aligning incident resolution with business objectives.


Why Cisco? 

At Cisco, we’re revolutionizing how data and infrastructure connect and protect organizations in the AI era – and beyond. We’ve been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.



Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you’ll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere. 



We are Cisco, and our power starts with you. 





This job post has been translated by AI and may contain minor differences or errors.

You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.