DevOps / Systems Resilience Engineer
About Standard Chartered
We are a leading international bank focused on helping people and companies prosper across Asia, Africa and the Middle East.
To us, good performance is about much more than turning a profit. It's about showing how you embody our valued behaviours - do the right thing, better together and never settle - as well as our brand promise, Here for good.
We're committed to promoting equality in the workplace and creating an inclusive and flexible culture - one where everyone can realise their full potential and make a positive contribution to our organisation. This in turn helps us to provide better support to our broad client base. The Role Responsibilities Strategy
- Team member to enhance application service and infrastructure resilience through self-healing and automated failovers - target a 99.99% up-time to customers.
- Assist in the running of planned random disruption of production infrastructure to ensure accountability for building resilient, always-on systems.
- Build resilience into the application so underlying system failures are handled gracefully and do not impact end users. Influence design/development teams to always be thinking of the rainy-day scenarios.
Optimize monitoring to reduce false positive alerts Creatively deepen monitoring capabilities leveraging the 3 tenets of observability - logs, metrics and traces Ensure all critical user service journeys are traceable end to end Ensure Production Solutions are fit for purpose. Where gaps are identified put a plan in place to uplift the toolset Business
Design, Code, implement break fixes to improve service availability based on outcomes of thematic reviews Participate in post mortem reviews helping to ensure each exercise is a blameless "adjust" opportunity Monitor SLIs/SLOs in partnership with Product Teams to achieve the optimal development velocity Capacity Planning
Enhance application and infrastructure scalability via iterative capacity management with the goal of reducing the effort required for capacity reviews through deep monitoring and auto-scale properties. Continuously monitor capacity for any discrepancies or spikes Efficiency
Identify opportunities to eliminate all manual and repeatable activities (toil) via tooling and automation Reduce the number of repeat incidents by permanently fixing the underlying root cause of issues Our Ideal Candidate
- Post graduation degree with knowledge in Information technology.
- Solid IT experience. Banking domain is desirable.
- Agile Trained
- Excellent oral and written communication skills, ability to interact with business representatives.
- Good with stakeholder communication and able to liaise with Sr.Mgmt.
Apply now to join the Bank for those with big career ambitions.
To view information on our benefits including our flexible working please visit our career pages . We welcome conversations on flexible working.