Site Reliability Engineer
About the Role
Are you looking for challenges that can grow your full potential? Are you eager to work in a multi-cultural environment? Do you want to be part of a critical role in supporting the community of global financial institutions? Then apply now!
From within the SWIFT Central Control Centre we manage the global and critical financial networks and services. We are currently looking for a Site Reliability Engineer (SRE), who will balance time between active participation in a service operation team and working on improvements for day-to-day routine activities, our tools, and the use of big-data to optimize operations.
With oversight from more senior staff, you will be in the front line to manage the service and keep availability and security to the highest standards. In collaboration with product teams and with other SRE's you will propose and work on improvements to how we can optimize and improve the way we operate and support our products. What to expect:
What will make you successful:
- Exert technical influence to improve the reliability of our production products and systems.
- Design, develop, test and maintain automation tools to be utilized by our team and for problem management investigations
- Periodic work on the weekends in support of production deployments
- Participate in system/network projects/enhancements by representing the department and providing technical input ensuring adherence to documented processes and procedures and risk mitigation effort
- Participate in day-to-day monitoring and control activities, problem management and change implementation
- Identify problems and use procedures and documentation for the best actions, and participate in the mitigation or resolution
- Implement changes in order to enhance products, or to mitigate problems on our products or underlying infrastructure
- Identify and automate repetitive and manual tasks in the day to day service operations
- Optimize tools and identify opportunities to use big-data to pro-active detect anomalies before it affects our service.
- Work with other SRE's to standardize product monitoring dashboards and identify opportunities for synergy between those products
- Collaborate with other departments like network services, software systems engineering and development teams to restore availability of services and identify and correct problems
- Rotation on weekend/holiday shift with leave and pay compensations
- Degree in IT/computer science or related
- Min. 4 years of related working experience in similar role
- Fluent in English (spoken and written)
- Ability to work under pressure and a team player in a multicultural team.
- Process and result oriented, analytical and methodical in problem investigations
What we offer
- Proven experience with HP-UNIX and/or RHEL Linux
- Proven experience with big-data analytics tools like Kibana and Elastic Search
- Experience with Oracle or other DB Platforms
- Experience with Container and Cloud technologies
- Experience with automation/scripting to optimize operational product management
- Programming language Scripting languages, including ksh, Python, Perl, Bash
- Familiarity with configuration and deployment management software such as; Bitbucket, Jenkins and Ansible.
- Knowledge of network technologies: TCP/IP, DNS, Firewall, ADC, VPN
We put you in control of career
We give you a competitive package
We help you perform at your best
We help you make a difference
We give you the freedom to be yourself We give you the freedom to be yourself. We are creating an environment of unique individuals - like you - with different perspectives on the financial industry and the world. An environment in which everyone's voice counts and where you can reach your full potential regardless of age, background, culture, colour, disability, gender, nationality, race, religion , or veteran/military status.