Site Reliability Engineer - Team Lead
Senior Software Engineer (SRE) - UI Engineering
Digital Platform UI Engineering
Your work will be at the core of everything we build. Develop globally deployed and adopted systems, frameworks and tooling to help J.P. Morgan build best in class products.
The Digital Platform team builds the technical foundation behind JPMorgan's flagship products. We are owners and advocates for the underlying design elements, developer platforms and product components at JPMorgan. These are the essential building blocks for excellent, safe, and coherent experiences for our users and drive the pace of innovation for every developer. We look across JPMorgan's products to build central solutions, break down technical barriers and strengthen existing systems.
We are looking for multi-disciplined forward-looking technologists like you with diverse backgrounds and experiences. The Role
We are looking for an experienced Software Engineer to lead our newly formed Site Reliability Engineering team, you'll be applying Software Engineering practices to support us with running, maintaining and improving our platform against Service Level Objectives that you'll help define.
The SRE team is responsible for the availability, performance, change management, monitoring, and capacity management of our platform in order to meet the reliability expectations of our internal users, and those of our external clients. Responsibilities
- Defines Service Level Objectives for our Digital Platform, ensure they are met through engineering improvements & efficiencies in our systems
- Designs, develops, tests and delivers the software to automate manual operational work
- Troubleshoots priority incidents, conducts blameless post-mortems and ensures permanent closure of the incidents
- Engages with development teams throughout the life cycle to help develop software for reliability
- Applies analytics on the past data like incidents and usage patterns for predicting issues and takes proactive actions
- Drives adoption of self-healing and resiliency patterns.
- Designs and conducts the performance tests, identifies the bottlenecks, opportunities for optimization and the capacity demand
- Defines and drives adoption of a best in class monitoring frameworks to accomplish end to end flow monitoring and noiseless alerting
- Helps define and deliver continuous delivery best practices
- Adds value to team delivery and works with team to complete tasks to high quality and actively learns new skills
- Facilitates maximum speed of delivery by objectively binding to error budgets of the service
- Manages the effort split between manual operational work and engineering work
- Coaches other team members and manages teams as needed
- Experience in performance engineering and monitoring using tools such as Grafana, Kibana, Splunk, etc.
- Experience working in an Agile Development environment.
- Experience/knowledge administering application servers, web servers, and databases (Tomcat, Apache Web Server, Nginx, Cassandra, etc.)
Ready to use your expertise and experience to drive change? Apply today