IaaS Operations Lead
Departmental Overview Hosting Services (HS) provides global infrastructure hosting and application services, spanning product management, engineering and L2 support. Infrastructure hosting covers capabilities such as Compute and Storage; application services include Database and Web Services. With over 1,500 staff globally across our core office locations (both BC and BDC), HS is dedicated to providing a premium service to its customers across IT within Credit Suisse.
HS is driving the largest internal IT project with Project Cloud Serve where it is delivering a step change in customer experience for firm wide compute, storage and connectivity services; enabling greater flexibility, agility and scalability through adopting a hybrid cloud strategy to replace existing/legacy infrastructure, enabling Infrastructure Rationalization and helping prepare Credit Suisse for the digital bank transformation.
The role sits within the newly formed IaaS function, which is driving innovation across the bank through the use of Private and Public Cloud (Azure). This IaaS tower is responsible for the design, implementation and ongoing operation support of Cloud services within CS. This role will report into the Global Operations Lead who reports in to the Global IaaS lead.
Team Overview As the manager of the Americas IaaS Operations Team you will lead the regional team for both operations and capacity & service improvement, reporting into the Global IaaS Operations Manager. This regional team is composed of 4 IaaS Operations Admins and 3 Capacity & Service Improvement staff, based out of Raleigh, NY and Pune. Although a small team right now, there is a similar operations setup in EMEA, APAC and CH with global harmonization and operations being critical to the success of this environment. Teams are composed of SME's across Compute, Storage, Networking, Capacity and Service Management and this will be an opportunity for someone to come in and handle the Americas environment and drive the environment forward within CS.
Key Accountabilities The Americas IaaS Operations Lead will be accountable for:
- Day to Day Management of the Americas Operations Team and the Cloud Infrastructure.
- Ensuring Stability of the platform and ensuring all regional projects/Book of Work is delivered to time and budget.
- Global Service availability and capacity planning of the Cloud Platforms.
- Global Service level reporting on the platforms to ensure efficient utilization.
- Global Implementation and delivery of cloud service improvement initiatives.
- Govern and lead the risk and controls posture for the IaaS Service.
Key Responsibilities Operations
- Have a consistent track record leading a team in a highly technical area with a strong background of handling a 24x7 support operation at an enterprise scale within a critical environment.
- Guiding the career and technical development of direct reports based within region and remote locations and building a strong team/work ethic.
- Be able to demonstrate management of complex projects to time, cost and quality.
- Have a consistent record of continuous service improvement.
- Promote stability and dedicatedly take steps to reduce risk to the production environment.
- Lead incident management of raised business impacting application server outages and partner with infrastructure and application teams to fix incidents.
- Partner with Product Management & Engineering managers to ensure new builds and management tools meet strategic, client & operational needs.
- Customer Focus - the ability to respond professionally and courteously to client needs and partner with the Technology Account Management Organization (TAMs) to build relationships within the Application Development world.
Capacity & Service Management
You Offer Desired Skills and Experience
- Overall service performance of the Cloud platform services, ensuring an accurate cloud inventory is maintained.
- Providing regular reporting to senior management on the health of the cloud platforms including performance, capacity and compliance reporting.
- Working with the wider Operations team for incident and problem solving
- Monitoring service performance and identifying areas for improvement to enhance scalability, availability and security.
- Reviewing service impacting incidents to ensure corrective actions are implemented to avoid future outages and coordinating post incident reviews for major incidents.
- Maintaining a view of service costs for cloud platforms and working with SRE/Cloud Engineers/Architects to optimize costs and supporting the prioritization of work for the Site Reliability Engineering team.
- Report/review Audit, SOX and Risk portfolio along with upcoming commitments & deadlines.
- Report/review of open security vulnerabilities along with remediation plan as well as handling EoL infra owned by the service.
- You possess good experience of managing an operations environment supporting large scale production deployments.
- You possess strong problem solving and analytical skills.
- You are experienced of supporting infrastructure transition from Development to Production.
- You have strong customer, business partner focus. Ability to build strong relationships with Cloud Services, Application DevOps, cross functional IT and global/local IT teams.
- You possess proven leadership and teamwork skills - Works collaboratively with Ops and Engineering to support service on boarding.
- You are experienced of cloud finance models and how to drive value for money as well as experience of using cloud sizing and costing calculators.
- Risk management effectiveness - manages and highlights risks in line with Credit Suisse risk management policies, standards.
- You possess superb communication (oral and written) skills.
- You have good social relationship skills.
- Ability to learn business situations and technology quickly.
- You are confident to work in a challenging and changing environment.
- Good Troubleshooting and Problem management skills are critical.
- Self-motivated, good planner and confident individual.
- Knowledge of VMWare ESXi & vRealize software stack is desirable.
- Knowledge of Dell Enterprise Hybrid Cloud Infrastructure is desirable.
- Experience in a Service Management role, preferably within a cloud environment.
- Demonstrable experience of working in a DevOps/Agile environment.
- Demonstrable understanding of the challenges of working within a commodity cloud environment.
- Demonstrable experience of leading teams and delivering on a global platform.
- You have deep Knowledge of storage systems preferably EMC
- ITIL Practitioner.
- Azure knowledge and certifications.
- You have deep understanding of Audit and risk compliance with the ability to lead deliverables in this area.