Sr Tech Operations Analyst
As a Operations Engineer at Ameriprise India, you will support the containerized applications running on Kubernetes/EKS running on AWS and Dockers, Mesos. Including some of the applications running on Apache and Java middleware technologies production environment by pro-actively monitoring and quickly responding to middleware incidents for one or more technologies within the technical area of expertise. Frequently collaborate with vendor/contractor partners to develop and implement detailed design, configuration and engineering strategies/solutions to resolve issues/incidents while remaining focused on security, up-time and performance. Provide troubleshooting and resolution to routine/semi-complex problems. Responsibilities
• Creation of dev, test, and production EKS clusters for hosting shared services and applications owned by Technology Infrastructure
• Experience in building and operationalizing EKS clusters and related infrastructure (e.g. ECR) in AWS accounts including:
o Creation of dev, test, and production EKS clusters for hosting shared services and applications owned by Technology Infrastructure.
o Creation and maintenance of provisioning and configuration automation for the above clusters, such as Cloud Formation Templates and Ansible playbooks
o Deployment of application containers onto the cluster
o Creation of automated jobs for operational and maintenance activities (e.g. remove a member from cluster, patching, container stop/start, etc.)
o Creation of service control policies, permission boundaries, and custom AWS Config rules and lambda code to govern the environment and enforce compliance with configuration and security standards
• Experience in deploying, patching and maintaining applications deployed on Apache Tomcat, IBM WebSphere, Apache Webserver and IIS.
Troubleshooting & Incident Management
- Perform moderately difficult and independent assignments in the troubleshooting, problem diagnosis, problem resolution and ongoing production support for one or more technologies within the technical area of expertise.
- Responsible for designing, reviewing, approving and deploying robust, stable and manageable solutions while minimizing middleware downtime.
- Periodically assist in the procurement, configuration, and integration of new technologies.
Proactive Monitoring & Preventative Maintenance
- Ensure the up time and response time SLAs/OLAs for services are met and or exceeded.
- Pro-actively monitor the stability and performance of various technologies within area of expertise and takes appropriate corrective action prior to an incident or problem occurring.
- Ensure patching and regular maintenance is performed as required.
- Actively collaborate with fellow members of the team and contractors/vendors on bridge calls to prevent or resolve incidents/problems in an expeditious manner.
- Recommend, deploy and document strategies and solutions for software/hardware/network engineering problems/incidents based upon comprehensive and thoughtful analysis of business goals, objectives, requirements and existing technologies.
- Independently identify key issues, patterns and deviations during the analysis.
- Recommend robust solutions utilizing pragmatic judgment, creativity, and in-depth technical knowledge and evaluation that comprehensively meet the needs of the business.
Leadership & Partnerships
- Manage effective relationships and works in partnership with leadership, team members, vendors, and contractors to deliver robust technical solutions ensuring that service level commitments and project time lines are maintained.
Processes, Standards & Best Practices
- Participate and provide input in the continual refinement of processes, policies and best practices to ensure the highest possible performance and availability of technologies.
- Create, maintain and update documentation of detailed design documents, diagrams, engineering specifications, build changes, models, troubleshooting and support guides, systems metrics and Standard Operating Procedures as required to ensure operational excellence.
- Continuously develop specialized knowledge and technical subject matter expertise by remaining apprised of Industry trends, the direction of emerging technologies, and their potential value to the business Required Qualifications
Bachelor's degree in Computer Science, Engineering or related field; or equivalent work experience.
- 5-7 years of proven engineering expertise within the subject matter domain.
- Primary Area of Subject matter expertise: Apache Webserver, Apache Tomcat, IBM WebSphere, IIS.
- Good hands experience on LINUX and Windows administration.
- Experience in deploying, patching and maintaining applications deployed on Apache Tomcat, IBM WebSphere, Apache Webserver and IIS.
- 2+ years of experience with Docker containers, registries, and container orchestration solutions (e.g. Kubernetes)
- 1+ years of experience with AWS Config, Lambda, AWS IAM, and Service Control Policies
- Container management platforms (DC/OS, Kubernetes)
- Proficient knowledge of Shell Scripting (Bash)/Python, PowerShell scripting languages.
- Knowledge in Log Aggregation Tool - preferably Sumologic
- Knowledge of Monitoring Tools like Sitescope.
- Experience on using version control systems, including GIT
- Deep understanding of TCP/IP networking concepts and debugging.
- Practical experience with the design and operation of enterprise-level complexity computing systems is essential.
- Ability to support/lead working outside of normal shifts to provide after hour or "on-call" support when necessary to solve incidents/problems/change execution (as required)
- Highly innovative problem solver with strong analytical and customer service abilities required.
- Ability to communicate and articulate technical /functional information across various organizational levels.
- High reasoning aptitude and ability to quickly understand complex operating environments.
- Exposure to Release Automation products - Preferably CA Release Automation
- Understanding of concepts of Single Sign-On.
Availability for 24*7 team operations Cab Facility provided :