• Competitive
  • Gurgaon, Haryana, India
  • Permanent, Full time
  • Moody's
  • 2019-05-24

Site Reliability Engineer

Location: Gurgaon, Haryana, India

Role:
Enterprise Monitoring Service (EMS) is the program responsible for the assessment, setup, maintenance, continuous improvement and reporting of the monitoring needs for Moody's IT technology stack (applications, databases, middleware, infrastructure, cloud and on-premise data centers). The team is also responsible for the life cycle management of several monitoring tools in use (Example: IPCenter, Dynatrace, AppDynamics, Splunk and SCOM) and maximizing the use / benefits from each tool. EMS team works closely with stakeholders across Moody's IT and business teams to design and implement monitoring, fine tune alerts, establish standardized thresholds and provide KPI reporting via dashboards. The EMS team is looking for a Senior Systems Engineer to focus on Site Reliability Engineering to maximize the benefits of the Enterprise Monitoring program and enhance the overall availability of the IT Infrastructure with the setup and improvement of various monitoring solutions. The Site Reliability Engineer will examine all the existing setup of Monitoring, including the technologies in use at Moody's, collaborate with the stakeholders and build automated solutions to help detect, log and resolve events that can potentially cause service disruptions. The Site Reliability Engineer will be a Subject Matter Expert (SME) in a variety of enterprise monitoring technologies and solutions.

RESPONSIBILITIES:
• Analyze and transform operational and/or functional needs of the organization into monitoring solutions, while remaining compliant with the standard IT policies and procedures
• Build a catalog with detailed descriptions of system monitoring parameters and integrate them to optimize the overall effectiveness and efficiencies.
• Manage the life cycle (onboarding, maintenance, migration and retirement) of all monitoring tools and platforms under ownership. Administer and provide software support for monitoring tools, and perform the necessary customization and implementations with any of the tool suites.
• Responsible for the day to day administration of the Monitoring platform, with focus on improvements that will help reduce alert volumes without compromising system stability and availability.
• Maintain and support infrastructure Monitoring environment to ensure the highest availability while reducing the impact of incidents.
• Conduct in depth evaluations of monitoring / alert data to assist with the diagnosis of various infrastructure and application problems.
• Test, recommend and implement new monitoring technologies. Retire the underused and outdated monitoring technologies with higher costs and / or diminishing returns.
• Collaborate with stakeholders across Moody's IT and business teams on projects and support initiatives
• Develop reports and perform analysis of the IT performance data using Tableau or Power BI
• Maintain the knowledge (documentation), reports and other artifacts in a central repository (ServiceNow Knowledge base)

Qualifications
Required Qualifications:
• At least 7 years of industry experience with a minimum of 3 years of relevant work experience as a Site Reliability Engineer
• Experience with installing, configuring and maintaining monitoring software such as IPCenter (or equivalent), Dynatrace, AppDynamics, Splunk, SCOM, VMWare VRops, AWS CloudWatch, Nagios, Azure Monitoring etc.,
• Solid working knowledge of both Windows and Linux Operating Systems, file and directory structures, commands, command-line interfaces and utilities.
• Knowledge of IT Best Practices as they relate to the following areas: IT Infrastructure Monitoring, Data Networks, IT Security, Virtualization, Web Servers, Cloud and Storage technologies
• Proficiency in scripting languages such as Python, PERL, Shell, VBS, SQL, Web Services, Containers and APIs
• Ability to leverage Excel for analysis, produce charts & reports (Pivot tables, charts, tables, and analysis) using macros/VBA and tools like Tableau or Power BI
• Experience in Cloud Environments such as, Azure, AWS, Google or private cloud would be a plus
• Understanding of containerization such as, Docker, Kubernetes and Micro services would be a plus.
• Proficiency in ITSM (ITIL v3 Foundation knowledge)
• Working knowledge of ServiceNow would be preferred
• Strong communication, presentation, analytical and problem solving skills required. Must have the ability to effectively understand and communicate technical issues and their solutions to multiple stakeholder groups and influence their outcome
• Strong customer focus and follow-up skills
• This is not a 9am to 5pm job. Candidate must be willing to work during non-standard business hours and weekend - on demand and onsite, if necessary

Preferred • Working knowledge of ServiceNow would be preferred

Moody's is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, protected veteran status, sexual orientation, gender expression, gender identity or any other characteristic protected by law.

Candidates for Moody's Corporation may be asked to disclose securities holdings pursuant to Moody's Policy for Securities Trading and the requirements of the position. Employment is contingent upon compliance with the Policy, including remediation of positions in those holdings as necessary.