Core Engineering - (Network) Site Reliability Engineering
MORE ABOUT THIS JOB Site Reliability Engineering (SRE)
is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the requirements of our internal and external users. We look for engineers who are motivated to collaborate with our businesses to build and run sustainable production systems, which can evolve and adapt to changes in our fast-paced, global business environment. Network Platforms
is a group within the SRE organization that is responsible building the systems that monitor the health of the global network at Goldman Sachs. We implement the systems and software responsible for network monitoring, logging and alerting on a global scale. We look for engineers that are motivated in exploring how to monitor our network utilizing a variety of tooling and building upon the software we use to control them RESPONSIBILITIES AND QUALIFICATIONS HOW YOU WILL FULFILL YOUR POTENTIAL
SKILLS AND EXPERIENCE WE ARE LOOKING FOR
- Work closely with the network resolution and engineering teams to better understand the network and resolve noisy, inefficient alarms while driving focused alert signaling.
- Streamline the network monitoring pipeline through automation to programmatically control the network monitoring systems.
- Develop tools and services to automate the mitigation and remediation of network-specific events.
- Participate in system design consulting, platform management, and capacity planning.
- Drive vendor engagements where applicable.
- BS degree in Computer Science or related technical field involving coding and / or systems engineering.
- Proficiency in one or more of the following: Go, Python, C, C++, Java, Perl, Ruby or shell scripting.
- Experience with algorithms, data structures and software design.
- Experience with Network systems and Unix operating systems.
- Experience using CI/CD to manage project SDLC
ABOUT GOLDMAN SACHS
- Coding beyond simple scripts in Go and/or Python.
- Hands-on experience with debugging and optimizing code.
- Experience with distributed systems design, maintenance, and troubleshooting.
- Experience writing software for network automation beyond simple scripts.
- Strong communication skills, drive, and ownership.
- Agile development experience.
- Network infrastructure engineering experience: Cisco, Juniper, BGP, OSPF, etc.
- Familiarity with one or more network/system monitoring or telemetry tools: MicroFocus, SevOne, Splunk, Smarts, NSO, Forward, Prometheus, Grafana
The Goldman Sachs Group, Inc. is a leading global investment banking, securities and investment management firm that provides a wide range of financial services to a substantial and diversified client base that includes corporations, financial institutions, governments and individuals. Founded in 1869, the firm is headquartered in New York and maintains offices in all major financial centers around the world.
Â© The Goldman Sachs Group, Inc., 2020. All rights reserved Goldman Sachs is an equal employment/affirmative action employer Female/Minority/Disability/Vet.