VP/AVP, Data Engineer, Group Consumer Banking and Big Data Analytics Technology, Technology & Operations
Business Function Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels. As a Senior Data Engineer you'll help us discover the information hidden in vast amounts of data. You'll help us make smarter decisions to deliver even better products and apply statistical analysis to build high quality prediction systems integrated with our products. Responsibilities
- Analyse source data and data flows, working with structured and unstructured data.
- Manipulate high-volume, high-dimensionality data from varying sources to highlight patterns, anomalies, relationships and trends
- Understand Terada or SAS ETLs and develop alternative processed using Spark SQL and BigData technologies
- Analyse and visualize diverse sources of data, interpret results in the business context and report results clearly and concisely.
- Discover data sources, get access to them, import them, clean them up, and make them "model-ready". You need to be willing and able to do your own ETL.
- Gather data, perform statistical analysis, draw conclusions on the impact of your optimizations and communicate results to peers and leaders.
- Create and refine features from the underlying data. You'll enjoy developing just enough subject matter expertise to have an intuition about what features might make your model perform better, and then you'll lather, rinse and repeat.
Apply Now We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.
- 7-10 years of Experience in one or more areas of big data
- Hands-on development with key technologies including Spark, Scala and other relevant distributed computing languages, frameworks, and libraries.
- Experience with Teradata SQL, Exadata SQL, T-SQL
- Experience with SAS, Python and other tools for data analysis and processing
- Experience in developing and scheduling jobs in Airflow
- Experience in migrating SQL from traditional RDBMS to Spark and BigData technologies
- Mastery of key development tools such as GIT, and familiarity with collaboration tools such as Jira and Confluence or similar tools.
- Experience in optimizers and automatic code generation
- In-depth knowledge of database internals and Spark SQL Catalyst engine
- Strong experience in graph and stream processing
- Experience using high-throughput, distributed message queueing systems such as Kafka.
- An ability to periodically deploy systems to on-prem environments.
- Practical experience in clustering high dimensionality data using a variety of approaches
- The ability to work with loosely defined requirements and exercise your analytical skills to clarify questions, share your approach and build/test elegant solutions in weekly sprint/release cycles.
- Independence and self-reliance while being a pro-active team player with excellent communication skills.