COMMENT: The routes to the best machine learning jobs in banking

30 May 2019

4 minute read

COMMENT: The routes to the best machine learning jobs in banking

I was asked recently for my view of the most genuinely promising areas in artificial intelligence (AI) and machine learning (ML). After deep learning, natural language processing and semantic databases were the two close cousins that sprang to mind.

There's NLP, NLU, NLL and NLG

If you want the best jobs in machine learning in banking, you'll need to know about natural language processing (NLP), understanding, learning and generation. They are all terms that have been with us since the 1950s. It is only recently that we’re seeing more widespread use and adoption in real life.

Most people would agree that NLP refers to a range of computer science techniques aimed at processing human (natural) languages in an effective often interpretive manner. Allied to this is natural language understanding (NLU), an AI-hard problem that is aimed at machine comprehension. Natural language learning (NLL) claims automatic triggering of specific responses to a language using the rules that define that language. Natural language generation (NLG) seeks to generate natural language from a machine representation NL system.

A true AI with all such capabilities would certainly blur the boundaries between humans and machines. Think the fictional Ava in the film Ex Machina rather than perhaps Siri or Alexa. We are, however, still many years away from such science fact. One only has to read automated language translations to realize any prose containing nuance is often lost in the machine.

The future will be all about semantic data models

Today’s classical, largely relational, databases prevalent in use throughout the finance industry have proved themselves to be and will continue to be useful workhorses for years to come. They are, however, limited in scope as they cannot cater for a conceptual definition of data.

By comparison, a semantic data model (SDM) seeks to relate meaning or connotation in language with the real world. It is a grand aim. Such an approach seeks to avoid the limitations of traditional databases in terms of how relationships can be queried. This is because the concepts of data normalization and a strict schema do not exist.

Among the advantages of using an SDM are flexibility in the way new interpretations of the same object can be added to existing ones and the fact that complex database queries can be handled extremely efficiently. In my opinion, semantic data models are the way forward for any more advanced machine learning system.

Graph databases

You'll also need to know about graph databases. A graph database uses mathematical graph structures for semantic queries with nodes representing data items and edges between the nodes to represent relationships.

Querying relationships within a graph database is fast and relationships can be visualized very easily. A quick web search will reveal the names of a number of very good graph databases in common use.

The bottom line is that we have made our world semantic. Semantic technology exists, is developing rapidly and is the way forward for people who want to do clever things with data. It is only a matter of time before mainstream financial service firms are competing fiercely with each other for people with semantic technology skills.

Richard Saldanha co-heads Oxquant, a successful consulting business that helps companies navigate today’s complex challenges and opportunities by providing expertise and advice on artificial intelligence, statistical machine learning and related areas. He has over 20 years’ experience in asset management and investment banking in the areas of quantitative trading and investment risk. In addition to his consulting activities, Richard lectures on statistical machine learning at Queen Mary University of London, is an independent adviser to Oxford Portfolio Advisers Ltd and sits on the governing board of Magdalen College School, Oxford. He attended Oriel College, University of Oxford and holds a doctorate in statistics.