The popularity of OpenAI’s ChatGPT has generated a great deal of enthusiasm and concern as well as a multi-billion dollar investment from Microsoft, which will be using the advanced natural language processing (NLP) model in services like the Azure cloud. We speak with Causality Link CEO and co-founder, Pierre Haren, about the nuances when it comes to applying NLP tech for financial markets.
Haren has been in the field of artificial intelligence (AI) for over 40 years, at one stage founding the company ILOG, which introduced an enterprise software that uses code to pull out complex business rules from applications, and continues to operate after being sold to IBM in 2008. In 2016, he founded Causality Link with a focus on explainable AI. The startup’s research platform analyzes millions of documents and other text-based sources to inform investors and analysts on companies, industries, and macroeconomic drivers, explicitly linking cause-and-effect relationships between market and company key performance indicators.
ChatGPT, said Haren, is a major step forward in AI, and not necessarily for the quality of the answers it automatically generates after being trained on huge corpuses of data, rather for the breadth of concepts it can tap into. Haren compares the impact of using ChatGPT to influential points in his own career path after being in an MIT auditorium hearing the first random algorithmically created music in 1978, but today, that access is across the world’s internet.
He expects the immediate impact to be a pull of global talent: “This will balloon. All this human energy will add to the momentum (and) it’s shock value to the best and the brightest. I am sure it will make a very big wave.”
What it all means for finance however is, not surprisingly, nuanced. For example, when it comes to recognizing mistakes in generated text, and ChatGPT makes plenty at this stage, expert practitioners are still needed to identify them. Haren noted that training the model on a new corpus costs single-digit millions and has a long lead time, and it is unclear how improving one-to-one interactions will translate to modifying the model.
“The technology (ChatGPT) uses is not conducive to real-time retraining of the model,” he said. “There is a potential that they will do that, but for the financial industry now, you not only need as real-time as possible, you need somewhat of a guarantee that the percentage of errors is reasonable for mission critical stuff.”
In the near-term, the major implications will be for programmers embedded in financial institutions, he added, because of the way code behaves differently from text.
“(ChatGPT’s) code generation enables a lot of people to very quickly test different pieces of code. It is not able to build a large risk analysis trading system, but it is able to accelerate the speed at which elements of it will be built, and so software developers inside the financial institution that are using ChatGPT will have an edge,” he said.
And that’s a safe edge, he added, because programmers will be able to rapidly test their system and avoid and detect errors: “It’s rare to have an asset manager that is doing enough trading that doesn’t have three to five programmers, and so if their performance increases by a factor of 30%, this is pretty good.”
Causal vs. correlation analysis
Building causal relationships is one of the most sought after goals of financial AI, and there are two ways of doing this currently: one being brute force gathering and crunching data to make the leap between correlation and causation, and the other using natural language processing (NLP).
Detecting causation by data is hampered because it often leads to spurious associations, a well understood grievance in finance. Causality Link uses the NLP route, which ultimately aims to extract causation from statements people make.
Securities financing is no stranger to NLP, and vendors like Bloomberg have been using it in combination with other technologies to, for example, automate repo trades: algorithms understand the way in which markets are discussed between dealers and clients, and are able to build a structured representation of unstructured content to help take next step actions such as ticket creation or further analytics. In an interview with Bloomberg in 2020, SFM highlighted how the tech is being used and evolving as repo market volumes go increasingly electronic and automated. Automation is also spurred by firms keen to avoid future situations such as those during pandemic disruption, which caused firms to pull in ex-operations team to close books and records.
On a broad level, what’s going to be a difficult achievement is “detection of the novelty of the causation”, which Haren believes is going to be an important next step in applying NLP technology. In a recent paper, Causality Link described a method to leverage in real-time the “wisdom of crowds” by analyzing causal statements expressed in text and tracking them over time. In the two case studies used, the causal statement method produced an early alert by detecting “faint signals”, and the team also introduced tools that can be used by investors or managers either as a real-time supervision tool or to analyze post-mortem the evolution of a crisis and improve detection for similar future events.
Their method was recently selected by a firm, whose CEO wanted “a system to see around the corner”, Haren said: “In finance what you want (is) a process by (which) a causal potential signal is detected as early as possible, and then track it potentially with your data system, but also with your NLP system…We are able to detect novelty so well that we are able to give you the needle in the haystack story.”
Read the full article here.