URSABLOG: Computer Narratives
In a recent paper produced by three scholars at the University of Chicago school of business, Alex Kim, Maximilian Muhn and Valeri Nikolaev described how they fed thousands upon thousands of balance sheets and income statements into ChatGPT from 15,000 companies from a database spanning 1968 to 2021, stripped of dates and company names, and then asked it to predict the companies’ individual performance. The results were striking, as this rather dry extract from the abstract shows:
Even without any narrative or industry-specific information, the LLM [Large Language Model] outperforms financial analysts in its ability to predict earnings changes. The LLM exhibits a relative advantage over human analysts in situations when the analysts tend to struggle…LLM prediction does not stem from its training memory. Instead, we find that the LLM generates useful narrative insights about a company’s future performance…Taken together, our results suggest that LLMs may take a central role in decision-making.
In other words, the LLM turned those statements into earnings predictions that were more accurate than analysts’ and the predictions formed the basis for model portfolios which generated substantial excess returns.
For those of us in roles where our opinions can influence investment decisions – ship sale and purchase brokers for example – this has existentialist implications. If LLMs can predict the future, what use will a traditional S&P broker be in the marketplace? All you have to do is feed the data into the model, and they will come up with the answer, quicker and more efficiently than a broker trying to persuade you to buy and sell a ship because they want to make a commission.
I wonder. Let me start with the abstract’s statement that the prediction does not stem from its training memory but that the LLM “generates useful narrative insights about a company’s future performance.” How did the LLM generate these narrative insights? Well, not from historical performance; the model was not “told” about the longer-term history of the company, or even what it did, but instead was prompted to perform standard financial analyses (“What has changed in the accounts from last year?”, “Calculate the liquidity ratio”, “What is the gross margin?”). But then, the LLM was prompted to write economic narratives that explained the outputs of the financial analysis. In other words, how and why did the results occur the way they did? Then they were asked to predict whether each company’s earnings in the next year would be up or down and whether the change would be small, medium-sized, or large, and, crucially, how certain the model was of its’ prediction.
Asking the LLM how and why the results occurred intrigues me, and also how using these narratives the model was able to be correct. A human analyst’s predictions were accurate about 57% of the time (when measured half-way through the year), better than the LLM before being prompted to provide narratives. However, after prompting the model to write economic narratives however, the LLMs increased their accuracy to 60%.
So crunching the numbers is not enough. Analysing the how and why, and then coming up with a narrative is more significant to improving results, even for – or perhaps, especially for – a computer.
Armed with these narratives, the LLM built long and short model portfolios based on the companies for which the it foresaw significant changes in earnings with the highest confidence. In back tests using historic data, these portfolios outperformed the broad stock market significantly.
I am a great believer in telling stories, in the narrative, when thinking about price movements in the short term and the overall health of the market in the long term. Testing whether my forecasts have proven correct is difficult for me: ship sale and purchase is a more complex and illiquid market than publicly quoted stocks, and the supporting data, the equivalent of balance sheets and income statements, are hard to come by on a ship-by-ship basis unless the ship is yours. Most of the time market performance is measured very crudely – how much was the ship bought for and what is the value today? – which takes no account of repairs, finance costs, freight income, delays and so on. Only the owners will know, and they will be very unlikely to share detailed information with anyone, even within the same company. But nevertheless I like the idea of developing LLM models to – as the authors of this paper say – become a central part of decision making.
Maybe they already have? Unlike the evangelists of all things techie with their continued insistence on sharing data, this is such a potentially powerful tool that a company that has already started to investigate or even employ such a model would surely not share even the knowledge of its existence with their competitors.
I for one, who am almost continually baffled by the abundance of data, stories, news would welcome some assistance, human or machine, in this area. Constructing an acceptable narrative to invest or divest at any particular time and then focus on a ship type and size, and then find the actual opportunity to invest or divest would certainly give me a competitive advantage over my brother and sister brokers. Or again, maybe I would just cut out the middleman, i.e. myself, and go directly into the shipowning business myself?
Because, and I am indebted to the thoughts of Robert Armstrong of the Financial Times for this, it should not be entirely surprising that an LLM, or even a powerful number-crunching computer just processing the data, can process more information than the average human brain can. Linear regression analysis can highlight trends that the human mind cannot. But what is even more surprising is that although humans are prey to biases – not least, market sentiment – the LLM can outperform humans (and more standard data-only models) after being prompted to provide narratives for why the data is the way it is, and then to predict likely results. The story matters, after all.
Perhaps I am over-exaggerating the role of S&P brokers in the market, and our opinions and views are treated with contempt, or at least the necessary noise of a competitive marketplace. And the suspicion will always lurk – I am sure of it, even though I have tried very hard to dispel it, at least in the relationships that I have with shipowners – that a broker is pushing something because they want the commission, not because they believe in the deal, or even the details of the deal.
But the success of a broker who forms a relationship with a client depends on whether he or she can find the opportunity that matches the desires and needs of the client. Sometimes the opportunity matches – or even creates – a desire or need that the client didn’t know that they had. Or perhaps more realistically, the broker comes up with a solution to a problem that the client hadn’t been able to express.
I have been largely uninspired and unthreatened by all the hype surrounding AI and LLMs or whatever, until now at least. I am old enough to have outlived the oft-predicted death of shipbroking by tech, IT, platforms, or whatever. For the most part, these predictions come from people that don’t really know what shipbroking is and all the services that a shipbroker provides – we are not just middle-men or women, but provide a necessary and continual service to our clients whether they are active or not, from finding opportunities though to negotiating contracts and executing them – but also assume that the principals in the business wants to cut out the middleman or woman and the fees they earn for providing the service, when the fees themselves are a miniscule proportion of the total money involved, at least compared with other financial services.
I am however inspired and threatened – in more or less equal measure – after reading this paper and the comments emanating from it. This reaction comes with the realisation that AI and LLMs can provide narratives for how and why things have happened, and therefore could likely predict with a fairly good chance of success what could happen next. Again, the variables involved in tramp shipping are far more diverse and complicated than the evidence revealed by balance sheets and income statements, but how many shipbrokers, or indeed shipowners, can think about all these variables all the time, and come to the right conclusions in an uncertain world?
I am swiftly coming to the conclusion that these tools will be able to compete with, or support, the smartest people in the shipping markets, and indeed are already doing so. This will not necessarily replace the need for shipbrokers – shipping is all about relationships after all – but also does not mean we should dismiss the power of AI and LLMs to change our world, and our place within it. When computers start to tell us stories, perhaps we would be wise to listen to them.
Simon Ward