#Large language models like GPT-3 aren’t good enough for pharma and finance

Table of Contents

“Large language models like GPT-3 aren’t good enough for pharma and finance”

Natural language processing (NLP) is among the most exciting subsets of machine learning. It lets us talk to computers like they’re people and vice versa. Siri, Google Translate, and the helpful chat bot on your bank’s website are all powered by this kind of AI — but not all NLP systems are created equal.

In today’s AI landscape, smaller, targeted models trained on essential data are often better for business endeavors. However, there are massive NLP systems capable of incredible feats of communication. Called ‘large language models‘ (LLMs), these are capable of answering plain language queries, and generating novel text. Unfortunately, they’re mostly novelty acts unsuited for the kind of specialty work most professional organizations need from AI systems.

OpenAI’s GPT-3, one of the most popular LLMs, is a mighty feat of engineering. But it’s also prone to outputting text that’s subjective, inaccurate, or nonsensical. This makes these huge, popular models unfit for industries where accuracy is important.

A lucrative outlook

Greetings, humanoids

Subscribe to our newsletter now for a weekly recap of our favorite AI stories in your inbox.

While there’s no such thing as a sure bet in the world of STEM, the forecast for NLP technologies in Europe is bright and sunny for the foreseeable future. The global market for NLP is estimated at about $13.5 billion today, but experts believe the market in Europe alone will swell to more than $21 billion by 2030.

This indicates a wide-open market for new startups to form alongside established industry actors, such as Dataiku and Arria NLG. The former, Dataiku, was initially formed in Paris, but managed to perform extremely well on the global funding stage and now has offices around the world. And the latter company, Arria NLG, is essentially a University of Aberdeen spinout that’s expanded well beyond its Scottish origins. Both companies have reached massive success on the back of their natural language processing solutions by focusing on data-centric solutions that produce verifiable, accurate, results for enterprise, pharma, and government services.

One reason for the massive success of these particular outlets is that it’s extremely difficult to train and build AI models that are trustworthy. An LLM trained on a massive dataset, for example, will tend to output ‘fake news’ in the form of random statements. This is useful when you’re looking for writing ideas or inspiration, but it’s entirely untenable when accuracy and factual outputs are important.

I spoke with Emmanuel Walckenaer, the CEO of one such company, Yseop. His Paris-based outfit is an AI startup that specializes in using NLP for natural language generation (NLG) in standardized industries such as pharma and finance. According to him, when it comes to building AI for these domains, there’s no margin for error. “It has to be perfect,” he told TNW.