Posted on October 25, 2023
This article provides exploratory commentary and is not intended as definitive legal advice.
Background
Large language models (LLMs) are a sub-set of “generative artificial intelligence (AI)” models. Other examples of generative AI include music generators and image generators. Some models, such as GPT-4, can process both text and images.
LLMs are formed by training a “deep” neural network (machine learning model) on a vast quantity of text data. The model turns chunks of text in the data into “tokens”. Tokens can be anything from a single letter to several words long. Each token is assigned an integer number by the model. The model learns which tokens are likely to follow each other. In this way, the model learns to predict text. What is interesting (and perhaps unexpected) is that not only are LLMs good text predictors, but they can also be used to produce human-like answers in response to prompts.
Recently, the press and other observers have entered into intense scrutiny of AI generally and LLMs specifically. A noticeable uptick in media speculation can be discerned since the release of OpenAI’s GPT-4 in Q2 2023. Meanwhile, governmental and inter-governmental mechanisms are whirring into action, as exemplified in the UK by this white paper (updated most recently on 3 August 2023).
Readers may be aware of the use of LLMs by students to prepare work products such as school and university essays, leading to concerns over a new category of plagiarism. Meanwhile, one US attorney has achieved notoriety through the use of Chat GPT to prepare a court briefing, wherein the model generated (“hallucinated”) fake legal citations, thoroughly undermining their case (Mata v. Avianca, Inc.).
Yet one cannot ignore that LLMs have a remarkable ability to generate human-like responses in answer to well-engineered prompts; to summarise large swathes of text into digestible chunks; to translate between languages; and to offer a glorified spell check or tone moderator. Beyond these relatively mainstream tasks, GPT-4 has for instance been tested for its ability to identify and purchase a novel compound similar to the leukaemia treatment and tyrosine kinase inhibitor, dasatinib (a drug which coincidentally is known in the world of patent law for a series of European Patent Office decisions regarding dasatinib patents). Several enterprising figures have designed AI tools specifically for use in the drafting of legal documents which – while promising – raise serious questions over confidentiality, reliability and client care.
Thus, we should engage thoughtfully with the possibilities LLMs offer many professionals, including solicitors and patent attorneys, without downplaying the risks.
The state of play
The current generation of LLMs has many advantages, including multi-modal capabilities (e.g., explaining a humorous feature of an image in text, as shown by GPT-4, above). The advent of user-friendly plug-ins (such as that of Bing, or Wolfram Alpha) also enable us to combine the power of internet search functions (or the Wolfram database) with the human-like response of an LLM such as that used in Chat GPT (GPT-3 or -4).
Meanwhile, we know that new LLMs are constantly in development by various enterprises. For example, GPT3.5 is the OpenAI model immediately preceding GPT-4. It has been much publicised that on a simulated US bar exam, GPT-4 achieves a score that falls in the top 10% of test takers, whereas GPT-3.5 scored in the bottom 10%.
Nevertheless, the GPT-4 technical report gives a fuller picture of the performance of GPT-4, compared to GPT-3.5, across several different US exams. While performance is improved in some instances, in others it is not.
The graph shows an unpredictable smorgasbord of improvements vs. non-improvements from one generation of GPT to the next. Thus, predicting the future of AI generally, and LLMs specifically, is very hard (as the GPT-3 vs GPT-4 graph shows). It is also thought that later generations of LLMs, despite or perhaps because of being more powerful, will take longer to develop.
If we had to summarise the state of play for LLMs in one word, it would be uncertain. LLMs provide rewards, but those rewards come with risks.
“Thinking, Fast and Slow”
LLMs have been praised for their solution-finding and fact-finding capabilities (although it is important to not treat them in quite the same way as search engines). A particularly notable example is found in coding, wherein LLMs can be used to devise solutions for human developers. However, LLMs will not always provide a solution that is reasonable; or provide facts that are correct. Why?
It is instructive to compare how LLMs “think” with how humans think. Even cursory inspection reveals that LLMs tend to provide shallow and often misleading answers to questions if not prompted correctly. This has led some in the field to compare LLMs’ “thinking” with “System 1” thinking as defined in Nobel laureate Daniel Kahneman’s seminal work, “Thinking, Fast and Slow”. (“System 1” thinking is shallow thinking – near instantaneous and low effort. In contrast, “System 2” is the more analytical, deliberate thinking we engage in when we need to solve problems.)
This has led to the emergence of so-called “prompt engineering”, wherein a human must design with care the question posed to an LLM in order to obtain an answer of any value. The human mastermind carries out “hard” thinking before the model can whirr into action. In consequence, even when using an LLM as a crutch, there is no substitute for reasoned logic and the expertise to spot when the model has gone wrong.
Telling you what you want to hear
It is well known that LLMs can “hallucinate” specific individual facts (as evidenced by Mata v. Avianca). These types of hallucinations can be easy to spot – for example, it is easy to determine whether an LLM, when asked for examples of patent applications filed by NASA, is returning true answers – one simply searches Espacenet. If the model tells you the capital of France is London, you know that is a hallucination.
However, easy-to-spot hallucinations are a symptom of a deeper, more troubling issue. That is, LLMs say what they think you want to hear. The resultant errors can be quite subtle and therefore treacherous. They can cause considerable problems if they appear in a setting such as a legal document. Nice prose – which is what Chat GPT can generate – is no substitute for insight and experience forming the basis of considered advice.
No mental scratchpad and no teamwork
Without wishing to sound over-negative, there is another problem. LLMs do not think iteratively. This has been described as an inability to use a “mental scratchpad”. There is therefore a deficit in their ability to replicate team thinking – such as when attorneys from different disciplines collaborate, or when an inventor, a company CEO and their legal advisor brainstorm ideas together. In this connection, LLMs have been described as being prone to going down “dead ends” from which they need to be rescued by good prompt engineers. This makes it dangerous to rely on an LLM for advice on which a critical decision turns, whether that decision is commercial or personal.
Copyright
As a final but critical point, and one that is particularly relevant in the world of intellectual property law, there has been uncertainty around the reaction of copyright owners to the use of their works to train large language models. At the time of writing, we have just been offered an insight into how matters may progress: suits alleging copyright infringement have now been formally lodged by certain copyright owners against OpenAI and Microsoft.
Moreover, the law in regard to ownership of copyright in AI-generated works themsevles - while perhaps better developed - remains in flux in many countries.
Until we have clarity on these issues, we should approach with deep caution (it would not be unreasonable to say, rule out using) tools, such as ChatGPT, that rest on models trained at risk of copyright infringement.
Conclusion
The advent of LLMs is without doubt exciting. It comes as part of a broader picture in which the automation of routine or low-stakes written tasks can leave us all with more time for more important, strategic matters. However, there seem to be limits on the capabilities of LLMs and it is important always to bear this in mind. Their future is uncertain, and over-reliance on their technology can be embarrassing or even disastrous. It will be interesting to see how regulation in this area develops in different jurisdictions, especially when it comes to copyright.
A balanced view, with appropriate caution and reserve, should be adopted as we continue to monitor developments in the world of LLMs, including how they might be of assistance to legal professionals such as patent attorneys as they become more sophisticated over the years to come.
We are a European firm and assist our clients to protect their IP rights in the UK, Europe and worldwide from our offices in the UK and The Netherlands and through our international network of trusted local attorneys.
Get in touch if you would like to discuss your innovations and brand protection further.
Anne-Marie Conn Associate
Samuel Read Associate