Oxford University research set to improve reliability of AI-tools like ChatGPT

Oxford research is set to make AI-generated material more accurate. <i>(Image: PA)</i>
Oxford research is set to make AI-generated material more accurate. (Image: PA)

Researchers from the University of Oxford have made a significant advance towards ensuring that information produced by generative artificial intelligence (AI) is robust and reliable.

Currently, so-called hallucinations, when an AI tool invents facts that sound plausible but which are imaginary, are a critical factor holding back wider adoption of large language models (LLM) like ChatGPT or Gemini.

These errors can make LLMs unreliable, with researchers referencing examples of a US lawyer getting in legal trouble for citing a case invented by ChatGPT, and can also be dangerous when used in medical diagnosis. 

READ MORE: Man accused of stalking by 'persistent door knocking' CLEARED by jury

In a new study published today in Nature, the Oxford researchers demonstrated a new way to detect when an LLM is likely to ‘hallucinate’.

This advance could open up new ways to deploy LLMs in situations where "careless errors" are costly such as legal or medical question-answering.

The researchers focused on hallucinations where LLMs gave different answers each time the tool was asked a question - even if the wording was identical - known as confabulating.

Study author Dr Sebastian Farquhar said:  “LLMs are highly capable of saying the same thing in many different ways, which can make it difficult to tell when they are certain about an answer and when they are literally just making something up.

“With previous approaches, it wasn’t possible to tell the difference between a model being uncertain about what to say versus being uncertain about how to say it.

"But our new method overcomes this.”