ChatGPT better than trainee doctors at diagnosing respiratory diseases, study shows

Trainee doctors were given an hour to research symptoms on the internet
Trainee doctors were given an hour to research symptoms on the internet - PeopleImages/iStockphoto

ChatGPT is better at diagnosing respiratory diseases than trainee doctors, a new study suggests.

The findings presented to the European Respiratory Society in Austria also found that Google’s Bard performed better than humans in some aspects, while Microsoft’s Bing performed as well as the trainee doctors.

The study indicates that these kinds of large language models could help medical staff to assess patients more efficiently and reduce the significant pressures facing the NHS.

Ten trainee doctors with less than four months of clinical experience in paediatrics were given one hour to use the internet, but not AI chatbots, to solve scenarios created by experts in child respiratory medicine for which there was no obvious diagnosis.

The doctor’s answers were graded and compared to answers given by the chatbots.

ChatGPT version 3.5 scored the highest and was believed to be more human-like in its responses than other chatbots.

The study was presented by Dr Manjith Narayanan, a consultant in paediatric pulmonology at the Royal Hospital for Children and Young People, in Edinburgh.

“These tools have several potential applications in medicine. My motivation to carry out this research was to assess how well large language modules (LLMs) are able to assist clinicians in real life,” Dr Narayanan said.

‘Hallucinations’

The researchers did not find any clear instances of so-called “hallucinations”, when language models seemingly make up information, with any of the three chatbots.

But Dr Narayanan cautioned that it is important to mitigate against this occurring moving forward as there is always a possibility of it happening.

Bing and Bard did give some answers that were deemed to be irrelevant to the questions asked, but so too did the trainee doctors.

More than half of the public in the UK and three-quarters of NHS staff said they support the use of artificial intelligence for patient care, according to a survey released in July by the healthcare think tank the Health Foundation.

But both the public and NHS staff also felt that the fact that AI systems cannot show “real empathy” or “kindness” was selected as the biggest disadvantage to using the technology.

Hilary Pinnock, a professor of primary care respiratory medicine, at the University of Edinburgh, said: “It is encouraging, but maybe also a bit scary, to see how a widely available AI tool like ChatGPT can provide solutions to complex cases of respiratory illness in children.

“It certainly points the way to a brave new world of AI-supported care”, she added.

Dr Narayanan is now planning to test chatbots against more senior doctors and to look at newer and more advanced large language models.