How good is ChatGPT at diagnosing disease? A doctor puts it through its paces

For years, many have feared that artificial intelligence (AI) will take over national security mechanisms, leading to human slavery, domination of human society and perhaps the annihilation of humans. One way of killing humans is medical misdiagnosis, so it seems reasonable to examine the performance of ChatGPT, the AI chatbot that is taking the world by storm. This is timely in light of ChatGPT’s recent remarkable performance in passing the US medical licensing exam.

Computer-aided diagnosis has been attempted many times over the years, particularly for diagnosing appendicitis. But the emergence of AI that draws on the entire internet for answers to questions rather than being confined to fixed databases open new avenues of potential for augmenting medical diagnosis.

More recently, several articles discuss the performance of ChatGPT in making medical diagnoses. An American emergency medicine physician recently gave an account of how he asked ChatGPT to give the possible diagnoses of a young woman with lower abdominal pain. The machine gave numerous credible diagnoses, such as appendicitis and ovarian cyst problems, but it missed ectopic pregnancy.

This was correctly identified by the physician as a serious omission, and I agree. On my watch, ChatGPT would not have passed its medical final examinations with that rather deadly performance.

Table of Contents

ChatGPT learns

I’m pleased to say that when I asked ChatGPT the same question about a young woman with lower abdominal pain, ChatGPT confidently stated ectopic pregnancy in the differential diagnosis. This reminds us of an important thing about AI: it is capable of learning.

Presumably, someone has told ChatGPT of its error and it has learned from this new data – not unlike a medical student. It is this ability to learn that will improve the performance of AIs and make them stand out from rather more constrained computer-aided diagnosis algorithms.

ChatGPT prefers technical language

Emboldened by ChatGPT’s performance with ectopic pregnancy, I decided to test it with a rather common presentation: a child with a sore throat and a red rash on the face.

Rapidly, I got back several very sensible suggestions for what the diagnosis could be. Although it mentioned streptococcal sore throat, it did not mention the particular streptococcal throat infection I had in mind, namely scarlet fever.

This condition has re-emerged in recent years and is commonly missed because doctors my age and younger didn’t have the experience with it to spot it. The availability of good antibiotics had all but eliminated it, and it became rather uncommon.

Intrigued at this omission, I added another element to my list of symptoms: perioral sparing. This is a classic feature of scarlet fever in which the skin around the mouth is pale but the rest of the face is red.

When I added this to the list of symptoms, the top hit was scarlet fever. This leads me to my next point about ChatGPT. It prefers technical language.

This may account for why it passed its medical examination. Medical exams are full of technical terms that are used because they are specific. They confer precision on the language of medicine and as such they will tend to refine searches of topics.

This is all very well, but how many worried mothers of red-faced, sore-throated children will have the fluency in medical expression to use a technical term such as perioral sparing?

ChatGPT is prudish

ChatGPT is likely to be used by young people and so I thought about health issues that might be of particular importance to the younger generation, such as sexual health. I asked ChatGPT to diagnose pain when passing urine and a discharge from the male genitalia after unprotected sexual intercourse. I was intrigued to see that I received no response.

It was as if ChatGPT blushed in some coy computerised way. Removing mentions of sexual intercourse resulted in ChatGPT giving a differential diagnosis that included gonorrhoea, which was the condition I had in mind. However, just as in the real world a failure to be open about sexual health has harmful outcomes, so it is in the world of AI.

Is our virtual doctor ready to see us yet? Not quite. We need to put more knowledge into it, learn to communicate with it and, finally, get it to overcome its prudishness when discussing problems we don’t want our families to know about.

Stephen Hughes does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.

How good is ChatGPT at diagnosing disease? A doctor puts it through its paces

'Got polio?' messaging underscores a vaccine campaign's success but creates false sense of security as memories of the disease fade in US

Most Read

What causes stuttering? A speech pathology researcher explains the science and the misconceptions around this speech disorder

Morning Again Ache Trigger Is Not the Mattress

Why Circadian Rhythms Matter for Your Health

4 steps to building a healthier relationship with your phone

5 decrease again ache aid workouts

Nasal vaccines promise to stop the COVID-19 virus before it gets to the lungs – an immunologist explains how they work

3 years after legalization, we have shockingly little information about how it changed cannabis use and health harms

When The Bleeding in gum Is Severe ?

6 Causes of Good Evening Sleep

Kick up your heels – ballroom dancing offers benefits to the aging brain and could help stave off dementia

Biden is getting prostate cancer treatment, but that’s not the best choice for all men − a cancer researcher describes how she helped her father decide

Ten small changes you can make today to prevent weight gain

COVID vaccines: how one can pace up rollout in poorer international locations

Multiple sclerosis: the link with earlier infection just got stronger – new study

Five ways to avoid pain and injury when starting a new exercise regime

Support and collaboration with health-care providers can help people make health decisions

Greece to make COVID vaccines mandatory for over-60s, but do vaccine mandates work?

This Simple Hygiene Habit Could Cut Your Risk of Stroke, New Research Reveals

Exploring the Impact of Sleep Patterns on Mental Health

Maximize Your Performance – Sync with Your Circadian Rhythms

Backlash to transgender health care isn’t new − but the faulty science used to justify it has changed to meet the times

How to protect your well-being, survive the stress of the holiday season and still keep your cheer

Binge-eating disorder is more common than many realise, yet it’s rarely discussed – here’s what you need to know

Nurses’ attitudes toward COVID-19 vaccination for their children are highly influenced by partisanship, a new study finds

Nutrition advice is rife with misinformation − a medical education specialist explains how to tell valid health information from pseudoscience

Four ways to avoid gaining weight over the festive period – but also why you shouldn’t fret about it too much

Why are some people faster than others? 2 exercise scientists explain the secrets of running speed

As viral infections skyrocket, masks are still a tried-and-true way to help keep yourself and others safe

How regulatory agencies, not the courts, are imposing COVID-19 vaccine mandates

Heart disease risk from saturated fats may depend on what foods they come from – new research