Asking an AI service like Chat GPT for advice, both privately and at work, is becoming increasingly common. Studies have shown that AI is often a good support and sometimes even better than doctors at making diagnoses.
But now, American researchers have scrutinized some of the most common large language models – chat robots – and come to the conclusion that they are not cognitively reliable. They exhibit clear signs of early dementia, according to a study.
The researchers let several chat robots – Chat GPT version 4 and 4o, Claude 3.5 “Sonnet” and Gemini version 1 and 1.5 – take a test used to detect dementia in humans.
Figures in Order
The test includes, for example, drawing a line between different numbers, drawing a clock face with numbers in the correct order, and drawing in a specific time, reproducing a geometric shape, and a memory test.
None of the chat robots got full marks, and most ended up just below the threshold for what is classified as mild cognitive impairment in humans. Just like in humans, older versions of the chat robots performed worse than younger ones.
When it comes to language, attention, and abstract thinking, the chat robots excelled. It was worse when it came to showing empathy or interpreting complex visual images. In tasks that required both visual focus and abstract thinking, it became difficult. They were also terrible at showing empathy.
No Replacement Yet
The researchers point out that they are fully aware of the differences between human brains and language models. But, they write, the findings highlight how poor they are at executive and visual tasks. This indicates that chat robots should not replace human doctors at present.
The study has been published in the scientific journal BMJ's Christmas edition, known for containing research of a more lighthearted nature.