Claims of AI outperforming human doctors are exaggerated, study finds

LONDON — It feels like each year we move closer towards a reality that is straight out of a science fiction film. Artificial intelligence and robotics continue to advance at a rapid rate, and it seems like we’ll all be living in a The Jetsons like landscape sooner rather than later. As far as the integration of AI into medicine, though, a new study is advising everyone to pump the brakes.

A number of recent research projects have concluded that artificial intelligence is already just as good, if not better, at interpreting medical images than highly trained human doctors and medical professionals. Now, however, an international study comprised of American and British researchers is warning that such claims are exaggerated, inaccurate, and potentially dangerous for patients the world over.

To put it simply, robots & AI aren’t quite ready to take the medical reins from people just yet.

Beyond just the medical implications of these findings, this new study is calling into serious question the design and reporting standards of these previous projects.

The branch of AI known as deep learning has shown a lot of promise in the field of medical imaging. That being said, it’s still a relatively new concept. So, while recent media headlines have already proclaimed deep learning & AI to be superior to human doctors when it comes to medical imaging, these institutions are very much jumping gun. Moreover, the potential method deficiencies and possible biases behind deep learning studies had never been formally investigated until now.

So, this study’s authors took it upon themselves to analyze all studies that had focused on medical imaging & AI over the past 10 years. They used this data to compare the performance of deep learning algorithms with human doctors.

Ultimately they decided to work with two eligible randomized clinical trials, as well as 81 non-randomized studies. Of the randomized initiatives, only nine of the 81 tracked and collected data over long periods of time, and only six were based on real-world clinical settings (hospitals, doctors’ offices, etc.). That means the vast majority of these claims are based on less-than-thorough research projects.

All in all, over two-thirds of those studies (58 out of 81) were found to be at “high risk of bias.” Also, many followed conventional reporting standards very poorly.

Another 61 concluded that AI was at the very least comparable to clinician performance, and 31 actually declared that no more trials or tests were needed to back up their claims.

“Many arguably exaggerated claims exist about equivalence with (or superiority over) clinicians, which presents a potential risk for patient safety and population health at the societal level,” the study reads.

Such claims “leave studies susceptible to being misinterpreted by the media and the public, and as a result the possible provision of inappropriate care that does not necessarily align with patients’ best interests.”

“Maximizing patient safety will be best served by ensuring that we develop a high quality and transparently reported evidence base moving forward,” the researchers conclude.

The study is published in BMJ.

Tags: AI, artificial intelligence, doctors, medicine, robots, technology