Many studies claiming that artificial intelligence (AI) is as good as or better than human experts at interpreting medical images are of poor quality and are arguably exaggerated, warn the authors of an analysis published in the BMJ today.
The study reviewed the results of published studies over the past 10 years, comparing the performance of a deep learning algorithm in medical imaging with expert clinicians.
Only 10 records were found for deep learning randomised clinical trials, two of which have been published (with low risk for bias, except for lack of blinding, and high adherence to reporting standards) and eight are ongoing. Of 81 non-randomised clinical trials identified, nine were prospective and six had been tested in a real-world clinical setting.
More than two thirds (58 of 81) of studies were judged to be at high risk for bias, and adherence to recognised reporting standards was often poor.
Three quarters (61 studies) concluded that AI performed as good as or better than clinicians.
The researchers point to some limitation, such as the possibility of missed studies and the focus on deep learning medical imaging.
Nevertheless, they say, “many arguably exaggerated claims exist about equivalence with or superiority over clinicians”. They add that overpromising language 'leaves studies susceptible to being misinterpreted by the media and the public, and as a result the possible provision of inappropriate care that does not necessarily align with patients’ best interests'.