No single artificial intelligence (AI) algorithm outperforms radiologists, a diagnostic study in JAMA Open Network reveals. Mammography screening decreased breast cancer mortality but lead to a large number of false positives.
Recent advances have renewed interest in whether AI-based models can match human interpretation and improve performance of mammography interpretation.
This diagnostic accuracy study conducted between September 2016 and November 2017, an international challenge was held to foster AI algorithm development for mammography interpretation. Algorithms used images, previous examinations and risk factor data, to create an output score of positive or negative cancer results within the next 12 months.
A total of 144,231 screening mammograms from the United States (952 cancer positive 12 months after screening) and 166,578 examinations from Sweden (780 cancer positive) were used.
The top-performing algorithm achieved 66.2 per cent (US) and 81.2 per cent (Sweden) specificity, lower than radiologists’ specificity of 90.5 per cent (US) and 98.5 per cent (Sweden). No single algorithm outperformed US radiologist benchmarks, including clinical data and prior mammograms, improved AI performance. Combining top-performing algorithms and US radiologist assessment achieved a significantly improved specificity (92.0%), suggesting integrating AI into mammography interpretation could yield significant performance improvements, with the potential to reduce healthcare costs, the authors concluded.