The latest study trained and validated a deep learning convolutional neural network using images of the skin and accompanying diagnoses. This neural network was then turned on 100 images that were also shared with 58 dermatologists, 30 of whom had more than five years of experience. The study looked at diagnostic accuracy when the images were analyzed with and without clinical information.
Dermatologists classified potentially cancerous lesions with a sensitivity of 86.6% and specificity of 71.3% when given just the images. The figures increased by around two and four percentage points, respectively, when clinical information was provided. The 30 experienced dermatologists performed slightly better than the overall group.
The neural network outperformed the dermatologists. Using the dermatologists’ 86.6% sensitivity as the benchmark, the neural network achieved a specificity of 82.5%. That compared favorably to the 71.3% specificity of the dermatologists. A similar gap in performance was seen when dermatologists had access to clinical information.
The findings echo the results of a similar study that was published in Nature early last year. That study also trained a neural network and found it held its own against dermatologists. Collectively, the papers add to the impression that lesion classification may become an early test case for AI in healthcare.
As the papers note, the availability of lesion-classifying neural networks and the ubiquity of smartphones equipped with cameras could reshape how lesions are assessed initially. Technology could move lesion classification out of the clinic, thereby expanding access, driving down costs and increasing the frequency and timeliness of assessments.
That rosy vision is undermined somewhat by the limitations of the latest study, which are noted in the paper. At 58, the number of dermatologists against whom the neural network was compared is fairly small. There is also a question mark over whether the dermatologists act differently in the real world, where the consequences of diagnostic errors are far more serious.
The readthrough from the study is further constrained by the limitations of the images used. The test set lacked certain types of lesion, such as pigmented basal cell carcinoma, and had less genetic and skin type diversity than is seen in the real world.
Larger studies would address many of these issues but the truest test of the effectiveness of the tool in the real world will come from prospective trials. This is the setting that will start to show whether doctors and patients will accept neural networks and act on their advice. AI will underperform if doctors do not trust the technology and therefore do not follow its recommendations.
Get real time update about this post categories directly on your device, subscribe now.