This year, Delft Imaging Systems will launch the long-anticipated version 6 of the CAD4TB software. The improved version has some exciting new features.
The input and output are still the same: CAD4TB processes frontal chest radiographs from any type of digital X-ray equipment and produces a heatmap, indicating with colours which parts of the lungs are likely abnormal, and a score between 0 and 100. The higher the score, the more likely that the subject on the image has tuberculosis (TB).
The main features of CAD4TB 6 are improved performance, a big reduction in processing time per image, from around 60 seconds to less than 15 seconds, and the ability to process images of subjects 4 years and older, whereas the lower age limit for previous versions was 16 years.
Under the hood, the major change is that the computer analysis is now mainly based on deep learning. This is a new powerful technology that is responsible for the improvements in performance in many applications of artificial intelligence, such as self-driving cars. For CAD4TB, the switch to deep learning has resulted in higher accuracy and a faster processing time per image.
To develop the new version, we have collected a lot more data, from more countries and from a wider variety of X-ray equipment. Our performance tests for CAD4TB 6 were based on an independent validation data set. This data set was not used for training the software. The set had over 7000 images from 10 countries and more than 10 different types of X-ray equipment. All validation images were read independently by two human experts. We have made our image reading process stricter. Because of this stricter reading protocol, the agreement between the human readers is very high. We used one of the readers as the reference standard. This means the verdict of this reader determines if we consider a validation image radiologically abnormal.
Compared to this reference, the second reader achieved a score of 0.963. This score is called the area under the receiver operating characteristics curve and is the internationally accepted standard way of testing software that makes classifications on images. It means that if one randomly takes a normal and an abnormal image from the set of test images, the second reader correctly said which of the two images is the abnormal one in 96.3% of all cases. We tested the new and the earlier versions of CAD4TB against this very high bar. Version 3 of the software, the first version that could process X-rays from different types of machines, achieved a score of 82.8%. This increased with version 4 and 5 to respectively 86.9% and 92.3%. The new version 6, based on deep learning, makes a big step forward and achieves a score of 96.5%, slightly better, but not significantly different, from the second human expert reader.
Figure 1. Performance of the new version of CAD4TB is on par with a human expert reader.
We also evaluated how accurate the software and the human observers are in determining if an X-ray is from a subject with bacteriologically confirmed TB. Here we used a validation data set of almost 2500 images, from 6 different sites in 6 different countries, and we had a culture or GeneXpert result for every case available. We found that CAD4TB 4 and CAD4TB 5 already outperformed human readers (85.9% and 87.0% versus 85.2% for the human readers). With CAD4TB 6 performance goes up a bit further, to 87.9%.
Figure 2. Performance of the new version of CAD4TB for predicting bacteriologically confirmed TB has further improved. It is outperforming human observers, but the difference is not statically significant.
One may wonder why the scores for the bacteriological reference are lower than for the radiological reference (both for CAD4TB and the human readers). The answer is that it is not always possible to detect TB on a chest X-ray (some TB patients have a normal chest radiograph), and on the other hand, some subjects have an X-ray with clear abnormalities suggestive of TB, but they are not bacteriologically positive. They may have another disease, or the X-ray shows signs of old and healed TB, or it may even be possible that the bacteriological reference standard is not correct.
CAD4TB for children
The availability of more data also allowed us to measure CAD4TB’s performance on younger subjects. Our tests revealed that CAD4TB is reliable for children 4 years and older. CAD4TB 6 can therefore be used on subjects 4 years and older.
Figure 3. Two examples of chest radiographs of children both four years old. The left image is normal, and no abnormal regions are seen in the heatmap. The score of CAD4TB version 6 for this case is 22. The right image has clear abnormalities, accurately detected by CAD4TB 6. The image gets a score of 86.
Upgrades and faster processing
Existing customers that use the CAD4TBbox can upgrade their software to version 6 after the official release, scheduled for the summer of 2018. For the new version of the software you will pay the same low bundle prices. You will directly notice that the software runs faster. CAD4TB 6 uses advanced deep learning technologies, such as convolutional neural networks, but has been optimized so that no special hardware, such as graphic processing unit cards, is required.
Beyond version 6
Delft’s CAD4TB development team at Thirona, collaborating with the Diagnostic Image Analysis Group at Radboud University Medical Center, a worldwide leading research group in application of deep learning to medical images, is already working on the next version of the software. We expect to add new features, including the detection of hallmark signs of TB such as cavities, detection of other lesions like lung nodules that could indicate lung cancer, and a specific version of CAD4TB for children below 4 years of age.