Medical image recognition through Deep Learning algorithms

Data Science Methods

Deep Learning is a machine learning algorithm often used in image recognition. This method reduces a medical image (MRI, CT, plain radiographs) to a numeric matrix, and an algorithm then analyzes this matrix in search of certain features associated with specific radiologic signs and diagnosis. It presents several advantages over the traditional way of image classification by clinical experts. Below we outline the comparison between the two, how the algorithm works, and what you would need if you would like to create a Deep Learning algorithm for your image recognition task.

Shortcomings of image classification by clinical specialists and where the Deep Learning-based analysis excels

Studies having a clinical endpoint evaluated through medical images will often have many limitations if specialists read and classify the images into different categories for disease diagnosis or further research. Examples are:

  1. Inter and intra-observer reliability measures in image recognition are often weak when evaluated by specialists. The resulting measurement error leads to decreased statistical power, an increased required sample size, and a lower probability of demonstrating a difference in the evaluated interventions even if this difference is clinically significant.

  2. The number of radiologic signs that can be reliably extracted by specialists is often limited, especially since less common characteristics will often lead to disagreement among specialists.

  3. The extraction of radiologic signs by clinical experts is often not directly related to the prediction of prognosis for individual patients. In other words, although radiologic signs are often associated with prognostic categories - for example, a particular tumor stage is associated with an average five-year survival - these predictions are estimated based on groups of patients rather than provided for individual patients with unique characteristics.

In contrast, since the Deep Learning algorithm always performs the same evaluation on a given image, it addresses all the previous limitations of human clinical experts, namely:

  1. Since a fixed Deep Learning algorithm conducts the assessment, there is no inter or intra-observer reliability, decreasing variability, and increasing statistical power.

  2. The Deep Learning algorithm can be trained to extract any number of radiologic signs and to evaluate any number of images consistently.

  3. The Deep Learning algorithm can be used to not only extract radiologic signs from images but also to use those same signs to predict prognosis and response to treatment. Of importance, the algorithm can perform these predictions for individual patients rather than just merely classifying a patient according to a given category.

Design of an image recognition Deep Learning algorithm

Here we provide a schematic representation for the general overview of the development of a Deep Learning algorithm along with its deployment based on a standard machine learning protocol for image classification and prediction. Steps include: first, we present raw medical images (MRI, CT, plain radiographs) to the computer that converts it into a matrix of numerical (pixel) values. This matrix serves as the input for multiple layers of a convolutional neural network (CNN) framework that identifies specific patterns (features) in the presented image. Then, these patterns get transformed into a numerical value (class probabilities) based on which the image is classified (for example, the chance of the image belonging to a tumor or non-tumor class). The resulting image recognition algorithm after this will then be passed through classic machine learning stages of training, validation, and testing so that we can finally deploy it as a prediction model. Multiple labeled images relevant to the research question will be used to train the algorithm for identifying specific feature patterns to classify any image. The trained algorithm is then presented with unlabeled images in the validation stage to check the precision (comparison of what the algorithm is saying versus the actual image classification) of the algorithm. We reiterate the training and validation steps until the algorithm achieves the required precision for image classification. Finally, in the testing phase, the algorithm performance is evaluated based on its accuracy to classify when presented with an external set of images. This step will be a simulation for the model performance in the “real world.”

What you need to get started

The first step would be to formulate a research question (for example, a Deep Learning-based image recognition model to predict breast cancer outcomes) to develop the algorithm. However, there are few other critical considerations that you need to investigate before venturing into the development of an image recognition prediction model. Some technical intricacies of the process are beyond the scope of this short article, but, here are the things you will need:

  1. The research question should be preferably about a common condition instead of any rare condition. Deep Learning-based algorithms require a large number of images for training and validation purposes. And thus, acquiring so much imaging data for any rare disease is usually not feasible.

  2. In case your clinic is part of a network, then the imaging techniques must be standardized across all the participating clinics. Following inconsistent imaging techniques across different sites will result in a heterogeneous sample. And ultimately, it would be difficult for the algorithm to distinguish between image artifacts versus features about the condition.

  3. Another drawback of inconsistent imaging techniques would be that the algorithm can not be generalized. In simple words, an algorithm based on inconsistent imaging data would be deemed unreliable for any further research.

  4. At the start, you should develop an algorithm with simple images that can be easily labeled by the expert or the support team. Complex exams that are difficult to label cannot serve the model training purpose. Additionally, it would also not allow the model to reach the required accuracy.

As you can see, some of the requirements are more practical than just scientific, and therefore, should be taken into account for the development of an accurate image recognition model. Ultimately, these Deep Learning-based models can serve as a reliable decision-support tool from the perspective of a radiologist.

References