Examining the Potential of Deep Learning Applications in Medical Imaging

7 min readDec 14, 2020

In this article, I discuss my most recent data science project. Using x ray images as data, I investigate the possibilities, pitfalls, and limitations of using machine learning algorithms as an assistant to a radiology team.

Medical imaging is a vital and widely used diagnostic tool in clinical health care, allowing physicians to get a sense of abnormalities in a patient’s anatomy and physiology that would otherwise be difficult or impossible to confirm. Rapidly advancing technology in the field leads to ever increasing use and expansion of medical imaging to improve the quality of patient care. However, with all of this advancement and escalation of use comes a serious challenge. A growing physician shortage, both currently and projected into future trends, is leading to a serious imbalance; the number of patients in need of care are rapidly outgrowing the number of physicians capable of offering clinical care. In radiology in particular, this burden can lead to an increase in error in diagnosing patients, missing time-sensitive findings, or misclassifying them, as the demand for interpreting images surpasses the workload that even the most skilled radiologists can reasonably accommodate.

With these factors as a primary motivation, I was particularly inspired to conduct a study exploring the possibilities of machine learning in the interpretation of clinical x rays. Before transitioning into data science, I worked as a medical radiographer — trained technologists who position patients and acquire x ray images. Through this career path, I learned a great deal about how images are created, digitally stored and passed to a radiologist for interpretation, and other relevant information that has helped me to have awareness of this project’s data problem. Utilizing some of the many skills gained in the 480 hours of training in General Assembly’s Data Science Immersive Bootcamp, I had access to the necessary tools to properly undertake this investigation. The following are a few highlights from my study and its findings (and you can read more about my study here).

What can machine learning do for radiology?

First and foremost, I will state what machine learning should not do. I have seen enough of the skills and experience of radiologists in my past work to say that there is simply no replacing their trained eyes and medical knowledge. Bringing to bear knowledge of patient history with the simple but important ability to communicate directly to ordering physicians, and apply various outside knowledge to every interpretation they make, a good radiologist goes far beyond just scanning the pixels of an image to diagnose a patient. An AI algorithm, no matter how well trained, won’t be able to go beyond the scope of the image it has been fed as data. Going further, there are serious ethical dilemmas to the idea of entrusting the health, safety and wellbeing of a patient entirely to the care of a mathematical function, even a very precise one.

With this said, the possibilities of predictive algorithms trained on medical images are enormous. They have the potential to pick up on changes so subtle that they are easily missed by the human eye, or even impossible to perceive by us. They can operate 24/7, allowing for some form of interpretation to come relatively quickly, such as remote sites taking an emergency image at 2 AM. Even after initial models are implemented, the data science team involved in creating the model can adjust and make improvements after performance is evaluated, allowing for specialized models fine tuned for the specific needs of individual healthcare centers, continually adapting to changing trends.

There are many ways that the team of physician and algorithm can greatly improve the quality of patient care. Every image a radiologist reads has the benefit of a second look by a model, which can help catch those rare but greatly unfortunate events of missing a serious and time-sensitive diagnosis. There is also the potential of having more rapid reading of time-sensitive diagnoses. Images acquired routinely are stored in a waiting queue until a radiologist can read them. When an ordering physician suspects a critical finding will be present, they can order a “wet read” (a term from back when x rays were printed on film) that prioritizes that image to be read sooner. A model, however, will always be able to flag potential findings as soon as an image is processed, regardless of whether the patient history dictated urgency or not. Knowing this, models can be trained to specifically identify critical, time-sensitive findings, creating a tiered system that flags images to be read sooner if the model identifies a high likelihood of an urgent finding, meaning that more critical findings could be found faster. Beyond all this, late hour emergency department/ICU images, or those taken in remote locations, can get some form of immediate results, even if a radiologist is not yet available to personally interpret the images. All of these are just some of the many ways in which machine learning can benefit the entire care team when involved with medical imaging. This is why I am driven to understand and be involved in this vital frontier in the world of technology and healthcare.

Overview of the study

For this project, I utilized data gathered by the NIH Clinical Center, which contains over 100,000 chest x rays from over 30,000 patients, along with their findings, which were text mined from the original radiologist’s reports. All identifying data on patients were appropriately anonymized for this study by the NIH. Over half of the images in the study present no positive finding, while the rest show a combination of 14 different possible findings (which are not exclusive; more than one of these findings may present on the same image), such as atelectasis, pneumonia, or a mass, among others.

Through the Keras API in the Tensorflow library, I constructed a convolutional neural network, a class of neural networks well suited for analyzing visual data. By using gradient descent to optimize mathematical weights and biases in the network, the model can be trained to assign various scores/penalties to portions of an image, which essentially allows it to differentiate a specific element of an image from others. This information was then tasked with image classification — using the extracted features from an image, the model I constructed attempted to classify each image according to the presence of a positive finding, as well as an attempt to identify which finding or findings it may contain. The resulting predictions are then evaluated against the labels provided by the data previously, to gauge accuracy.

Despite the limitations of this project — both with the data and the fact that this study was restricted to free, open source computational resources — the model constructed was able to perform significantly higher than baseline accuracy, showing strong evidence that these models can successfully be trained to accuracy levels sufficient for being put in production in clinical applications. In the coming weeks, I intend to continue my work on these models, identifying what factors best contribute to higher accuracy, and how to best clean and prepare data for better results.

What comes next?

There are a number of avenues I intend to explore to follow up on these initial findings. For example, I will work to reduce and better balance the distribution of classes in the data. Having severely imbalanced classes (say a class that only rarely appears vs others that are very common) will often cause predictive algorithms to fail to accurately identify those classes. I will also be seeking more data, as fully training a model benefits from as many samples as can be acquired. As accuracy scores improve, I will also be constructing a secondary model designed for properly localizing any positive findings on the image.

It is my hope to offer my unique background in clinical healthcare and my more recently developed skills in data science to design and implement highly accurate neural networks for use in diagnostic medical imaging. By having access to more powerful resources, I could construct more accurate networks — as a brief example, allowing for larger input sizes, and passing larger batches of images through the network at one time, can increase the model’s capacity to differentiate between subtle changes and the large number of different classes, but these parameters were limited by memory constraints in the study I conducted.

In conclusion

This post discusses my most recent data science study at a very high level for introductory purposes — many specifics are omitted for quick reading and accessibility for less technical audiences to understand the potentials I explored. For more information on the specifics of this study, and to follow any updates I post as I continue this work, please feel free to look over my GitHub repository that houses my relevant code and findings.

With the right team of skilled individuals, sufficient resources, and determination, I believe that we can soon see an explosive advancement in this exciting frontier on the edge of technology and clinical practice. With my background and experience in healthcare, my commitment to continual honing and advancement of my data science toolkit and skills, and an understanding of the need and potentials of these advancements, I intend to do what I can to be involved with and help push forward the application of deep learning neural networks in medical imaging, for the benefit of both our often heavily overworked medical care teams, and continual improvement to the quality of patient care.

Examining the Potential of Deep Learning Applications in Medical Imaging

Written by John Lawless