Detection and Semantic Segmentation of Pneumothorax Disease from X-Ray Images using Deep Learning

Build a binary image classification model to detect if the image contains pneumothorax. If yes, then pass it through a semantic segmentation model to identify and mark the affected part.

Source: https://www.svhlunghealth.com.au/conditions/pneumothorax

Table of Contents:
1. Introduction
2. Types of Pneumothorax Disease
3. Symptoms
4. Diagnosis
5. Business Problem
6. DL Formulation
7. Business Constraints
8. Dataset Column Analysis
9. Performance metric
10. Exploratory Data Analysis
11. Existing Approaches and Improvements in my model
12. Data Preprocessing
13. Deep Learning Models
14. Final Data Pipeline
15. Error Analysis
16. Future work
17. LinkedIn and GitHub Repository
18. Reference

1. Introduction:

A pneumothorax can be severe, depending on how much air is trapped in the pleural space. A small amount of trapped air can usually resolve by itself, provided there are no other complications. Larger amounts of trapped air can be serious and lead to death if medical treatment is not obtained.

2. Types of Pneumothorax Disease

a) Primary spontaneous:
Primary spontaneous pneumothorax(PSP) occurs in young people (aged 15–34) without any history of lung disease. The direct cause of PSP is unknown. People at risk include smokers, tall men, and those who have had a family member with a pneumothorax.

2. Secondary spontaneous:
Secondary spontaneous pneumothorax (SSP) can be caused by a variety of lung diseases(such as chronic obstructive pulmonary disease, cystic fibrosis, tuberculosis, pneumonia, lung cancer, sarcoidosis, pulmonary fibrosis or cystic lung diseases) and tissue disorders(such as Marfan’s Syndrome). SSP carries more serious symptoms than PSP, and it is more likely to cause death.

3. Traumatic pneumothorax:
A traumatic pneumothorax is the result of an impact or injury. Potential causes include blunt trauma or an injury that damages the chest wall and pleural space. One of the most common ways this occurs is when someone fractures a rib. The sharp points of the broken bone can puncture the chest wall and damage lung tissue. Other causes include sports injuries, car accidents, and puncture or stab wounds.

4. Tension pneumothorax:
Tension pneumothorax is caused by a leak in the pleural space that resembles a one-way valve. As a person inhales, the air leaks into the pleural space and becomes trapped. It cannot be released during an exhale. This process leads to increased air pressure in the pleural space that is life-threatening and needs immediate treatment.

3. Symptoms:

4. Diagnosis:

5. Business Problem:

Our objective is to build an automated method to predict X-rays with pneumothorax and segmentize the affected area. This will help to prioritize the treatment of patients with pneumothorax. Automatic image segmentation can assist doctors in the treatment and diagnosis of diseases with higher accuracy, accelerate the diagnosis process, and improve efficiency.

6. DL Formulation:

What is image segmentation?
Image segmentation is a task where we classify pixel values of images belonging to a particular object class. So based on the way of classifying these pixels there are broadly two types of Segmentation.
I) Semantic segmentation and II) Instance segmentation.

I) Semantic Segmentation:
In semantic segmentation, every pixel belongs to a particular class (either background or person). Also, all the pixels belonging to a particular class are represented by the same color (background as black and person as pink).

II) Instance Segmentation:
In instance segmentation also every pixel belongs to a particular class. However, different objects of the same class have different colors i.e different class labels (Person 1 as red, Person 2 as green, background as black, etc.).

As mentioned earlier, our problem is a semantic segmentation problem where we have to predict every pixel either mask or background.

7. Business Constraints:

8. Dataset Column Analysis:

Given Dataset

The given data consists of ImageId and EncodedPixels. For every ImageId, we have an image in DICOM format. EncodedPixels with ‘-1’ value indicates the image is without pneumothorax. Images with pneumothorax have masks in run-length-encoded (RLE) format. We have to decode and create the mask.

9. Performance metric:

b) Segmentation Model:
I have given the images along with masks. I have to train a model using that data and predict masks for the test data. So, this is a Semantic Image Segmentation problem.
In this Semantic Image Segmentation problem, I am going to measure the performance of the model based on the “IOU score”. I will use a combination of “binary_crossentropy” and “dice_loss” as loss functions. These terms are explained below.

I. Intersection over Union (IoU) Score:
The Intersection over Union (IoU) metric, also referred to as the Jaccard index. This is a method to quantify the percent overlap between the target mask and our prediction output. This metric is closely related to the Dice coefficient.
The IoU metric measures the number of pixels common between the target and prediction masks divided by the total number of pixels present across both masks.

II. Pixel-wise cross-entropy loss:
This loss examines each pixel individually, comparing the class predictions (depth-wise pixel vector) to our one-hot encoded target vector.
Pixel-wise loss is calculated as the log loss summed over all possible classes.

III. Dice loss:
Dice Loss = 1-Dice Coefficient

Where Dice Coefficient(D) =

Here, pi = predicted pixel values.
gi = groung truth pixel values.
In the image segmentation scenario, the values of pi and gi are either 0 or 1.
1 → pixel is a boundary
0 → pixel is not a boundary
In the dice coefficient,
Numerator → 2 * Sum of correctly predicted boundary pixels. (when pi and gi both are 1)
Denominator → Sum of total boundary pixels of both predicted and ground truth.

10. Exploratory Data Analysis:

Out of 12954 image-ids, 12047 are unique. It means there are duplicates. So, I have to remove the duplicate image-ids.

The images are given in DICOM format. We have to extract the information from the metadata for EDA. The metadata of a sample image is printed below:

Extract some information from metadata: Extract age, sex, modality, body part, and view position of every image given. I will use this data for EDA.

a) Distributions of Class Labels:

If the RLE mask field for an image is “-1”, then this is negative pneumothorax otherwise positive.

Observation:
This is an imbalanced dataset. Among all the x-ray images 77.85% are without Pneumothorax and 22.15% are with pneumothorax.

b) Distribution of Gender:

Observation:
There are 55% male patients and 45% female patients in the given dataset.

c) Distribution of Gender along with Class Label:

Observation:
Among all the male patients 77.53% are without pneumothorax and 22.47% are with pneumothorax and among all the female patients 78.23% are without pneumothorax and 21.77% are with pneumothorax. Pneumothorax distribution is almost similar for both male and female patients.

d) Distribution of View Position:

Posteroanterior view (PA):
The x-ray source is positioned so that the x-ray beam enters through the posterior (back) aspect of the chest and exits out of the anterior (front) aspect, where the beam is detected.
Anteroposterior view (AP):
The x-ray source and detector are reversed: the x-ray beam enters through the anterior aspect and exits through the posterior aspect of the chest. AP chest x-rays are harder to read than PA x-rays and are therefore generally reserved for situations where it is difficult for the patient to get an ordinary chest x-ray, such as when the patient is bedridden.

Observation:
In the dataset for 60.38% of images view position is PA and 39.62% of images view position is AP.

e) Distributions of Patient’s Age for different Class Labels:

Observation:
a) Patients within 0–6 years and 90–100 years are not suffering from Pneumothorax.
b) For patients with 16 years of age, pneumothorax count is more than without pneumothorax.

11. Existing Approaches and Improvements in my model:

As 78% of the given images does not contain pneumothorax, I have split the solution into 2 parts:
a) Classification: Firstly, I will build a binary image classification model using pre-trained models and transfer learning to classify the image into positive pneumothorax or negative pneumothorax. I will use images along with their class labels to train the classification model.
b) Segmentation: I will build an image segmentation model to segment the pneumothorax affected area. I will be using only the images which contain pneumothorax and their corresponding masks to train the segmentation model. I will use UNET architecture with the DenseNet121 as an encoder for image segmentation. If the classification model predicts positive pneumothorax, then the image will be passed through the segmentation model for mask prediction.

12. Data Preprocessing:

b) Convert RLE to png mask:
The masks are given in Run Length Encoded (RLE) format. We have to covert RLE to png mask. The below function given by organizers converts RLE to mask.

What is Run Length Encoding?
Run-length encoding (RLE)
is a very simple form of data compression in which a stream of data is given as the input (i.e. “AAABBCCCC”) and the output is a sequence of counts of consecutive data values in a row (i.e. “3A2B4C”). This type of data compression is lossless, meaning that when decompressed, all of the original data will be recovered when decoded. Its simplicity in both encoding (compression) and decoding (decompression) is one of the most attractive features of the algorithm.
In the RLE encoded format, the odd position indicates the number of occurrences and the even position right to it indicates the value.

13. Deep Learning Models:

a) Classification Model: Firstly, I have to build a data pipeline for the classification model using the decoded images and their corresponding labels. Below is the code snippet for the data pipeline for the classification model.

Now, I will create my classification model using VGG19 architecture with pre-trained imagenet weights. I will set all the layers of the VGG19 model “trainable=False”. I tried VGG16 as well but received a better recall value using VGG16.

Now, I will compile and train this model and save the best one using checkpoints.

Graphs obtained from tensorboard are displayed below.

As we can see from the graph, the best validation recall received at epoch 7. Weights are saved for this using model checkpoints.

As I said earlier, I tried VGG16 as well. Below is the comparison for both models.

From the above table, we can see that vgg19 gives better recall compared to vgg16.

Now define a function to predict the class label for validation data using this classification model.

Function to predict class label and plot confusion matrix

Now we have to check for which threshold value ranging from 0.1 to 0.9 gives the best prediction.

From the output of the above code snippet, I found that threshold=0.3 gives the best result in terms of all the parameters. The confusion matrix for this threshold value is plotted below.

Below is the distribution of predicted probabilities for different class labels from the classification model.

It is observed that there is a large overlap between the probability scores of positive and negative class labels.

b) Semantic Segmentation Model: Semantic Segmentation model is built on the positive pneumothorax data only along with their corresponding masks. Similar to the classification model, I have to build the data pipeline for the segmentation model as well. Below is the code snippet.

I have used UNET architecture for this semantic segmentation task. I replaced the encoder part of the UNET model with pre-trained DenseNet121 backbone with imagenet weights and kept the same decoder part. Below is the code snippet.

Define callbacks and compile and train the model. Also, save the best model using checkpoints.

Below are the graphs of IOU Score and Loss received from tensorboard.

Best validation score=0.3066 received at epoch 17. Saved the weights using checkpoints for future use.

Displaying some of the images and their corresponding original and predicted masks using the above model.

Below is the distribution of iou score. for all the masks predicted.

Observation:
a) There are around 200 images whose iou score is less than 0.1.
b) We need to train the model with more similar images for which iou score is very low, so that model can learn better.

14. Final Data Pipeline:

Final Pipeline for Mask Prediction

Below are some sample predictions received from the final pipeline given an image path as input.

When the classification model gives negative result, it displays only image with title “THIS IMAGE DOES NOT CONTAIN PNEUMOTHORAX”.

When the classification model gives positive result, the image is passed through segmentation model and the segmentation model predicts the mask. The image along with the mask is displayed with title “THIS IMAGE CONTAINS PNEUMOTHORAX”.

15. Error Analysis:

a) Classification Model:
Find out the false-negative points and display a few of them.

False Negative

Conclusion:
1. False-negative points whose probability scores are very low(near zero) are classified completely wrongly. To fix this issue we need to oversample these data so that the model can learn from these similar images.
2. False-negative points whose probability score is less than the threshold but has a little higher value(near about threshold value), even though these are classified wrongly this can be fixed by training our model furthermore.

Find out the false positive points and display a few of them.

Conclusion:
1. False-positive points whose probability score is higher(near one) are classified completely wrongly. We need to oversample these data and train our model to get a better result.
2. False-positive points whose probability score is lower(near the threshold value) but higher than the threshold, even though these points are classified wrongly this can be fixed by training our model further.

b) Segmentation Model:
Firstly, store the iou score for the predicted masks corresponding to every image. Then, sort the dataframe based on the iou score in descending order. Display a few images along with their original and predicted masks from the top of the dataframe i.e. with best iou scores.

Display a few images along with their original and predicted masks from the bottom of the dataframe i.e. with a very low iou score.

Conclusion: Images for which iou score is less than 0.1 are generating very poor results.

16. Future Work:

  1. By doing error analysis, we have filtered the images for which we got false negative and false positive predictions for the classification model. If we oversample these images so that the model can learn more, we may get better results.
  2. For the segmentation model also, I have filtered out the images with very low iou score. If oversampling is done for these images and retrain the model, we may get better results.

17. LinkedIn and GitHub Repository:

18. References:

  1. U-Net: Convolutional Networks for Biomedical Image Segmentation
    https://www.semanticscholar.org/paper/U-Net%3A-Convolutional-Networks-for-Biomedical-Image-Ronneberger-Fischer/6364fdaa0a0eccd823a779fcdd489173f938e91a
  2. Metrics to Evaluate your Semantic Segmentation Model
    https://towardsdatascience.com/metrics-to-evaluate-your-semantic-segmentation-model-6bcb99639aa2#:~:text=The%20Intersection%2DOver%2DUnion%20(,segmentation%E2%80%A6%20and%20for%20good%20reason.&text=For%20binary%20(two%20classes)%20or,each%20class%20and%20averaging%20them.
  3. Decode DICOM files for medical imaging
    https://www.tensorflow.org/io/tutorials/dicom
  4. www.appliedaicourse.com

Machine Learning Enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store