Description

This repository provides code and describes a deep learning tumor segmentation model I developed by fine-tuning Meta's foundational model MedSAM on the publicly available dataset LIDC-IDRI. This project is a part of a larger diagnostic pipeline designed to be used by UHN Princess Margaret Cancer Centre.

Preprocessing

In order to make the publicly available LIDC-IDRI dataset compatible with Meta's foundational model MedSAM, preprocessing was required. Multiple issues had to be tackled:

The transformation of the 3D lung dicom file to 2D images.
The transformation of the 2D lung tumor annotations to 2D images.
Matching the lung scan images to their corresponding annotations using the dicom's metadata.

Visualization

Below is a comparaison of the performance of the MedSAM model before and after being fine-tuned on a subset of the LIDC-IDRI data. The subset of the data included 240 lung slices, about 2.5% of the total dataset.

The image of the left is the ground truth.
The image in the middle is a lung slice passed in MedSAM model that did not undergo finetuning. The resulting dice coefficent of 0.287.
The image on the right is the same lung slice passed into the fine-tuned MedSAM model. The dice coefficient significantly improved and has now reached 0.873.

Preliminary results

I started by training MedSam on a subset of the LIDC-IDRI dataset. This subset of data only included tumours larger than 14mm which resulted in a dataset of 550 lung slices. After performing 5 fold cross validation, I found that the model performs with an average of 0.893 dice coefficient. These results seem quite promising for my next step which is to train MedSAM on the full lidcidri dataset for tumors larger or equal to 3mm which represents about 10 500 lung images.

However, these preliminary results where obtained by training the model only on lung slices that contained tumours. It is important to note that I ultimately want the model to take in as input the entire 3D lung scan which will inevitably also include lung slices that do not contain any tumors. Next steps are detailed below.

Next Steps

Further preprocessing

As mentioned above, the goal is that the model performs well both on lung slices that contain and do not contain tumors. I am working on balancing the dataset to contain 60 % of the paired lung slices and annotations with no tumours and 40 % to contain tumours.
Additionally, filtering 'closed' lung images which reside at the beginning and the end of the slice where the lung begins to close will enhance the efficiency of the model.

Widening the scope

I am currently working on training MedSAM on the full lidcidri dataset with tumors larger or equal to 3mm which represents about 10 500 lung images.

Awknowledgments

Thank you to Meta AI for making the foundational model MedSAM publically available. The link to its official repository
I am also grateful to have been able to use the open-source Lung Image Database Consortium image collection (LIDC-IDRI) to finetune this model. Access the dataset here

Running the code

This code will run on cpu. Change pre_gre_rgb2D.py and DL_model.py appropriately to run this model on GPU.

Installation

Create a virtual environment conda create -n medsamtumour python=3.10 -y and activate it conda activate medsamtumour
Install Pytorch 2.0
pip install monai
git clone https://github.com/charlottevedrines/TumorSegMedSam
Enter the MedSAM folder cd MedSAM and run pip install -e

Getting Started

Download the model checkpoint and place it in work_dir/SAM/
Download a subset of the LIDC-IDRI dataset and place it in MergedImages

To start, run the script CentralScript_g.py. This will run the model on a sample of the LIDC-IDRI dataset included in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
MergedImages		MergedImages
data		data
segment_anything		segment_anything
utils		utils
work_dir/SAM		work_dir/SAM
CentralScript_g.py		CentralScript_g.py
DL_model.py		DL_model.py
Dcm2PngCT_g.py		Dcm2PngCT_g.py
LICENSE		LICENSE
Merging_g.py		Merging_g.py
README.md		README.md
RemovingPng_g.py		RemovingPng_g.py
pre_grey_rgb2D.py		pre_grey_rgb2D.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Preprocessing

Visualization

Preliminary results

Next Steps

Further preprocessing

Widening the scope

Awknowledgments

Running the code

Installation

Getting Started

About

Releases

Packages

Languages

License

charlottevedrines/TumorSegMedSam

Folders and files

Latest commit

History

Repository files navigation

Description

Preprocessing

Visualization

Preliminary results

Next Steps

Further preprocessing

Widening the scope

Awknowledgments

Running the code

Installation

Getting Started

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages