Use Case: Retinal OCT

Martin Isaksson
4 min readAug 10, 2021
Image Source.

The human eye is a complex sensing organ that reacts to light to provide us with a field of view into the world around us. Taking care of our eyes is an important aspect of our overall healthcare. However, our eyes are susceptible to a number of diseases, some of which can be attributed to other ailments and even just from aging.

One method that medical practitioners use to detect such issues is Optical Coherence Tomography (OCT). OCT involves using lightwaves to construct cross-section images of the retina, allowing the practitioner to map and measure the retina’s layers. There are millions of these scans conducted each year, and it takes a significant amount of time to analyze them all.

With the growing use of (machine learning) ML in healthcare, we set out to build an image recognition model in PerceptiLabs that could analyze retinal OCT images to automate the analysis of these scans. A model like this could help doctors, researchers, and other healthcare practitioners more quickly and accurately diagnose eye diseases.

Dataset

To train our model, we used images from the Retinal OCT Images dataset on Kaggle. The original dataset comprises over 80,000 .jpeg images of varying resolutions divided into four classifications. One classification represents normal OCT scans (i.e., no diseases detected), and the others are OCT scans for three diseases: Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME), and Drusen (i.e., multiple drusen present in early age-related macular degeneration (AMD)).

The original dataset is unbalanced with over 44.6% of the images for CNV, 31.5% for normal, 13.6% for DME, and 10.3% for Drusen. To eliminate potential biases during training we made a balanced dataset using a subset of 4000 images per classification.

Figure 1 shows some example images from this dataset:

Figure 1: Examples of images from the dataset.
Figure 1: Examples of images from the dataset — Image Source.

To map the classifications to the images, we created a .csv file that associates each image file with the appropriate classification label (CNV, DME, DRUSEN, and NORMAL) for loading the data using PerceptiLabs’ Data Wizard. Below is a partial example of how the .csv file looks:

Example of the .csv file to load data into PerceptiLabs that maps the image files to their classification labels.

Model Summary

Our model was built with four Components:

Component 1: Merge, Three inputs from the same Input source.

Component 2: MobileNetV2, include_top=false, pretrained=imagenet

Component 3: Dense, Activation=ReLU, Neurons=128

Component 4: Dense, Activation=ReLU, Neurons=4

The model uses transfer learning via MobileNetV2 and a Merge component to convert grayscale images to RGB. This is a requirement of MobileNet because it’s using a pre-trained model with frozen weights, where its first Convolution layer was trained on inputs with three channels (RGB). Figure 2 shows the model’s topology in PerceptiLabs:

Figure 2: Topology of the model in PerceptiLabs — Image Source.

Training and Results

We trained the model in batches of 32 across 10 epochs, using the ADAM optimizer, a learning rate of 0.001, and a Cross-Entropy loss function. With a training time of around 11.5 minutes, we achieved a training accuracy of 95.78% and a validation accuracy of 83.69%.

Figure 3 shows PerceptiLabs’ Statistics view during training:

Figure 3: PerceptiLabs’ Statistics View during training.
Figure 3: PerceptiLabs’ Statistics View during training — Image Source.

Figures 4 and 5 below show the accuracy and loss across the 10 epochs during training:

Figure 4: Accuracy during training.
Figure 4: Accuracy during training — Image Source.
Figure 5: Loss during training.
Figure 5: Loss during training — Image Source.

In Figure 4 we can see that both training and validation accuracy started at approximately 79% to 80%. Training accuracy continued to climb before stabilizing at around the seventh epoch, while validation accuracy experienced little gain throughout. Interestingly, training loss (shown in Figure 5) dropped the most during the first epoch or so, then steadily declined throughout training, while experiencing a couple of small temporary increases. Validation loss, on the other hand, remained fairly stable until about the fifth epoch, where it slowly increased until the end of the final epoch.

Vertical Applications

A model like this could be used by both medical practitioners and researchers who need to detect retinal diseases. The model could be used to analyze large quantities of images, flagging cases that need further attention. The model itself could also be used as the basis for transfer learning to create additional models for detecting ailments from other types of medical scans.

Summary

This use case is an example of how image recognition can be used in healthcare. If you want to build a deep learning model similar to this, run PerceptiLabs and check out the repo we created for this use case on GitHub. Also be sure to check out our other eye-related use case: Ocular Disease Recognition.

--

--

Martin Isaksson

Martin Isaksson is Co-Founder and CEO of PerceptiLabs, a startup focused on making machine learning easy.