ResU-Net: Retinal Vessel Segmentation Project by Sahil Baranwal

Hello, everyone, this is Sahil Baranwal from IIEST, Shibpur had done this project in last summer of segmenting retinal blood vessels through deep learning techniques. So this is general information on what has been done by me under this project. Feedback will be appreciated.


The vessels of the retina are very important to mark for curing various cardiovascular and ophthalmological diseases, that’s why the automatic segmentation of vessels is necessary to cure and monitor over these chronic diseases and problems related to these diseases. A few years earlier, retinal vessel segmentation done by the method of deep learning has reached a state of the art performance. As we know there’s lies lots of variation in retinal vessels which includes noisy background, optic disk and critical issues related to thin vessels, etc. So to overcome this problem, we came with an idea of a U-Net like model with weighted Res-Net. Our model is applied to publicly available DRIVE dataset and our proposal also uses data augmentation techniques which allows us to explore the information of fundus images during training.


The most important human body part i.e. retina , is responsible of observing through a non-inavise fundus and segmented retinal vessels serves as an crucial signal for curing chronic eye’s problem , cardiovascular diseases and most important diabetic retinopathy. Retinal vessel segmentation is done by determining vessels according to their intensity of specified pixels . Manual Segmentation is time consuming and is not so accurate. Deep learning method through CNN is helpful for reducing medical cost and increasing efficiency.

The factors which comes as a challenge in vessels segmentation are-

  1. Small vessels are generally difficult to detect at the end of blood vessels.
  2. Optic disk region are somewhat brighter which makes retinal vessels segmentation a problematic tasks.
  3. Poor or overexposed illumination which will minimize retina image contrast which results in the irregular shape boundary of retinal vessels and many more. Therefore , for solving all these problems we proposed a well efficient and accurate retinal vessel segmentation model ,named as , weighted Res-Unet .This model is built after the original U-Net model along with adding residual network with skip connection . In this way , this model can learn segmentation of vessels in a more efficient and descriptive way.The framework which has been used for training and testing this model is keras and tensorflow. Here we used DRIVE dataset in which training and testing are already been split. Training and testing folder contains fundus images and ground truth value(segmented vessels) of their respective fundus images , mask of fundus image. Both training and testing contains 20 images each.


As we know , the method of retinal blood vessel segmentation can be classified into unsupervised and supervised learning . In case of Unsupervised methods , it attempt to
find built-in patterns of blood vessels without any of the given labeled information, including thresholding, conventional matched filtering , vessel tracking etc.Supervised method of learning trained our model for segmenting retinal vessel depending upon the training model which has the annotation experienced by professional opthalmologists.Wtih the development of Convolutional Neural Network(CNN) in deep learning , there’s comes an effective
way to solve segmentation problem.There’s are some previoulsy done methods which is used
for segmentation of vessels are , Jose Ignacio Orlando et al. comes with it’s proposed method which totally over comes the problem of segmenting with thin and elongated vessels structure.The method is known as conditional random field(CRF) with more expressive potential, taking advantage of recent results enabling inference of fully connected models
almost in real time. Huazhu Fu et al. came with their’s proposed method in which they were applying a multi-scale and multi-level Convolutional Neural Network (CNN) with a
side-output layer to learn a hierarchical representation, and at the same time using Conditional Random Field (CRF)as well to train the model for long-range interactions between pixels.Americo Filipe Moreira Oliveira et al. gave it’s own method which helps to overcome the problem of changing width and direction of vessels structure.They combines the
multiscale analysis with a Wavelet tranformation connected via Fully connected convolutional neural network. The M2U-Net is type of model which has a encoder and decoder model which is basically inspired by U-Net architecture. It adds some of the pre-trained components of MobileNetV2 in the encoder section and contractive blocks in the decoder section in up sampling part.


The method proposed by us is an accurate and robust U-Net model in which there attached residual network after every down sampling and up sampling besides concatenating the down sampled layer to up sampled layer of the model.The detailed information of the architecture is described as follows –

A. Retinal Fundus Image pre-processing
As theres a lot problem related to contrast , illumination, brightness but with the involvement of CNN we can have better accuracy and performance with specified pre-
processed input. Along with , we have only 20 images in training set so we need to augment it so explore fundus image information so to train efficiently . At first we load the data and convert it into gray scale . And then for each fundus image we augment it and do various operation such as rotation , random cropping , random shearing , random
zoom etc. by which our training set increases by many folds.For converting into gray scale operation we used CLAHE operation –

i =(i − μ)/ σ

where σ and μ are the standard deviation and the mean of
the extracted gray image i.

Retina fundus desire ai
Retina fundus
Mask Image
Mask Image
Grey Image after pre-processing
Grey Image after pre-processing

B. Architecture of proposed method
The overall model of the given method is Res-Unet which is shown in figure .This is similar to original Unet in which Res-Net is used to overcome the lost information while each down sampling and up sampling and skip connection which concatenate to the layers of down sampling and up sampling. After each layer we have used ReLU activation energy and in the layer before getting output sigmoid activation energy is used.We provide fundus image ,
it’s ground truth value and mask image as an input and after the 10th layer we get the output as segmented image.

Proposed Res-Unet Model desire ai
Proposed Res-Unet Model

• U-Net : This network architecture is divided into two parts i.e.the contracting path followed by the expansive path. The left side of the architecture i.e. contract-
ing path is the convolutional network. It consists of repetition of 33 convolutions layer(unpadded) where each layer is followed by a ReLU(rectified linear unit) activation function and 22 max pooling operation for down sampling with a stride of 2. At each down
sampling process we just double the number of feature channels. Similarly, in the expansive process i.e. right side of the model consists of feature maps where each
is followed by a 22 convolution for up-sampling that halves the features channel.The concatenation of the respective feature maps with respective cropped features maps and 33 convolution network followed by ReLU activation function.The concatenation is necessary for
getting back all losses of border pixels faced during up and down sampling . In the final layer we used a 1×1 convolution layer which is used to map each 64 component feature vector to the given number of classes.

Basic U-Net Model desireAI
Basic U-Net Model

• CNN : It stands for Convolutional Neural Network also known as ConvoNet is a deep learning algorithm in which we can feed input image and assigns learnable and biases to identify the various aspects and able to differentiate one image from other image. A
CNN is able to capture all the spatial and temporal dependencies in the image by the use of relevant filters. This algorithm helps for better fitting to the image data set due to reduction in the number of parameters and using it again of the saved weights.
There’s lies a crucial role of ConvoNet to to reduce the image into a particular form which is easier to be used, without losing the important features and making a good prediction thereafter.There’s exist a first part of convolutional layer where the element is
being carried by convolutional layers called Kernel or Filter,K. Similarly following feature map’s values are calculated as follows, where g is the input image and our kernel is represented by w. The indexes of rows and columns of the result matrix are marked with m
and n respectively.
• ReLU : It stands for Rectifier Linear Unit . It is the activation function which is defined for the positive part of it’s agument: where y is the neuron’s input. This is also known as a ramp function and is also similar to half-wave rectification.It can also be defined as a uni employing the rectifier.Hahnloser et al was first who introduced the ReLU activation function to a dynamical network in 2000 with strong mathematical justification and biological motivations. It has been used for the first time in 2011 for the better training of the deeper neural
network .Till date ReLU activation function is the most popular activation function for Deep Learning.ReLU function is shown in figure .
g(y) = y= max(0, y)

ReLU Activation Function desireAI
ReLU Activation Function

• Sigmoid : A mathematical function which have characteristics S shaped curve or likely to say sigmoid curve is called Sigmoid Function. Moreover , it can also be defined as the special case of logistic function.

Sigmoid Activation Function desireAI
Sigmoid Activation Function

Skip Connection : In general short and long skip connection are crucially used in medical image segmentation.In the model like U-net , longskip connections are used to conactenate thos feature during down sampling and helps in recovering the spatial losses durng it. In, up-sampled features maps are summed with feature maps skipped from contractive path while concatenate them and convolutions and non-linearities between each
up sampling step.As there is a loss of spatial information along down sampling i.e. contracting path which can be recovered by skipping equal feature’s resolution from former to latter..
In skip connections are added around non-linearities, which enables us for creating shortcuts by which parameters can be updated deeper into the network which is a result of the uninterrupted gradient flow.Moreover , many results have shown us that these skip connections allow us for performing better convergence during training of our model.

Res-Net : An artificial neural network that builds on construct known form pyramidal cells is called residual neural network(Res-Net). Skip connection or short cuts over layers is the main thing used in residual neural network. Standard Res-Net are used with skipping two to three layers alongwith ReLU and batch normalisation.The main reason for using Res-Net is to reduce the problem of vanishing gradients , by again using the activation from the previous layer. When the data set is being trained, the weights adapt to mute the upstream
layer, and amplify layer which is skipped previously.In the most general case, weights of adjacent layer’s connection are adapted, with no explicit weights for the upstream layer.

Res-Net desireAI

• Optimizer : In this model , we used Adam for stochastic Optimizer which is based on adaptive estimates of lower moments and is one of the popular algorithm for first order gradient optimization. The method is easy to implement , is efficient and most importantly it requires very little memory . It is most appropriate for the problems of large datasets and parameters.This algorithm also helps in reducing noisy and sparse gradients. The
hyper parameters used in the Adam optimizer requires intuitive and somewhat of tuning. If we consider our model the hyper parameters used are – lr(learning rate)= 1e-3 , β1 = 0.9 , β2 = 0.999 , epsilon = None.

Experiment – We trained our proposed models over training dataset after augmenting our data set and training it over 100 epochs which takes almost an hour and the best training accuracy which comes out to be 0.9571. After 100 epoch , our proposed model is ready to be tested on the dataset provided on DRIVE . On predicting the segmented retinal vessel of
the test fundus images the AUC ROC score turned out to be 0.9812.

Segmented Retinal Vessels desireAI
Segmented Retinal Vessels


  1. Awesome post! Keep up the great work! 🙂

  2. You made some good points there. I looked on the web for additional information about the issue and found most people will go along with your views on this website.| а

Leave a Reply