Detection of diabetic retinopathy from f

Journal of Information and Computational Science ISSN: 1548-7741 Detection of Diabetic Retinopathy from Fundus Images using Deep Learning:A Review Supriya Mishra1, Seema Hanchate2 Department of Electronics & Communication, Usha Mittal Institute of Technology, (SNDT University, Mumbai, India ) Department of Electronics & Communication, Usha Mittal Institute of Technology, (SNDT University, Mumbai, India) Abstract Diabetic retinopathy is a human eye disease observed in people having diabetics It is caused due to damage internally in the retinal blood vessels of the light sensitive tissue at the back of the eye (retina) Effective treatment of DR (Diabetic Retinopathy) can be done if it is detected early DR detection in early stages prevents the blindness or losing vision of the eye Many physical tests are also available for detection but are very time consuming One of the solutions to detect DR is CNN (Convolutional Neural Network) algorithm of deep learning In this paper, we have performed a comparative study of CNN architectures to detect the DR There are five stages of DR To detect the disorders and monitoring their changes over time, fundus camera is used for capturing the fundus images which provide colored images of the interior surface of the eye This algorithm will give the required accuracy suitable and precise for detection Keywords: Deep Learning, CNN, Fundus images, Diabetic Retinopathy (DR), and Architectures I INTRODUCTION If the body is unable to process perfectly or the pancreas doesn‘t secrete enough insulin, it may cause diabetes and it may be a chronic and organ disease As the time increases, diabetes affects the circulatory system, including that of the retina DR is the medical condition where the retina is broken because fluid leaks from the retinal blood vessels It is a one among the foremost common eye diabetic disease and a better explanation for blindness It occurs when diabetes damage the small blood vessels inside the retina, this small vessels leaks the blood and fluid on the retina form the features like micro-aneurysms, hemorrhages, hard exudates, cotton spots or venous loops DR often classified as NPDR (Non-proliferative diabetic retinopathy) and PDR (Proliferative diabetic retinopathy) Depending on the condition of the retina features, the stages of DR are classified, NPDR is the stage where the disease can advance from mild, moderate to severe stage with various levels of features except less growth of blood vessels PDR is the advance stage where the fluids sent by the retina for nourishment trigger the expansion of blood vessels If vessels leak blood, severe vision loss and even blindness may result Currently detecting DR is time consuming and manual process that needs a trained clinician to look at and evaluate digital color fundus photographs of the retina By the time the human readers submit their reviews, often each day or two later, the delayed result cause lost follow up, miscommunication and delayed treatment Volume 13 Issue – 2020 99 www.joics.net Journal of Information and Computational Science ISSN: 1548-7741 According to [1], the International Diabetes Federation (IDF), the presence of adults with the diabetes within the world is estimated to be 366 million in 2011 and by 2030 this is able to have risen to 552 million, the presence of individuals with type diabetes is increasing in every country 80% of individuals with diabetes stay in low and Middle income countries, India stands first Now a days increasing prevalence in rural areas are found There are five stages of DR, 0, 1, 2, 3, The below table describes about the stages of the DR, Levels DR Stage Severity No DR Normal eye Mild NPDR Balloon like swelling in the retinas blood vessels Moderate Blood vessels nourishing the NPDR retina swell and may even become blocked Severe NPDR An increasing number of blood vessels nourishing the eye have become blocked PDR Last stage of DR, new blood vessels growing inside the retina, vessels are begin to leak and bleed, which cause spotty vision, vision loss FIG:1 STAGES OF DR CNN model is a mathematical methods used to the input, like pixel value within the image, where the data is trained by presenting the network Deep learning especially CNN is one of the best method in the field of DR detection Deep Learning has good roll in healthcare domain With regard to DR the application ranges from binary classification (No DR/DR) to multi-level classification (No DR / Mild NPDR / Moderate NPDR / Severe NPDR) In Deep Learning, since the dataset is large, CPU (Central Processing Unit) is unable to support to train and test the data; hence instead GPU (Graphic Processing Unit) is used FIG:2 Severity stages of DR Volume 13 Issue – 2020 100 www.joics.net Journal of Information and Computational Science ISSN: 1548-7741 The figure defines the severity stages of DR, from which people who have DR they will get to know that they have DR of severity like 0, 1, 2, and According to the figure the severity of DR are 0, 1, 2, and respectively The rest of the paper is organized into sections Section I includes Introduction of DR, section II includes A Review of various papers based on architectures, section III comprises methods for detection of DR, section IV concludes the paper II LITERATURE REVIEW A supervised classification is predicated on classifying the test image dataset from the training data with labeled classes Generally, classification is made by extracting the features from the pictures followed by identifying the categorized classes supported the trained data with labeled classes Some of the popular methodologies are used to the DR detection According to [1], Supervised algorithms includes, is the solution is to search out an improved and optimized way to classifying the fundus image with pre-processing techniques A DCNN (Deep Convolutional Neural Network) is more complex architecture This architecture deployed with dropout layer techniques yields around 94-96 percent accuracy And, it tested with databases like STARE, DRIVE, kaggle fundus images datasets are available publicly According to [2], Automatic segmentation of blood vessels in fundus images is of great importance as eye diseases cause observable pathologic modifications CNN method was tested on publicly available DRIVE dataset and the result demonstrated the high effectiveness of the deep learning approach GPU is used for implementation of deep max-pooling convolutional neural networks to segment blood vessels This method achieved an average accuracy and AUC of 0.9466 and 0.9749, respectively In [3], R-CNN (Regional Convolutional Neural Network) approach to diagnose DR from digital fundus images All the images were classified into two groups i.e., with DR and without DR There was a new method invented where the whole image was segmented and only the regions of interest were taken for further processing This method included 10 layers for R-CNN, trained it on 130 fundus images and tested on 110 images This R-CNN approach was found to be efficient in terms of speed and accuracy An accuracy of approximately 93.8% was obtained from R-CNN In [4], A CNN, based approach are often used to automate the strategy of DR stage classification During this work, from the colour fundus retinal images, DR is classified into five stages employing a CNN Three CNNs are deployed for stage classification of DR Images with diabetic retinopathy is classified into five groups from the opinion of an expert of ophthalmologists By the concatenation of those three networks, VGG16, AlexNet, and InceptionNet V3, an accuracy of 80.1% is obtained In [5] each patient in the dataset is represented by two images of left and right eyes That algorithm was trained on over 70,000 labeled retinal images The algorithm performs quick and reliable detection of anomalies in retinal images, diagnoses the stage of diabetic retinopathy and provides the location of the anomalies detected in the pictures Grading is done for each eye image separately The evaluation of model on over 10,000 fundus images from 5,000 patients taken from the Kaggle DR Detection Challenge dataset, provided by California Healthcare Foundation (CHF) The algorithm achieves an curve AUROC of 0.946 with 96.2% sensitivity (95% CI: 95.8–96.5) and 66.6% specificity (95% CI: 65.7–67.5) Volume 13 Issue – 2020 101 www.joics.net Journal of Information and Computational Science ISSN: 1548-7741 According to [6], AlexNet, VggNet, GoogleNet, ResNet are analyze that how well these models with the DR image classification CNNs are used to DR detection, which includes major difficult challenges: classification, segmentation and detection These are employed publicly available Kaggle platform for training these models The best classification accuracy is 95.68% and the results have demonstrated the better accuracy of CNNs and transfer learning on DR image classification In [7], CNN is used to diagnose the DR from d fundus images and accurately classifying its severity and developed a network with CNN architecture and data augmentation which can identify the intricate features involved in the classification task on the retina and consequently provided a diagnosis automatically without user input This network was trained using a high-end GPU on the publicly available Kaggle dataset and gave the impressive results, particularly for a classification task CNN achieves a sensitivity of 95% and an accuracy of 75% on 5,000 test images In [8], A CNNs approach is used to automate the method of DR screening using color fundus retinal photography as input The network uses CNN along with de-noising to identify features like micro-aneurysms and haemorrhages on the retina The network was trained using a high-end GPU on the publicly available Kaggle dataset On the dataset of over 30,000 images, the proposed model achieved around 95% accuracy for the two class classification and around 85% accuracy for the five class classification on around 3,000 validation images Sehrish Qummar [9], has proposed by using publicly available Kaggle dataset of retina images to train an ensemble of five deep CNN models (Resnet50, Inceptionv3, Xception, Dense121, Dense169) to encode the rich features and improve the classification for different stages of DR The experimental results showed that the proposed model detected all the stages of DR unlike the current methods and performed better compared to state-of-the-art methods on the same Kaggle dataset An automatic diabetic retinopathy (DR) analysis algorithm based on two-stages DCNN Compared to existing DCNN-based DR detection methods, the proposed algorithm has the following advantages: (1) Our algorithm can not only point out the lesions in fundus color images, but also give the severity grades of DR (2) By introducing an imbalanced weighting scheme, more attentions payed on lesion patches for DR grading, which significantly improved the performance of DR grading under the same implementation setup In this study, labeled 12, 206 lesion patches and re-annotate the DR grades of 23, 595 fundus images from Kaggle competition dataset Under the guidance of clinical ophthalmologists, the experimental results showed that the lesion detection net achieved comparable performance with trained human observers, and the proposed imbalanced weighted scheme also be proved to significantly enhance the capability of the DCNN-based DR grading algorithm [10] An application system is built which takes the input parameters as the patient‘s details along with the fundus image of the eye A DCNN model was trained on a large dataset consisting of around 35,000 images to automatically diagnose and thereby classify high resolution fundus images of the retina into five stages based on the severity A trained DCNN model would further extracted the features of the fundus images and later with the help of the activation functions like relu and softmax an output is obtained The output obtained from the CNN model and the patient details collectively make a standardized report [11] The classification of severe cases of pathological indications in the eye has achieved over 90% accuracy Still, the mild cases are challenging to detect due to CNN inability to identify the subtle features, discrimnative of Volume 13 Issue – 2020 102 www.joics.net Journal of Information and Computational Science ISSN: 1548-7741 disease The data used (i.e annotated fundus photographies) was obtained from publicly available sources – Messidor and Kaggle The experiments were conducted with 13 CNN architectures, pre-trained on large-scale ImageNet database using the concept of Transfer Learning The results were measured against the standard Accuracy metric on the testing dataset The comprehensive evaluation of numerous CNN architectures was conducted in order to facilitate an early DR detection After the extensive experimentation, the maximum Accuracy of 86% on No DR/Mild DR classification task was obtained for ResNet50 model with fine-tuning (unfreeze and re-train the layers from 100 onwards), and RMSProp Optimiser trained on the combined Messidor + Kaggle (aug) datasets Furthermore, several performance improvement techniques were assessed to address the CNN limitation in subtle eye lesions identification The model also included various levels of image quality (low/high resolution, under/over-exposure, out-of-focus etc.), in order to prove its robustness and ability to adapt to real-world conditions [12] III ARCHITECTURES A CNN (Convolutional Neural Network) CNN has been proving itself that it is well performed for applications like image processing, pattern recognition and video recognition CNN takes a picture as the input and classify it into the different category Fig: Block diagram of CNN Convolution Layer : Convolution is a specialized kind of linear operation Convolutional networks are simply neural networks that use convolution in place of general matrix multiplication in at least one of their layers The name ―convolutional neural network‖ indicates that the network employs a mathematical operation called convolution When programming a CNN, the input is a tensor with shape (number of images) x (image width) x (image height) x (image depth) Then after passing through a convolutional layer, the image becomes abstracted to a feature map, with shape (number of images) x (feature map width) x (feature map height) x (feature map channels) A convolutional layer within a neural network should have the following attributes: Volume 13 Issue – 2020 103 www.joics.net Journal of Information and Computational Science ISSN: 1548-7741  Convolutional kernels defined by a width and height (hyper-parameters)  The number of input channels and output channels (hyper-parameter)  The depth of the Convolution filter (the input channels) must be equal to the number channels (depth) of the input feature map Convolutional layers convolve the input and pass its result to the next layer This is similar to the response of a neuron in the visual cortex to a specific stimulus Each convolutional neuron processes data only for its receptive field Receptive field: In a convolutional layer, neurons receive input from only a restricted subarea of the previous layer Typically the subarea is of a square shape (e.g., size by 5) The input area of a neuron is called its receptive field In a convolutional layer, the receptive area is smaller than the entire previous layer ReLU Layer : ReLU is the abbreviation of rectified linear unit, which applies the non-saturating activation function It effectively removes negative values from an activation map by setting them to zero There are other nonlinear functions such as or sigmoid that can also be used instead of ReLU Most of the data scientists use ReLU since performance wise ReLU is better than the other two Pooling Layer : Pooling layers section would reduce the number of parameters when the images are too large Spatial pooling also called subsampling or down sampling which reduces the dimensionality of each map but retains important information Spatial pooling can be of different types:  Max Pooling  Average Pooling  Sum Pooling Max pooling takes the largest element from the rectified feature map Taking the largest element could also take the average pooling Sum of all elements in the feature map call as sum pooling Flattening Layer : The next step is to flatten the layer In between the convolutional layer and the fully connected layer, there is a ‗Flatten‘ layer Flattening involves transforming the entire pooled feature map matrix into a single column which is then fed to the neural network for processing Full Connection Layer : This step is made up of the input layer, the fully connected layer, and the output layer The output layer is where we get the output that is predicted classes The objective of a fully connected layer is to take the results of the convolution/pooling process and use them to classify the image into a label (in a simple classification example) The fully connected part of the CNN network goes through its own backpropagation process to determine the most accurate weights Each neuron receives weights that prioritize the most appropriate label The information is passed through the network and the error of prediction is calculated Done the back-propagation through the system to improve the prediction It is important that it brought down to numbers between zero and one, which represent the probability of each class This is the role of the activation (Softmax) function Volume 13 Issue – 2020 104 www.joics.net Journal of Information and Computational Science ISSN: 1548-7741 The summary of CNN algorithm is as follows :  Provide input image into convolution layer  Choose parameters, apply filters with strides, padding if requires Perform convolution on the image and apply ReLU activation to the matrix  Perform pooling to reduce dimensionality size  Add as many convolutional layers until satisfied  Flatten the output and feed into a fully connected layer (FC Layer)  Output the class using an activation function (Logistic Regression with cost functions) and classifies images B AleNet Three VGG-E models, VGG-11, VGG-16, and VGG-19; were proposed the models had 11,16, and 19 layers respectively The Visual Geometry Group (VGG), was the runner-up of the 2014 ILSVRC The main contribution of this work is that it shows that the depth of a network may be a critical component to achieve better recognition or classification accuracy in CNNs The VGG architecture consists of two convolutional layers both of which use the ReLU activation function Following the activation function may be a single max pooling layer and a number of other fully connected layers also employing a ReLU activation function the ultimate layer of the model may be a Softmax layer for classification In VGG-E the convolution filter size is modified to a 3x3 filter with a stride of two All versions of the VGG-E models ended an equivalent with three fully connected layers However, the amount of convolution layers varied VGG-11 contained convolution layers, VGG-16 had 13 convolution layers, and VGG-19 had 16 convolution layers VGG-19, the foremost computational expensive model, contained 138Mweights and had 15.5M MACs [13], [14] C VggNet The oldsters at Visual Geometry Group (VGG) invented the VGG-16 which has 13 convolutional and three fully-connected layers, carrying with them the ReLU tradition from AlexNet This network stacks more layers onto AlexNet, and use smaller size filters (2×2 and 3×3) It consists of 138M parameters and takes up about 500MB of space for storing They also designed a deeper variant, VGG-19 By now you would‘ve already noticed that CNNs were beginning to get deeper and deeper This is often because the foremost straightforward way of improving performance of deep neural networks is by increasing their size D AlexNet AlexNet is the primary to implement Rectified Linear Units (ReLUs) as activation functions With 60M parameters, AlexNet has layers — convolutional and three fully-connected AlexNet just stacked a number of more layers onto LeNet-5 At the purpose of publication, the authors acknowledged that their architecture was ―one of the most important convolutional neural networks so far on the subsets of ImageNet.‖ E GoogleNet The number of network parameters GoogLeNet used was much less than its predecessor AlexNet or VGG GoogLeNet had 7M network parameters when AlexNet had 60M and VGG-19 138M The Volume 13 Issue – 2020 105 www.joics.net Journal of Information and Computational Science ISSN: 1548-7741 computations for GoogLeNet also were 1.53G MACs far less than that of AlexNet or VGG GoogLeNet, the winner of ILSVRC 2014, was a model proposed by Christian Szegedy of Google with the target of reducing computation complexity compared to the normal CNN [15] F LeNet As computational hardware started improving in capability, CNNs stated becoming popular as an efficient learning approach within the computer vision and machine learning communities Although LeNet was proposed within the 1990s, limited computation capability and memory capacity made the algorithm difficult to implement until about 2010 LeCun, however, proposed CNNs with the back-propagation algorithm and experimented on handwritten digits dataset to realize state-of-the-art accuracies His architecture is documented as LeNet-5 the entire number of weights and Multiply and Accumulates (MACs) are 431k and a couple of 3M respectively [16] G ResNet ResNet is developed with many various 40 numbers of layers; 34, 50,101, 152, and even 1202 the favored ResNet50 contained 49 convolution layers and fully connected layer at the top of the network the entire number of weights and MACs for the entire network are 25.5M and 3.9M respectively The winner of ILSVRC 2015 was the Residual specification , ResNet Resnet was developed by Kaiming He with the intent of designing ultra-deep networks that didn't suffer from the vanishing gradient problem that predecessors had H DenseNet This idea is efficient for feature reuse, which dramatically reduces network parameters DenseNet consists of several dense blocks and transition blocks, which are placed between two adjacent dense blocks DenseNet developed by Gao Huang et al in 2017, which consists of densely connected CNN layers, the outputs of every layer are connected with all successor layers during a dense block Therefore, it's formed with dense connectivity between the layers rewarding it the name ―DenseNet‖ I Inception Inception-v1 : The 22-layer architecture with 5M parameters is named the Inception-v1 Here, the Network In Network approach is heavily used, as mentioned within the paper this is often done by means of ‗Inception modules‘ The planning of the architecture of an Inception module may be a product of research on approximating sparse structures The motivation for Inception-v2 and Inception-v3 is to avoid representational bottlenecks (this means drastically reducing the input dimensions of subsequent layer) and have more efficient computations by using factorization methods Inception-v3 : Inception-v3 may be a successor to Inception-v1, with 24M parameters It‘s an earlier prototype of v3 hence it‘s very almost like v3 but not commonly used We came out with Inception-v2, they ran many experiments thereon , and recorded some successful tweaks Inception-v3 is that the network that comes with these tweaks Volume 13 Issue – 2020 106 www.joics.net Journal of Information and Computational Science ISSN: 1548-7741 IV CONCLUSION Diabetic retinopathy is a disease of diabetes that damages the retina, causing vision loss Diabetes harms the retinal blood vessels and results blindness DR is preventable, and to avoid vision loss, early detection is important Manually DR detection takes longer time, so to reduce the time, CNN algorithm is used with different architectures which increases the speed of the algorithm General architecture of CNN gives the average accuracy which does not up to the mark To increase the accuracy or to find the better accuracy special architectures are used 93.28% of accuracy is obtained by using R-CNN Now special architectures like VGG16, AlexNet, and InceptionNet V3 are used to found the accuracy of 80.1% Now by using architectures like AlexNet, VggNet, GoogleNet, ResNet got the accuracy of 95.68% Many activation functions like ReLU and Softmax functions are used To get the better accuracy, the number of layers like convolution layer, pooling layer, activation layer are increased which is known as DCNN REFERENCES [1] T Chandrakumar and R Kathirvel , Classifying Diabetic Retinopathy using Deep Learning Architecture, International Journal of Engineering Research & Technology (IJERT), Vol Issue 06, June-2016 [2] M.Melinscak.P.Prentasic and S.Loncaric, Retinal Vessel Segmentation using Deep Neural Networks, VISAPP(1), (2015):577-582 [3] Athira T R, Athira Sivadas, Aleena George , Amala Paul and Neethu Radha Gopan, Automatic detection of Diabetic Retinopathy using R-CNN, International Research Journal of Engineering and Technology (IRJET), Volume: 06 Issue: 05 | May 2019 [4] Nikhil M N and Angel Rose A, Diabetic Retinopathy Stage Classification using CNN, International Research Journal of Engineering and Technology (IRJET) , Volume: 06 Issue: 05 | May 2019 [5] Colas, E., Besse, A., Orgogozo, A., Schmauch, B., Meric, N and Besse, E., 2016, Deep learning approach for diabetic retinopathy screening, Acta Ophthalmologica, 94 [6] Wan, S, Liang, Y and Zhang, Y, Deep convolutional neural networks for diabetic retinopathy detection by image classification, Computers & Electrical Engineering, 72, pp.274-282,2018 [7] Pratt, H., Coenen, F., Broadbent, D M., Harding, S P., and Zheng, Y, Convolutional neural networks for diabetic retinopathy, Procedia Computer Science, 90, 200205 DOI: 10.1016/j.procs.2016.07.014, 2016 [8] Ratul Ghosh, Kuntal ghosh and Sanjit Maitra, Automatic Detection and Classification of Diabetic Retinopathy stages using CNN, 2017 4th International Conference on Signal Processing and Integrated Networks (SPIN) [9] Sehrish Qummar, Fiaz Gul Khan, Sajid Shah, Ahmad Khan, Shahaboddin Shamshirband, Zia Ur Rehman, Iftikhar Ahmed Khan, And Waqas Jadoon, A Deep Learning Ensemble Approach for Diabetic Retinopathy Detection, Received August 27, 2019, accepted October 4, 2019, date of publication October 15, 2019, date of current version October 29, 2019 [10] Y Yang, T Li, W Li, H Wu, W Fan, and W Zhang, Lesion detection and grading of diabetic retinopathy via two-stages deep convolutional neural networks, in Proc.Int.Conf.Med Image Comput.Comput.-Assist Intervent Cham, Switzerland: Springer, 2017, pp 533–540 Volume 13 Issue – 2020 107 www.joics.net Journal of Information and Computational Science [11] ISSN: 1548-7741 Pranay Liya, Vaibhavi Shirodkar, Aashish Kapadia, Prashant Sawant, Detection of Diabetic Retinopathy using Convolutional Neural Network, International Research Journal of Engineering and Technology (IRJET), Volume: 06 Issue: 04 | Apr 2019 [12] Rubina sarki, Sandra Michalska, Khandakar Ahmed, Hua Wang, Yunchun Zhang, Convolution Neural Network for mild diabetic retinopathy detection : an experimantal study, doi: https://doi.org/10.1101/763136 [13] Simonyan, Karen, and Andrew Zisserman, Very deep convolutional networks for largescale image recognition, arXiv preprint arXiv:1409.1556(2014) [14] Md Zahangir Alom, Improved Deep Convolutional Neural Networks (Dcnn) Approaches For Computer Vision And Bio-Medical Imaging, University Of Dayton, December 2018 [15] Szegedy, Christian, et al, Going deeper with convolutions, Proceedings of the IEEE conference on computer vision and pattern recognition 2015 [16] Y Lecun, L Bottou, Y Bengio and P Haffner, Gradient-based learning applied to document recognition, in Proceedings of the IEEE, vol 86, no 11, pp 2278-2324, Nov 1998 [17] E M Shahin, T E Taha, W Al-Nuaimy, S El Rabaie, O F Zahran and F E A El-Samie, Automated detection of diabetic retinopathy in blurred digital fundus images, 2012 8th International Computer Engineering Conference (ICENCO), Cairo, 2012, pp 20-25 [18] V Zeljković, M Bojic, C Tameze and V Valev, Classification algorithm of retina images of diabetic patients based on exudates detection, 2012 International Conference on High Performance Computing & Simulation (HPCS), Madrid, 2012, pp 167-173 Volume 13 Issue – 2020 108 www.joics.net

Định dạng
Số trang	10
Dung lượng	167,53 KB