Hiện nay với sự phát triển nhanh chóng của công nghệ đang trở thành một phần không thể thiếu của cuộc sống hàng ngày, và việc áp dụng deep learning trong việc nhận dạng loại quả đem lại nhiều lợi ích rõ ràng. Từ việc tăng cường hiệu suất sản xuất trong nông nghiệp đến việc cung cấp các dịch vụ và sản phẩm mới trong ngành công nghiệp thực phẩm, tiềm năng của đề tài này là không hạn chế. Hơn nữa, nghiên cứu và phát triển trong lĩnh vực này không chỉ đáp ứng nhu cầu thực tế mà còn đặt ra nhiều thách thức mới trong việc phát triển các thuật toán và mô hình deep learning phù hợp. Việc nghiên cứu trong đề tài này không chỉ là một cơ hội để khám phá các ứng dụng mới của công nghệ mà còn là một bước tiến quan trọng trong việc đóng góp vào sự phát triển của lĩnh vực trí tuệ nhân tạo và máy học. Đề tài “Hệ thống nhận dạng loại quả bằng deep learning” là một vấn đề thực tiễn không chỉ phản ánh sự quan tâm và sẵn lòng khám phá các công nghệ mới mà còn thể hiện sự nhận biết về tiềm năng ứng dụng và ý thức về vai trò quan trọng của công nghệ trong xã hội hiện đại.
Trang 1Classi cation of Fruits using Deep Learning
Keywords: Adam, Fruit classi cation, Gaussian lter, MobileNetV2, TSR transformation
Posted Date: April 1st, 2022
DOI:https://doi.org/10.21203/rs.3.rs-1495878/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License
Read Full License
Trang 2In this work, the various combinations of fruits are classi ed into proper variety and after that the quality
of the fruit is checked as whether it is defect or non-defect For the rst part, Convolutional Neural
Network, AlexNet and MobileNetV2 are employed MobileNetV2 achieved 100% accuracy for fruit typeclassi cation In the second part, the same kind of fruits are fed into classi er for quality checking Theabove said classi ers are used for defect classi cation also For defect classi cation, MobileNetV2 gives99.89% accuracy for orange and 100% for apple
1 Introduction
Throughout the world people rely on fruits for survival Fruits are used in a range of businesses, includingfood industry, thus they desire to deliver a high-quality product The quality of fruits is vitally important topeople's health The most di cult part of the process is manually identifying fruit aws Researchers areusing an automated image processing technique to help them overcome these obstacles Consumers willbene t get more from this type of research work These types of inspections help determine market
pricing Computer vision systems and image processing are burgeoning academic elds, and they're acrucial component of fruit analysis There are numerous methods and techniques for determining imagequality Three types of CNN models are used in this research to determine the optimum accuracy for fruitclassi cation and fruit defect classi cation
Juan M Pounce et al., [1] proposed marker controlled watershed segmentation for olive fruit varietyclassi cation They investigated six different models such as AlexNet, Inception-ResNetV2, Inception V1,Inception V3, ResNet-50 and ResNet-101 Based on the six models Inception ResNetV2 gave the accuracy
as 95.91% Hamdi Altaheri et al., [2] comprised their work into three classi cation models for date fruittype, maturity stage and harvesting decision For type and maturity classi cation the multiclass classi erwere carried out AlexNet and VGGNet models were used Among those models VGGNet acquired 99%accuracy Two deep learning architectures named as light model of six CNN layers and VGG16 for
classi cation the variety of fruits were proposed by M.Shamim Hossain et al., [3] They categorized thefruit species based on their shape, colour and texture Three subsystems termed as to estimate the datetype, maturity and weight were proposed by Mohammed Faisal et al., [4] The date type and maturitysubsystems made use of four models such as ResNet, VGG-19, Inception V3 and NasNet SVM regressionwas used in the date weight estimation ResNet acquired 99% accuracy Aifeng Ren et al., [5] gathered thedata from apple and mango fruit slices in order to analyse and classify the moisture content over time.They have three domain in the feature extraction are frequency domain, time domain and time-frequencydomain Wavelet is a time-frequency domain characteristic that was used to analyse the short durationpulses with quick and unpredictable variations SVM, KNN and Decision Tree are the machine learning
Trang 3models used for the classi cation Among three D-Tree gave the accuracy for apple 95.45% and for
framework(CAE-AND) They experimented with three different models are ResNet-50, DenseNet-169 andADN The CAE-ADN model has the accuracy as 95.86% Jose Naranjo-Torres et al., [8] proposed their work
to recognize fruit using many models such as AlexNet, VGG16, MobileNet, Inception V3 and ResNet-50.They trained their model over 10epochs using stochastic descent gradient (SGD) algorithm AlexNet gavethe accuracy as 95.86% Himer Avila-George et al., [9] classi ed gooseberry fruit by ripeness level utilisingthree colour spaces are RGB, HSV and l*a*b* It was used to visually classify things The models likeANN, D-Tree, SVM and KNN are trained among this SVM classi er gave the accuracy as 92.47%
Autonomous vision based technologies using the open CV programme was proposed by Shital A et al.,[10] This technique was used for fruit sorting, grading and aw identi cation Shushree et al., [11] appliedSFTA algorithm for extracting the features from the dataset The DNN approach was used to train thedataset The examination of sicknesses found in fruits was proposed by Ananthi N et al., [12]
Preprocessing, feature extraction, and image de-noising techniques are investigated for analyzing fruitinfections The median lter was employed in the image de-noising process Blob detection was used toimprove the image and convert it to binary format The common agglomeration problem was solved by K-Means segmentation DNN Classi cation technique was used to classify fruits based on their taintness.Meshwa Patel et al., [13] identi ed the fruit quality in orange fruit using image processing with SupportVector Machine classi er The rst stage in image processing was image preparation The Gray Level Co-occurrence Matrix feature was used in feature extraction to remove extraneous components and simplifythe process In the image de-noising process, the Median lter was applied Morphological image
processing was used for enhancing the images Pushpavalli M [14] advocated a computer vision
technique to grade mango fruits Image preprocessing, segmentation, feature extraction, selection, andclassi cation were applied for better result To minimize noise, a median lter was utilized After
converting the image to binary representation, segmentation was performed The OTSU method was used
to transform the data SoummoSupriya et al., [15] developed a machine vision-based expert system forfruit recognition in order to retrieve the cultural past Image segmentation was carried out using K-Meansclustering algorithm The statistical and Grey Level Co-occurrence Matrix features were extracted from theimages
Yogesh et al., [16] introduced three primary steps such as model building, model testing, and model
con guration The apple and mangosteen fruits were analyzed using the CNN algorithm For pears
categorization, the ANN classi er was used Strawberries were classi ed using SVM It had a great level
of precision Deepti.C et al., [17] set out to discover the most e cient and cost-effective method of
Trang 4detecting arti cially ripened fruits Functional needs, non-functional requirements, assumptions anddependencies, and restrictions were the four kinds of system requirements To determine whether the fruit
is phony or real, a CNN algorithm with appropriate distance and orientation was applied RashminPriya etal., [18] proposed for the disease name in orange fruits Shape, color, and texture are taken into accountfor feature extraction Two types of lters are employed in segmentation such as the median lter and thebox lter Segmentation was done using OTSU method For data mining and cluster analysis, the K-
Means method is utilized ShaliniGnanavel et al., [19] set out to determine fruit quality and provide
organic consumption levels A conductivity sensor can be used for a variety of purposes, including qualitymonitoring, detecting arti cially ripened fruit, and detecting pesticide residue levels in fruits
Nareshkumar.C et al., [20] proposed an automatic vision-based technology for fruit problem detectionforreplacing the human approach Image acquisition, image noise removal, and image segmentation werethe three procedures covered by them To remove background noise, the Gaussian lter was applied Thesolution proposed by Neha et al., [21] used computer vision to detect bruising in a non-destructive manner
in tomato aws The computer vision system has a total of 12 levels: one for input, another for output,and ten for hidden layers To extract features from the image, convolutional neural networks are utilized
A temperature sensor, humidity sensor, and light dependent resistor make up the IOT-based food storageunit monitoring system The Wi-Fi module connects to the internet and updates the room's status TheCNN model extracts and categorizes features and patterns in images
An image processing-based technique for identifying apple fruit quality loss was proposed by Ramya.C etal., [22] There are three steps: Data gathering, CNN development, and data argumentation were all part ofthe data collection process The pooling layer shrinks the image in the second step The 7*7 matrix wasconverted into a 4*4 matrix by stacking the layers The fully connected layer is useful for training thenetwork, predicting outcomes, and categorizing inputs The fruit illness was discovered in the nal
stages IshdeepSingla et al., [23] compared a number of approaches Image acquisition, image
preprocessing, image segmentation, feature extraction, and classi cation were the ve steps required totrain the apple Filters, contrast, stretching, grouping, histogram equalization, grey scale conversion, andRGB to binary are used to improve the image In the segmentation phase, morphological ltration,
thresholding, and clustering were utilized To help categorize the fruits, extracted the features such ascolor, shape, texture, and size Sumati.M et al., [24] conferred an automatic vision based system for
sorting and grading fruits by color and size Color-based fuzzy reasoning is used to sort the mango fruit.A.K.Dubey et al., [25] de ned segmentation methodologies for detecting exterior faults in pome fruits Thenotion of marker-controlled watershed segmentation was founded on ooding Manjesh.R et al., [26]approach was capable of recognizing fruit based on shape, color, and texture The texture was examinedusing GLCM, which considers pairings of pixels with certain values The color histogram converts a colorimage to an HSV image while keeping the hue and saturation For object detection, the HOG feature wasused in vision and image processing Vibhute.A.S et al., [27] proposed a technique for evaluating fruitsbased on their quality Color detection was evaluated using the fruit's RGB values The RGB image wastransformed into HSV during color detection The main and minor axis lengths were calculated using theEuclidian distance method for size detection The outside parameters were used to grade and sort the
Trang 5objects Color and size were detected via threshold detection Kumar Mondal et al., [28] used a edge detection framework in their work CNN shown a fruit recognition system based on visual datacaptured by a smartphone During feature extraction, the network used a variety of convolutions andpooling methods to detect the features The completely linked layer utilized as a classi er DeepikaBairwa et al., [29] designed to classify fruits based on shape, color, and texture using image processingtechniques Pre-processing eliminates noise and corrects sorted or graded data GLCM was used inextraction Malini V.L et al., [30] were created the data by shooting the fruits they were being spun by amotor and then eliminating frames Technique proposed by Misha et al., [31] was completely automatedand capable of grading a large number of fruits That was used to predict the maturity and quality ofpurchased fruits During image acquisition and pre-processing, Gaussian lter was employed to reducethe noise Fuzzy segmentation reduces the appearance of the image To improve the image quality, theintensity value is changed Feature extraction and color features were considered by this system Thesparse representative classi er sparsely represented the fruit image using a part of the training data Theclassi er can assign a class label to test samples directly based on training data.
cutting-2 Proposed Work
Farmers, purchasers, and shopkeepers can utilize the deep learning techniques used in this work toidentify fruit type and quality The purpose of this work is to eliminate health concerns associated withtainted fruits
2.1 Fruit type classi cation
To expand the size of the datasets, the input images are augmented The images are then preprocessedbefore being categorized using classi er models such as DCNN, AlexNet, and MobileNetV2 as gure 1shows the block diagram for the fruit type classi cation
The datasets for fruit type classi cation are sourced from Google Images Here the different types likeapple, orange, and banana are taken as classes There are 668 images in total for this framework
2.1.1 Preprocessing
The original input image is converted to gray scale image as shown in Fig 2 Next the images are
subjected to translation process Translation, Scaling, and Rotation (TSR Transformation) are performed
as shown in Fig 3
Translation is the movement of an object in a straight line from one point to another, which is calculatedusing the equation (1) and (2) The object is moved from one coordinate location to another in this step.x1 = x+Tx (1)
Trang 6y1 = y+Ty (2)
Assume P is a point with coordinates of x and y (x, y) To translate a point from one coordinate position(x, y) to another (x1 y1), we multiply the original coordinate by the translation distances Tx and Ty It will
be translated as (x1 y1)
Scaling is a tool for adjusting or changing the size of items Scaling factors are used to make the
changes, which is calculated using the equation (3) and (4)
x1 = x *sx (3)
y1 = y * sy (4)
Sx in the x-direction and Sy in the y-direction are the two scaling factors If the starting point is x and y Sxand Sy are the scaling factors, and the values of the coordinates after scaling are x1 and y1
Rotation is the process of adjusting an object's angle The rotation can be done in either a clockwise oranticlockwise direction We must give the angle of rotation and the rotation point for rotation A pivotpoint is another name for a rotation point It's a print that shows which object has been turned is
denoted as an angle
Figure 3(a) shows the translation image, the scaling applied image is shown in gure 3(b) and the
rotation is performed as shown in gure 3(c)
The Gaussian lter was applied to minimize noise as shown in the gure 4 A Gaussian blur of an imageremoves outlier pixels or high-frequency components After the preprocessing stages, normalization iscarried out as shown in gure 5 It improves the pixel range and intensity values
2.2 Classi cation
After preprocessing, the images are fed into classi ers In this work DeepCNN, AlexNet, and MobileNetV2are employed as classi ers
2.2.1 Deep convolutional neural network
A CNN is one of the most popular deep learning models It deals with local and spatial features andpatterns directly from raw data like as pictures, video, text, and sound using deep convolutional networks.DCNN learns features from data automatically, removing the need to manually extract them Through asequence of successive convolutional layers, it can create complex features by combining simple
characteristics The deeper layers learn to recognize complicated high-level features such as completeobjects in a picture, while the early layers learn to recognize low-level elements such as edges and curves
Trang 7Figure 6 displayed the DCNN architecture The Conv2D function accepts four arguments: the rst is thenumber of lters, which is 32; the second is the shape of each lter, which is 3x3; the third one is inputshape, which is 224; and the fourth one is type of image (RGB) of each image The activation function'ReLU' stands for a Recti ed Linear Unit function If the function receives a negative value, it returns 0, but
if it receives a positive value, it returns that value It helps to keep the compute required to run the neuralnetwork from growing exponentially y = max(0, x)is the formula used for ReLU activation function Toavoid over tting, three convolution layers followed by max-pooling layers is used A dropout layer isinserted after the maxpool operation Sparse Categorical Crossentropy as the loss function which is used
to recompile the model For a smoother curve, the lower learning rate of 0.000001 is considered Themodel is trained and classi ed Then the accuracy rate is calculated
27x27x256 output The next step is to use MaxPooling once more, this time lowering the size to
13x13x256.Another Convolutional Layer with 384, (3,3) lters and the same padding is applied twice,yielding 13x13x384 as the output, followed by another Convolutional Layer with 256, (3,3) lters and thesame padding, yielding 13x13x256 as the output MaxPool is used, and the dimensions are lowered to6x6x256 The layer is then attened and two fully connected layers with 4096 units each are created,each of which is connected to a 1000 unit softmax layer in gure 7 As per our requirements, the network
is utilized to classify a large number of classes However, because we have to categorize into six classes,
we will build the output softmax layer with six units The softmax layer calculates the probabilities foreach class that an Input Image could belong to Adam optimizer is used in this model The model istrained and classi ed then the accuracy rate is calculated
2.2.3 MobileNetV2
For Image Classi cation and Mobile Vision, MobileNetV2 is a CNN architecture model MobileNetV2requires relatively little computational power to execute or apply transfer learning This makes it ideal formobile devices, embedded systems, and PCs with limited computing e ciency or no GPU, without
affecting the accuracy of the results It's also ideal for web browsers, which have limitations in terms ofcompute, graphics processing, and storage
MobileNetV2 is based on a streamlined design that leverages depth wise separable convolutions togenerate low weight deep neural networks, are proposed for mobile and embedded vision applications
We offer two simple global hyper-parameters that e ciently trade off latency and accuracy Depthwiseseparable lters, also known as Depthwise Separable Convolution, are the foundation of MobileNet ingure 8 Another component that improves performance is the network structure In the order of trainingand testing, the fruit type and defect datasets are partitioned in the 80:20 range The batch size is 32 and
Trang 8the image form is resized to 224*224 The last completely connected layer for fruit type classi cation isfreezes, and three layers are placed before this layer, with the model called fruit model There are 157layers in total, with 3.5 million parameters The MobileNetv2 model includes 1000 neurons, but in thisstudy only three are used, depending on the categories learned Two new layers are included for fruitdefect classi cation, and the model is termed fruit defect model There are 156 layers in total, with 3.5million parameters Recompile the model using Adam as the optimizer and categorical cross entropy asthe loss function In neural network models that predict a multinomial probability distribution, the
softmax function is utilized as the activation function in the output layer The model is trained and
classi ed The accuracy rate is calculated
2.2.4 Adam
Adam is a Neural Network optimization solver that is computationally e cient, takes little memory and issuited for problems with a lot of data and parameters Adam is a stochastic gradient descent extension.Adam adjusts the learning rate for each weight of the neural network using estimates of the rst andsecond moments of gradient Adam optimizer is utilized in three models such as DCNN, AlexNet andMobileNetV2
The images are trained in this three models and fruit type is classi ed into three types such as apple,orange and banana
3 Classi cation Of Images As Defect Or Non-defect
The fruit quality categorization is to help with import and export, as well as numerous fruit-related
businesses Fruit defects are easily spotted which saves time for workers and ensures that the fruits arefresh in a timely manner This fruit quality classi cation method is used to locate defective fruits in ashort amount of time
3.1 Data Augmentation
The datasets for fruit defect classi cation are acquired from Google Images The fruits chosen for thisframework are apple and orange There are 1728 images obtained for the apple datasets and 2331images collected for the orange datasets This framework distinguishes between defective and non-defective fruits The datasets for this frameworkcomprise a total of 4059 images in total To expand thedataset quantity data augmentation is performed It reduces the over tting Rotation, shear, zoom,
brightness, and horizontal ip are all performed in the augmented images
3.2 Classi cation
The images are fed into the above three classi er models like DCNN, AlexNet and MobileNetV2 In thisframework, apple and orange fruits are labeled into two types such as defect and non-defect The appleand orange fruits are classi ed into two types named as defect or non-defect
Trang 94 Results And Discussion
4.1 Result for fruit type classi cation
Three models are trained in the proposed framework for fruit type classi cation The models trained forthis dataset are DCNN, AlexNet and MobileNetV2 The learning rate of the Deep Convolutional NeuralNetwork model is 0.000001 and it is trained over 110 epochs in the duration of 3hrs The validation
accuracy is 87.5 percent and the training accuracy is 100 percent The AlexNet model is trained across
100 epochs, with a total time of 1hr40mins The validation accuracy is 99.25 percent, whereas the
training accuracy is 99.81 percent The MobileNetV2 model is trained over a period of 12 minutes using
10 epochs The validation accuracy is 100 percent and the training accuracy is 100 percent According tothese three models, the MobileNetV2 model has the best fruit type accuracy.It recognizes the fruit typeandcategorizes as shown in gure 9.
The gure 10 shows the accuracy rate of training and validation with various epochs for fruit type
classi cation
The loss rate for training and validation process is shown in gure 11 with respect to epochs vs crossentropy for fruit type classi cation
Table 1 Comparison of accuracy for fruit type classi cation with different epochs
Deep convolutional neural network 110 3hrs 87.5%
For the fruit type classi cation, there is a comparison accuracy for the three models as shown in table 1.The MobileNetV2 model, out of the three, provided 100 percent accuracy in a short period of time Forfruit type classi cation, the MobilNetV2 model is the best.The best model for fruit type classi cation asshown in gure 12 as a graphical representation based on its accuracy Among the three models, theMobileNetV2 model has the best accuracy
4.2Result for defect classi cation
Three models are trained in this work for the classi cation of fruits as defect or non-defect such as
DCNN, AlexNet and MobileNetV2 For apple dataset,the learning rate of the Deep Convolutional NeuralNetwork model is 0.000001, and the model is trained across 300 epochs in 3 hours and 25 minutes Thevalidation accuracy is 68 percent and the training accuracy is 84.55 percent The AlexNet model is trainedover 100 epochs in 1 hour and 6 minutes The validation accuracy is 70.59 percent and the training
Trang 10accuracy is 99.36 percent The MobileNetV2 model is trained over a 10-minute period using 20
epochs.The validation accuracy is 100 percent, whereas the training accuracy is 100 percent Whencomparing the accuracy of the three models, the MobileNetV2 model has the best accuracy This
paradigm is quite useful for distinguishing between defective and non-defective fruits as shown in gure13
Table 2 Comparison of accuracy for the defect classi cation of apple fruit with different epochs
Deep convolutional neural network 300 3hrs25mins 68%
The table 2 shows the comparison of the proposed three models for fruit defect and non-defect
classi cation based on accuracy Apple and Orange are the two fruits for which three models were
developed For apple, the MobileNetV2 model provided the most accurate results Apple gained 100percent accuracy in a short amount of time The best model for fruit defect classi cation for apple isshown in gure 16 as a graphical representation based on its accuracy
For orange dataset, the Deep Convolutional Neural Network model has a learning rate of 0.000001 andtakes 35 minutes to train across 10 epochs The training accuracy is 72.63 percent, whereas the
validation accuracy is 90 percent In 40 minutes, the AlexNet model is trained across 10 epochs Thetraining accuracy is 97 percent, while the validation accuracy is 53 percent The MobileNetV2 model istrained using 10 epochs over a 22-minute timeframe The training accuracy is 100 percent, whereas thevalidation accuracy is 99.89 percent When the accuracy of the three models is compared, the
MobileNetV2 model comes out on top This paradigm is very effective for determining if a fruit is defect
or not gure 17
The gure 18 shows the accuracy rate of training and validation with various epochs for defect fruitclassi cation for orange