Interactive and multiorgan based plant species identification =177

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY NGUYEN THI THANH NHAN INTERACTIVE AND MULTI-ORGAN BASED PLANT SPECIES IDENTIFICATION Major: Computer science Code: 9480101 THESIS ABSTRACT COMPUTER SCIENCE Hanoi −2020 This dissertation is completed at: Hanoi University of science and technology Supervisors: Assoc Prof Dr Le Thi Lan Prof Dr Hoang Van Sam Reviewer 1: Assoc Prof Dr Nguyen Thi Thuy Reviewer 2: Assoc Prof Dr Tran Quang Bao Reviewer 3: Assoc Prof Dr Pham Van Cuong The dissertation will be defended before approval committee at Hanoi University of Science and Technology: Time 13h30, date 06 month 05 year 2020 The dissertation can be found at Ta Quang Buu Library Vietnam National Library INTRODUCTION Motivation Plant identification that aims at matching a given specimen plant to a known taxon is considered as important key to assess flora knowledge Nowadays, the availability of relevant technologies (e.g digital cameras and mobile devices), images datasets and advance techniques in image processing and pattern recognition let the idea of automated plants/species identification become reality The automatic plant identification can be defined as a process of determining the name of species based on their observed images In recent years, we have witnessed a significant improvement of automatic plant identification performance in term of both accuracy and the number of species classes [1, 2] The use of plant identification in reality still has to overcome the following limitations First, the number of covered plant species (e.g., 10,000 in LifeCLEF [2]) is relatively small in comparison with the number of plant species on the earth (e.g., 400,000 [3]) Second, the accuracy of the automatic plant identification still need to be improved Objective The main aim of this thesis is to overcome the second limitation of the automatic plant identification (low recognition accuracy) by proposing novel and robust methods for plant recognition For this, we first focus on improving the recognition accuracy of plant identification based on images of one sole organ Among different organs of the plant, we select leaf as this organ is the most widely in the literature Second, taking into consideration that using one sole organ for plant identification is not always relevant because one organ cannot fully reflect all information of a plant due to the large inter-class similarity and large intra-class variation Therefore, multi-organ plant identification is also studied in this dissertation Finally, one more objective of the dissertation is to contribute in spread the knowledge of one specific kind of plants (medicinal plants) in Vietnam by developing an application for Vietnamese medicinal plant retrieval based on plant identification To this end, the concrete objectives are: Develop a new method for leaf-based plant identification that is able to recognize the plants of interest even in complex background images; Propose a fusion scheme in multiple organ plant identification; Develop image-based plant search module in Vietnamese medicinal plant retrieval application Contributions The dissertation has three main contributions as follows: Contribution 1: A complex background leaf-based plant identification method has been proposed The proposed method combines the advantages of interactive segmentation that helps to determine the leaf region with very few user interactions and the representative power of Kernel Descriptor (KDES) Contribution 2: One fusion scheme for two-organ based plant identification has been introduced The fusion is an integration between a product rule and a classification-based approach Contribution 3: Finally, an image-based plant search module has been developed and deployed in Vietnamese medicinal plant retrieval application named VnMed Dissertation outline Introduction: This section describes the main motivations and objectives of the study We also present critical points the research’s context, constraints and challenges Additionally, the general frame-work and main contributions of the dissertation are also presented Chapter 1: A Literature Review: This chapter mainly surveys existing works and approaches proposed for automatic plant identification Chapter 2: In this chapter, a method for plant identification based on leaf image is proposed In the proposed method, to extract leaf region from images, we proposed to apply interactive segmentation Then, the improved KDES (Kernel DEScriptor) is employ to extract leaf characteristic Chapter 3: This chapter focuses on multi-organ plant identification We have proposed different strategy for determining the result of multi-organ identification based on those of single-organ ones Chapter 4: In this chapter, we propose a method for organ detection and the use of this method for developing an application for Vietnamese medicinal plant retrieval system Conclusion: We give some conclusions and discuss the limitations of the proposed method Research directions are also described for future works CHAPTER LITERATURE REVIEW 1.1 Plant identification from images of single organ There are a large number of automatic plant identification methods Among different organs of the plant, leaf is the most widely used [4] because leaf usually exists in a whole year The identification results on leafscan often give the best results when compared with other organs [5] The popular organ is flower because its appearances (e.g., color, shape, texture) are highly distinguishing [6] In addition, other organs are used to identify plant such as fruit, stem and branch There are two main approaches for the plant identification based on image of the plant organs The first one uses the hand-designed feature-based while the second one employs the deep learning method Hand-designed feature-based method consists of main stages: training and testing Each stage consists of four main components: image acquisition, preprocessing, feature extraction and classification [7] Feature extraction can be considered the most important component in system The purpose of extracting features is reducing the dimensionality of the data and good representation of that data Features include global (color, texture, shape) and local features (organ-specific) For example, leaf has an organ-specific feature such as leaf vein structure, leaf margin, tooth Shape of leaf plays the most important role [4] Shape and color are important features for a flower Previous studies often combine two or more feature types for each organ because there is no single feature strong enough to separate all categories The second employs deep learning methods Recently, learning feature representations using a Convolutional Neural Networks (CNN) show a number of successes in different topics in the field of computer vision such as object detection, segmentation, and image classification [8] CNN can automatically learn the features Each layer extracts features from the output of the previous layer The first layers in the network are very simple to extract for lines, curves, or blobs in the input image This information will be used as input for the next layer, with the task more difficult to extract the components of the object in the image Finally, the highest classes in the training network will receive the task of classifying objects in the image Typically CNNs are AlexNet, VGG, GoogLeNet and ResNet The teams utilizing deep learning techniques are top winners in LifeCLEF competition 1.2 Plant identification from images of multiple organs The fact that the state-of-the-art results of the plant identification using a single organ are still far from practical requirements Recently, the plant identification moves from one sole organ to multi-organ More researches have been dedicated to plant identification from images of multi-organs especially with the release of a large dataset of multi-organs images of ImageCLEF since 2013 [1, 5, 2] This is motivated by a real scenario and it also can reflect the process of plant identification of botanists Observing simultaneously several organs allows the botanists to disambiguate species that could be confused when using only one organ Multi-organ plant identification can be divided into two groups: The first group is interested in organs of the plant, the second group does not care about the organs of the plant In the first approach each organ will be trained separately In the second approach, all images will be trained together, regardless of which organ they belong Early or late fusion techniques will be used to combine the results 1.3 Plant data collection and identification systems There are a number of image-based plant identification applications deployed on mobile devices such as Pl@ntNet, iNaturalist, iSpot, Leafsnap, FlowerChecker, PlantSnapp, Plantifier, etc [9, 10] These applications often provide three main functions that are exploring, identifying, collecting data Plant identification function and data collection function are two functions that support each other When a plant identification function obtains highly accuracy, it will attract more people to use the system and collect more data Collected data will be diversity and more species Then, these data are used to retrain the system One well-known problem in classification is overfitting In order to avoid this, images of the same species should be diverse and be taken under different protocols This shows the role of crowdsourcing system CHAPTER LEAF-BASED PLANT IDENTIFICATION METHOD BASED ON KERNEL DESCRIPTOR 2.1 The framework of lead-based plant identification method The framework of leaf-based plant identification is illustrated in Figure 2.1 The method consists of three main modules that are image preprocessing, feature extraction and classification Figure 2.1 The complex background leaf image plant identification framework 2.2 Interactive segmentation Figure 2.2 The interactive segmentation scheme The interactive segmentation scheme used in our work consists of steps: Inner/ outer markers determination, watershed segmentation, interested leaf selection and leaf-shape normalization (see Figure 2.2) At the first step, a user manually marks lines in and out of an interesting leaf Then, watershed method is used for image segmentation[11] From the returned results of watershed segmentation, the users can select the interested leaf region in the third step Finally, leaf-shape is normalized 2.3 Feature extraction Kernel descriptor (KDES) has been proposed firstly by Liefeng Bo et al [12], allows to capture various visual attributes of the pixel (e.g using gradient, color and binary Fingure 2.5 An example of the uniform patch in the original KDES and the adaptive patch in our method (a,b) two images of the same leaf with different sizes are divided using uniform patch; (b,c): two images of the same leaf with different sizes are divided using adaptive patch shape) and to learn compact features from match kernels via kernel approximation In [13], Nguyen et al proposed three improvements in KDES extraction Being similar to KDES, the improved KDES is extracted through steps: pixel level feature extraction, patch-level feature extraction, and image-level feature extraction We proposed improved KDES for feature extraction a) Pixel-level features extraction At this level, a gradient vector is computed for each pixel of the image The gradient vector at a pixel z is defined by its magnitude m(z ) and θ(z ) is the angle of gradient vector at that point In [12], the orientation θ˜(z) is defined as following: θ˜(z) = [sin(θ(z ))cos(θ(z ))] (2.8) b) Patch-level features extraction Generate a set of patches from image with adaptive size In this task, we create patches with adaptive size instead of fixed size We make an adaptive patch size in order to get a similar number of patches along both horizontal axis and vertical axis Figure 2.5 describes an example of the uniform patch in the original KDES and the adaptive patch in our method Compute patch-level feature Patch-level features are computed based on the idea of the kernel method Derived from match kernel representing the similarity of two patches, we can extract feature vector for the patch using approximative patch-level feature map, given a designed Fingure 2.7 Construction of image-level eature concatenating feature vectors of cells in layers of hand pyramid structure patch level match kernel function The approximative feature over image patch P is constructed as [13]: ˜ (z)φ o (˜ ω(z )) ⊗ φp (z) F gradient (P ) = X m (2.17) z ∈P where m ˜ (z) is the normalized gradient magnitude, φo (˜ ω(z )) and φp (z) are approximative feature maps for orientation kernel and position kernel respectively, ⊗ is the Kronecker product c) Image-level features extraction Once patch-level features are computed for each patch, the remaining work is computing a feature vector representing the whole image To this, a spatial pyramid structure dividing the image into cells using horizontal and vertical lines at several layers Then we compute the feature vector for each cell of the pyramid structure and concatenate them into a final descriptor The feature map on the pyramid structure is: φ¯P (X ) = w(1) φ¯S (X (1,1) ); ; w (l) φ¯S (X (l,t) ); ; w (L) φ¯S (X (L,n L ) ) (2.21) Where w(l) is the weight associated with level l, φ¯S (X (l,t) is the feature map of cell t-th in the l-th level We obtain the final representation of the whole image, that we call image-level feature vector This vector will be an input of a Multiclass SVM for training and testing 2.4 Experimental results 2.4.1 Datasets We conduct experiments on the following public datasets: Subset of ImageCLEF 2013 dataset: 5,540 and 1,660 leaf images of 80 species of ImageCLEF 2013 for training and testing respectively Flavia dataset: 1,907 leaf images on a simple background of 32 species LifeCLEF 2015 dataset: The Table 2.1 shows detail leaf/leafscan dataset Table 2.1 Leaf/leafscan dataset of LifeCLEF 2015 Leaf Leafscan Training 13,367 12,605 Testing 2,690 221 Number of species 899 351 2.4.2 Experimental results Results on ImageCLEF 2013 dataset The results are shown in Table 2.2 The results show that our improvements on kernel descriptor extraction make a significant increase of the performance on both interactive and automatic segmented images Moreover, the proposed method obtains the best result On the same dataset, improved KDES outperformed the original KDES On the same KDES method, interactive segmentation allows to improve significantly the accuracy Table 2.2 Accuracy obtained in six experiments on ImageCLEF 2013 dataset Method Accuracy(%) Improved KDES with Interactive segmentation 71.5 Original KDES with Interactive segmentation 63.4 Improved KDES with no segmentation 43.68 Original KDES with no segmentation 43.25 Improved KDES with Automatic segmentation 42.3 Original KDES with Automatic segmentation 35.5 Results on Flavia dataset The accuracy is 99.06% We compare with other methods on Flavia dataset The results are as follows in Table 2.4, the our method is the best, it improve in range [0.36, 6.86]% than other results The results are very high with a simple image dataset Results on LifeCLEF 2015 dataset The evaluation measure is the score at image level [1] The proposed method has been integrated in our submissions named Mica Run 1, Mica Run and Mica Run Figure 2.12 shows the obtained results of all participated teams in LifeCLEF 2015 The results show that KDES performs very well on the Leaf Scan category with an identification score better than most of the runs based on the GoogLeNet such as Inria c: the predicted class of the species for the query q Basic combination techniques The three rules that are widely used in basic combination techniques are max, sum and product rules Using these rules, the class c of the query q is defined as follows: Max rule c = arg max max s i (I k ) k=1 N i Sum rule (3.1) N c = arg max X s i (I k ) i (3.2) k=1 Product rule N c = arg max Y s i (I k ) i (3.3) k=1 The basic combination approaches not always guarantee a good performance However, as they are simple and not require training process, most of the current multiple-organ based plant identification methods adopt these techniques Classification-based fusion (CBF) The main idea of classification-based fusion approaches is that multiple scores are treated as feature vectors and a classifier is constructed to discriminate each class The signed distance from the decision boundary is usually regarded as the fused score We adopt this idea in plant identification from images of two organs In our work, SVM (Support Vector Machine) is chosen as classifier as it is a powerful classification technique The CBF is performed as follows: Firstly, we have to define the positive and negative samples in training dataset For each pair of images, we will have one positive sample and (C − 1) negative samples The positive and negative samples are illustrated as shown in Figure 3.3 In the test phase, for the query q, the feature vector is computed through the singleorgan plant identification models Then CBF method results two predict probabilities for each species i − th: one for positive denoted P pos (i, q) and one for negative denoted P neg (i, q) respectively The list of plants that are ranked by s i (q) is determined where s i (q) is the confidence score of the plant species i − th obtained for the query q : s i (q) = Ppos (i, q) 11 (3.4) Figure 3.3 Explanation for positive and negative samples The class c is predicted as follows, where ≤ i ≤ C c = arg max s i (q) (3.5) i Robust Hybrid Fusion (RHF) The above classification-based approach can loose distribution characteristics for each species because all positive and negative samples of all species are merged and represented in a metric space only Therefore, we build each species an SVM model based on its positive and negative samples When we input a pair of organs to our model, we will know the probability that this pair belongs to each species by these SVM classifiers Then we combine this probability with the confidence score of each organ As far as we know, q is the query of a pair of two image organs, and si (I k ) is i-th species confidence score for image I k Let us denote si (q) are the confidence score of a query q for i-th plant species computed by SVM model The robust hybrid fusion model is formed as follows: c = arg maxs i (q) i (I ) Y si k (3.6) k=1 This model is an integration between a product rule and a classification-based approach We expect that the positive probability of point q affects the fusion result If the positive probability of point q is high, the probability of point q belonging to i-th species is high, too 12 3.3 The choice of classification model for single organ plant identification For the single organ plant identification we employ some well-known CNN networks that are AlexNet [21], ResNet [22] and GoogLeNet [23] Two schemes are proposed as illustrated in Figure 3.10: (1) one proper CNN for each organ and (2) one CNN for all organs The first scheme allows making explicit fusion for each organ while the second does not require to know the type of organ and consumes less computation resources Fingure 3.10 Single organ plant identification In our experiments, we use two schemes for the network weights that are pretrained on ImageNet dataset and fine tune the chosen networks with the working dataset 3.4 Experimental results 3.4.1 Dataset To evaluate the proposed fusion scheme, the dataset contains images of 50 species extracted from LifeCLEF 2015 and augmented from the Internet (Table 3.2) This dataset is divided into parts: CNN training is the training data for single organ identification; SVM Input used as training dataset of the SVM model; Testing is used to evaluate the performances of CNN and late fusion methods Table 3.2 The collected dataset of 50 species with four organs Flower Leaf Entire Branch Total CNN Training 1650 1930 825 1388 5793 SVM Input 986 1164 495 833 3478 Testing 673 776 341 553 2343 Total 3309 3870 1661 2774 11614 Species number = 50 13 Table 3.3 Single organ plant identification accuracies (%) with two schemes: (1) A CNN for each organ; (2) A CNN for all organs AlexNet ResNet GoogLeNet Organ Scheme Scheme Scheme Scheme Scheme Scheme Leaf (Le) 66.2 63.8 73.4 70.6 75.0 76.6 Flower (Fl) 73.0 72.2 75.6 75.4 82.2 78.4 Branch (Br) 43.2 47.4 48.6 54.6 53.2 54.8 Entire (En) 32.4 33.8 32.4 39.0 36.4 35.2 3.4.2 Single organ plant identification results The results obtained for the two proposed schemes with three networks are shown in Table 3.3 We can observe that GoogLeNet obtained better results than that of AlexNet, ResNet in both schemes and for most organs It is interesting to see that scheme is suitable for high discriminative and salient organs such as leaf and flower while scheme is a good choice for other organs such as branch and entire The results of branch and entire identification in scheme are improved because some images of flower and leaf might contain the branch and entire information The advantage of using scheme for single organ identification is that it does not require to determine the type of organ The results also show that flower is the organ that obtains the best result while the entire gets the lowest result 3.4.3 Evaluation of the proposed fusion scheme in multi-organ plant identification Table 3.4, Table 3.5 and Table 3.6 show the performance obtained when combining a pair of organs for plant identification The experimental results show that almost the fusion techniques highly improve the accuracy rate compared with utilizing images from one sole organ In the case of applying scheme for single organ plant identification, with AlexNet, the best performance for single organ is 73.0% for flower images, whereas by applying the proposed RHF, the accuracy rate of a combination between leaf-flower images dramatically increases by 16.8% to 89.8% When applying ResNet, the combination of leaf and flower (Le-Fl) improves +17% over the single organ and +13.6% when applying GoogLeNet Not only the leaf-flower pair but in all six pairs of multi-organs combination, RHF also retain the high performances Almost the other fusion performances are also higher than those of single organ Comparison to MCDCNN (Multi Column Deep Convolutional Neural Networks) To show the effectiveness of the proposed fusion scheme, we compare its performance with the performance of MCDCNN [24] The obtained results on the same dataset in Table 3.7 show that the proposed method outperforms MCDCNN in all 14 Table 3.4 Obtained accuracy at rank-1 when combining each pair of organs with different fusion schemes in case of using AlexNet The best result for each pair of organs is in bold Accuracy (%) En - Le En - Fl Le - Fl Br - Le Br - Fl Br - En R1 R5 R1 R5 R1 R5 R1 R5 R1 R5 R1 R5 Scheme for Max Sum rule rule 66.2 67.2 88.6 88.8 73.8 74.4 92.6 92.8 81.6 82.0 96.8 96.8 70.2 71.0 89.6 90.0 74.2 75.4 90.8 91.4 51.6 52.2 76.8 77.6 single organ Product rule 75.6 93.2 78.8 94.2 88.6 98.2 76.8 93.4 80.8 95.2 58.0 83.6 identification CBF 74.0 81.8 77.2 84.2 86.2 90.4 73.8 79.6 79.0 83.0 58.0 81.4 RHF 76.6 94.6 81.2 94.4 89.8 98.4 78.4 93.8 81.4 95.4 58.6 83.8 Scheme for single organ identification Max Sum Product CBF RHF rule rule rule 66.8 67.2 77.4 71.4 78.6 88.4 88.2 93.6 80.2 94.4 73.84 73.6 78.8 76.24 80.4 88.8 89.2 94.8 83.6 95.6 78.8 81.2 89.6 83.2 89.6 95.6 96.0 99.2 88.8 99.2 66.4 68.2 78.2 73.6 78.2 92.0 93.0 95.6 81.6 96.0 70.2 70.6 80.6 76.6 81.4 90.4 90.6 95.4 84.6 95.6 52.4 52.8 60.6 60.6 61.6 78.2 78.6 83.6 83.4 84.9 Table 3.5 Obtained accuracy at rank-1 when combining each pair of organs with different fusion schemes in case of using ResNet The best result for each pair of organs is in bold Accuracy (%) En - Le En - Fl Le - Fl Br - Le Br - Fl Br - En R1 R5 R1 R5 R1 R5 R1 R5 R1 R5 R1 R5 Scheme for Max Sum rule rule 70.4 72.2 91.8 92.6 73.8 75.4 93.2 93.6 90.0 91.4 98.0 98.8 77.8 79.2 91.8 92.2 80.0 81.0 93.6 94.4 52.4 54.4 82.0 83.4 single organ Product rule 75.2 92.8 80.0 95.0 92.4 99.0 82.0 94.0 84.4 97.6 62.2 86.6 identification CBF 73.2 90.6 76.4 89.2 91.4 96.0 79.4 90.4 82.0 91.4 55.0 80.4 RHF 78.0 93.2 83.2 95.4 92.6 99.2 83.2 94.6 86.4 97.8 60.6 87.4 Scheme for single organ identification Max Sum Product CBF RHF rule rule rule 73.6 75.4 80.8 73.2 80.8 94.2 94.4 94.8 90.6 95.2 74.6 76.0 80.2 76.4 83.2 94.4 95.0 95.8 89.2 95.2 85.8 87.6 89.2 91.4 92.6 98.4 98.4 99.0 96.0 99.2 79.8 81.4 83.6 79.4 83.2 94.4 94.4 96.4 90.4 94.6 78.8 80.4 85.6 81.0 86.0 95.6 96.0 96.2 91.4 97.6 60.4 66.2 69.0 55.0 69.0 84.8 85.6 89.6 80.4 87.6 Table 3.6 Obtained accuracy at rank-1 when combining each pair of organs with different fusion schemes in case of using GoogLeNet The best result for each pair of organs is in bold Accuracy (%) En - Le En - Fl Le - Fl Br - Le Br - Fl Br - En R1 R5 R1 R5 R1 R5 R1 R5 R1 R5 R1 R5 Scheme for Max Sum rule rule 74.6 75.0 94.0 93.8 79.2 79.8 95.8 96.0 91.4 92.0 99.6 99.6 79.8 81.0 94.4 94.6 85.0 86.0 97.0 97.4 58.0 58.8 81.4 81.8 single organ Product rule 79.2 93.6 83.4 97.0 95.4 99.6 84.6 97.4 90.2 99.2 61.8 86.8 identification CBF 79.4 84.0 83.8 89.2 93.8 96.0 80.2 84.8 87.2 90.2 60.2 70.4 RHF 80.6 94.4 84.2 96.8 95.8 99.8 84.6 97.4 91.6 99.0 64.2 87.0 Scheme for single organ identification Max Sum Product CBF RHF rule rule rule 77.8 78.0 79.4 81.2 82.0 91.4 91.4 96.2 85.6 95.8 77.6 78.0 81.0 80.2 81.0 93.6 93.8 95.8 84.4 96.2 90.6 90.2 92.6 91.8 92.8 98.6 98.8 99.0 93.8 99.0 81.2 81.8 85.6 81.6 86.6 96.8 96.8 96.8 86.0 97.0 80.0 80.4 86.8 83.2 87.2 96.0 96.0 97.6 86.8 97.0 57.8 58.4 65.6 59.2 66.4 82.2 82.0 87.0 68.4 87.0 combinations The improvement is up to 14.4% for the combination of branch and leaf 15 Table 3.7 Comparison of the proposed fusion schemes with the state of the art method named MCDCNN [24] The best result for each pair of organs is in bold Accuracy (%) En - Le En - Fl Le - Fl Br - Le Br - Fl Br - En 3.5 R1 R5 R1 R5 R1 R5 R1 R5 R1 R5 R1 R5 RHF (AlexNet) 76.6 94.6 81.2 94.4 89.8 98.4 78.4 93.8 81.4 95.4 58.6 83.8 Scheme for single organ identification RHF RHF (ResNet) (GoogLeNet) 78.0 80.6 93.2 94.4 83.2 84.2 95.4 96.8 92.6 95.8 99.2 99.8 83.2 84.6 94.6 97.4 86.4 91.6 97.8 99.0 60.6 64.2 87.4 87.0 RHF (AlexNet) 78.6 94.4 80.4 95.6 89.6 99.2 78.2 96.0 81.4 95.6 61.6 84.0 Scheme for single organ identification RHF RHF (ResNet) (GoogLeNet) 80.8 82.0 95.2 95.8 83.2 81.0 95.2 96.2 92.6 92.8 99.2 99.0 83.2 86.6 94.6 97.0 86.0 87.2 97.6 97.0 69.0 66.4 87.6 87.0 MCDCNN [24] 70.0 91.0 75.6 94.2 86.6 98.4 72.2 93.0 76.8 93.0 55.2 80.6 Conclusion This chapter presented the fusion scheme proposed for multi-organ based plant identification The combination of two organs usually gives better results than one organ The experiments show that the fusion techniques increase the performances dramatically Also, the robust hybrid fusion model presents the best result in all evaluations It obtains from + 3.2% to + 14.8% of improvement in rank-1 over MCDCNN method In future work, we will investigate a method to identify species for observations with an unfixed number of organs 16 CHAPTER TOWARDS BUILDING AN AUTOMATIC PLANT RETRIEVAL BASED ON PLANT IDENTIFICATION 4.1 The framework for building automatic plant identification system We proposed a new framework based on deep learning for building an automatic plant identification from images as illustrated in Figure 4.3 Figure 4.3 The proposed framework for building automatic plant identification Plant data collection The step aims at collecting images from different sources Plant organ detection We propose to build an organ detection (leaf, flower, fruit, stem, branch, non-plant) based on LifeCLEF 2015 dataset and used as an automatic data filter Data validation The main purpose of this task is to remove the invalid plant images while keeping the valid ones Plant identification Once dataset processed by the data validation step, different identification models can be trained for plant identification 4.2 Plant organ detection We propose to apply GoogLeNet and transfer learning for building organ detection We apply the dataset by taking images of LifeCLEF 2015 dataset (leaf, flower, fruit, 17 stem, branch) [1] and the collected dataset from the internet (non-plant) Experiment: Table 4.4 presents the results corresponding to two weighted initialization strategies The results show that using the weighted training set on a large such as ImageNet allows to obtain an improvement +5.08% for the rank and +2.54% for the rank over the case of randomly weight initialization This result is very promising as the working images are mainly captured in a complex background This proves that deep learning is capable of learning well with natural images Table 4.4 The organ detection performance of the GoogLeNet with different weight initialization Weighted initialization strategy Acc rank1 (%) Acc rank2 (%) Randomly generated weight 82.10 94.92 Pre trained on ImageNet 87.18 97.46 4.3 Case study: Development of image-based plant retrieval in VnMed application The aim of this section is to develop the image-based plant retrieval functionality of VnMed by applying the proposed framework We have conducted the following experiments First, we collect images of 100 medicinal plants following two acquisition modes: manual acquisition and crowdsourcing We then organize these images into four datasets as follows: VnDataset1 contains images captured by manual image acquisition VnDataset2 contains images of VnDataset1 and images collected through crowdsourcing VnDataset3 contains remaining images of VnDataset2 after applying the plant organ detection method built in the previous section to remove invalid images VnDataset4 contains images of VnDataset3 after manually removing invalid images of VnDataset3; These training datasets are shown in Table 4.8 We perform two evaluations named evaluation and evaluation The evaluation contains 972 images captured by manual image acquisition while in the evaluation 2, 3,163 images including images of evaluation and other images collected through crowdsourcing are used Table 4.8 Four Vietnamese medicinal species databases VnDataset1 VnDataset2 VnDataset3 VnDataset4 train 3,901 16,513 15,652 15,150 We finetune GoogleNet which is pre-trained on ImageNet Four models are generated for four corresponding datasets (denote M i model) The results are shown in Table 4.9 18 The training data plays an important role in the performance of the plant identification The more heterogeneous the training data is, the more robust the model is Among models, M outperform the others ones on evaluation (accuracy at rank is 81.58%) However, when testing with the images in evaluation 2, the performance of this model decreases dramatically The other models obtain the results that are relatively lower than the model M on evaluation However, these models still keep high accuracies when working with images of evaluation M1 model is not suitable with data collected from crowdsourcing Between the models M2 , M , M , the results obtained on both evaluations ranked from high to low are M4 , M , M This shows the important role of data validation It is also worth noting that the automatic data validation based on plant organ detection allows to remove a significant part of the invalid images Table 4.9 Results for Vietnamese Experiments Accuracy (%) rank evaluation rank rank evaluation rank medicinal plant identification M1 M2 M3 M4 81.58 76.03 78.70 79.63 90.64 88.48 83.54 84.77 29.62 56.50 57.73 58.46 34.62 66.42 67.31 79.48 At the time of the dissertation writing, a second dataset containing 75,405 images of 596 Vietnamese medicinal plants is built by applying the proposed framework We propose to use the GoogLeNet to train on this dataset and to get accuracy at rank is 66.61% and at rank 10 is 87.52% The identification model trained on the collected dataset has been integrated in VnMed application 4.4 Conclusion In this chapter, an automatically plant identification system has been proposed The core step of the framework is the data validation with the help of the proposed plant organ detection We also have confirmed the validity of the proposed framework for building image-based plant retrieval of VnMed application As results, an image dataset of 596 medicinal plants in Vietnam has been collected and carefully annotated with the help of botanists Moreover, the identification model trained on this dataset has been integrated in the VnMed application 19 CONCLUSIONS AND FUTURE WORKS Conclusions This dissertation has made three contributions: (1) a complex background leafbased plant identification method, (2) a fusion scheme for two-organ based plant identification, (3) a framework for automatic plant identification without dedicated dataset and the application of this framework for the Vietnamese medicinal plant retrieval system For plant identification based on complex background leaf images, we have proposed to combine an interactive segmentation method with the improved KDES To evaluate the robustness of the proposed method, we have performed experiments on different datasets The obtained results show that the combination of improved KDES and interactive image segmentation in the proposed method outperform the original KDES and different state of the art hand-designed feature-based methods on both ImageCLEF 2013 and Flavia dataset While working with a more challenging dataset such as LifeCLEF 2015, the obtained results are still competitive to others deep learningbased methods for leaf images Concerning the fusion scheme in multiple-organ based plant identification, a fusion scheme named RHF has been proposed The proposed scheme allows to determine the results of plant identification based on the results of single-organ ones For the single organ based plant identification two schemes: one CNN for each organ, and one CNN for all organs have been evaluated with three deep learning networks AlexNet, ResNet and GoogLeNet The obtained results show that the proposed method outperforms the conventional fusion schemes that are basic combination schemes and classificationbased schemes It also outperforms the state of the art method MCDCNN methods on a subset LifeCLEF 2015 datasets with 50 species The results also confirm that the use of two organs highly improves the accuracy rate compared with those utilizing images from one sole organ Among the combination, leaf and flower is the most robust one In the case, applying scheme for single organ plant identification, for the AlexNet, the best performance for single organ is 73.0% for flower images, whereas by applying the proposed RHF, the accuracy rate of a combination of leaf-flower images significantly increases by 16.8% When deploying plant identification system in practice, one issue we have to face is the lack of plant dataset In this thesis, we have introduced a framework for automatic plant identification including main steps: plant data collection, plant organ detection, data validation and plant identification The core step of the framework 20 is the data validation with the help of the proposed plant organ network We have applied the proposed framework for building image-based plant retrieval function of VnMed application The experiments on a dataset of 100 medicinal plants have shown that the role of filtering data collected from different sources is very important, which helps training data models become more robust Based on the proposed framework, an image containing 75,405 images of 596 Vietnamese medicinal plants has been built The identification model trained on the collected dataset has been integrated in VnMed application Future works In this thesis, we have proposed some improvements for plant identification However, these improvements dedicate a small portion to the progress of developing automatic plant identification systems for real environments In the future, we want to continue to some research works based on the results of this dissertation In addition, there are a number of research questions and ideas come when we have to finish this dissertation In this section, we summarize the selected works we would like to after this dissertation that is divided into two categories: Short term and long term 4.4.1 Short term Evaluate the proposed fusion schemes for multi-organ plant identification: In the thesis, a fusion scheme RHF has been proposed for two-organ based plant identification Theoretically, this fusion scheme can be applied for multiple-organ plant identification Therefore, in the near future, we will extend the proposed fusion scheme and evaluate its robustness for other organs Deploy multiple-organ search module for VnMed: In the current deployment, the image-based plant retrieval takes only one image of the plant We would like to deploy the two-organ plant retrieval in the first period and then multiple-organ plant retrieval in this application For this, an interface that allows to select/capture several images, as well as the fusion scheme, has to be implemented 4.4.2 Long term Although several improvements have been obtained for plant identification, the current accuracy rate is still limited especially when working with a heterogeneous dataset and the high number of species For example, the accuracy at rank for 596 Vietnamese medicinal plants is 66.61% Therefore, the first long-term future work is to improve the accuracy of the identification by following suggestions: Enrich the image dataset through the use of the system by end-users: We will collect plant images from end users, these images will be validated by the proposed system and labeled by automatic identification, then eventually verified by the experts These images will be used to enrich the training data Our experiments have been shown that the accuracy of the method can be increased when the 21 training dataset is enriched Design appropriate CNN architectures/loss function for plant identification: As a part of our thesis is the fusion scheme technique, for single organ identification, we simply apply transfer learning on the available CNNs In the future, we would like to investigate and design appropriate CNN architecture or loss functions for the plant identification problem Develop multimodal based plant identification: Of course, images is one important cue to identify the plants However, using only images for plant identification could produce errors Besides images, the experts and botanists reply on different cues (e.g., smell, touch) to secure a high identification accuracy In the future, we will study and develop multimodal based plant identification Secondly, we would like to extend the Ph.D work for others kinds of species in Vietnam such as economically/ecologically important species Finally, in order to spread the knowledge of our flora to wide public especially our next generation, we aim to develop different game-based and augmented reality-based application using the plant identication results 22 Bibliography [1] Goăeau H., Bonnet P., and Joly A (September 2015) Lifeclef plant identification task 2015 In CEUR-WS, editor, CLEF: Conference and Labs of the Evaluation forum, volume 1391 of CLEF2015 Working notes Toulouse, France [2] Goăeau H., Bonnet P., and Joly A (2017) Plant identification based on noisy web data: the amazing performance of deep learning (lifeclef 2017) CLEF working notes, 2017 [3] Govaerts R (2001) How many species of seed plants are there? Taxon , 50(4):pp 10851090 [4] Wă aldchen J and Mă ader P (2018) Plant species identication using computer vision techniques: A systematic literature review Archives of Computational Methods in Engineering, 25(2):pp 507543 [5] Goăeau H., Joly A., Bonnet P., Selmi S., Molino J.F., Barthélémy D., and Boujemaa N (2014) Lifeclef plant identification task 2014 In CLEF2014 Working Notes Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014 , pp 598–615 [6] Nilsback M.E and Zisserman A (2009) An automatic visual flora-segmentation and classification of flower images Ph.D thesis, Oxford University [7] Aakif A and Khan M.F (2015) Automatic classification of plants based on their leaves Biosystems Engineering , 139:pp 66–75 [8] Yoo H.J (2015) Deep convolution neural networks in computer vision IEIE Transactions on Smart Processing & Computing, 4(1):pp 35–43 [9] Joly A., Goăeau H., Bonnet P., Bakic V., Barbe J., Selmi S., Yahiaoui I., Carré J., Mouysset E., Molino J.F., et al (2014) Interactive plant identification based on social image data Ecological Informatics , 23:pp 22–34 [10] http://www.inaturalist.org/(retrieved 15/january/2017) [11] Meyer F and Beucher S (1990) Morphological segmentation Journal of visual communication and image representation, 1(1):pp 21–46 [12] Bo L., Ren X., and Fox D (2010) Kernel descriptors for visual recognition In Advances in neural information processing systems, pp 244–252 [13] NGUYEN V.T (2015) Visual interpretation of hand postures for human-machine interaction Ph.D thesis, Université de La Rochelle 23 [14] Chaki J., Parekh R., and Bhattacharya S (2015) Recognition of whole and deformed plant leaves using statistical shape features and neuro-fuzzy classifier In 2015 IEEE 2nd international conference on recent trends in information systems (ReTIS), pp 189–194 [15] Chaki J., Parekh R., and Bhattacharya S (2015) Plant leaf recognition using texture and shape features with neural classifiers Pattern Recognition Letters , 58:pp 61–68 [16] Wang Z., Sun X., Ma Y., Zhang H., Ma Y., Xie W., and Zhang Y (2014) Plant recognition based on intersecting cortical model In 2014 International joint conference on neural networks (IJCNN), pp 975–980 [17] Kheirkhah F.M and Asghari H (2018) Plant leaf classification using gist texture features IET Computer Vision , 13(4):pp 369–375 [18] Tsolakidis D.G., Kosmopoulos D.I., and Papadourakis G (2014) Plant leaf recognition using zernike moments and histogram of oriented gradients In Hellenic Conference on Artificial Intelligence, pp 406–417 Springer [19] Du J.x., Zhai C.M., and Wang Q.P (2013) Recognition of plant leaf image based on fractal dimension features Neurocomputing , 116:pp 150–156 [20] Priya C.A., Balasaravanan T., and Thanamani A.S (2012) An efficient leaf recognition algorithm for plant classification using support vector machine In International conference on pattern recognition, informatics and medical engineering (PRIME-2012), pp 428–432 IEEE [21] Krizhevsky A., Sutskever I., and Hinton G.E (2012) Imagenet classification with deep convolutional neural networks In Advances in neural information processing systems, pp 1097–1105 [22] He K., Zhang X., Ren S., and Sun J (2015) Deep residual learning for image recognition CoRR , abs/1512.03385 [23] Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., Erhan D., Vanhoucke V., and Rabinovich A (2015) Going deeper with convolutions In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9 [24] He A and Tian X (2016) Multi-organ plant identification with multi-column deep convolutional neural networks In 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)2016 , pp 002020–002025 24 PUBLICATIONS [1] Thi-Lan Le, Duong-Nam Duong, Van-Toi Nguyen, Hai Vu, Van-Nam Hoang and Thi Thanh-Nhan Nguyen, (2015) Complex Background Leaf-based Plant Identification Method Based on Interactive Segmentation and Kernel Descriptor, Proceedings of the 2nd International Workshop on Environmental Multimedia Retrieval, ISBN: 978-1-4503-3274-3, pp.3-8 [2] Thi-Lan Le, Duong-Nam Duong, Hai Vu and Thanh-Nhan Nguyen (2015) Mica at lifeclef 2015: Multi-organ plant identification, In CEUR-WS.org/Vol-1391CLEF2015 Working note proceedings, ISSN: 1613-0073, vol 1391 [3] Thi Thanh Nhan Nguyen, Van Tuan Le, Thi Lan Le, Hai Vu, Natapon Pantuwong and Yasushi Yagi (2016), Flower species identification using deep convolutional neural networks, AUN/SEED-Net Regional Conference on Computer and Information Engineering 2016, Yangon, Myanmar, ISBN: 978-99971-0-231-7, pp.51-56 [4] Thi Thanh-Nhan Nguyen, Thi-Lan Le, Hai Vu, Huy-hoang Nguyen and VanSam Hoang (2017), A combination of Deep Learning and Hand-Designed Feature for Plant Identification Based on Leaf and Flower, In Asian Conference on Intelligent Information and Database Systems, Studies in Computational Intelligence, volume 710, Springer, ISBN: 978-3-319-56659-7, pp 223-233 [5] Nguyen Thi Thanh Nhan, Do Thanh Binh, Nguyen Huy Hoang, Vu Hai, Tran Thi Thanh Hai, Thi-Lan Le (2018), Score-based Fusion Schemes for Plant Identification from Multi-organ Images, VNU Journal of Science: Computer Science and Communication Engineering, Vol 34, No.2, ISSN 2588-1086, pp.1-15 [6] Thi Thanh Nhan Nguyen, Thi-Lan Le, Hai Vu, Van-Sam Hoang, Thanh-Hai Tran (2018), Crowdsourcing for botanical data collection towards to automatic plant identification: A review, Journal of Computers and Electronics in Agriculture (SCIE), vol 155, ISSN:0168-1699, pp.412-425 [7] Nguyen Thi Thanh Nhan, Le Thi Lan, Vu Hai, Hoang Van Sam (2018), Automatic Plant Organ Detection from Images using Convolutional Neural Networks, Journal of Research and Development on Information and Communication Technology (in Vietnamese), vol V-1, No 39, ISSN: 1859-3526, pp 17-25 [8] Thi Thanh-Nhan Nguyen, Thi-Lan Le, Hai Vu, Van-Sam Hoang (2019), Towards an automatic plant identification system without dedicated dataset International Journal of Machie Learning and Computing (Scopus), vol 9, No.1, ISSN: 2010-3700, pp.26-34 ... identify plant such as fruit, stem and branch There are two main approaches for the plant identification based on image of the plant organs The first one uses the hand-designed feature -based while... each species because all positive and negative samples of all species are merged and represented in a metric space only Therefore, we build each species an SVM model based on its positive and. .. Nguyen and VanSam Hoang (2017), A combination of Deep Learning and Hand-Designed Feature for Plant Identification Based on Leaf and Flower, In Asian Conference on Intelligent Information and Database

Định dạng
Số trang	27
Dung lượng	0,96 MB