Mobile Netw Appl (2014) 19:618–625 DOI 10.1007/s11036-014-0526-7 Content-Based Image Retrieval Using Moments of Local Ternary Pattern Prashant Srivastava & Nguyen Thanh Binh & Ashish Khare Published online: 18 July 2014 # Springer Science+Business Media New York 2014 Abstract Due to the availability of large number of digital images, development of an efficient content-based indexing and retrieval method is required Also, the emergence of smartphones and modern PDAs has further substantiated the need of such systems This paper proposes a combination of Local Ternary Pattern (LTP) and moments for Content-Based Image Retrieval Image is divided into blocks of equal size and LTP codes of each block are computed Geometric moments of LTP codes of each block are computed followed by computation of distance between moments of LTP codes of query and database images Then, the threshold using distance values is applied to retrieve images similar to the query image Performance of the proposed method is compared with other state-of-the-art methods on the basis of results obtained on Corel-1,000 database The comparison shows that the proposed method gives better results in terms of precision and recall as compared to other state-of-the-art image retrieval methods Keywords Image retrieval Content-based image retrieval Local ternary pattern Geometric moments P Srivastava : A Khare (*) Department of Electronics and Communication, University of Allahabad, Allahabad, Uttar Pradesh, India e-mail: ashishkhare@hotmail.com A Khare e-mail: khare@allduniv.ac.in P Srivastava e-mail: prashant.jk087@gmail.com N T Binh Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Ho Chi Minh, Vietnam e-mail: ntbinh@cse.hcmut.edu.vn Introduction With the advent of numerous digital image libraries, containing huge amount of different types of images, it has become necessary to develop systems that are capable of performing efficient browsing and retrieval of images Also, with the emergence of mobiles and smartphones, the number of images is increasing day-by-day Pure text-based image retrieval systems are prevalent but are unable to retrieve visually similar images Also, it is practically difficult to annotate manually large number of images Hence, pure text-based approach is insufficient for image retrieval Content-Based Image Retrieval (CBIR) - the retrieval of images on the basis of features present in the image, is an important problem of Computer Vision Content-based image retrieval, instead of using keywords and text, uses visual features such as colour, texture and shape to search an image from large database [1,2] These features form a feature set which act as an indexing scheme to perform search in an image database These feature sets of query images are compared with that of database images to retrieve visually similar images Since retrieval is based on contents of image, the process of arrangement and classification of images is easier as it does not require manual annotation The automatic classification of images together makes the access of similar images easier to the users Early image retrieval systems were based on primitive features such as colour, texture and shape The field of image retrieval has witnessed substantial work on colour feature Colour is a visible property of an object and a powerful descriptor of object Colour based CBIR systems use conventional colour histogram to perform retrieval Texture is another feature that has been used extensively for image retrieval Texture feature represents structural arrangement of a region and describe characteristics such as smoothness, coarseness, roughness of a region One such feature is Local Binary Mobile Netw Appl (2014) 19:618–625 Pattern (LBP) [3] which is applied on gray-level images LBP is a very powerful descriptor as it is practically easy to compute and is invariant to gray-level transformations However, being based on bit values and 1, LBP operator fails to discriminate between multiple patterns Also, the presence of noise in the image affects the LBP operator as it is highly sensitive to noise Tan et al [4] provided an extension of LBP as Local Ternary Pattern (LTP) LTP thresholds neighbourhood pixels to three values and is less sensitive to noise as compared to LBP However, LTP is not invariant to gray level transformation Content-based retrieval methods based on shape feature has been used extensively Shape does not mean shape of whole image but shape of a particular object or a region in the image Shape features generally act as global features The global features consider whole image to extract features However, they not consider local variations in the image Shape features are generally used after segmentation of objects from images unlike colour and texture [5] Since segmentation is a difficult problem, therefore, shape features have not been exploited much But, still shape is considered as a powerful descriptor Single feature is insufficient to construct efficient feature vector which is very essential for efficient image retrieval The combination of more than one feature attempts to solve this problem The combination of colour and texture [6], colour and shape [7], and colour, texture, and shape [8] has been widely used for this purpose Modern image retrieval methods combine local and global features of an image to perform efficient retrieval The combination of local and global features exploits the advantages of both the features This property has motivated us to combine local feature LTP with global feature moments This paper combines LTP and moments in the form of moments of LTP Grayscale images are divided into blocks of equal size and LTP codes of each block are computed Geometric moments of these LTP codes are then computed to form feature vector Euclidean distance is computed between blocks of query image and database images to measure similarity followed by computation of threshold values to find images similar to the query image Rest of the paper is organized as follows Section discusses some of the related work in the field of image retrieval Section describes fundamentals of LTP and image moments along with their properties Section of this paper is concerned with the proposed method Section discusses experimental results and Section concludes the paper Related work Over a past few decades the field of image retrieval has witnessed a number of approaches to improve the performance of image retrieval Text-based approaches are still in 619 use and almost all web search engines follow this approach Early CBIR systems were based on colour features Later on, colour based techniques saw use of colour histograms Texture features caught the attention of researchers and were used extensively for the purpose of image retrieval Texture features such as LBP, LTP are considered to be powerful descriptive features and have been used for various applications Pietikäinen et al [9] proposed block-based method for image retrieval using LBP Murala et al [10] proposed two new features, namely Local Tetra Patterns (LTrP) and Directional Local Extrema Pattern (DLEP) [11], based on the concept of Local Binary Pattern (LBP) as features for image retrieval Liu et al [12] proposed the concept of Multi-texton Histogram (MTH) which is considered as an improvement of Texton Co-occurrence Matrix (TCM) [13] The concept of MTH works for natural images The concept of Microstructure Descriptor (MSD) has been described in [14] This feature computes local features by identifying colours that have similar edge orientations Shape has also been exploited as a single feature as well as in combination with other features Zhand et al [15] proposed a region based shape descriptor, namely, Generic Fourier Descriptor (GFD) Two dimensional fourier descriptor was applied on polar raster sampled shape image in order to extract GFD, which was applied on image to determine the shape of the object Lin et al [16] proposed a rotation, translation and scale invariant method for shape identification which is also applicable on the objects with modest level of deformation Yoo et al [17] proposed the concept of histogram of edge directions, called as edge angles to perform shape based retrieval [18] used the concept of moments for CBIR The method divided images into blocks and computed geometric moments of each block Euclidean distance between blocks of query image and database image was computed followed by computation of threshold to retrieve visually similar images However, these features have been exploited as single feature which are not sufficient for constructing powerful feature vector Therefore, the combination of two or more features emerged as silver lining in the field of image retrieval as this combined the advantages of all features [19] proposed the combination of SIFT, LBP and HOG descriptors as bag of feature model in order to exploit the concept of local and global features of image The combination of wavelets with other features has also been exploited for image retrieval Combination of gabor filter and Zernike moments has been proposed in [20] Gabor filter performs texture extraction while Zernike moment performs shape extraction This method has been applied for face recognition, fingerprint recognition, shape recognition Wavelet has also been used with colour as wavelet correlogram in [21] Wavelet has a powerful characteristic of multiresolution analysis It is because of this 620 property that wavelets have been used extensively for image retrieval The combination of trous wavelet with microstructure descriptor (MSD) as trous gradient structure descriptor has been proposed in [22] Wang et al [8] incorporated colour, texture and shape features for image retrieval Colour feature has been exploited by using fast colour quantization Texture features are extracted using filter decomposition and finally, shape features have been exploited using pseudo-Zernike moments Li et al [23] proposed the use of phase and magnitude of Zernike moment, for image retrieval Deselaers et al [24] compared certain features for image retrieval on different databases Mobile Netw Appl (2014) 19:618–625 moments and various types of moment based invariants play an important role in object recognition and shape analysis The (p+q)th order geometric moment Mpq of a gray-level f(x,y) is defined as Z∞ Z∞ M pq ¼ xp yq f ðx; yÞdxdy ð2Þ ∞ ∞ In discrete cases [25], the integral in the equation (2) reduces to summation and equation (2) becomes M pq ¼ n X m X xp yq f x; yị 3ị xẳ1 y¼1 Features used and their properties 3.1 Local ternary patterns Local Ternary Pattern (LTP) is an extension of Local Binary Pattern (LBP) Whereas LBP operator thresholds a pixel to 2-valued codes and 1, LTP thresholds a pixel to 3-valued codes The gray levels in a zone of width ±t around pixel c are quantized to 0, those which are above this are quantized to +1 and those below this are quantized to − That is, < 1; p c ỵ t = 1ị LTPp; c; t ị ẳ 0; pc < t : ; −1; p≤ c−t where t is a user-specified threshold In order to eliminate negative values, the LTP values are divided into two channels, the upper LTP (ULTP) and the lower LTP (LLTP) The ULTP is obtained by replacing the negative values by The two channels of LTP are treated as separate entities for which separate histograms and similarity metrics are computed combining these at the end Computation of LTP with the help of an example has been shown in Fig (t=5) 3.2 Properties of LTP LTP holds following important properties1 LTPs are less sensitive to noise as compared to LBP LTP is not invariant to gray level transformation 3.3 Moments Moment is a measure of shape of object Image moments are useful to describe objects after segmentation Image where n x m is the size of gray-level image f(x,y) Simple properties of image which are found via image moments include area, its centroid and information about the orientation Moment features are invariant to geometric transformations Such features are useful to identify objects with unique shapes regardless of their size, and orientation Being invariant under linear coordinate transformations, the moment invariants are useful features in pattern recognition problems Moments have been used for distinguishing between shapes of different aircraft, character recognition, and scene matching applications Following properties of image moments are very useful in image retrieval1 Moment features are invariant to geometric transformations Moment features provide enough discrimination power to distinguish among objects of different shapes Moment features provide efficient local descriptors for identifying the shape of objects Infinite sequence of moments uniquely identifies objects 3.4 Local ternary patterns and moments Single feature fails to capture complete information of an image The combination of features is required to incorporate fine details of an image while constructing feature vector The combination of local and global features is one such approach in this direction The local features help in capturing local variations On the other hand global features capture holistic ideas of an image Also, this approach combines the advantages of both the features The combination of LTP and moments help in fulfilling these criteria LTP, a local feature captures texture details and act as a powerful classifier Moment, a global feature determines shape of an object in the image Mobile Netw Appl (2014) 19:618–625 and is invariant to geometric transformation The advantages of this combination are summarized as follows1 LTP, as compared to LBP, is less sensitive to noise and hence the combination of LTP with moments is less affected by the presence of noise The use of geometric moment as a single feature creates numerical instabilities as it takes high values for higher order moments [26] But the combination of LTP and moments overcome this disadvantage as the moment values of LTP are not very high Geometric moments are invariant to geometric transformations Hence its combination with LTP incorporates this advantage in the LTP-Moment feature vector 621 The schematic diagram of the proposed method is shown in Fig 4.1 Computation of LTP codes The algorithm for computation of LTP codes is as follows: Convert the image into grayscale Rescale the image to 252×252 Divide the image into blocks of 84 × 84 and compute LTP codes of each block Computation of LTP yields two values: upper LTP (ULTP) and lower LTP (LLTP) 4.2 Computation of moments The proposed method The proposed method consists of three steps: The first step is concerned with division of image into blocks and computation of LTP codes of each block In second step, we compute geometric moments of LTP codes of query image and database images Threshold is computed to perform retrieval in the third step Fig Computation of LTP Geometric moments of ULTP and LLTP codes are computed using eqn (3) The sequence of moments chosen here is to 15 The moment values of ULTP and LLTP are computed separately 4.3 Distance measurement Let the moments of LTP codes for different blocks of query image be represented as mQ =(mQ1,mQ2,mQn) Let the moments of LTP codes for different blocks of database images 622 be represented as mDB ẳ mDB1 ; mDB2 ; mDBn ị Then, the Euclidean distance between block LTP moments of query and database image is given as À Á qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi D mQ ; mDB ẳ 4ị mQi mDBi 4.4 Computation of threshold Threshold is used to perform retrieval Use of threshold improves the retrieval results as compared to the retrieval result obtained without using threshold The basic idea behind threshold computation is to find the range of distance values which return images similar to the query image The Euclidean distance values computed using equation (4) are sorted in ascending order so that images are arranged according to similarity to query image That is, the most similar image first and others after that The index of similar images is stored along with their distance values to identify minimum and maximum values of range This determines the range of Then the algorithm to compute threshold is given below: Mobile Netw Appl (2014) 19:618–625 similarity to a query image This procedure is repeated for every image of database to find the range of similarity Finally, the minimum and maximum of all range of values is determined These values determine threshold of the entire category of similar images This is done for all categories of images in database The threshold values for upper LTP and lower LTP are computed separately To compute threshold, let (i) N be total number of relevant images in database and NDB be total number of images in the database (ii) sortmat be the sorted matrix (ascending order) of distance values and minix be first N indices of images in sortmat matrix (iii) start_range and end_range be the range of relevant images in the database (iv) maxthreshold and minthreshold are respectively the maximum and minimum distance values of each query image (v) mthreshmat be the maximum of all the values of maxthreshold Mobile Netw Appl (2014) 19:618–625 623 Fig Schematic diagram of the proposed method Experiment and results To perform experiment using the proposed method, images from Corel-1K database [27] have been used The images in this database have been classified into ten categories, namely, Africans, Beaches, Buildings, Buses, Dinosaurs, Elephants, Flowers, Horses, Mountains, Food Each image is of size either 256 × 384 or 384 × 256 Each category of image consists of 100 images Each image has been rescaled to 252×252 to ease the computation Sample images from each category are shown in Fig Each image of this database is taken as query image If the retrieved images belong to the same category as that of the query image, the retrieval is considered to be successful, otherwise the retrieval fails P¼ IR TR ð5Þ where IR denotes total number of relevant images retrieved and TR denotes total number of images retrieved Recall is defined as the ratio of total number of relevant images retrieved to the total number of relevant images in the database Mathematically, recall can be formulated as R¼ IR CR ð6Þ where IR denotes total number of relevant images retrieved and CR denotes total number of relevant images in the database In this experiment, TR =10 and CR =100 5.1 Performance evaluation 5.2 Retrieval results Performance of the proposed method has been measured in terms of precision and recall Precision is defined as the ratio of total number of relevant images retrieved to the total number of images retrieved Mathematically, precision can be formulated as Fig Sample images from Corel-1,000 database For the experimentation purpose, each image is divided into blocks of size 84 ×84 Local Ternary Pattern codes of each block are computed followed by computation of geometric moments of LTP codes Distance between block moments of 624 Mobile Netw Appl (2014) 19:618–625 Table Average precision and recall values for each category of image Table Comparison of the proposed method with other methods Category Precision (%) Recall (%) Methods Precision (%) Africans Beaches Buildings Buses Dinosaurs Elephants Flowers Horses Mountains Food Average 41.50 33.70 33.40 54.80 94.50 42.50 87.60 79.30 27.90 41.80 53.70 66.12 51.82 66.60 74.80 91.82 74.56 85.88 83.46 53.42 72.20 72.09 Block-based LBP [9] CBIR using moments [18] Gabor histogram [24] Image-based HOG-LBP [19] LF SIFT histogram [24] Color histogram [24] The proposed method 23.00 35.94 41.30 46.00 48.20 50.50 53.70 query image and database image is determined Then the retrieval is performed using threshold obtained by using threshold algorithm The computation of local ternary pattern yields two values, namely upper LTP and lower LTP These two values are treated as separate entities of LTP codes Separate moment distance and threshold values are computed which are subsequently combined at the end of computation of threshold After computing distance measurement of the two moment values, threshold is computed for the purpose of retrieval This produces two sets of similar images Union of these two sets is taken to produce final set of similar images Recall is computed by counting total number of relevant images in the final set Similarly, for precision, top n matches for each image set is counted and then union is applied on these two sets to produce final set Mathematically, this can be formulated as follows Let fULTP be set of similar images obtained from moments of upper LTP codes and fLLTP be set of similar images obtained from moments of lower LTP codes Then, the final set of similar images denoted by fRS is given by f RS ¼ f ULTP ∪ f LLTP ð7Þ Similarly, let fnULTP and fnLLTP be set of top n images obtained from moments of upper LTP codes and moments of lower LTP codes respectively Then the final set of top n images denoted by fnPS is given as f nPS ¼ f nULTP ∪ f nLLTP Fig a Precision vs Category plot b Recall vs Category plot ð8Þ Fig Comparison of the proposed method (PM) with other methods in terms of average precision Mobile Netw Appl (2014) 19:618–625 Retrieval is considered to be good if the values of precision and recall are high Table shows the performance of the proposed method for each category of image of database in terms of precision and recall Fig shows the plot between recall and precision values for different image categories The proposed method is compared with other state-of-theart methods such as Block-based LBP method [9], Imagebased HOG-LBP [19], and LF SIFT Histogram [24] Table shows the performance comparison of the proposed method with other methods in terms of average precision Fig shows the plot between precision and methods Values of precision and recall were computed on the same Corel-1K image database From Table and Fig it can be observed that the proposed method outperforms, in terms of precision, Block-based LBP [9] by 30.70 %, CBIR using Moments [18] by 17.76 %, Gabor Histogram [24] by 12.4 %, Imagebased HOG-LBP [19] by 7.7 %, LF SIFT Histogram [24] by 5.5 %, Color Histogram [24] by 3.2 % Conclusion In this paper, we have presented the combination of LTP and moments Local Ternary Pattern codes of blocks of gray level image are computed Geometric moments of the resulting LTP codes are then computed The method then computes distance between blocks of query and database images and finally retrieval is performed on the basis of threshold This method combines the advantage of low noise sensitivity of LTP and invariance to geometric transformation property of moments Also, this method exploits the advantages of fusion of local and global features of an image Performance of the proposed method was measured in terms of precision and recall The experimental results showed that the proposed method outperformed other state-of-the-art methods Results of the proposed method can be further improved by dividing moments into more number of sequences References Long H, Zhang H, Feng DD (2003) Fundamentals of content-based image retrieval Multimedia information retrieval and management Springer Berlin, Heidelberg, pp 1–26 Rui Y, Huang TS, Chang S (1999) Image retrieval: current techniques, promising directions, and open issues J Vis Commun Image Represent 10:39–62 Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns IEEE Trans Pattern Anal Mach Intell 24(7):971–987 625 Tan X, Triggs B (2010) Enhanced local texture feature sets for face recognition under difficult lighting conditions IEEE Trans Image Process 19(6):1635–1650 Khare M, Srivastava R K, Khare A (2013) Moving object segmentation in daubechies complex wavelet domain Signal, Image and Video Processing Accepted, doi: 10.1007/s11760-013-0496-4, Springer Wang X, Zhang B, Yang H (2002) Content-based image retrieval by integrating color and texture features MultimediaTools Appl 1–25 Gevers T, Smeulders AW (2000) Pictoseek: combining color and shape invariant features for image retrieval IEEE Trans Image Process 33(1):102–119 Wang X, Yu Y, Yang H (2011) An effective image retrieval scheme using color, texture and shape features Comput Stand Interfaces 33(1):59–68 Pietikäinen M, Takala V, Ahonen T (2005) Block-based methods for image retrieval using local binary patterns.14th Scandinavian Conference on Image Analysis 882–891 10 Murala S, Maheshwari RP, Balasubramanian R (2012) Local tetra patterns: a new descriptor for content-based image retrieval IEEE Trans Image Process 21(5):2874–2886 11 Murala S, Maheshwari RP, Balasubramanian R (2012) Directional local extrema patterns: a new descriptor for content-based image retrieval Int J Multimedia Inf Retrieval 1(3):191–203 12 Liu G, Zhang L, Hou Y, Yang J (2008) Image retrieval based on multi-texton histogram Pattern Recogn 43(7):2380–2389 13 Liu G, Yang Y (2008) Image retrieval based on texton co-occurrence matrix Pattern Recogn 41(12):3521–3527 14 Liu G, Li Z, Zhang L, Xu Y (2011) Image retrieval based on microstructure descriptor Pattern Recogn doi:10.1016/j.patcog 2011.02.003 15 Zhang D, Lu G (2002) Shape-based image retrieval using generic fourier descriptor Signal Process-Image Commun 17(10):825–848 16 Lin H, Kao Y, Yen S, Wang C (2004) A study of shape-based image retrieval In Proc 24th International Conference on Distributed Computing Workshops 118–123 17 Yoo H, Jang D, Jung S, Park J, Song K (2002) Visual information retrieval via content-based approach J Pattern Recognit Soc 35:749– 769 18 Srivastava P, Binh N T, Khare A (2013) Content-based image retrieval using moments In Proc 2nd International Conference on Context-Aware Systems and Applications 228–237 19 Yu J, Qin Z, Wan T, Zhang X (2013) Feature integration analysis of bag-of-features model for image retrieval Neurocomputing 120: 355–364 20 Fu X, Li Y, Harrison R, Belkasim S (2006) Content-based image retrieval using gabor-zernike features 18th International Conference on Pattern Recognition, Hong Kong 2:417–420 21 Moghaddam HA, Khajoie TT, Rouhi AH, Tarzjan MS (2005) Wavelet correlogram: a new approach for image indexing and retrieval Pattern Recogn 38:2506–2518 22 Agarwal M, Maheshwari RP (2012) Á trous gradient structure descriptor for content based image retrieval Int J Multimedia Inf Retr 1(2):129–138 23 Li S, Lee MC, Pun CM (2009) Complex Zernike moments shapebased image retrieval IEEE Trans Syst Man Cybern Part A: Syst Hum 39(1):227–237 24 Deselaers T, Keysers D, Ney H (2008) Features for image retrieval: an experimental comparison Inf Retr 11:77–107 25 Flusser J (2005) Moment invariants in image analysis Enformatika 11 26 Kotoulas L, Andreadis I (2005) Image analysis using moments 5th International Conference on Technology and Automation, Thessaloniki, Greece 360–364 27 http://wang.ist.psu.edu/docs/related/ ... objects 3.4 Local ternary patterns and moments Single feature fails to capture complete information of an image The combination of features is required to incorporate fine details of an image while... f x; yị 3ị xẳ1 yẳ1 Features used and their properties 3.1 Local ternary patterns Local Ternary Pattern (LTP) is an extension of Local Binary Pattern (LBP) Whereas LBP operator thresholds a pixel... proposed two new features, namely Local Tetra Patterns (LTrP) and Directional Local Extrema Pattern (DLEP) [11], based on the concept of Local Binary Pattern (LBP) as features for image retrieval Liu