3 Đề xuất và Kết quả
3.10 chính xác với số lượng histogram được chọn của bộ dữ liệu STex bởi hệ
(RGB (a); HSV (b) ; I1I2I3 (c); Y CbCr (d))
Number of selected histograms
0 1 2 3 4 5 6 7 8 9 Accuracy 0.7 0.75 0.8 0.85 0.9 0.95 Supervised Unsupervised
Number of selected histograms
0 1 2 3 4 5 6 7 8 9 Accuracy 0.7 0.75 0.8 0.85 0.9 0.95 Supervised Unsupervised (a) RGB (b) HSV
Number of selected histograms
0 1 2 3 4 5 6 7 8 9 Accuracy 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Supervised Unsupervised
Number of selected histograms
0 1 2 3 4 5 6 7 8 9 Accuracy 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 Supervised Unsupervised (c) I1I2I3 (d) Y CbCr
Hình 3.10: Độ chính xác với số lượng histogram được chọn của bộ dữ liệu STex bởi hệ sốthưa trong ngữ cảnh lựa chọn có giám sát và không giám sát trên 4 hệ màu khác nhau thưa trong ngữ cảnh lựa chọn có giám sát và không giám sát trên 4 hệ màu khác nhau (RGB (a); HSV (b) ; I1I2I3 (c); Y CbCr (d))
Unsupervised LBP histogram selection for color texture classification via sparse representation
Vinh Truong Hoang
Ho Chi Minh City Open University
97 Vo Van Tan Street, District 3. Ho Chi Minh City, Vietnam e-mail: vinh.th@ou.edu.vn
Abstracf-In recent years, LBP and its variants have led
to significant progress in applying texture methods to different applications. However, this operator tends to produce high dimensional feature vectors, especially when the number of considered neighboring pixels increases or when it is applied to color images. Various approaches are proposed to obtain more discriminative, robust LBP-features with reduced feature dimensionality. LBP histogram selection is a method to reduce the number of histogram to characterize color image. In this paper, we propose to construct sparse similarity matrix by an unsupervised way for LBP histogram selection.
Keyword~color LBp, texture classification, sparse representa- tion, histogram selection, unsupervised sparse score
I. INTRODUCTION
Data with high dimensionality decreases the performance of learning process due to the curse of dimensionality and the existence of irrelevant, redundant, and noisy features [1]. Processing and stocking such amounts of high dimensional data become a challenge. It is necessary to choose a small subset of the relevant features from the original ones in order to reduce the time of computing and also memory for storage of data. In order to solve the class discrimination problem in color texture classification, feature selection method is applied to by using the class labels to identifY a subset of the most discriminative features. In the last decade, a various discriminative and computationally efficient local and global texture descriptors have been introduced, which has led to significant progress in the analysis of color texture for many computer vision problems. Among of them, the LBP operators is a good candidate for local image texture descriptor due to its low computational complexity, ease of implementation and invariance to monotonic illumination changes [2]. Despite its wide applications, LBP features still have some limitations since this operator results in a high dimensional feature vector. Thus, a dimensionality reduction method for LBP is needed to address this problem. Various approaches are proposed to obtain more discriminative, robust LBP-features with reduced feature dimensionality.
We classifY the LBP-features dimensionality reduction techniques into two strategies: (1) the first one is to reduce the feature length based on some rules or the predefinition of patterns of interest (like uniform patterns) and (2) the second one exploits feature selection methods to identify the discriminative patterns with similar motivations as the beam search LBP variants [3]. Smith and Windeatt apply the Fast Correlation-Based Filtering (FCBF) algorithm [4] to select the
LBP patterns that are the most correlated to the target class [5]. Lahdenoja et al. define a discrimination concept of symmetry for uniform patterns to reduce the feature dimensionality [6]. Maturana et al. use heuristic algorithm to select the neighbors used in the computation of LBP [7].
In terms of availability of supervised information, fea- ture selection techniques can be roughly classified into three groups: supervised, unsupervised and semi-supervised meth- ods [8]. Based on the different strategies of evaluation, feature selection can be classified into three groups: filter, wrapper and hybrid methods [1]. Similarly, histogram selection approaches can be grouped into the same categories. Filter approaches consist in computing a score for each histogram in order to measure its efficiency. Then, the histograms are ranked accord- ing to the proposed score. In wrapper approaches, histograms are evaluated thanks to a specific classifier and the selected ones are those which maximize the classification rate. The hybrid approaches combine the reduction of processing time of a filter approach and the high performances of a wrapper approach.
Sparse representation has received a great deal of attention in computer vision, especially in image representation in the recent years.Ithas many effective applications such as image compression and coding [9], pattern recognition, image and signal processing [10]. Recently, Qiao et al. presented a new method to design the similarity matrix based on the modified sparse representation [11]. This matrix allows us to determine the soft similarity value between 0 and 1 for two corresponding images. This soft value could reflect the intrinsic geometric properties of classes which may lead to natural discriminating information. Inspired by the approach proposed by Porebski, Kalakech et al. propose to adapt the famous supervised Lapla- cian score for feature ranking and selection, to select and rank histograms in the supervised context [12]. This score namely Adapted Supervised Laplacian (ASL)-score, evaluates the relevance of a histogram using the local properties of the image data which is based on Jeffrey distance and a similarity matrix deduced from the class labels. It is a hard value which is 0 or 1. Moreover, a value between 0 and 1 will measure the similarity in a subtle way, instead of being binary with just two values 0 and 1. This may lead to more powerful discriminating information. Indeed, the soft value of the similarity obtained by the sparse representation could better reflect the geometric structure of different classes. Instead of using the value I or 0, Y.T Hoang et al. propose the Sparse Adapted Supervised Laplacian (SpASL) to construct the sparse similarity matrix based on the sparse representation by a
gain the same performance of supervised context and not use any training label data.
The rest of this paper is organized as follows. Section II
introduces the definition of LBP and its extension to color. The proposed histogram selection score is introduced in sectionII.
We then presents the experimental results on several color texture databases in section IV before the perspective and conclusion.
the i image histogram among N color images. Porebski et al. firstly proposed an approach which selects the most discriminant whole LBP histograms [17]. In this approach, the most discriminant LBP histograms are selected in their entirety, out of the different LBP histograms extracted from a color texture.
III. HISTOGRAM SELECTION SCORE
II. LBP HISTOGRAM SELECTION
A. Color LEP
The definition of the original LBP operator has then been generalized to explore intensity values of points on a circular neighborhood. The circular neighborhood is defined by considering the values of radius RandP neighbors around the central pixel. The LBPp,n(xc,Yc) code of each pixel(xc, Yc)
is computed by comparing the ~ray val~e gc o~ the central pixel with the gray values {gi};=C/ of Its P neighbors, as follows:
P-1
LBPp,n(xc,Yc) = L <I>(gi - gc) X 2i
i=O
where <I> is the threshold function which is defined as:
_{I if(gi-gc) 20,
<I>(gi - gc) - .
o otherwise.
(I)
(2)
A. Adapted supervised Laplacian score
Kalakech et al. propose to Adapt the Supervised Laplacian (ASL) score used in the literature for feature ranking and selection, to select and rank histograms in the supervised context [12]. The ASL-score evaluates the relevance of a his- togram using the local properties of the image data. The basic idea is to assume that the input histogram pairwise similarity measures in the original histogram space are preserved in the relevant histogram subspace. So, similar images with same class labels have to be close when they are represented by one relevant histogram.
In [12], the Jeffrey distance is used to construct an Adapted Laplacian score ASLr of the rth histogram. Jeffrey distance has the advantage of being positive and symmetric. The value of the Jeffrey divergence between two histograms is low when their corresponding images are similar to each other. It is defined by:
Using this measure, the ASL-score of the histogram Hr is then defined as follows:
The originalLBP computation is based on grayscale im- ages. However, it has been demonstrated that color information is very important to represent the texture, especially in natural textures [14]. In literature, the extension of LBP to color follows several strategies of color and texture combination. In order to describe color texture, Opponent Color LBP (OCLBP) was defined [15], the strategy consists in taking into account the spatial interactions within and between color components. The EOCLBP improved significantly the color texture clas- sification as shown by several authors in the state-of-the- art [12],[16]. For this purpose, the LBP operator is applied on each pixel and for each pair of components (Ck ,CD,
k, k' E {I, 2, 3}.
(4)
(5)
B. LEP histogram selection context
In the considered LBP histogram selection context, the database is composed ofN color texture images. Each image
hiE {I, ...,N} is characterized by 5 histograms (5 = 9)
in a single 3D color space. Let Hr is the rth histograms to evaluate. The data is summarized by the matrix Hr is defined as:
(6)
Hr(l) H[(l) HlV(l)
where:
• Sij is an element of the similarity matrix S. In a supervised context, for each image h a class label
Yi is associated. The similarity between two images
Ii and Ij is defined by:
{ I ifYi=Yj, Sij = 0 otherwise W = [Hl:...Hi. ..HN] = Hr(k) Hr(Q) HHk) HHQ) HlV(k) HlV(Q) (3) 80
•d i is the degree of the image Ii :
N
di= LSij,
j=l
(8)
IIr _ L~lH[di
-N
Li=ldi
The histograms are sorted according to the ascending order of the ASL-score in order to select the most relevant ones.
Dataset name Image size #class #training #test Total
New BarkTex 64 x 64 6 816816 1632
Outex-TC-OOO 13 128 x 128 68 680680 1360 USPTex 128 x 128 191 11461146 2292
B. Sparse score
Sparse representation allows to find the most compact rep- resentation of the original data. The graph adjacency structure and corresponding graph weights are built simultaneously by the h-norm minimization problem. This is, in fact, a new way that is fundamentally different from the traditional ones (like Euclidean distance, cosine distance, etc...) to measure the similarity between different data points.
The sparse representation of Hr is constructed by using a few entries ofHr as possible. Itis defined as follows:
Itis interesting to note that the sparse similarity matrix can be constructed by using all histogram globally. In this case, the class label does not incorporate to the construction and we are in the case of unsupervised learning. The SpASL and ASL- score are now extended in the unsupervised learning context by introducing a new score, namely Sparse Adapted Unsupevised Laplacian (SpAUNL), which is defined as follows:
(13)
where:
• 11.111 is the h-norm of a vector • 11.112 denotes 12-normof a vector.
• Siis an N-dimensional vector in which theithelement is equal to zero implying that Hr is removed from Hr.
It is defined as:
Several measures have been proposed for evaluating the difference between two histograms [18]. Firstly, we proposeto study the impacts of selected distance to SpASL score. Four common distances such as histogram intersection, X2 ,Jeffrey, Euclidean and Kullback-Leibler are considered. Figure 1 repre- sents the classification rate obtained with the different distance associated by SpASL-score on the New BarkTex database. The result indicates that Jeffrey and Euclidean distance reach the same performance. As they give the results close, we propose to use Jeffrey for the proposed SpAUNL-score hereafter.
Since the color space can have an impact to the perfor- mance, we conduct the experiment in four different color spaces(RGB, HSV, hI2h andYCbCr ). Figure 2,3 and 4
B. Results
GivenCclasses, the SpASL-score needCsparse similarity matrix in the supervised context while the SpAUNL-score only uses ones in the unsupervised context.
The histogram selection consists to compute for each histogram Hr an associated SSpASL or SSpAUNL score and rank these scores in ascending order. The following section presents the experimental results of the proposed score on several benchmark color texture database.
All these test suites can be download at https://www.lisic.univ- Iittoral.fir porebski
In order to evaluate the efficiency of the proposed score, we perform the evaluation on four benchmarks color texture images databases: OuTex-TC-00013, New BarkTex and USP- Texl .Each database is divided into training set and testing set by holdout method (as shown in table I). The Ll-distance is associated with 1-NN classifier while the classification perfor- mance is evaluated by accuracy rate (AC). Let us summarize theses databases by tableI.
A. Color texture database considered
IV. EXPERIMENTAL RESULTS
(11) (10)
(12)
",C ",Nc D (Hrc Hrc )'C
sr _ L...-c=lL...-i,j=l Jej i ' j Sij
SpASL - ",C ",Nc D (Hrc Hrc)dC
L...-c=l L...-c=l Jej t ' t Si = [Sil, ... , Si(i-l),0,Si(i+l), ... ,SiN]T
•1 E ]RN is a vector of all ones. • ~ represents the error tolerance
For each histogram Hr, we can compute the similarity vector Si, and then get the sparse similarity matrix:
minSi Iisilll'
where Si is the optimal solution of equation (9). The matrix S determines both graph adjacency structure and sparse sim- ilarity matrix simultaneously. Note that, the sparse similarity matrix is generally asymmetric.
Instead of using hard similarity Sij by class labels, V. T Hoang et al. define the similarity matrixbased on the sparse representation, and then integrate it into the equation (5) [13] in the supervised histogram selection context.
Given a database ofN images belonging toCclasses, each class c, C= 1, .. ,C, contains Nc images. For each class, we construct the sparse similarity matrix using images within the same class by Equation (9). We note SC the sparse similarity matrix of class c, and H[C the rth histogram of imageIi in class c. This leads to Sparse Adapted Supervised Laplacian score (SpASL) defined by:
•• •Supervised Unsupervised 3456
Number of selected histograms
3456
Number of selected histograms 3456
Number of selected histograms
0.78 0.74 0.8 0.72 08 0.82 0.72 0.74 ~ g0.76 <{ 0.65 0.74 3456
Number of selected histograms
0.6 08 ~ ~ 0.78 08 0.75 ~ g 0.7 <{ 0.76 0.78 ~ :i.0.76
Fig. 2: Accuracy vs. the desired number of selected histograms on the New BarkTex database by sparse score in supervised and unsupervised context from the top to the bottom: RGB, HSV, hI2h, YCbCr color space
9
8
. . .- Euclidean
3456
Number of selected histograms o 0.76 0.74 0.8 0.82 >, o ~ §0.78 « V. CONCLUSION
In this paper, we proposed an extension of sparse score of LBP histogram selection in an unsupervised context. Ex- perimental results achieved on several color texturedatabases have shown that the proposed approach reaches the same results of color texture classification in the supervised learning context. This work is interesting for the evaluation strategy based on wrapper and hybrid approaches since we determine the optimum dimension based on the highest accuracy. The future of this work is now extended to combine many scores in order to have a stable performance of classification. Fig. 1: Accuracy rates vs. the number of selected histograms of the SpASL-score by using different distances on the New BarkTex database.
show the classification results of New BarkTex, OuTex-TC- 00013 and USPTex database by two sparse histogram selection scores in supervised and unsupervised context in different color spaces.
We observe that both SpASL and SpAUNL-scores reach the same highest accuracy rate in most of cases on three benchmark databases. For example, the SpAUNL-score is better than SpASL-score on New BarkTex (in RGB space), on OuTex-TC-00013 (in HSV and YCbCr space). In the framework of evaluation method based wrapper and hybrid, the highest classification rate is preferred to determine the optimal dimension.Itis interesting to repeat that the SpAUNL- score does not use class label to compute the score for each histogram via sparse matrix similarity. So, the experimental results show the efficiency of the proposed score.
092 0.9 ~O.88 ~0.86 :i.064 092 06 0.78 •• •Supervised UnsupefVised 3456
Number of selected histograms
0.9 0.85 i ~ 0.8 0.75 ... ·'Supervised Unsupervised 3456
Number of selected histograms
(a)RGB (b)HSV 09 0.85 0.6 ~0.75 .it 0.7 0.65 0.6 0.9 0.85 0.6 fO.75 ::t. 0.7 0.65 06 0.55 ...•... ·'Supervised Unsupervised 3456
Number of selected histograms
3456
Number of selected histograms
Fig. 3: Accuracy vs. the desired number of selected histograms on the OuTex-TC-OOO 13 database by sparse score In supervised and unsupervised context in (RGB (a); HSV(b) ;hI2h (c); YCbCr (d)) color space
0.95
3456
Number of selected histograms
07
09
0.75 0.85
3456
Number of selected histograms 0.6 0.92 0.94 064 • Supervised 0.78L---.':'---~_~_~_~_---':='o:::::u=~="""'i:::;'="'~ o 09 092 0.96 ~ ~0.88 ~ 0.86 (a) RGB (b) HSV 3456
Number of selected histograms 06 0.9 0.95 0.75'---"'---~-~-~-~-~-~-~----" o ~ :t.0.85 0.95 3456
Number of selected histograms 07
0.85 0.9
0.65
Fig. 4: Accuracy vs. the desired number of selected histograms on the USPTex database by sparse score In supervised and