Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 22 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
22
Dung lượng
3,31 MB
Nội dung
Information Sciences 317 (2015) 202–223 Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins A novel kernel fuzzy clustering algorithm for Geo-Demographic Analysis Le Hoang Son ⇑ VNU University of Science, Vietnam National University, 334 Nguyen Trai, Thanh Xuan, Hanoi, Viet Nam a r t i c l e i n f o Article history: Received January 2014 Received in revised form 19 April 2015 Accepted 25 April 2015 Available online May 2015 Keywords: Fuzzy clustering Geo-Demographic Analysis Intuitionistic possibilistic fuzzy clustering Kernel-based clustering Spatial Interaction – Modification Model a b s t r a c t Geo-Demographic Analysis (GDA) is a major concentration of various interdisciplinary researches and has been used in many decision-making processes regarding the provision and distribution of products and services in society Machine learning methods namely Principal Component Analysis, Self-Organizing Map, K-Means, fuzzy clustering and fuzzy geographically weighted clustering were proposed to enhance the quality of GDA Among them, the state-of-the-art method – Modified Intuitionistic Possibilistic Fuzzy Geographically Weighted Clustering (MIPFGWC) has some drawbacks such as: (i) using the Euclidean similarity measure often results in high error rate and sensitivity to noises and outliers; (ii) updating the membership matrix by the Spatial Interaction – Modification Model (SIM2) model leads to new centers not being ‘‘geographically aware’’ In this paper, we present a novel fuzzy clustering algorithm named as Kernel Fuzzy Geographically Clustering (KFGC) that utilizes both the kernel similarity function and the new update mechanism of the SIM2 model to remedy the disadvantages of MIPFGWC Some supported properties and theorems of KFGC are also examined in the paper Specifically, the differences between solutions of KFGC and those of MIPFGWC and of some variants of KFGC are theoretically validated Lastly, experimental analysis is performed to compare the performance of KFGC with those of the relevant algorithms in terms of clustering quality Ó 2015 Elsevier Inc All rights reserved Introduction Prior to the definition of Geo-Demographic Analysis (GDA) problem, let us consider an example to demonstrate the roles of GDA to practical applications Example A hot-spot analysis for the number of viral hemorrhagic fever cases in Vietnam in 2011 is examined in Fig The results are expressed in a map showing various groups determined by intervals of cases such as [0, 47] and [48, 233] From this fact, decision makers could observe the most dangerous places and issue appropriate medical measures to prevent such the situation in the future The distribution can be expressed by linguistic labels such as ‘‘High cases of viral hemorrhagic fever’’ and ‘‘Low cases of viral hemorrhagic fever’’ to eliminate the limitations of boundary points in the intervals Such hot-spot analysis in this example is a kind of Geo-Demographic Analysis regarding the classification of a geographical area according to a given subject, e.g viral hemorrhagic fever cases The classification could be done for spatial data both in point ⇑ Tel.: +84 904171284; fax: +84 0438623938 E-mail addresses: sonlh@vnu.edu.vn, chinhson2002@gmail.com http://dx.doi.org/10.1016/j.ins.2015.04.050 0020-0255/Ó 2015 Elsevier Inc All rights reserved L.H Son / Information Sciences 317 (2015) 202–223 203 Lists of abbreviation Terms Explanation GIS Geographical Information Systems GDA Geo-Demographic Analysis SIM Spatial Interaction Model SIM-PF Spatial Interaction Model with Population Factor SIM2 Spatial Interaction – Modification Model SOM Self-Organizing Maps PCA Principal Component Analysis K-Means a hard partition based clustering algorithm FCM Fuzzy C-Means NE Neighborhood Effects FGWC Fuzzy Geographically Weighted Clustering IPFGWC Intuitionistic Possibilistic Fuzzy Geographically Weighted Clustering MIPFGWC Modified Intuitionistic Possibilistic Fuzzy Geographically Weighted Clustering KFGC Kernel Fuzzy Geographically Clustering IFV a spatial clustering quality validity index UNO United Nations Organization and region standards of Geographical Information Systems (GIS), and a number of points/regions that both share common characteristics in the spatial and attribute data forming a group marked by a unique symbol and color1 in the map Intuitively, GDA can be regarded as the spatial clustering in GIS Definition Given a geo-demographic dataset X consisting of N data points where each data point is equivalent to a point/region of spatial data in GIS This data point is characterized by many geo-demographic attributes where each one could be considered as a subject for clustering The objective of GDA is to classify X into C clusters so that, J¼ N X C X ukj  X k V j ! min; 1ị kẳ1 jẳ1 ukj ẵ0; 1; > > > C > X > > > > ukj ¼ 1; > < j¼1 À Á > u ¼ ukj wj ; > > > kj À Á > > > V ¼ V j wj ; > > j : k ¼ 1; N; j ¼ 1; C; ð2Þ where ukj is the membership value of data point X k ðk ¼ 1; NÞ to cluster jth ðj ¼ 1; CÞ V j is the center of cluster jth ðj ¼ 1; CÞ wj is the weight of cluster jth ðj ¼ 1; CÞ showing the influence of spatial relationships in a map It is often calculated through a spatial model such as Spatial Interaction Model (SIM) [6], Spatial Interaction Model with Population Factor (SIM-PF) [14,25,27] or Spatial Interaction – Modification Model (SIM2) [26] GDA is widely used in the public and private sectors for planning and provision of products and services In GDA, geo-demographic attributes are used to characterize essential information of population at a certain geographical area and a specific point of time Some common geo-demographics can be named but a few such as gender, age, ethnicity, knowledge of languages, disabilities, mobility, home ownership, and employment status One of the most useful functions of GDA is the capability to visualize the geo-demographic trends by locations and time stamps such as the study of the average age of a population over time and the investigation of the migration trends of local people in a town As illustrated in Example 1, the results of GDA are depicted on a map that demonstrates the distribution of several distinct groups Various distribution maps can be combined or overlapped into a single one so that users could observe the tendency of a certain group over time for the analyses of geo-demographic trends Both distributions and trends of values within a geo-demographic variable are of interest in GDA By providing essential information about geo-demographic distributions and trends, GDA assists effectively for many decision-making processes involving the provision and distribution of products and services in society, the determination of common population’s characteristics and the study of population variation This clearly demonstrates the important role of GDA to practical applications nowadays For interpretation of color in Figs and 3, the reader is referred to the web version of this article 204 L.H Son / Information Sciences 317 (2015) 202–223 Fig The distribution of viral hemorrhagic fever cases in Vietnam in 2011 In GDA, improving the clustering quality especially in terms of theoretical results over the relevant exiting methods is the major concentration of researchers in many years [19] Finding more accurate distribution of distinct groups of geo-demographic data is essential and significant to the expression and reasoning of events and phenomena concerning the characteristics of population and to the better support of decision-making Some of the first methods – Self-Organizing Maps (SOM) [13] and Principal Component Analysis (PCA) [34] rely on basic principles of statistics and neural networks to determine the underlying demographic and socio-economic phenomena However, their disadvantages are the requisition of large memory space and computational complexity For these reasons, researchers tended to use clustering algorithms such as Agglomerative Hierarchical Clustering [5] and K-Means [16] to classify geo-demographic datasets into clusters represented in the forms of hierarchical trees and isolated groups Nonetheless, using hard clustering for GDA often leads to the issues of ecological fallacy, which can be shortly understood that statistics accurately describing group characteristics not necessarily apply to individuals within that group Thus, fuzzy clustering algorithms like Fuzzy C-Means (FCM) and its variants were considered as the appropriate methods to determine the distribution of a demographic feature on a map [10,27,28,39] Since the results of FCM are independent to the geographical factors, some improvements of that algorithm were made by attaching FCM with a spatial model such as SIM [6], SIM-PF [14,25,27] or SIM2 [26] The comparative experiments [26] showed that the MIPFGWC algorithm combining the SIM2 model with an intuitionistic possibilistic L.H Son / Information Sciences 317 (2015) 202–223 205 fuzzy geographically clustering algorithm has better clustering quality than other relevant algorithms such as NE [6], FGWC [14] and IPFGWC [25] In addition to MIPFGWC, there are some relevant works concerning the applications and algorithms for GDA such as in [4,7,12,15,17,18,20–24,29,31,36,37,40] Among all existing works, MIPFGWC is considered as the state-of-the-art algorithm for the GDA problem MIPFGWC calculates new centers through the Euclidean similarity measure and uses the SIM2 model to update the membership matrix However, there exists some problems in this process that may decrease the clustering quality Let us make a deeper analysis about this consideration (a) The similarity measure for clustering is the Euclidean function According to Keogh and Ratanamahatana [11], using the Euclidean similarity measure in clustering algorithms often has high error rate and sensitivity to noises and outliers since this measure is not suitable for ordinal data, where preferences are listed according to rank instead of according to actual real values Furthermore, the Euclidean measure cannot determine the correlation between user profiles that have similar trends in tastes, but different ratings for some of the same items Gu et al [8] stated that Euclidean measure results in the contributions to the clustering results among all features being the same, which lead to a strong dependence of clustering on the sample space For those reasons, a better similarity measure should be used instead of Euclidean function to obtain high clustering quality (b) Updating the membership matrix by the SIM2model is another concerned problem MIPFGWC calculates the values of membership matrix, the hesitation level and the typicality by solving the optimization problem The membership matrix is then updated by the SIM2 model The problem is that the hesitation level and the typicality values are not updated by any geographical model so that the new centers, which are calculated from those values and the updated membership matrix, are not ‘‘geographically aware’’ Thus, the final clustering results could be far from the accurate ones by this situation We clearly recognize that these two drawbacks prevent MIPFGWC from achieving high quality of clustering so that they need enhancing and improving thoroughly Our motivation in this article is to investigate a new method that can handle those limitations, more specifically, (a) For the first problem, we consider the kernel functions, which can be used in many applications as they provide a simple bridge from linearity to non-linearity for algorithms that can be expressed in terms of dot products This type of similarity measures makes very good sense as a measure of difference between samples in the context of certain data, e.g geo-demographic datasets (b) For the second problem, we convert the activities of the SIM2 model into the objective function of the problem; thus giving more closely related to spatial relationships since all elements such as the cluster memberships, the hesitation level, the typicality values and the centers are ‘‘geographically aware’’ Our contributions in this paper are: (a) A novel fuzzy clustering algorithm named as Kernel Fuzzy Geographically Clustering (KFGC) that utilizes both the kernel function and the new update mechanism of the SIM2 model to remedy the disadvantages of MIPFGWC (b) Some supported properties and theorems of KFGC are also examined in the paper Specifically, the differences between the solutions of KFGC and those of MIPFGWC and of some variants of KFGC are theoretically validated (c) Experimental analysis is performed to compare the performance of KFGC with those of the relevant algorithms in terms of clustering quality These findings both ameliorate the quality of results for the GDA problem and enrich the knowledge of developing clustering algorithms based on Kernel distances and the spatial model SIM2 for practical applications In the other words, the findings are significant to both theory and practical sides Before we close this section and move to the detailed model and solutions, let us raise an important question: ‘‘How can we apply the Kernel functions to the clustering algorithm in order to handle the problem of similarity measures?’’ To answer the question, a survey of kernel-based fuzzy clustering algorithms in [1,2,8,9,30,35,38] was done in order to find the appropriate algorithm and kernel function for our considered problem Through this survey, it is clear that the kernel function is often applied directly to the objective function with the most frequent used function being the Gaussian kernel-induced distance [38] Moreover, most spatial-based kernel fuzzy clustering employed the spatial bias correction in the objective function so that this gives us a hint of how to apply the activities of the SIM2 model to handle the second limitation of MIPFGWC The rest of the paper is structured as follows Section presents our main contribution consisting of a new objective function and solutions, some supported properties and theorems, and details of the new algorithms – KFGC Specifically, in Section 2.1, we introduce a new objective function that integrates some results of MIPFGWC and Yang and Tsai [38], aiming to handle two limitations that exist in MIPFGWC By using the Lagrangian method, the optimal solutions including the new centers, the membership matrix, the hesitation level and the typicality values are found accordingly In Section 2.2, we examine some interesting properties and theorems of solutions such as, 206 L.H Son / Information Sciences 317 (2015) 202–223 Some characteristics of KFGC’s solutions, e.g the limit of membership values when fuzzifier is large; The estimation of the difference between solutions of KFGC and those of MIPFGWC and of the variants of KFGC In Section 2.3, the proposed algorithm – KFGC is presented in details Section validates the proposed approach through a set of experiments involving real-world data Finally, Section draws the conclusions and delineates the future research directions Kernel Fuzzy Geographically Clustering 2.1 The proposed model and solutions Supposing there is a geo-demographic dataset X consisting of N data points Let us divide the dataset into C groups satisfying the objective function (3) J¼ N X C C N m g s À X X À ÁÁ X À Ág a1 u0kj þ a2 t0kj þ a3 hkj cj À t kj ! min; À K Xk ; V j ỵ kẳ1 jẳ1 jẳ1 3ị kẳ1 where K x; yÞ is the Gaussian kernel-induced function u0kj ; t 0kj and hkj are the updated membership values, typicality values and hesitation level by the SIM model in Eqs (4)–(9), respectively u0k ẳ au uk ỵ bu t0k ẳ at t k ỵ bt k1 C X X wkj u0j ỵ cu wkj uj ; Au jẳk jẳ1 4ị k1 C X X wkj t0j ỵ ct  wkj  t j ; At j¼k j¼1 hk ẳ ah hk ỵ bh k1 C X X wkj hj ỵ ch wkj hj ; Ah jẳk jẳ1 5ị 8k ẳ 1; C; 6ị au ỵ bu ỵ cu ẳ 1; 7ị at ỵ bt ỵ ct ẳ 1; 8ị ah ỵ bh ỵ ch ẳ 1; 9ị wkj is the weighting function showing the influence of area kth to area jth defined through Eqs (10) and (11) wkj ¼ b < ðpopk Âpopj Þ Âpckj ÂIMdkj k – j; : else; dakj PC > < k¼1 popk ¼ N0 ; pkj dkj ; > : IMkj popk ỵ popj : 10ị 11ị The parameters Au ; At and Ah are scaling variables that force the membership values, typicality values and hesitation level satisfying constraints (12)–(18) ukj ; tkj ; hkj ẵ0; 1; 12ị ukj ỵ hkj ỵ tkj ẳ 1; 13ị ! C C X 1X wij ukj ẳ 1; C iẳ1 jẳ1 14ị ! C C X 1X wij hkj ¼ 1; C iẳ1 jẳ1 15ị m; g; s > 1; 16ị 207 L.H Son / Information Sciences 317 (2015) 202–223 > 0; i ẳ 1; 3; 17ị cj > j ¼ 1; CÞ: ð18Þ The proposed model KFGC in Eqs (3)–(18) relies on the principles of intuitionistic fuzzy sets, possibilistic fuzzy clustering, weighted clustering, the SIM2 model and the kernel-based clustering In order to analyze the difference and the improvement of this model in comparison to MIPFGWC [26], let us review some points below The objective function of KFGC in (3) employs the Gaussian kernel function K ðx; yÞ instead of the traditional Euclidean function of MIPFGWC This handles the first limitation of MIPFGWC as shown above In Eq (3), u0kj ; t 0kj ; hkj are used instead of the original membership values, typicality values and hesitation level in MIPFGWC This tackles the second problem of MIPFGWC where the typicality values and the hesitation level are not updated by any geographical model so that the next center is not correctly calculated According to this amendment, the membership values, typicality values and the hesitation level are all updated by the SIM2 model as shown in Eqs (4)–(9) The weighting function is kept intact for the update of those values Inspired by the spatial bias correction of the group of kernel-based fuzzy clustering, especially the work of Yang and Tsai [38], an improvement in the constraints of the proposed model was made and described in Eqs (14) and (15) The role of P the average weight to cluster jth C1 Ci¼1 wij is equivalent to rj in the work of Yang & Tsai Nonetheless, since there has already had the weights derived from the SIM2 model, they are utilized for this task By providing the weights into the constraints, the proposed model is getting closely related to spatial relationships Now, let us continue to find the optimal solutions of the model (3)–(18) Theorem The optimal solutions of the systems (3)–(18) are: !mÀ1 ð1ÀK ðXk ;V j ÞÞ P 1 À ÁÁ!mÀ1 PC À iÀ1 ð1ÀK ðXk ;V j ÞÞ bu jÀ1 À K Xk; V j i¼1 wji au uki ỵ bu uki1ị ỵ þ bu uk1 j¼1 @1 À A; À À À ÁÁ PC PC P cu i1 K Xk ; V j au ỵ bu jÀ1 j¼1 C i¼1 wij i¼1 wji Au bu þ Á Á Á þ b þ PC j¼1 ukj ¼ k ¼ 1; N; j ¼ 1; C; ð19Þ !sÀ1 ð1ÀK ðX k ;V j ÞÞ P 1 À ÁÁ!sÀ1 PC À iÀ1 ð1ÀK ðX k ;V j ÞÞ bh jÀ1 K Xk; V j iẳ1 wji ah hki ỵ bh hki1ị ỵ ỵ bh hk1 jẳ1 @1 À A; À À À ÁÁ PC 1 PC P ch À K Xk; V j ah ỵ bh j1 bhi1 ỵ þ b þ j¼1 C i¼1 wij i¼1 wji A PC j¼1 hkj ¼ h k ¼ 1; N; j ẳ 1; C; 20ị tkj ẳ cgj À1 À À g1 P ct i1 i1 a2 at ỵ bt j1 w b ỵ ỵ b ỵ þ ÁÁÁ þ b þ  À K X k ;V j ji t i¼1 i¼1 wji At bt P iÀ1 bt jÀ1 i¼1 wji at t ki ỵ bt t ki1ị ỵ ỵ bt t k1 À ; P iÀ1 cgj ỵ at ỵ Actt bt j1 w b þ ÁÁÁ þ b þ t i¼1 ji cgj ỵ at ỵ Actt bt Pj1 21ị m m PjÀ1 cu cu PC i1 i1 B a1 au ukj ỵ bu iẳ1 wji au uki ỵ bu uki1ị ỵ þ bu uk1 þ Au  bu þ ÁÁÁ þ b ỵ ukj ỵ Au iẳj wji  uki C g C PN B PjÀ1 C B ch ch PC i1 i1 CX k kẳ1 B ỵa2 ah hkj ỵ bh iẳj wji hki iẳ1 wji ah hki ỵ bh hki1ị ỵ ỵ bh hk1 ỵ Ah bh ỵ ỵ b ỵ hkj ỵ Ah  C B A @ s PjÀ1 ct ct P C i1 i1 ỵa3 at t kj ỵ bu iẳ1 wji at tki ỵ bt tki1ị ỵ ỵ bt t k1 ỵ At bt ỵ ỵ b ỵ tkj ỵ At iẳj wji t ki Vj ¼ m m j ¼ 1;C: PjÀ1 cu cu PC i1 i1 B a1 au ukj ỵ bu iẳ1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 þ Au  bu þ ÁÁÁ þ b þ ukj ỵ Au iẳj wji uki C g C PN B PjÀ1 P C B wji ah hki ỵ bh hki1ị ỵ ỵ bi1 hk1 ỵ Ach bi1 þ ÁÁÁ þ b þ  hkj þ Ach Ciẳj wji hki C kẳ1 B ỵa2 ah hkj ỵ bh h h iẳ1 h h C B A @ s P P ct i1 i1 ỵa3 at tkj ỵ bu j1 ỵ ỵ b ỵ t kj þ Actt  Ci¼j wji  tki i¼1 wji at t ki ỵ bt t ki1ị ỵ ỵ bt t k1 ỵ At bt 22ị 208 L.H Son / Information Sciences 317 (2015) 202–223 2.2 Some supported properties and theorems 2.2.1 Properties of solutions Property The limits of the membership values are È É lim ukj ẳ 1; 23ị m!1ỵ bu ẩ ẫ lim ukj ẳ Pj1 m!1 iẳ1 wji au uki ỵ bu uki1ị ỵ ỵ bi1 u uk1 au ỵ bu C C ẩ ẫ X 1X lim ukj ¼ wij m!1 C i¼1 j¼1 PjÀ1 bi1 u ỵ ỵ b ỵ cu iẳ1 wji Au ! À bu PjÀ1 i¼1 wji 24ị ; au uki ỵ bu uki1ị ỵ ỵ bi1 u uk1 P cu i1 au ỵ bu j1 iẳ1 wji Au bu ỵ ỵ b ỵ C C X 1X wij C iẳ1 jẳ1 ! ! ỵ1 : ð25Þ The results in (24) and (25) differ those in [26] a quantity of constants This means that applying the SIM2 model into the objective function makes the limits (24) and (25) dependent on the parameters of the model Property Similarly, some limits of the hesitation level are ẩ ẫ lim hkj ẳ 1; 26ị ẩ ẫ lim hkj ẳ 0; 27ị s!1ỵ s!1 ẩ lim hkj ẫ s!1 ! ! jÀ1 C C X X 1X iÀ1 A : wij bh wji ah hki ỵ bh hki1ị ỵ þ bh hk1 ¼  @1 À P ch C iẳ1 ah ỵ bh j1 bi1 iẳ1 jẳ1 h ỵ ỵ b ỵ iẳ1 wji A 28ị h Property Limits of the typicality value are È É lim t kj ẳ g!1ỵ P i1 bt j1 iẳ1 wji at t ki ỵ bt t ki1ị ỵ ỵ bt t k1 À ; P P iÀ1 i1 at ỵ Actt bt j1 ỵ ỵ b þ at þ Actt bt jÀ1 þ ÁÁÁ þ b ỵ iẳ1 wji bt iẳ1 wji bt 29ị ẩ ẫ lim t kj ẳ 1; 30ị g!1 ẩ É lim t kj ¼ g!1 P i1 bt j1 iẳ1 wji at t ki ỵ bt t ki1ị ỵ ỵ bt t k1 À : P P i1 i1 at ỵ Actt bt j1 ỵ ỵ b ỵ at ỵ Actt bt j1 ỵ þ b þ i¼1 wji bt i¼1 wji bt ð31Þ The remarks from Properties and are similar to those in Property that means the limits are more dependent to the parameters of the model than those in the MIPFGWC algorithm [26] Property if a2 ¼ then tkj ¼ 1; 8k ¼ 1; N; j ¼ 1; C Contrary to the result in [26], there does not exist the typicality values in cases of a2 ¼ This recommends us the value of parameter a2 should be avoided in order to obtain the best clustering quality of algorithm Property lim fv j g ¼ m!1 g!1 es!1 lim PN Xk k¼1 m g PN m;g;sị!1;1;1ị 1ỵ lẳ1;lk a1 a1 u0kj u0 lj m ỵa2 ỵa2 t 0lj t0kj ỵa3 h0lj ị s ca3 h0kj ị s ffi PN k¼1 X k 0 m þ ðN À 1Þlimðm;g;sÞ!ð1;1;1Þ @ PN a1 u0 lj þa2 a1 u0kj ỵa2 m g s t 0lj þa3 ðh0lj Þ t0kj þa3 ðh0kj Þ g 1 ¼ s k¼1 X k N A ¼ V l ½1;N : When the parameters are quite large, all the clusters’ centers tend to move to the central point of the dataset ð32Þ 209 L.H Son / Information Sciences 317 (2015) 202–223 Property The limits of the ratio between ukj and hkj are & lim m!1ỵ s!1ỵ lim & m!1ỵ s!1 ukj hkj ukj hkj ' ~ ẳ 1; 33ị ẳ 1; 34ị ' P jÀ1 iÀ1 1 Á À b w a u ỵ b u ỵ þ b u P  À & ' PC À1P u ki k1 C jÀ1 u u kðiÀ1Þ u iẳ1 ji w au ỵbu iẳ1 wji Acuu bi1 ukj u ỵỵbỵ1ị C iẳ1 ij !: ẳ jẳ1 lim m!1 P s!1 hkj jÀ1 iÀ1 1 PC À1PC Á À bh i¼1 wji ah hki þ bh hkðiÀ1Þ þ Á Á Á þ bh hk1 Pj1 c w ah ỵbh iẳ1 wji Ah bhi1 ỵỵbỵ1ị jẳ1 C iẳ1 ij h 35ị ~ is complex infinity where 2.2.2 The difference of solutions between algorithms Firstly, the difference between the solutions of KFGC and MIPFGWC [26] is measured Theorem Supposing that similar inputs and initialization are given to both KFGC and MIPFGWC The upper bounds of the difference of solutions of algorithms are: ðKMIPFGWCÞ À U ðMIPFGWCÞ N  C U  @P C j¼1 P C C iẳ1 wij ỵ XC @bu j¼1 iÀ1 @1 À w ji au uki ỵ bu uki1ị ỵ ỵ bu uk1 iẳ1 Xj1 a u ỵ bu Pj1 cu iẳ1 wji Au 111 bi1 u ỵ ỵ b ỵ AAA; 36ị KMIPFGWCị À HðMIPFGWCÞ N  C H  @P C j¼1 P C C i¼1 wij ỵ 0 j1 C X X @bh wji ah hki ỵ bh hki1ị ỵ þ biÀ1 hk1 @1 À h j¼1 C X ðKMIPFGWCÞ À T ðMIPFGWCÞ N C T jẳ1 iẳ1 a h ỵ bh PjÀ1 ch i¼1 wji Ah 111 AAA; bhi1 ỵ ỵ b ỵ 37ị 1 cgj ỵ at ỵ Actt bt cgj Pj1 iẳ1 wji : bi1 ỵ ỵ b ỵ t 38ị From those results, the difference of clustering qualities of two algorithms through the IFV index [3] can be estimated IFV was used to evaluate the clustering qualities of MIPFGWC and other algorithms in [26] It was shown to be robust and stable when clustering spatial data The definition of IFV is stated below " #2 C < N N = X X X ð1=NÞ ukj log2 C À ð1=NÞ log2 ukj IFV ¼ ð1=C Þ Â ðSDmax =rD Þ; : ; j¼1 k¼1 k¼1 2 SDmax ¼ maxV k À V j ; k–j rD ! C N X X 2 ẳ 1=C ị 1=Nị Xk V j : jẳ1 39ị 40ị 41ị kẳ1 When IFV ! max, the value of IFV is said to yield the most optimal of the dataset The difference of IFV values between KFGC and MIPFGWC is estimated as 210 L.H Son / Information Sciences 317 (2015) 202–223 IFV KMIPFGWC À IFV MIPFGWC " #2 C < X N = SD 1X N KMIPFGWC 2 1X max KMIPFGWC ¼ ukj log2 C À log2 ukj  : ; C j¼1 N k¼1 N k¼1 rD " #2 = SD C < X N 1X N MIPFGWC 2 1X max À ukj log2 C À log2 uMIPFGWC  ; kj ; C j¼1 :N kẳ1 N kẳ1 rD 42ị " #2 " #2 19 C < N N N N = 2 2 X 1X @X 1X 1X KMIPFGWC KMIPFGWC MIPFGWC MIPFGWC A  SDmax : ¼ ukj log2 C À log2 ukj À ukj log2 C À log2 ukj ; C j¼1 :N k¼1 N k¼1 N k¼1 rD kẳ1 43ị Based upon the results in Theorem 2, we can recognize that IFV KMIPFGWC À IFV MIPFGWC P In the other words, the clustering quality of KFGC is generally better than that of MIPFGWC Secondly, the effective of KFGC with and without using Gaussian kernel function is verified 2 Theorem In the objective function (3) of KFGC, the kernel function is replaced with the Euclidean function X k À V j and the similar proof with that of Theorem is used to determine the new optimal solutions of KFGC without using Gaussian kernel function as in Eqs (44)–(47) PC kXk ÀV j k kXk ÀV j k ukj ¼ P P C C j¼1 j¼1 C !mÀ1 1 0 PjÀ1 ! PC X k À V j 2 mÀ1 bu i¼1 wji au uki ỵ bu uki1ị ỵ þ biÀ1 u uk1 C j¼1 ÂB À@ @1 À A; P cu iÀ1 X k V j 2 au ỵ bu j1 iẳ1 wij iẳ1 wji Au bu ỵ ỵ b ỵ k ẳ 1; N; j ẳ 1; C; PC kX k ÀV j k kX k ÀV j k hkj ¼ P P C C jẳ1 jẳ1 C 44ị !s1 1 0 PjÀ1 !1 PC X k À V j 2 sÀ1 bh i¼1 wji ah hki ỵ bh hki1ị ỵ ỵ bi1 h hk1 C j¼1 ÂB À@ @1 À A; PjÀ1 ch iÀ1 X k À V j 2 w a ỵ b w b ỵ ỵ b ỵ ij ji h h h iẳ1 iẳ1 A 45ị h tkj ẳ cgj cgj ỵ at ỵ Actt bt À bt PjÀ1 i¼1 wji 2 gÀ1 P ct i1 i1 w b ỵ ỵ b ỵ ỵ ỵ b ỵ  X k À V j  a2 at ỵ bt j1 ji t iẳ1 iẳ1 wji At bt Pj1 at tki ỵ bt tki1ị ỵ ỵ bi1 t t k1 cgj þ at þ Actt bt ; iÀ1 w b ỵ ỵ b ỵ ji t i¼1 PjÀ1 k ¼ 1; N; j ¼ 1; C; ð46Þ m P cu i1 i1 a1 au ukj ỵ bu j1 iẳ1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 ỵ Au bu ỵ ỵ b ỵ ukj C B C B g C PN B P C B jÀ1 ch iÀ1 iÀ1 CX k kẳ1 B ỵa2 ah hkj ỵ bh iẳ1 wji ah hki ỵ bh hki1ị ỵ ỵ bh hk1 ỵ Ah bh ỵ ỵ b þ  hkj C B C B A @ s PjÀ1 ct i1 i1 ỵa3 at tkj ỵ bu iẳ1 wji at t ki ỵ bt t ki1ị ỵ þ bt t k1 þ At  bt þ Á ỵ b ỵ t kj Vj ẳ m ; j ¼ 1; C: P cu iÀ1 iÀ1 a1 au ukj ỵ bu j1 iẳ1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 ỵ Au bu ỵ ỵ b ỵ ukj C B C B C g PN B PjÀ1 C B ch i1 i1 C kẳ1 B ỵa2 ah hkj ỵ bh iẳ1 wji ah hki ỵ bh hki1ị ỵ þ bh hk1 þ Ah  bh þ ÁÁ Á þ b þ  hkj C B C B A @ s P ct i1 i1 ỵa3 at tkj ỵ bu j1 w a t ỵ b t ỵ þ b t  b þ Á ÁÁ þ b þ þ  t t ki k1 kj t ki1ị t t iẳ1 ji At 47ị Theorem The upper bounds of the difference of solutions of KFGC with (a.k.a K1) and without (a.k.a K2) using Gaussian kernel function are: L.H Son / Information Sciences 317 (2015) 202–223 ðK1Þ À U ðK2Þ N  C U jÀ1 C X X i1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 @1 bu j¼1 i¼1 211 1 A; P cu iÀ1 au ỵ bu j1 w b ỵ ỵ b ỵ ji u iẳ1 Au 48ị ðK1Þ À HðK2Þ N  C H jÀ1 C X X @1 wji au uki ỵ bu uki1ị ỵ ỵ bi1 bu u uk1 jẳ1 iẳ1 1 A; P cu i1 au ỵ bu j1 iẳ1 wji Au bu ỵ ỵ b ỵ 49ị X C K1ị T K2ị T jẳ1 N  C  cgj À1 ; gÀ1 P P ct i1 i1 cgj ỵ at ỵ Actt bt j1 þ Á ÁÁ þ b þ þ Á ÁÁ þ b þ  a2  at þ bt j1 iẳ1 wji bt iẳ1 wji At bt 50ị Thus, IFV ðK1Þ P IFV ðK2Þ : ð51Þ Thirdly, the effectiveness of plugging the SIM2 model into the objective function (3) is verified Theorem In the objective function (3) of KFGC, the updated membership values, the updated typicality values and the updated hesitation level are replaced with their original ones and the similar proof with that of Theorem is used to determine the new optimal solutions of KFGC without plugging the SIM2 model into the objective function as in Eqs (52)–(55) À ÁÁ!mÀ1 PC À j¼1 À K X k ; V j  À À ÁÁ ; À K Xk; V j i¼1 wij ukj ¼ P C P C hkj ¼ P C P C j¼1 C j¼1 C tkj ẳ 1ỵ !s1 PC j¼1 À K X k ; V j  À À ÁÁ ; À K Xk ; V j i¼1 wij a2 ð1ÀK ðX k ;V j ÞÞ ; gÀ1 k ¼ 1; N; j ¼ 1; C; 52ị k ẳ 1; N; j ẳ 1; C; 53ị k ẳ 1; N; j ẳ 1; C; ð54Þ cj PN s g a1 um kj þ a2 t kj þ a3 hkj X k ; Vj ¼ P s N g m k¼1 a1 ukj ỵ a2 t kj ỵ a3 hkj kẳ1 j ẳ 1; C: 55ị Theorem The upper bounds of the difference of solutions of KFGC with (a.k.a K1) and without (a.k.a K3) plugging the SIM2 model into the objective function are: C b X u ðK1Þ À U ðK3Þ N  C  U j¼1 C b X h ðK1Þ À HðK3Þ N  C  H jẳ1 Pj1 iẳ1 wji au uki ỵ bu uki1ị ỵ ỵ bi1 u uk1 au ỵ bu Pj1 iẳ1 wji Pj1 cu iẳ1 wji Au bi1 u ỵ ỵ b þ ah hki þ bh hkðiÀ1Þ þ Á ỵ bi1 h hk1 ah ỵ bh Pj1 ch iẳ1 wji Ah bi1 h ỵ ỵ b ỵ ; 56ị ; 57ị 212 L.H Son / Information Sciences 317 (2015) 202–223 PjÀ1 gÀ1 iÀ1 C c À b w a t ỵ b t ỵ ỵ b t X t ji ki k i1 ị k1 t t t iẳ1 j K1ị À T ðK3Þ N  C  T : P iÀ1 j¼1 cgj ỵ at ỵ Actt bt j1 ỵ ỵ b ỵ iẳ1 wji bt 58ị IFV K1ị P IFV ðK3Þ : ð59Þ Thus, Fourthly, it needs to be checked how using spatial bias correction in the constraints (14) and (15) changes the optimal solutions of the system Theorem The objective function (3) is kept intact and the standardized weights out of constraints (14) and (15) are removed and the new optimal solutions of KFGC without using standardized weights of constraints as in Eqs (60)–(63) are found P 1 À ÁÁ!mÀ1 À ÁÁ!mÀ1 PC À PC À iÀ1 bu jÀ1 K Xk; V j iẳ1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 jẳ1 À K X k ; V j j¼1 @1 À A; À À ÁÁ À À ÁÁ À P À K Xk; V j À K Xk; V j au ỵ b j1 wji cu bi1 ỵ ỵ b ỵ ukj ¼ u i¼1 u Au k ¼ 1; N; j ¼ 1; C; ð60Þ P 1 À ÁÁ!sÀ1 À ÁÁ!sÀ1 PC À PC À iÀ1 bh jÀ1 À K Xk ; V j iẳ1 wji ah hki ỵ bh hki1ị ỵ ỵ bh hk1 jẳ1 K X k ; V j j¼1 @1 À A; À À ÁÁ À À ÁÁ À P À K Xk ; V j À K Xk; V j ah ỵ b j1 wji ch bi1 ỵ ỵ b ỵ hkj ẳ h iẳ1 h Ah k ẳ 1; N; j ẳ 1; C; 61ị tkj ¼ cgj À1 À À ÁÁgÀ1 P ct iÀ1  a2 at ỵ bt j1 bi1 ỵ þ b þ þ ÁÁ Á þ b þ  À K X k À V j t i¼1 wji At bt P iÀ1 bt j1 iẳ1 wji at t ki ỵ bt t ki1ị þ ÁÁ Á þ bt t k1 À 62ị ; k ẳ 1; N; j ẳ 1; C: P i1 cgj ỵ at ỵ Actt bt j1 ỵ ỵ b ỵ iẳ1 wji bt cgj ỵ at ỵ Actt bt PjÀ1 i¼1 wji m P a1 au ukj ỵ bu j1 wji au uki ỵ bu uki1ị ỵ ỵ bi1 uk1 ỵ Acuu bi1 ỵ ỵ b þ  ukj u u i¼1 C B g C PN B PjÀ1 C B ch i1 i1 ỵa a h ỵ b w a h ỵ b h ỵ ỵ b h b ỵ ỵ b ỵ ỵ h CX k B h ki k1 kj h h ki1ị kẳ1 h h iẳ1 ji Ah C B h kj s A @ Pj1 ct i1 ỵa3 at t kj ỵ bu iẳ1 wji at tki ỵ bt t ki1ị ỵ ỵ bi1 t b þ Á ÁÁ þ b þ þ  t k1 kj t t At Vj ¼ m ; j ¼ 1; C: PjÀ1 cu i1 a1 au ukj ỵ bu iẳ1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 ỵ Au bi1 ỵ ỵ b ỵ ukj u C B g C PN B PjÀ1 C B ch i1 i1 C kẳ1 B ỵa2 ah hkj ỵ bh iẳ1 wji ah hki ỵ bh hki1ị ỵ ỵ bh hk1 ỵ Ah bh ỵ ỵ b ỵ hkj C B s A @ PjÀ1 ct i1 i1 ỵa3 at tkj ỵ bu iẳ1 wji at t ki ỵ bt t ki1ị ỵ þ bt t k1 þ At  bt þ Á ỵ b ỵ t kj 63ị Theorem The upper bounds of the difference of solutions of KFGC with (a.k.a K1) and without (a.k.a K4) using the standardized weights of constraints (14) and (15) are: PjÀ1 iÀ1 C b w a u ỵ b u ỵ ỵ b u XB ki k1 C u u kðiÀ1Þ u iẳ1 ji u NC K1ị B1  À U ðK4Þ P U C A; @ C P jÀ1 c g i1 jẳ1 iẳ1 wij c ỵ au ỵ u b wji b ỵ ỵ b þ C j Au u i¼1 PjÀ1 iÀ1 C bh i¼1 wji ah hki ỵ bh hki1ị ỵ ỵ bh hk1 XB C NÂC ðK1Þ B1 À  À HðK4Þ P H C @ A; C P j¼1 i¼1 wij cg1 ỵ au ỵ ch b j1 wji bi1 ỵ ỵ b ỵ C j Ah h iẳ1 64ị u h 65ị 213 L.H Son / Information Sciences 317 (2015) 202–223 1 PjÀ1 gÀ1 iÀ1 C Bc À b w a t ỵ b t ỵ ỵ b t X t ji ki k ð iÀ1 Þ k1 C t t t iẳ1 NC K1ị B j  À T ðK4Þ P T C @ A: C PjÀ1 iÀ1 c g t w jẳ1 iẳ1 ij c ỵ at ỵ b wji b ỵ ỵ b ỵ C j At t 66ị t iẳ1 Thus, IFV ðK1Þ P IFV ðK4Þ : ð67Þ From Theorems to 8, it is clear that the clustering quality of KFGC is better than those of MIPFGWC and other variants of KFGC 2.3 The proposed algorithm In this section, the KFGC algorithm is presented in details Kernel Fuzzy Geographically Clustering I: O: Data X whose number of elements ðNÞ in r dimensions; Number of clusters ðCÞ; Threshold and other parameters: au ; bu ; cu ; at ; bt ; ct ; ah ; bh ; ch; m; g; s > 1; > 0; i ẳ 1; 3ị; cj > e j ẳ 1; Cị; a; b; c; d; r; Matrices U; T; H and centers V; KFGC: 1: Vj 2: ð0Þ ukC 3: 4: 5: ð0Þ random j ẳ 1; Cị; t = 0; random; Repeat t=t+1 ðtÞ uk1 ðtÀ1Þ ukC ; ð0Þ hkC ðtÞ hk1 ð0Þ random; ðtÀ1Þ hkC ; t kC ðtÞ tk1 random k ẳ 1; Nị satisfy (12) and (13) t1ị t kC 6: Calculate ukj ðk ¼ 1; N; j ¼ 1; CÞ by Eq (19) 7: ðtÞ Calculate hkj ðk ¼ 1; N; j ¼ 1; CÞ by Eq (20) tị Calculate tkj k ẳ 1; N; j ẳ 1; Cị by Eq (21) tị Update V j j ẳ 1; CÞ by Eq (22) 8: 9: 10: ðtÞ Until V ðtÞ À V ðtÀ1Þ e Results 3.1 Experimental environments In this part, the experimental environments are described such as, Experimental tools: the proposed algorithm – KFGC has been implemented in addition to FGWC [14] and MIPFGWC [26] in C programming language and executed them on a PC Intel(R) Core(TM)2 Duo CPU T6570 @ 2.10 GHz (2 CPUs), 2048 MB RAM, and the operating system is Windows Professional 32-bit The experimental results are taken as the average values after 10 runs Experimental dataset: – A real dataset of socio-economic demographic variables from United Nation Organization – UNO [33] which was used for experiments in the articles [25,26] It contains statistics of 230 nations on population size and composition, births, deaths, marriage and divorce on an annual basis, economic activity, educational attainment, household characteristics, etc UNO will be used in Sections 3.2 and 3.3 to compare the clustering qualities of algorithms and to examine the characteristics of KFGC, respectively – A benchmark UCI Machine Learning dataset so-called Coil 2000 [32] consisting 9000 socio-demographic instances of 86 variables describing information of customers of an insurance company It will be used in Section 3.4 to validate the capabilities of KFGC to produce results that are more closely related to spatial relationships – All attributes/variables of these datasets are used concurrently for the best evaluation of the algorithms Cluster validity measurement: the IFV validity function in Eqs (57)–(59) Parameters setting: some parameters of KFGC such as the threshold e are set up similar to those of MIPFGWC [26] 214 L.H Son / Information Sciences 317 (2015) 202–223 Objective: – To evaluate the clustering qualities and the computational time of all algorithms – To examine the characteristics of KFGC by various cases and parameters of the Gaussian kernel function – To validate the capabilities of KFGC to produce results which are more closely related to spatial relationships 3.2 The comparisons of clustering quality and computational time In Table 1, the IFV values of all algorithms are measured by various numbers of clusters and parameters ða; b; cÞ on the UNO dataset The first remark from this table is that the IFV values of KFGC are larger and better than those of MIPFGWC and FGWC even if the number of clusters or the parameters ða; b; cÞ changes Specifically, In the first case of ða; b; cÞ when the number of clusters is 2, the IFV values of KFGC, MIPFGWC and FGWC are 5.1, 4.3 and 0.8, respectively When the number of clusters increases to 3, the IFV values of all algorithms are also larger, but again we see that the IFV value of KFGC is better than those of other algorithms, i.e 29.8, 22.2 and 8, respectively The remark is recognized in other cases of ða; b; cÞ, for instance in the second case a; b; cị ẳ ð0:35; 0:4; 0:25Þ when the number of clusters is 5, the IFV values of KFGC, MIPFGWC and FGWC are 44.8, 37.4, 8.9, respectively Those experimental results have shown that the clustering quality of KFGC is better than those of other algorithms Secondly, the impact of the number of clusters to the IFV values of all algorithms is investigated Clearly, Table indicates that there is a slight increment of IFV values of all algorithms when the number of clusters increases: In the first case of ða; b; cÞ when the number of clusters changes from to 4, the IFV value of KFGC increases from 29.8 to 32.9 Meanwhile, the IFV value of MIPFGWC (resp FGWC) varies from 22.2 (resp 3.8) to 30.6 (resp 6.0) When we test with clusters, the IFV value of KFGC increases from 32.9 to 40.7 and the IFV value of MIPFGWC (resp FGWC) also changes from 30.6 (resp 6.0) to 37.6 (resp 7.9) The average increment ratios of KFGC, MIPFGWC and FGWC in the first case of ða; b; cÞ are 19%, 24.1% and 31.5%, respectively In the second case – a; b; cị ẳ 0:35; 0:4; 0:25ị, the average increment ratios of KFGC, MIPFGWC and FGWC are 20.6%, 29.2% and 28.9%, respectively The average increment ratios of KFGC in the third, fourth, fifth and sixth cases are 39.7%, 34.5%, 21.7% and 20%, respectively Similarly, the values of MIPFGWC (resp FGWC) are 36.3% (resp 40.8%), 41.1% (resp 54.6%), 26.4% (resp 24.9%) and 26.1% (resp 29.6%), respectively The average increment ratios of KFGC, MIPFGWC and FGWC by cases are 25.9%, 30.5% and 35%, respectively Those ratios help us to predict the IFV values of algorithms for a given number of clusters Table IFV values by geographic parameters and C C ða; b; cị ẳ 0:3; 0:25; 0:45ị a; b; cị ẳ 0:35; 0:4; 0:25Þ KFGCa MIPFGWC FGWCb KFGCa MIPFGWC FGWCb 5.0585 29.8394 32.8705 40.6905 52.7738 59.3786 4.2773 22.1791 30.5811 37.6089 42.9346 52.1952 0.7674 3.7982 6.0314 7.8566 9.3534 11.0136 6.4445 26.2245 33.8837 44.7920 55.4991 53.8379 2.0574 19.5822 29.6247 37.4115 45.8490 53.4887 0.8783 4.3664 7.2173 8.8986 9.7352 11.4477 6.7451 16.4691 28.3652 39.1279 43.2684 53.7999 2.0891 7.0151 11.3263 18.7527 22.6014 26.0993 a; b; cị ẳ ð0:7; 0:2; 0:1Þ 11.0862 17.2164 31.4931 48.8323 49.0544 59.1806 a; b; cị ẳ 0:55; 0:15; 0:3ị a; b; cị ẳ 0:34; 0:33; 0:33ị a b 7.8089 24.5602 30.5453 37.4965 51.8698 52.5315 2.0337 20.6662 28.3496 36.5597 46.4185 52.1431 au ¼ at ¼ ah ¼ a; bu ¼ bt ¼ bh ¼ b; cu ¼ ct ¼ ch ¼ c b value in FGWC is equal to the sum of b and c 8.3532 20.0305 39.2364 43.5082 54.0188 57.7399 4.0304 15.2013 31.0666 37.5862 47.4099 53.6075 0.6440 4.1440 10.3852 13.4101 16.6708 19.0490 3.4353 20.7994 31.2177 38.3869 42.0896 51.2836 1.0529 6.3679 10.1296 12.0988 14.4043 17.4037 a; b; cị ẳ 0:5; 0:3; 0:2ị 0.4040 4.6698 6.7717 8.7119 9.4352 11.1229 8.07516 26.2767 35.7169 43.7499 44.5557 53.4076 215 L.H Son / Information Sciences 317 (2015) 202–223 Thirdly, the impact of the parameters ða; b; cÞ to the IFV values of all algorithms is conducted The results showed that the IFV values of KFGC and MIPFGWC are stable through various cases of parameters Some proofs are given as follows The average IFV values of KFGC from the first to the sixth case are 36.8, 36.8, 36.1, 37.1, 34.1 and 35.3, respectively There is a one-IFV-value gap between the maximal and minimal IFV values of KFGC in those cases Analogously, the average IFV values of MIPFGWC from the first to the sixth case are 31.6, 31.3, 31.3, 31.5, 31.0 and 31.2, respectively The FGWC algorithm is not stable since the difference between the maximal and minimal IFV values are IFV values From these numbers, we can recognize that the average IFV values of KFGC are still better than those of MIPFGWC and FGWC Besides, the effectiveness of KFGC is independent from the changes of parameters Fourthly, which case of parameters results in the best IFV values of KFGC should be known From the average IFV values of KFGC above, it comes to conclusion that the fourth case – ða; b; cÞ ¼ ð0:55; 0:15; 0:3Þ is the best case of parameters which means that we should set up the medium value of parameter a, the low value of b and the high value of c in order to obtain large IFV values in KFGC However if each IFV value of the fourth case is observed, the difference between the results of two consecutive numbers of clusters is irregular On the contrary, the third case a; b; cị ẳ ð0:7; 0:2; 0:1Þ makes the difference between the results of two consecutive numbers of clusters is absolutely equal Thus, our recommendation is choosing a large value of a, a medium value of b and a low value of c in order to achieve the best IFV values of KFGC Lastly, the computational time of all algorithms in Table is recorded The results showed that KFGC runs longer than MIPFGWC and FGWC Some proofs are as follows In the first case when the number of cluster is 2, the computational time of KFGC, MIPFGWC and FGWC is 0.67, 0.18 and 0.11 s, respectively The average computational time of KFGC from the first to sixth cases is 2.7, 3.4, 3.1, 3.6, 2.7 and 2.8 s, respectively Those values in cases of MIPFGWC (resp FGWC) are 1.29 (resp 0.62), 1.27 (resp 0.6), 1.31 (resp 0.59), 1.14 (resp 0.59), 1.23 (resp 0.61) and 1.28 (resp 0.58) seconds, respectively The average computational time of KFGC by various cases of parameters and various numbers of clusters is 2.43 and times larger than those of MIPFGWC and FGWC, respectively Nonetheless, it takes only s for each run that process a given number of clusters and a case of parameters Table The computational time of algorithms by geographic parameters and C (s) C ða; b; cị ẳ 0:3; 0:25; 0:45ị a; b; cị ẳ ð0:35; 0:4; 0:25Þ KFGCa MIPFGWC FGWCb KFGCa MIPFGWC FGWCb 0.6744 2.1971 2.7883 2.8166 3.1938 4.7668 0.1775 0.5597 1.2513 1.3622 1.7202 2.6462 0.1092 0.3247 0.7456 0.7084 0.8612 0.9814 1.6702 2.7612 3.0743 3.5081 3.9346 5.5056 0.218 0.6485 1.0134 1.3873 1.812 2.5304 0.1048 0.3039 0.6639 0.6798 0.9009 1.0003 0.2523 0.6563 1.2005 1.2903 1.9696 2.4851 0.1201 0.3713 0.542 0.6177 0.882 1.0611 1.6098 3.4167 3.6922 3.9845 4.2604 4.3596 a; b; cị ẳ 0:7; 0:2; 0:1ị 2.5559 2.8355 2.9155 3.1376 3.4095 3.4632 a; b; cị ẳ 0:55; 0:15; 0:3ị a; b; cị ẳ 0:34; 0:33; 0:33ị a b 1.2190 2.2414 2.9427 3.1366 3.3171 3.5604 0.1786 0.631 0.9881 1.3555 1.7106 2.5288 au ¼ at ¼ ah ¼ a; bu ¼ bt ¼ bh ¼ b; cu ¼ ct ¼ ch ¼ c b value in FGWC is equal to the sum of b and c 0.2231 0.7059 0.9596 1.3778 1.6378 1.9538 0.0942 0.3178 0.518 0.6467 0.7749 1.2478 0.2124 0.6338 0.9905 1.2886 2.483 2.1251 0.1314 0.3411 0.5347 0.6416 0.7734 1.0946 a; b; cị ẳ 0:5; 0:3; 0:2Þ 0.0870 0.3028 0.6041 0.6989 0.8970 1.0985 1.3499 2.1719 2.39459 2.9959 3.7792 4.1296 216 L.H Son / Information Sciences 317 (2015) 202–223 Table The IFV values of KFGC by various cases C Case 7 11.7049 17.9523 35.0811 51.5421 50.4248 60.0569 20.0233 25.4080 39.7139 56.8321 59.0267 59.6235 10.1541 15.7234 30.3731 47.8757 47.9637 57.9032 8.2307 16.3287 29.1589 48.7489 46.4649 57.1424 9.238 15.9463 30.7203 48.8075 47.6161 58.7163 10.4698 16.9254 31.0940 46.9292 46.1987 56.4657 11.0862 17.2164 31.4931 48.8323 49.0544 59.1806 Table The IFV values of KFGC by r of the Gaussian kernel function r C 1.0 1.5 2.0 2.5 3.0 3.5 4.0 10.5376 15.6419 30.6946 45.5547 46.6749 56.5632 9.4920 15.3285 29.9652 48.0115 47.5884 58.2222 11.0862 17.2164 31.4931 48.8323 49.0544 59.1806 11.6872 17.6214 32.9895 49.0008 50.8182 61.0639 13.7415 18.4386 33.1232 49.6653 51.3159 60.6796 14.4181 19.8629 33.8038 50.1130 50.0033 60.5061 14.1753 18.1979 35.4039 50.8741 52.6601 64.1375 Thus, the computational time of KFGC is not too large and can be acceptable The final conclusion of Section 3.2 is: – The clustering quality of KFGC is better than those of MIPFGWC and FGWC – KFGC is stable through various cases of parameters – We should choose a large value of parameter a, a medium value of b and a low value of c in order to achieve the best IFV values of KFGC – The computational time of KFGC is acceptable 3.3 The characteristics of KFGC In this part, the characteristics of KFGC have been investigated by various cases defined below and by the values of parameter r of the Gaussian kernel function on the UNO dataset The aim is to verify the impact of these parameters to the IFV values of KFGC The results are expressed in Tables and 4, respectively Case Case Case Case Case Case Case 1: 2: 3: 4: 5: 6: 7: ðau > at > ah Þ : ðau ; bu ; cu Þ ¼ ð0:7; 0:2; 0:1Þ; ðat ; bt ; ct Þ ¼ ð0:6; 0:15; 0:25Þ; ðah ; bh ; chÞ ¼ ð0:5; 0:2; 0:3Þ ðau > ah > at Þ : au ; bu ; cu ị ẳ 0:7; 0:2; 0:1ị; at ; bt ; ct ị ẳ 0:5; 0:2; 0:3ị; ah ; bh ; chị ẳ 0:6; 0:15; 0:25ị at > au > ah Þ : ðau ; bu ; cu ị ẳ 0:6; 0:15; 0:25ị; at ; bt ; ct ị ẳ 0:7; 0:2; 0:1ị; ah ; bh ; chị ẳ 0:5; 0:2; 0:3ị at > ah > au Þ : ðau ; bu ; cu Þ ¼ ð0:5; 0:2; 0:3ị; at ; bt ; ct ị ẳ 0:7; 0:2; 0:1ị; ah ; bh ; chị ẳ 0:6; 0:15; 0:25Þ ðah > au > at Þ : ðau ; bu ; cu ị ẳ 0:6; 0:15; 0:25ị; at ; bt ; ct ị ẳ 0:5; 0:2; 0:3ị; ah ; bh ; chị ẳ 0:7; 0:2; 0:1ị ah > at > au Þ : ðau ; bu ; cu Þ ¼ ð0:5; 0:2; 0:3Þ; ðat ; bt ; ct Þ ¼ ð0:6; 0:15; 0:25Þ; ðah ; bh ; chÞ ¼ 0:7; 0:2; 0:1ị au ẳ at ẳ ah ị : au ; bu ; cu ị ẳ at ; bt ; ct ị ẳ ah ; bh ; chị ẳ ð0:7; 0:2; 0:1Þ From Table 3, we recognize that Case produces the best results among all When C ¼ 2, the IFV value of Case is 20, which is 1.71, 1.97, 2.43, 2.17, 1.91 and 1.81 times larger than those of Case 1, Case 3, Case 4, Case 5, Case and Case 7, respectively The average IFV value of Case by the number of clusters is 43.4, whilst those of Case and from Case to Case are 37.79, 34.99, 34.34, 35.17, 34.68 and 36.14, respectively Those average IFV values pointed out the order of cases for the sake of best IFV that is Case 2, Case 1, Case 7, Case 5, Case 3, Case and Case This order gives us two remarks: – In order to achieve the best IFV values of KFGC, the parameters as in Case ðau > ah > at Þ should be set – The changing of IFV in KFGC can be observed by this order L.H Son / Information Sciences 317 (2015) 202–223 Now, the IFV values of KFGC by 217 r of the Gaussian kernel function are examined in Table The results revealed that the IFV values of KFGC are in the direct proportional to the values of r In the other words, the higher the value of r is, the larger the IFV of KFGC could achieve For example when C ¼ 4, the IFV values of KFGC from r ¼ 1:0 to r ¼ 4:0 are 30.7, 29.9, 31.5, 32.9, 33.1, 33.8 and 35.4, respectively Thus, high value of r should be set in order to achieve the best IFV values of KFGC The final conclusion of Section 3.3 is: – In order to obtain the best clustering quality in KFGC, either the parameters should be set as au > ah > at or the value of parameter r is high; – The changing of IFV values in KFGC by various cases and parameters can be referenced in Tables and 3.4 The validation of spatial relationships and outliers elimination in KFGC This section validates the capabilities of KFGC to produce the results which are (i) more closely related to the spatial relationships and (ii) less outliers than other relevant algorithms as stated in Section The experiments conducted on the Coil 2000 dataset where the first 43 attributes relate to the socio-demographic data and the last 43 attributes describe product ownership The target variable has two classes ‘‘0’’ and ‘‘1’’ showing the number of mobile home policies The distribution of the dataset according to the attribute ‘‘Customer Subtype’’ is depicted in Fig The accurate classification results of the Coil 2000 dataset including 9236 classes ‘‘0’’ (Group 1) and 586 classes ‘‘1’’ (Group 2) according to the description of dataset are described in Fig Using the FGWC method [14], the classification results on the Coil 2000 dataset are depicted in Fig Some analyses are shown as follows Obviously, the number of wrong prediction results increases remarkably in comparison with the accurate distribution in Fig The number of data points in ‘‘Group 2’’ (Blue color) is more than that in Fig and tends to expand to the entire space instead of a small right-corner sub-space Some data points in the left-corner were changed from ‘‘Group 1’’ to ‘‘Group 2’’ so that the number of data points belonged to ‘‘Group 2’’ in this area of Fig raises dramatically The number of boundary points belonging to ‘‘Group 2’’ is also added up The reason for this fact is that FGWC was constructed on the bases of the traditional fuzzy sets and the SIM model so that the classification results have large numbers of outliers As being mentioned in Section 1, the SIM model considers the update of the (old) neighboring groups only, and it does not take into account the new updated neighboring groups so that the classification results of FGWC are less closely related to the spatial relationships and contain large number of outliers Analogously, the classification results of the MIPFGWC algorithm [26] are illustrated in Fig We clearly recognize some crucial remarks from this figure Fig The distribution of Coil 2000 dataset by ‘‘Customer Subtype’’ 218 L.H Son / Information Sciences 317 (2015) 202–223 Fig The accurate distribution of Coil 2000 dataset Fig The classification results of FGWC The number of wrong prediction results in MIPFGWC reduces remarkably in comparison with that in FGWC Even though the data points of ‘‘Group 2’’ are still located in the entire space, yet it can be seen that the problem of data concentration in the left-corner and in the boundary sides does not exist in MIPFGWC as illustrated in Fig The interactions between cluster memberships by the SIM2 model re-calculated the values of memberships and made the correct labels for data points The number of correct classification results in this algorithm is 8068 over 9822 labeled data points MIPFGWC improves FGWC by considering two crucial points: (i) the algorithm is deployed on the intuitionistic fuzzy sets instead of the traditional fuzzy sets in order to handle the vagueness in the membership function and to process the hesitation levels; thus giving more accurate clustering results; (ii) the SIM2 model, which takes into account the new updated neighboring groups, is used instead of the SIM model Thus, MIPFGWC not only has better clustering quality than FGWC but also contains less outlier than this algorithm Nevertheless if the results in Fig are observed carefully, it can be seen that there are some data points such as those in upper-right corner (the blanks and the boundaries) being not labeled These points are the outliers of the MIPFGWC algorithm Since the SIM2 model does not update the typicality values and the hesitation levels, the classification results are incorrect since the calculation of the membership matrix is performed through those values In the intuitionistic L.H Son / Information Sciences 317 (2015) 202–223 219 Fig The classification results of MIPFGWC Fig The classification results of KFGC possibilistic fuzzy model, the typicality values and the hesitation levels have great impacts to the decision of which cluster that a data point belong to Thus, the local update with the previous cluster memberships would make the next memberships diversified, and the outliers could occur as a result In Fig 6, there are the classification results of the KFGC algorithm and some remarks as follows By comparing the data points of ‘‘Group 2’’ from Figs to 6, it is clear that the distribution of those points of KFGC in Fig is more similar to the accurate distribution in Fig than those of MIPFGWC (Fig 5) and FGWC (Fig 4), which are either irregular, misclassified or data-concentration This proves the capability of KFGC to produce results which are more closely related to spatial relationships In additions, the number of outliers is reduced, and the KFGC algorithm could produce better classification results than other algorithms as shown in Fig In this case, the number correct classification results in this algorithm is 9244 over 9822 labeled data points By providing the new update SIM2 scheme in the objective function, the spatial bias correction and the Gaussian kernel function, the classification results of KFGC are proven to be closely related to spatial relationships From these figures, it is obvious that KFGC achieves better classification results than other algorithms as proven by the following facts 220 L.H Son / Information Sciences 317 (2015) 202–223 The quantities of ‘‘Group 1’’ in both KFGC and MIPFGWC are nearly equal and larger than that of FGWC with the numbers being 8680, 8660 and 7602, respectively Those numbers are smaller than the actual value, which is 9236 data points belonged to ‘‘Group 1’’ Similar facts are found for ‘‘Group 2’’ The classification results of KFGC are better than those of MIPFGWC and FGWC with the numbers being 94.1%, 87.6% and 83.1%, respectively The final conclusion of Section 3.4 is: – KFGC is more efficient than other relevant algorithms in terms of robustness to outliers and spatially-guaranteed results; – The accuracy of KFGC is approximately 94% Conclusions In this paper, a novel kernel-based intuitionistic possibilistic fuzzy clustering algorithm so-called KFGC was introduced for the Geo-Demographic Analysis problem The objective function of KFGC employed the Gaussian kernel function instead of the traditional Euclidean, used the updated membership values, typicality values and hesitation level by the SIM2 model, and made the spatial bias correction through the standardized weights By doing so, KFGC could produce results which are more closely related to spatial relationships and eliminate outliers since the membership values, the hesitation level, the typicality values and the centers are ‘‘geographically aware’’ Some properties of KFGC’s solutions and the comparison of clustering qualities between KFGC and other algorithms were theoretical validated The experimental results on a benchmark dataset also re-affirmed that the clustering quality of KFGC is better than those of other relevant algorithms Further researches will investigate the uses of context variables to KFGC and variants of this algorithm in distributive environments Acknowledgements The authors are greatly indebted to the editor-in-chief, Prof W Pedrycz; anonymous reviewers for their comments and their valuable suggestions that improved the quality and clarity of paper Other thanks are sent to Mr Nguyen Van Canh and Ms Hoang Thi Thu Huong, FPT for some experimental works and language editing, respectively This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under Grant No 102.05-2014.01 Appendix A Proof of Theorem Fix T; H; V for the kth column U k of U, we get the reduced problem, jÀ1 C C X X c X J¼ a1  au ukj ỵ bu wji u0ki ỵ u wji uki Au j¼1 i¼j i¼1 ¼ !m À À ÁÁ  À K X k ; V j ! min; ðA:1Þ jÀ1 C C C C c X X X X X u a1  au ukj þ bu wji au uki þ bu ukðiÀ1Þ þ Á ỵ bi1 bui1 ukt ỵ þ bu ukt þ ukt u uk1 þ Au t¼2 jẳ1 tẳi tẳiỵ1 iẳ1 !m C cu X wji  uki  À K X k ; V j ! : ỵ Au iẳj !! ðA:2Þ Using Lagrange multiplier for (A.2) we obtain !m jÀ1 C c X X c u XC iÀ1 iÀ1 u LðU k ; kk ị ẳ a1 au ukj ỵ bu wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 ỵ bu ỵ ỵ b þ  ukj þ  wji  uki i¼j Au Au j¼1 i¼1 ! ! C C X À À ÁÁ 1X  À K X k ; V j À kk wij ukj À ; ðA:3Þ C iẳ1 jẳ1 @LU k ;kk ị ) ẳ a1  m  @ukj ! jÀ1 X À À cu i1 au ỵ bu wji bu ỵ ỵ b ỵ K X k ;V j A u i¼1 jÀ1 C c c X X u  au ukj ỵ bu wji au uki ỵ bu uki1ị ỵ ỵ bi1 ukj bui1 ỵ ỵ b ỵ ỵ u wji uki u uk1 ỵ A A u u iẳ1 iẳj ! C C X 1X À kk wij : C i¼1 j¼1 !mÀ1 ðA:4Þ 221 L.H Son / Information Sciences 317 (2015) 202223 Since @LU k ;kk ị @ukj ẳ 0, we get À a1 Âm ukj ¼ !mÀ1 PC À PC Á kk w P C iÀ1 i¼1 ij Á PjÀ1 cj¼1 À bu jÀ1 i¼1 wji au uki ỵ bu uki1ị ỵ þ bu uk1 iÀ1 u au þbu i¼1 wji Au bu ỵỵbỵ1ị 1K X k ;V j ịị ; P cu i1 au ỵ bu j1 iẳ1 wji Au bu ỵ ỵ b ỵ j ẳ 1; C; k ẳ 1; N: A:5ị Due to the constrain (14), we have P P P 1mÀ1 C C jÀ1 i1 bi1 u ỵ ỵ b ỵ þ j¼1 C i¼1 wij  bu i¼1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 A kk ¼ @ PC PC j¼1 C i¼1 wij P À À P cu i1 a1 m au ỵ bu jÀ1  Cj¼1 À K X k ; V j iẳ1 wji Au bu ỵ þ b þ  ðA:6Þ ; k ¼ 1; N: PC PC j¼1 C i¼1 wij au ỵ bu Pj1 cu iẳ1 wji Au From (A.5) and (A.6), we have !mÀ1 ð1ÀK ðXk ;V j ÞÞ P 1 À ÁÁ!mÀ1 PC À iÀ1 ð1ÀK ðXk ;V j ÞÞ bu jÀ1 À K Xk; V j i¼1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 j¼1 @1 À A; À À À ÁÁ PC PC P cu iÀ1 K Xk ; V j au ỵ bu j1 jẳ1 C iẳ1 wij iẳ1 wji Au bu ỵ ỵ b ỵ PC jẳ1 ukj ẳ k ẳ 1; N; j ẳ 1; C: A:7ị Similarly, fix U; T; H for the kth column Hk of Hand use the Lagrange multiplier, we get the solution in (A.8) !sÀ1 ð1ÀK ðX k ;V j ÞÞ ð1ÀK X k ;V j ịị hkj ẳ P P C C j¼1 C i¼1 wij P 1 À ÁÁ!sÀ1 PC À iÀ1 bh jÀ1 À K Xk ; V j i¼1 wji ah hki ỵ bh hki1ị ỵ þ bh hk1 j¼1 @1 À A; À À ÁÁ À P ch À K Xk; V j ah ỵ bh j1 bi1 h ỵ ỵ b ỵ iẳ1 wji A PC jẳ1 k h ¼ 1; N; j ¼ 1; C: ðA:8Þ Next, we continue to fix U; H; V for the typicality value t kj and get the reduced problem from (A.9) to (A.11) C g À À Á X À ÁÁ À Ág J tkj ¼ a2 t 0kj À K X k ; V j ỵ cj À t kj ! min; j¼1 ¼ C X a2 at tkj ỵ bt jẳ1 j1 C X c X wji t0ki ỵ t wji tki At iẳj i¼1 !g À À ÁÁ À Ág À K X k ; V j ỵ cj t kj ! min; jÀ1 c X t ẳ a2 at t kj ỵ bt wji at t ki ỵ bt tki1ị ỵ ỵ bi1 t t k1 ỵ At iẳ1 jẳ1 !g C À À ÁÁ À Ág c X wjl  t kl  À K X k ; V j þ cj  À tkj ! : þ t At lẳj C X Since @Jt kj ị @t kj A:9ị bi1 t C X lẳ2 A:10ị C C X X tkl ỵ ỵ bt t kl ỵ tkl lẳi !! lẳiỵ1 A:11ị ẳ we get !gÀ1 jÀ1 C c c X X i1 i1 t t at tkj ỵ bt wji at tki ỵ bt tki1ị ỵ ỵ bt tk1 ỵ tkj bt ỵ ỵ b ỵ ỵ wji  tki At At i¼j i¼1 ! jÀ1 X À À ÁÁ À ÁgÀ1 c  a2  at ỵ bt wji t bi1 ỵ ỵ b þ ;  À K X k ; V j ¼ g  cj  À t kj t A t iẳ1 A:12ị 222 L.H Son / Information Sciences 317 (2015) 202–223 ) tkj ¼ cgj ỵ at ỵ Actt bt bt PjÀ1 i¼1 wji PjÀ1 i¼1 wji biÀ1 þ ÁÁ Á þ b þ t at tki ỵ bt tki1ị ỵ ỵ bi1 t t k1 cgj ỵ at ỵ Actt bt PjÀ1 i¼1 wji cgj À1 À À ÁÁgÀ1 P ct iÀ1  a2  at ỵ bt j1 ỵ ỵ b ỵ  À K X k ; V j i¼1 wji At bt ; k ¼ 1; N; j ẳ 1;C: bi1 ỵ ỵ b ỵ t A:13ị Finally, we take the derivative of J m;g;s ðV Þ with respect to each V j !m jÀ1 C X X cu B a1 au ukj þ bu wji au uki þ bu ukðiÀ1Þ þ ÁÁ ỵ bui1 uk1 ỵ cu bi1 C wji uki u ỵ ỵ b ỵ ukj ỵ Au Au B C B C iẳ1 i¼j B !g C B C jÀ1 N X C B C X X CÀ Á À Á @J m;g;s ðV Þ X B þa2 ah hkj þ bh wji ah hki þ bh hki1ị ỵ ỵ bi1 hk1 ỵ ch bi1 ỵ ỵ b ỵ hkj þ ch  ¼ wji  hki C h h A A B C 2X k À 2V j  K X k ; V j : h h @V j C k¼1 j¼1 B i¼1 i¼j B C !s B C jÀ1 C X X B C ct ct i1 i1 @ ỵa3 at t ỵ b A wji at t ỵ b t ỵ ỵ b t wji t ỵ b ỵ ỵ b ỵ t ỵ kj ki u t ki1ị t k1 At t kj ki At iẳ1 iẳj A:14ị Since @J m;g;s V ị @V j ẳ we get m m PjÀ1 cu cu PC i1 i1 B a1 au ukj ỵ bu iẳ1 wji au uki ỵ bu uki1ị ỵ ỵ bu uk1 ỵ Au bu ỵ ỵ b ỵ ukj ỵ Au i¼j wji  uki C g C PN B PjÀ1 C B ch ch PC i1 i1 CX k kẳ1 B ỵa2 ah hkj ỵ bh iẳj wji hki iẳ1 wji ah hki ỵ bh hki1ị ỵ ỵ bh hk1 þ Ah  bh þ ÁÁ Á þ b þ hkj ỵ Ah C B A @ s PjÀ1 P C ct ct i1 i1 ỵa3 at t kj ỵ bu iẳ1 wji at tki ỵ bt t ki1ị þ Á ÁÁ þ bt t k1 þ At  bt ỵ ỵ b ỵ t kj ỵ At iẳj wji tki Vj ¼ m m : PjÀ1 cu cu PC iÀ1 iÀ1 B a1 au ukj ỵ bu iẳ1 wji au uki ỵ bu uki1ị þ ÁÁ Á þ bu uk1 þ Au  bu þ ÁÁ Á þ b þ  ukj þ Au  i¼j wji  uki C g C PN B PjÀ1 P C B wji ah hki ỵ bh hki1ị ỵ ỵ bi1 hk1 ỵ Ach bi1 ỵ ỵ b ỵ hkj ỵ Ach Ciẳj wji hki C kẳ1 B ỵa2 ah hkj þ bh h h i¼1 h h C B A @ s P ct ct PC i1 i1 ỵ t ỵ þa3 at tkj þ bu jÀ1 w a t þ b t ỵ ỵ b t b þ ÁÁ Á þ b þ  w  t t ki k1 kj ki t kðiÀ1Þ t t iẳj ji iẳ1 ji At At A:15ị Appendix B Proof of Theorem ðKMIPFGWCÞ À U ðMIPFGWCÞ N  C U vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u0 1112 u À ÁÁ!mÀ1 PC À u XC XjÀ1 1 j¼1 À K X ;V j u@ @ @1 ỵ bu AAA ; w au uki ỵ bu uki1ị ỵ ỵ bi1 t PC 1 PC Pj1 cu iÀ1 u uk1 j¼1 i¼1 ji ð1 À K ðX ;V ịị au ỵ bu iẳ1 wji Au bu ỵ ỵ b ỵ jẳ1 C iẳ1 wij 0 11 XC XjÀ1 1 i1 @1 ỵ bu AA: N  C  @P P w a u þ b u þ ÁÁ Á þ b u ji u ki k ð iÀ1 Þ k1 u P u jẳ1 iẳ1 C C cu i1 au ỵ bu j1 jẳ1 C iẳ1 wij iẳ1 wji Au bu ỵ ỵ b ỵ B:1ị B:2ị Similarly, we have the estimation of hesitation level in (B.3) ðKMIPFGWCÞ À HðMIPFGWCÞ N  C H  @P C j¼1 P C C iẳ1 wij ỵ bh j1 C X X wji ah hki ỵ bh hki1ị ỵ ỵ bi1 h hk1 jẳ1 iẳ1 @1 À 11 AA: P ch ah þ bh jÀ1 biÀ1 h þ ÁÁ Á þ b ỵ iẳ1 wji A h B:3ị Now, we calculate the estimation of typicality values rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ðKMIPFGWCÞ t KMIPFGWC ; À T ðMIPFGWCÞ N  C  À t MIPFGWC T kj kj ðB:4Þ 6NÂC vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u 12 u u gÀ1 uX B C cj u C B C  u j¼1 B À ; 1 C À gÀ1 u À ÁÁgÀ1 PjÀ1 iÀ1 PjÀ1 ct iÀ1 @ A c g À1 a  1ÀK X ;V t ð k j ÞÞ t ð  a2  at þ bt i¼1 wji At bt þ Á ÁÁ þ b ỵ K X k ;V j cj ỵ at ỵ At bt iẳ1 wji bt þ Á Á Á þ b þ 1þ c j ðB:5Þ L.H Son / Information Sciences 317 (2015) 202–223 6NÂC C X j¼1 223 cgj ỵ at ỵ Actt bt cgj : i1 w b ỵ ỵ b ỵ t iẳ1 ji Pj1 B:6ị References [1] M.N Ahmed, S.M Yamany, N Mohamed, A.A Farag, T Moriarty, A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data, IEEE Trans Med Imaging 21 (2002) 193–199 [2] S.C Chen, D.Q Zhang, Robust image segmentation using FCM with spatial constrains based on new kernel-induced distance measure, IEEE Trans Syst Man Cybernet Part B 34 (2004) 1907–1916 [3] H Chunchun, M Lingkui, S Wenzhong, Fuzzy clustering validity for spatial data, Geo-spatial Inform Sci 11 (3) (2008) 191–196 [4] B.C Cuong, L.H Son, H.T.M Chau, Some context fuzzy clustering methods for classification problems, in: Proceedings of the 2010 ACM Symposium on Information and Communication Technology, 2010, pp 34–40 [5] P Day, J Pearce, D Dorling, Twelve worlds: a geo-demographic comparison of global inequalities in mortality, J Epidemiol Community Health 62 (2008) 1002–1010 [6] Z Feng, R Flowerdew, Fuzzy geodemographics: a contribution from fuzzy clustering methods, in: S Carver (Ed.), Innovations in GIS 5, Taylor & Francis, London, 1998, pp 119–127 [7] H Fritz, L.A GarcíA-Escudero, A Mayo-Iscar, Robust constrained fuzzy clustering, Inform Sci 245 (2013) 38–52 [8] C Gu, S Zhang, K Liu, H Huang, Fuzzy kernel k-means clustering method based on immune genetic algorithm, J Comput Inform Syst (1) (2011) 221–231 [9] H.C Huang, Y.Y Chuang, C.S Chen, Multiple kernel fuzzy clustering, IEEE Trans Fuzzy Syst 20 (1) (2012) 120–134 [10] J Ji, W Pang, C Zhou, X Han, Z Wang, A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data, Knowl.-Based Syst 30 (2012) 129–135 [11] E Keogh, C.A Ratanamahatana, Exact indexing of dynamic time warping, Knowl Inform Syst (3) (2005) 358–386 [12] V Loia, W Pedrycz, S Senatore, P-FCM: a proximity-based fuzzy clustering for user-centered web applications, Int J Approx Reason 34 (2) (2003) 121144 [13] M Loureiro, F Baỗóo, V Lobo, Fuzzy classification of geodemographic data using self-organizing maps, in: Proceedings of 4th International Conference of GIScience 2006, 2006, pp 123–127 [14] G.A Mason, R.D Jacobson, Fuzzy geographically weighted clustering, in: Proceedings of the 9th International Conference on GeoComputation (Electronic Proceedings on CD-ROM), 2007 [15] W Pedrycz, B.J Park, S.K Oh, The design of granular classifiers: a study in the synergy of interval calculus and fuzzy sets in pattern recognition, Pattern Recognit 41 (12) (2008) 3720–3735 [16] J Petersen, M Gibin, P Longley, P Mateos, P Atkinson, D Ashby, Geodemographics as a tool for targeting neighbourhoods in public health campaigns, J Geogr Syst 13 (2011) 173–192 [17] W Pedrycz, Granular Computing: Analysis and Design of Intelligent Systems, CRC Press, 2013 [18] G Peters, Rough clustering utilizing the principle of indifference, Inform Sci 277 (2014) 358–374 [19] D.K Rossmo, Recent developments in geographic profiling, Policing (2) (2012) 144–150 [20] M Rostam Niakan Kalhori, M.H Fazel Zarandi, I.B Turksen, A new credibilistic clustering algorithm, Inform Sci 279 (2014) 105–122 [21] L.H Son, Enhancing clustering quality of geo-demographic analysis using context fuzzy clustering type-2 and particle swarm optimization, Appl Soft Comput 22 (2014) 566–584 [22] L.H Son, HU-FCF: a hybrid user-based fuzzy collaborative filtering method in recommender systems, Expert Syst Appl 41 (15) (2014) 6861–6870 [23] L.H Son, Optimizing municipal solid waste collection using chaotic particle swarm optimization in GIS based environments: a case study at Danang City, Vietnam, Expert Syst Appl 41 (18) (2014) 8062–8074 [24] L.H Son, DPFCM: a novel distributed picture fuzzy clustering method on picture fuzzy sets, Expert Syst Appl 42 (1) (2015) 51–66 [25] L.H Son, B.C Cuong, P.L Lanzi, N.T Thong, A novel intuitionistic fuzzy clustering method for geo-demographic analysis, Expert Syst Appl 39 (10) (2012) 9848–9859 [26] L.H Son, B.C Cuong, H.V Long, Spatial interaction – modification model and applications to geo-demographic analysis, Knowl.-Based Syst 49 (2013) 152–170 [27] L.H Son, P.L Lanzi, B.C Cuong, H.A Hung, Data mining in GIS: a novel context-based fuzzy geographically weighted clustering algorithm, Int J Mach Learn Comput (3) (2012) 235–238 [28] L.H Son, N.D Linh, H.V Long, A lossless DEM compression for fast retrieval method using fuzzy clustering and MANFIS neural network, Eng Appl Artif Intell 29 (2014) 33–42 [29] L.H Son, N.T Thong, Intuitionistic fuzzy recommender systems: an effective tool for medical diagnosis, Knowl.-Based Syst 74 (2015) 133–150 [30] P Thakur, C Lingam, Generalized spatial kernel based fuzzy c-means clustering algorithm for image segmentation, Int J Sci Res (5) (2013) 165–169 [31] P.H Thong, L.H Son, A new approach to multi-variables fuzzy forecasting using picture fuzzy clustering and picture fuzzy rules interpolation method, in: Proceeding of 6th International Conference on Knowledge and Systems Engineering, 2014, pp 679–690 [32] UCI Machine Learning Repository, COIL 2000, 2000 (accessed 06.01.14) [33] UNSD Statistical Databases, Demographic Yearbook, 2011 (accessed 14.07.12) [34] N Walford, An Introduction to Geodemographic Classification (Census Learning), 2011 [35] Z Wu, W.X Xie, J.P Yu, Fuzzy c-means clustering algorithm based on kernel method, in: Proceedings of Fifth International Conference on Computational Intelligence and Multimedia Applications, 2003, pp 49–56 [36] W Wang, X Liu, Fuzzy forecasting based on automatic clustering and axiomatic fuzzy set classification, Inform Sci 294 (2015) 78–94 [37] H.J Xing, M.H Ha, Further improvements in feature-weighted fuzzy c-means, Inform Sci 267 (2014) 1–15 [38] M.S Yang, H.S Tsai, A Gaussian kernel-based fuzzy c-means algorithm with a spatial bias correction, Pattern Recognit Lett 29 (12) (2008) 1713–1725 [39] S.M.R Zadegan, M Mirzaie, F Sadoughi, Ranked k-medoids: a fast and accurate rank-based partitioning algorithm for clustering large datasets, Knowl.Based Syst 39 (2013) 133–143 [40] M Zarinbal, M.H Fazel Zarandi, I.B Turksen, Relative entropy fuzzy c-means clustering, Inform Sci 260 (2014) 74–97 ... Information Sciences 317 (2015) 202–223 203 Lists of abbreviation Terms Explanation GIS Geographical Information Systems GDA Geo-Demographic Analysis SIM Spatial Interaction Model SIM-PF Spatial... determination of common population’s characteristics and the study of population variation This clearly demonstrates the important role of GDA to practical applications nowadays For interpretation... decrease the clustering quality Let us make a deeper analysis about this consideration (a) The similarity measure for clustering is the Euclidean function According to Keogh and Ratanamahatana [11],