2.3 Optimization Methods for Energy Efficient Routing Protocols
2.3.1 Clustering Algorithms for cluster-based protocols
Although clustering techniques are commonly known to be used for data manage- ment and pattern recognition, they can be applied in a large number of problems in different research areas. These techniques as mentioned above provides a method to reduce the size of the network to be controlled by grouping sensor nodes into clusters which have manageable size and assists the improvement of energy effi- ciency [16]. The clustering techniques deal with grouping objects according to a given measure. It emerges a high degree of structure for an unstructured set of ob- jects. Many clustering algorithms have been proposed in the literature. However, the main problem with the data clustering algorithms is that it cannot be stan- dardised. Algorithms developed may give best result with one type of data set but may fail or give poor result with data set of other type [73]. Clustering algorithms can be broadly classified into two categories: (1) Parametric Clustering and (2) Non-Parametric Clustering [74]. The parametric clustering methods attempt to minimize a cost function or an optimality criterion, which associates a cost to each instance-cluster assignment. This type of method usually includes some assump- tions about the underlying data structure; it is assumed that set of parameters is
are classified into parametric clustering algorithms [75, 77, 78]. Meanwhile, non- parametric type assumes that the data distribution cannot be defined in terms of such a finite set of parameters. They can often be defined by assuming an infinite dimensional. No assumption about the underlying data distribution is required.
Furthermore, they do not require an explicit representation of the data in a Eu- clidean form. They only require a matrix with the pair-wise similarities based on a predefined distance. However, such a matrix containing the pair-wise similarities sometimes can require a lot of storage space, thus making the algorithm inappli- cable in much of real life problems, where the data set to cluster is typically large.
Density-based Spatial Clustering of Applications with Noise (DBSCAN) and hier- archical clustering algorithms, which includes agglomerative methods like BIRCH, CURE, ROCK, etc., and divisive methods like DIANA and MONA, belong to this type [79]. Table 2.1 shows the comparison of some typical clustering algorithms.
Table2.1:Typicalclusteringalgorithms Clustering algorithmsK-meansFuzzyC-meansGaussianMixtureDen- sityDecompositionHierarchicalclusteringal- gorithmsDensity-basedSpatial ClusteringofAppli- cationswithNoise (DBSCAN) Advantages-Itisfast,robustandeas- iertounderstand-Itgivesbestresult foroverlappeddataset andcomparativelybetter thenk-meansalgorithm.
-Itgivesextremelyuseful resultfortherealworld dataset.
-Noa-prioriinformation aboutthenumberofclus- tersisrequired.
-Itdoesnotrequire a-priorispecificationof numberofclusters. -Itisrelativelyefficient.-Unlikek-means,here datapointmaybelongto morethanonecluster.
-Easytoimplementand givesbestresultinsome cases.
-Itisabletoidentify noisedatawhilecluster- ing. -Itgivesbestresultwhen datasetaredistinctor wellseparatedfromeach other.
-Itisabletofindarbi- trarilysizeandarbitrarily shapedclusters. Disadvantages-Thelearningalgorithm requiresa-priorispecifi- cationofthenumberof clusters.
-A-priorispecificationof thenumberofclustersis required.
-Algorithmishighly complexinnature.-Thealgorithmsaresen- sitivetonoiseandout- liers,breaklargeclusters, andaredifficulttohan- dledifferentsizedclusters andconvexshapes.
-Itfailsincaseofvarying densityclusters. -Thelearningalgorithm providesthelocaloptima ofthesquarederrorfunc- tion.
-Euclideandistance measurescanunequally weightunderlyingfac- tors.
-Noobjectivefunctionis directlyminimized-Itismorecomplexthat parametricmethods. -Iftherearetwohighly overlappingdata,k- meanswillnotbeable toresolvethatthereare twoclusters.
-Sometimes,itisdiffi- culttoidentifythecor- rectnumberofclustersby thedendogram. -Itismorecomplexthat parametric-method. ComplexityO(NKd)O(N)-O(N2logN)O(NlogN)
In cluster-basedWSNs, clustering techniques are essential for network forma- tion, which defines the structure of the network and the connectivity among the sensor nodes for data transmission. However, WSNs and their fundamental com- ponents, the sensor nodes, have very limited ability of computation, small memory storage and finite energy sources. Therefore, the clustering algorithms used for cluster formation should be simple, fast and efficient. Amongst the algorithms mentioned above, K-Means and Fuzzy C-Means are good candidates for improving the cluster formation of cluster-based WSNs.
K-means is one of the most well-known and simplest clustering techniques [75, 76]. The objective of k-means is to organize a given number of objects into k disjunct groups in a hard partitioning manner. The main idea is to define k cen- troids - one for each cluster. Based on the initial selection of k centroids, k-means iteratively updates the clustering until the algorithm converges, i.e a stopping crite- ria such as maximum number of iterations or no change of fitness value is satisfied.
In every iteration, each data point (object) is associated with the nearest centroid.
When all points are allocated into clusters, the present iteration is finished. The algorithm recalculated k centroids based on the network configuration obtained in the present iteration. After that, the next iteration is started to improve the clus- tering. This iterative process will update the centroids and the associated clusters in each step until no further improvement can be achieved.
k-means has already been successfully used in the context of wireless ad-hoc network by Fernandess and Malkhi [20] in order to limit the amount of routing information stored and maintain at individual hosts. In WSNs, Tan et. al. [21]
have employed k-means in their proposed BPK-means protocol for assistance of
the network formation.
FCMis another well-known fuzzy clustering algorithm that was first proposed by Bezdek [77]. Likek-means, the objective of theFCMclustering algorithm is also to group a set of objects into a given number of clusters. However, it is different from k-means that instead of hard partitioning data points (objects) into only one particular cluster, FCM establishes overlapping clusters of the data points.
Each object is assigned a degree of belonging to each cluster created rather than completely being a member of just one clusters. This algorithm is also potential to be applied in WSNs that have high degree of randomness.