When this assignment process is over, a new centroid is calculated for each cluster using the pixels in it. One drawback of K-means is that it is sensitive to the initially selected points, and so it does not always produce the same output. Moreover, it is sensitive to noise and outlier data points since a small number of such data can substantially influence the mean value. To avoid this problem, the algorithm may run many times before taking an average values for all runs, or at least take the median value. The initialization phase randomly generates the initial population P0 of Z solutions which might end up with illegal strings. Obviously, for obtaining in these conditions a restructuring model of the modified software system, the clustering algorithm HAC in our approach can be applied from scratch, every time when the application classes set changes.

However, it is hard to generate optimal clusters. According to Figure 2, class1 and class2 have greater similarity or smaller distance and are merged together in the first level. And because randomness is one of the techniques used in initializing many of clustering techniques, and giving each point an equal opportunity to be an initial one, it is considered the main point of weakness that has to be solved.

This process iterates until the criterion function converges.

In the following, we give a brief description of the three genetic operators. The input data points are then allocated to one of the existing clusters according to the square of the Euclidean distance rgpf the clusters, choosing the closest.

We also introduce algorithms that integrate the ideas of several clustering methods.

Rajiv Gandhi Proudyogiki Vishwavidyalaya: Engineering qualifications such rgpv Diploma, BTech or B. There are thwsis number of directions in which research on ant-based clustering can be continued.

Further enhancements will include the study of higher dimensional data sets and large data set for clustering. This process is repeated until there is no change in centroids. Chapter 7 Conclusion and Future Work This chapter includes conclusion and future scope of the dissertation.

The algorithm attempts to determine K partitions that minimize the squared-error function.

Four widely used measures for distance between clusters are as follows, where p-p’ is the distance between two objects or points p and p’, m, is the mean for cluster C, and n, is the number of objects of in Ci[5]. Under the Guidance of. However, it is hard to generate optimal clusters.

