Clustering analysis is a methodology of categorizing a list of objects into groups, whereby the objects in a group possess similar attributes value. The process of clustering should not involve the intervention of human expert or additional information that guide it in clustering the objects. In this study, two clustering algorithms are examined, namely k Means Algorithm and Genetic Algorithm. The simple iris dataset is used to an alyse the performance of the both algorithms. The suitable design and implementation of the two algorithms to enhance their performance and generate a better result are being discussed. In order to measure the performance of these algorithms, several metho ds are being used and the main measurement is the matching of the clustering with the class labels given. Note that the class labels are only used for verification purpose and do not involve in the clustering process of the two algorithms. The implementa tion of the two algorithms are catered specifically for the iris dataset, where the number of attributes are fixed to be four, and the k value for k Means Algorithm is fixed to be three. There is a total of 150 instances given from the dataset. The main components of k Means Algorithm include the initialization of centroids, the clustering of every object and the recomputation of centroids based on the clustered objects. For Genetic Algorithm, the main components are problem encoding scheme, penalty evaluation function, parents and children selection, and also the genetic operations including crossover and mutation.
The resources mentioned in this guide (if any) are purely for learning purpose. No copyright infringement intended.