Wednesday, December 28, 2011

K-Means Algorithm(Clustering Algorithm)

1) Basic Algorithm (Based on Relocation Algo)


Step1. select K data points as the initial representatives
Step2. for i = 1 to N, assign item xi to the most similar centroid (this gives K clusters)
Step3. for j = 1 to K, recalculate the cluster centroid Cj
Step4. repeat steps 2 and 3 until these is (little or) no change in clusters

2) Example (Clustering Term) :

Step 1:Initial arbitrary assignment as:
C1 ={ T1,T2} , C2={T3,T4}, C3 = {T5,T6}

Step 2:
Doc -Document
T - Terms in Doc
C - Clusters



Step3 : Cluster Term Similarity Matrix


i

Step 4 : Using new cluster centroid original Document - Term Matrix



Step5 : The process repeats until no further changes are made to Clusters.