Clustering of collinear data points in lower dimensions

Terence Johnson; Jervin Zen Lobo

Clustering of collinear data points in lower dimensions

Date

2012-12-31

Authors

Terence Johnson

Jervin Zen Lobo

Publisher

IOSR Journals

Abstract

Clustering using the basic version of the K-Means algorithm begins by randomly selecting K cluster centers, assigning each point to the cluster whose mean is closest in a Euclidean distance sense, computing the mean vectors of the points assigned to each cluster and using these as new centers in an iterative approach. This suggests that if we identify points in the dataset which represent the final unchanging means, the task of clustering reduces to just assigning the remaining points in the dataset into clusters which are closest to these final means based on the Euclidean Distance measure. Taking a cue from the result of the K-Means algorithm this paper presents an approach for performing collinear clustering based on the idea that values in a dataset can be put into different clusters, depending on which points in the dataset lie at maximum distance from each other. The clusters are formed by finding the minimum Euclidean distance of all points in the dataset and these maximally separated data points.