e-ISSN 2231-8526
ISSN 0128-7680
Paul Inuwa Dalatu and Habshah Midi
Pertanika Journal of Science & Technology, Volume 26, Issue 4, October 2018
Keywords: Clustering, estimators, K-Means, simulation, weighted
Published on: 24 Oct 2018
Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis.
ISSN 0128-7680
e-ISSN 2231-8526