Journal of Engineering and Applied Sciences

Year: 2019
Volume: 14
Issue: 17
Page No. 6317 - 6335

Real World Data Clustering using a Hybrid of Normalized Particle Swarm Optimization and Density–Sensitive Distance Measure

Authors : Temitayo Fagbola, Olugbara Oludayo and Surendra Thakur

Abstract: k–means is among the most widely used classical partitioned clustering algorithms mainly because of its quick convergence rate, adaptability nature to sparse data and simplicity of implementation. However, it only guarantees convergence of sum of square’s objective function to a local minimum while its convergence to global optimum appears NP–hard when introduced to large, noisy and non–convex structures. This in turn maximizes its error margin. Most currently existing improvements on k–means adopt techniques which further introduce additional challenges including inaccurate clustering results, high space and time complexities and sometimes premature convergence on k–means. However, high accuracy with large datasets, robustness to noisy data, low clustering time and low sum–of–squared error are sought–after capabilities of good clustering algorithms. In this study, a hybrid Normalized Particle Swarm Optimized–Density Sensitive (NPSO–DS) k–means algorithm is developed to manage the aforementioned limitations of k–means. The proposed NPSO–DS k–means algorithm combines the global stability feature of the normalized Particle Swarm Optimization (PSO) technique incorporating a min–max technique and a clustering error as objective function with the stable properties of a density–sensitive k–means to realize convergence of particles to global optimum with large and noisy real–world datasets. Using clustering accuracy, sum–of–squared error and clustering time as performance metrics, the experimental evaluation results obtained when the developed algorithm was tested on Educational Process Mining (EPM) and wine datasets indicate that it is significantly capable of consistently yielding high quality results. Furthermore, the developed NPSO–DS k–means algorithm could identify non–convex clustering structures and offers appreciable robustness to noisy data, thus, generalizing the application areas of the baseline k–means algorithm.

How to cite this article:

Temitayo Fagbola, Olugbara Oludayo and Surendra Thakur, 2019. Real World Data Clustering using a Hybrid of Normalized Particle Swarm Optimization and Density–Sensitive Distance Measure. Journal of Engineering and Applied Sciences, 14: 6317-6335.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved