Ariyaratne, M, & Fernando, TGI (2022). A Comprehensive Review of the Firefly Algorithms for Data Clustering. In A. Biswas, C. B. Kalayci, & S. Mirjalili (Eds.), Advances in Swarm Intelligence: Variations and Adaptations for Optimization Problems (pp. 217–239). Springer International Publishing. https://doi.org/10.1007/978-3-031-09835-2_12

Abstract:

Separating a given data set into groups (clusters) based on their natural similar characteristics is one of the main concerns in data clustering. A cluster can be defined as a collection of objects which are “homogeneous” between them and are “heterogeneous” to the objects belonging to other clusters. Many areas encounter clustering based applications including the fields of medical, image processing, engineering, economics, social sciences, biology, machine learning and data mining. Clustering goes under unsupervised learning where no labels are given to the learning algorithm, leaving it on its own to find structure in its input. Even though many classical clustering algorithms can be found, most of such suffer from severe drawbacks such as sensitivity over initial cluster centroids and hence can be easily trapped in local optimum solutions. The other main problem with the data clustering algorithms is that it cannot be standardized. On the other hand, clustering can be considered under optimizations which goes to the category of NP hard optimization making more difficult in solving. Addressing such NP hard problems, meta-heuristics play a remarkable role in optimization. Since its appearance from more than a decade ago, Firefly Algorithm (FA), a stochastic meta- heuristic in nature inspired algorithms has shown significant performance in giving solutions to many optimization problems. Hence FA has been used in research addressing the problem of clustering optimization. This chapter forestalls the ability of firefly algorithm in solving data clustering problem. It presents an introduction to clustering and the performance of FA, briefly reviews and summarizes some of the recent firefly-based algorithms used for data clustering with the emphasis on how FA has been combined/ hybridized with other methods to contribute to the problem of data clustering. Further it discusses on different representations, initializations, and the used cluster validation criteria in FA based clustering methods. The chapter also discusses why FA is to be more useful for clustering over other methods and what features made it more suitable for handling the clustering problem compared with other meta-heuristics. Finally, it focuses on the limitations that have been found in the literature on clustering grounded on FA-based applications and discusses possible avenues in future.