BiModalClust: Fused Data and Neighborhood Variation for Advanced K-Means Big Data Clustering


Mussabayev R. Mussabayev R.
February 2025Multidisciplinary Digital Publishing Institute (MDPI)

Applied Sciences (Switzerland)
2025#15Issue 3

K-means clustering is a fundamental tool in data mining, yet its scalability and efficacy decline when faced with massive datasets. In this work, we introduce BiModalClust, a novel clustering algorithm that leverages a bimodal optimization paradigm to overcome these challenges. Our approach simultaneously optimizes two interdependent modalities: the input data stream and the neighborhood structure of the solution landscape, which emerges from iterative restrictions of the Minimum Sum-of-Squares Clustering (MSSC) objective function to sampled subsets of the data. By integrating the Variable Neighborhood Search (VNS) metaheuristic, we systematically explore and refine these landscapes through dynamic reinitialization of degenerate centroids and adaptive exploration of expanding neighborhoods. This dual-stream optimization not only transforms traditional local search into a more global and robust process but also ensures computational scalability and precision. Extensive experimentation on diverse real-world datasets demonstrates that BiModalClust achieves superior clustering performance among K-means-based methods in big data environments.

big data , BiModalClust algorithm , clustering , data streaming , decomposition , global optimization , high-performance computing , K-means , K-means++ , large-scale datasets , minimum sum-of-squares , multi-start local search , unsupervised learning , variable neighborhood search , VNS

Text of the article Перейти на текст статьи

AI Research Lab, Department of Software Engineering, Satbayev University, Satbayev Str. 22, Almaty, 050013, Kazakhstan
Laboratory for Analysis and Modeling of Information Processes, Institute of Information and Computational Technologies, Pushkin Str. 125, Almaty, 050010, Kazakhstan

AI Research Lab
Laboratory for Analysis and Modeling of Information Processes

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026