BiModalClust: Fused Data and Neighborhood Variation for Advanced K-Means Big Data Clustering
Mussabayev R. Mussabayev R.
February 2025Multidisciplinary Digital Publishing Institute (MDPI)
Applied Sciences (Switzerland)
2025#15Issue 3
K-means clustering is a fundamental tool in data mining, yet its scalability and efficacy decline when faced with massive datasets. In this work, we introduce BiModalClust, a novel clustering algorithm that leverages a bimodal optimization paradigm to overcome these challenges. Our approach simultaneously optimizes two interdependent modalities: the input data stream and the neighborhood structure of the solution landscape, which emerges from iterative restrictions of the Minimum Sum-of-Squares Clustering (MSSC) objective function to sampled subsets of the data. By integrating the Variable Neighborhood Search (VNS) metaheuristic, we systematically explore and refine these landscapes through dynamic reinitialization of degenerate centroids and adaptive exploration of expanding neighborhoods. This dual-stream optimization not only transforms traditional local search into a more global and robust process but also ensures computational scalability and precision. Extensive experimentation on diverse real-world datasets demonstrates that BiModalClust achieves superior clustering performance among K-means-based methods in big data environments.
big data , BiModalClust algorithm , clustering , data streaming , decomposition , global optimization , high-performance computing , K-means , K-means++ , large-scale datasets , minimum sum-of-squares , multi-start local search , unsupervised learning , variable neighborhood search , VNS
Text of the article Перейти на текст статьи
AI Research Lab, Department of Software Engineering, Satbayev University, Satbayev Str. 22, Almaty, 050013, Kazakhstan
Laboratory for Analysis and Modeling of Information Processes, Institute of Information and Computational Technologies, Pushkin Str. 125, Almaty, 050010, Kazakhstan
AI Research Lab
Laboratory for Analysis and Modeling of Information Processes
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026