Comparative Analysis of Audio Features for Unsupervised Speaker Change Detection
Toleu A. Tolegen G. Mussabayev R. Krassovitskiy A. Zhumazhanov B.
December 2024Multidisciplinary Digital Publishing Institute (MDPI)
Applied Sciences (Switzerland)
2024#14Issue 24
This study examines how ten different audio features, including MFCC, mel-spectrogram, chroma, and spectral contrast etc., influence speaker change detection (SCD) performance. The analysis is conducted using two unsupervised methods: Bayesian information criterion with Gaussian mixture model (BIC-GMM), a model-based approach, and Kullback-Leibler divergence with Gaussian Mixture Model (KL-GMM), a metric-based approach. Evaluation involved statistical analysis of feature changes in relation to speaker changes (vice versa), supported by comprehensive experimental validation. Experimental results show MFCC as the most effective feature, demonstrating consistently good performance across both methods. Features such as zero crossing rate, chroma, and spectral contrast also showed notable effectiveness within the BIC-GMM framework, while mel-spectrogram consistently ranked as the least influential feature in both approaches. Further analysis revealed that BIC-GMM exhibits greater stability in managing variations in feature performance, whereas KL-GMM is more sensitive to threshold optimization. Nevertheless, KL-GMM achieved competitive results when paired with specific features, such as MFCC and zero crossing rate. These findings offer valuable insights into the impact of feature selection on unsupervised SCD, providing guidance for the development of more robust and accurate algorithms for practical applications.
audio features , speaker change detection , unsupervised approach
Text of the article Перейти на текст статьи
Laboratory of Analysis and Modelling of Informational Processes, Institute of Information and Computational Technologies, Almaty, 050010, Kazakhstan
AI Research Lab, Satbayev University, Almaty, 050040, Kazakhstan
Laboratory of Analysis and Modelling of Informational Processes
AI Research Lab
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026