The Automated Method of Collecting and Labeling Data for Speech Emotion Recognition based on Face Emotion Recognition
Shoiynbek A. Kuanyshbay D. Menezes P. Assunção G. Meraliyev B. Mukhametzhanov A. Shoiynbek T. Sklyar S.
September 2025Natural Sciences Publishing
Applied Mathematics and Information Sciences
2025#19Issue 51067 - 1077 pp.
Speech Emotion Recognition (SER) is vital for enabling natural and effective human-machine interactions, yet its advancement is constrained by the scarcity of richly annotated emotional speech corpora, the laborious nature of manual labeling, and the difficulty of eliciting genuine expressions. We propose an automated data-collection and labeling pipeline that synchronizes video-based facial emotion recognition (FER) with audio capture to annotate speech recordings according to speakers’ natural facial expressions. Applying this method, we processed 1 243 YouTube videos (1 058 hours of raw footage) and extracted 218 359 candidate utterances, which-after FER-guided filtering-yielded a high-quality corpus of 45 459 recordings (33 h 15 min of audio) across seven basic emotions in Kazakh (15 076 utterances) and Russian (30 383 utterances). We trained a deep neural network on the combined dataset and achieved 86.84% overall test accuracy, with per-language accuracies of 89.00% (Kazakh) and 85.20% (Russian) for sevenway emotion classification; a support vector machine reached 82.47% under the same conditions. By reducing manual annotation effort by over 80% while maintaining consistent labels, our approach delivers a scalable, language-agnostic solution for generating authentic emotional speech datasets, substantially cutting down on human labor and paving the way for more robust, real-world SER systems.
Face emotion recognition , labeling , machine learning , speech emotion recognition
Text of the article Перейти на текст статьи
School of Digital Technology, Narxoz Univeristy, Almaty, 050035, Kazakhstan
Faculty of Engineering and Natural Sciences, SDU University, Kaskelen, 040900, Kazakhstan
Institute of Systems and Robotics, University of Coimbra, Coimbra, 3030-788, Portugal
General and Applied Psychology Department, Faculty of Philosophy and Political Science, Al-Farabi KazNU, Almaty, 050040, Kazakhstan
School of Digital Technology
Faculty of Engineering and Natural Sciences
Institute of Systems and Robotics
General and Applied Psychology Department
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026