A Kazakh language Dataset of Lip Movements for Command Recognition
Kenzheakhmetov B. Amankos A. Amirgaliyev B. Zhanibekova Z. Zhalgas A. Yedilkhan D.
December 2025Nature Research
Scientific Data
2025#12Issue 1
Lip reading systems determine the content of speech based on the visual tracking of lips of the speaker and therefore serve to offer communicative substitutes when acoustic information is not available in the environment. The training of strong lip reading models requires acquisition of specialised corpora that not only characterise the linguistic variability but also visual articulatory variability. In the given work, we introduce a new visual corpus which is specific to Kazakh lip reading. The corpus contains a collection of 102 nouns taking place most commonly in the Kazakh language with specific articulatory patterns and are recorded in 26 participants with a wide age range. The resultant series takes up about 34,000 short video clips, which resulted in about 1,2 million single frames. The resulting database is widely annotated, and thus would represent a very useful resource in improving the Kazakh lip reading technologies. It will promise great prospect for further researches in this area and the various researches that can be done in multi modal speech recognition.
Text of the article Перейти на текст статьи
School of Artificial Intelligence and Data Science, Astana IT University, Astana, 010001, Kazakhstan
School of Software Engineering, Astana IT University, Astana, 010001, Kazakhstan
Research and Innovation Center “Smart City”, Astana IT University, Astana, 010001, Kazakhstan
School of Artificial Intelligence and Data Science
School of Software Engineering
Research and Innovation Center “Smart City”
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026