Creating a Parallel Corpus for the Kazakh Sign Language and Learning
Yerimbetova A. Sakenov B. Sambetbayeva M. Daiyrbayeva E. Berzhanova U. Othman M.
March 2025Multidisciplinary Digital Publishing Institute (MDPI)
Applied Sciences (Switzerland)
2025#15Issue 5
Kazakh Sign Language (KSL) is a crucial communication tool for individuals with hearing and speech impairments. Deep learning, particularly Transformer models, offers a promising approach to improving accessibility in education and communication. This study analyzes the syntactic structure of KSL, identifying its unique grammatical features and deviations from spoken Kazakh. A custom parser was developed to convert Kazakh text into KSL glosses, enabling the creation of a large-scale parallel corpus. Using this resource, a Transformer-based machine translation model was trained, achieving high translation accuracy and demonstrating the feasibility of this approach for enhancing communication accessibility. The research highlights key challenges in sign language processing, such as the limited availability of annotated data. Future work directions include the integration of video data and the adoption of more comprehensive evaluation metrics. This paper presents a methodology for constructing a parallel corpus through gloss annotations, contributing to advancements in sign language translation technology.
deep learning , Kazakh sign language , machine translation , parallel corpus , sequence to sequence model , sign language
Text of the article Перейти на текст статьи
Laboratory of Computer Engineering of Intelligent Systems, Institute of Information and Computational Technologies, Almaty, 050010, Kazakhstan
Global Education and Training, University of Illinois Urbana-Champaign, Champaign, 61820, IL, United States
Department of Information Systems, L.N. Gumilyov, Eurasian National University, Astana, 010008, Kazakhstan
Department of Software Engineering, Satbayev University, Almaty, 050010, Kazakhstan
Faculty of Information Technology, Al-Farabi Kazakh National University, Almaty, 050038, Kazakhstan
Department of Communication Technology and Network, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang 43400, Malaysia
Laboratory of Computer Engineering of Intelligent Systems
Global Education and Training
Department of Information Systems
Department of Software Engineering
Faculty of Information Technology
Department of Communication Technology and Network
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026