Development of Visual Data Processing Algorithm for the Sign Language Recognition System Based on CBFT-BiLSTM
Zholshiyeva L. Zhukabayeva T. Berdiyeva M.
2025Institute of Electrical and Electronics Engineers Inc.
International Conference on Computer Science and Engineering, UBMK
2025Issue 2025987 - 991 pp.
This paper presents the development of a visual data processing algorithm for a real-time Kazakh Sign Language (KazSL) recognition system using the CBFT-BiLSTM model. The system leverages multimodal vision language models, specifically CLIP and BERT, to generate semantic embeddings for both video and text. A projection module is proposed to align these embeddings into a shared semantic space, enabling accurate matching using cosine similarity. A fine-tuned BiLSTM classifier is used to recognize the sentence class. The model is evaluated on two KazSL datasets (KRSL20 and QazSL) and achieves high accuracy (94,98% and 76%) with low confusion rates between similar signs. The system operates effectively in real-time with an average latency of less than 100 milliseconds per sentence. The proposed method demonstrates the potential of adapting vision language models for low-resource languages.
BERT , CLIP , LSTM , sign language recognition , vision language model
Text of the article Перейти на текст статьи
Astana International University, International Science Complex Astana, Astana, Kazakhstan
L.N. Gumilyov Eurasian National University, Department of Information Systems, Astana, Kazakhstan
South Kazakhstan Medical Academy, Department of Medical Biophysics and Informtion Technology, Shymkent, Kazakhstan
Astana International University
L.N. Gumilyov Eurasian National University
South Kazakhstan Medical Academy
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026