Kazakh Text Classification using Deep Learning Approaches
Mukazhanov N. Batura T. Yerimbetova A. Turdalyuly M. Sakenov B. Bayekeyeva A.
2025Institute of Electrical and Electronics Engineers Inc.
International Conference on Computer Science and Engineering, UBMK
2025Issue 2025495 - 500 pp.
This study focuses on the classification of Kazakh text data using deep learning models. Text classification constitutes a foundational component within the domain of natural language processing (NLP), frequently serving as a preliminary step in numerous other text-processing applications. The research considered the primary task of text data classification, i.e. sentiment analysis. The study proposed the use of deep learning models for the automated classification of text data into positive and negative opinions. Since the Kazakh is a low-resource language, it is imperative to create specialised datasets, perform preprocessing of text, and carry out annotation. The study utilised the KazSAnDRA dataset, in conjunction with a dataset prepared by the authors. A range of deep learning models, including CNN, LSTM, GRU, and DistilBERT, were utilised in the experimental evaluations. Among the models examined, the DistilBERT and LSTM models demonstrated the highest levels of accuracy and were identified as a promising approach for the automatic classification of Kazakh texts. The findings of this study have the potential to provide a theoretical and practical foundation for future research endeavours focused on the analysis of texts in the Kazakh language.
deep learning , Kazakh language , NLP , sentiment analysis , text classification , transformer models
Text of the article Перейти на текст статьи
Institute of Information and Computational Technologies of the Committee of Science, Ministry of Science and Higher Education of the Republic of Kazakhstan, Almaty, Kazakhstan
A.P. Ershov Institute of Informatics Systems, Russian Academy of Sciences, Siberian Branch, Novosibirsk, Russian Federation
Maqsut Narikbayev Universit, Institute of Information and Computational Technologies of the Committee of Science, Ministry of Science and Higher Education of the Republic of Kazakhstan, Astana, Kazakhstan
Institute of Information and Computational Technologies of the Committee of Science
A.P. Ershov Institute of Informatics Systems
Maqsut Narikbayev Universit
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026