Hate Speech Detection in the Kazakh Language Utilizing an Attention-Based BERT Model
Suieuova N. Adali E. Tynyshbek K. Shynzhigit B. Toktarova A. Beissenova G.
2025Institute of Electrical and Electronics Engineers Inc.
International Conference on Computer Science and Engineering, UBMK
2025Issue 2025416 - 421 pp.
This article examines methods to identify and mitigate offensive and abusive language in online material on social networks. The study seeks to address a critical issue: the identification of violent and damaging information on Kazakh social media platforms. At present, Kazakh is regarded as a resource-deficient language, and the quantity of automated systems processing its texts is restricted. The study introduces a novel hybrid model that employs contemporary machine learning and deep learning techniques tailored for Kazakh texts. The proposed model utilizes BERT transformer embeddings and integrates multi-head self-attention with convolutional neural network layers. This methodology enables the successful detection of both explicit and implicit hate speech. The stages of natural language processing (NLP), including pre-processing, lemmatization, and tokenization of textual data, are examined, and a specialized Kazakh dataset is created. The data is categorized into classes by voluntary annotators. The models results demonstrated superior accuracy and stability relative to conventional approaches (e.g., Naive Bayes, LSTM, FastText). The proposed architecture was particularly adept at identifying implicitly conveyed texts with significant emotional depth. This studys findings will enhance digital security in the Kazakh language and foster ethical conversation among social media users. In the future, the model is intended to be utilized in real-time and augmented with multilingual data.
attention mechanism , BERT , detection , Hate speech , online content
Text of the article Перейти на текст статьи
Yessenov University, Department of Information Communication Technologies, Aktau, Kazakhstan
Istanbul Technical University, Faculty of Computer and Informatics, Istanbul, Turkey
M.Auezov South-Kazakhstan Univ., Department of Computer Science, Shymkent, Kazakhstan
Central Asian Innovation University, Department of Information Communication Technologies, Shymkent, Kazakhstan
M.Auezov South-Kazakhstan University, Department of Information Communication Technologies, Shymkent, Kazakhstan
M.Auezov South-Kazakhstan University, Department of Computer Science, Shymkent, Kazakhstan
Yessenov University
Istanbul Technical University
M.Auezov South-Kazakhstan Univ.
Central Asian Innovation University
M.Auezov South-Kazakhstan University
M.Auezov South-Kazakhstan University
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026