Extremist Ideology Classification in Kazakh: A Multi-Class Approach Using Machine Learning and Psycholinguistic Analysis
Mussiraliyeva S. Bolatbek M. Baisylbayeva K.
2025Institute of Electrical and Electronics Engineers Inc.
IEEE Access
2025#13140500 - 140518 pp.
This paper presents a new approach for analyzing extremist content in the Kazakh language on social media using advanced machine learning and natural language processing techniques. With the rapid growth of online data, especially on social networks, there is an urgent need for tools that can identify and classify extremist ideologies. Our study focuses on four primary categories: propaganda, recruitment, radicalization, and neutral content. We employ a hybrid methodology that combines traditional text vectorization techniques, machine learning algorithms, and a psycholinguistic analysis module (PLAM) specifically adapted for the Kazakh language. To improve the accuracy of classification and capture subtle emotional signals, we integrate psycholinguistic features extracted through PLAM. The experimental results demonstrate the effectiveness of our hybrid approach. The combination of CountVectorizer + Logistic Regression + PLAM achieved the highest performance among traditional models (F1-score: 0.9305, Accuracy: 0.9308, ROC AUC: 0.9892). Among deep learning models, the BERT + LSTM model yielded the best results (F1-score: 0.9481, Accuracy: 0.9485, ROC AUC: 0.9918), followed by the standalone BERT model (F1-score: 0.9412, Accuracy: 0.9414, ROC AUC: 0.9901). These findings confirm that combining contextual embeddings with sequential modeling improves classification performance, particularly for ideologically complex categories. This research provides an effective framework for multilingual text analysis. It also contributes to improved monitoring and prevention of extremist content in underrepresented languages, such as Kazakh. Future work will focus on refining these methods and exploring their application in other domains for robust content moderation and security in the digital space.
counter-extremism , ideology of extremist messages , Kazakh language , machine learning , NLP , social network analysis , Text classification
Text of the article Перейти на текст статьи
Al-Farabi Kazakh National University (KazNU), Department of Cybersecurity and Cryptology, Almaty, 050040, Kazakhstan
Al-Farabi Kazakh National University (KazNU)
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026