Applying machine learning techniques for religious extremism detection on online user contents

Mussiraliyeva S. Omarov B. Yoo P. Bolatbek M.
2021 Tech Science Press

Computers, Materials and Continua
2021 #70 Issue 1 915 - 934 pp.

In this research paper, we propose a corpus for the task of detecting religious extremism in social networks and open sources and compare various machine learning algorithms for the binary classification problem using a previously created corpus, thereby checking whether it is possible to detect extremist messages in the Kazakh language. To do this, the authors trained models using six classic machine-learning algorithms such as Support Vector Machine, Decision Tree, Random Forest, K Nearest Neighbors, Naive Bayes, and Logistic Regression. To increase the accuracy of detecting extremist texts, we used various characteristics such as Statistical Features, TF-IDF, POS, LIWC, and applied oversampling and undersampling techniques to handle imbalanced data. As a result, we achieved 98% accuracy in detecting religious extremism in Kazakh texts for the collected dataset. Testing the developed machine learning models in various databases that are often found in everyday life “Jokes”, “News”, “Toxic content”, “Spam”, “Advertising” has also shown high rates of extremism detection.

Extremism , Machine learning , Natural language processing , NLP , Religious extremism , Social media , Social network

Text of the article Перейти на текст статьи

Al-Farabi Kazakh National University, Almaty, Kazakhstan
CSIS, Birkbeck College, University of London, London, United Kingdom

Al-Farabi Kazakh National University
CSIS

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026